Computer Vision - part II



Similar documents
A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

Feature Tracking and Optical Flow

3D Scanner using Line Laser. 1. Introduction. 2. Theory

Randomized Trees for Real-Time Keypoint Recognition

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

MetropoGIS: A City Modeling System DI Dr. Konrad KARNER, DI Andreas KLAUS, DI Joachim BAUER, DI Christopher ZACH

Probabilistic Latent Semantic Analysis (plsa)

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr

DESIGN & DEVELOPMENT OF AUTONOMOUS SYSTEM TO BUILD 3D MODEL FOR UNDERWATER OBJECTS USING STEREO VISION TECHNIQUE

Face detection is a process of localizing and extracting the face region from the

Local features and matching. Image classification & object localization

Augmented Reality Tic-Tac-Toe

Terrain Traversability Analysis using Organized Point Cloud, Superpixel Surface Normals-based segmentation and PCA-based Classification

Robust Pedestrian Detection and Tracking From A Moving Vehicle

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

Tracking in flussi video 3D. Ing. Samuele Salti

Lecture 2: The SVM classifier

Wii Remote Calibration Using the Sensor Bar

ENGN D Photography / Winter 2012 / SYLLABUS

3D Model based Object Class Detection in An Arbitrary View

Feature Point Selection using Structural Graph Matching for MLS based Image Registration

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

Introduction Epipolar Geometry Calibration Methods Further Readings. Stereo Camera Calibration

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES

Image Segmentation and Registration

Segmentation of building models from dense 3D point-clouds

BRIEF: Binary Robust Independent Elementary Features

The Visual Internet of Things System Based on Depth Camera

Classifying Manipulation Primitives from Visual Data

Build Panoramas on Android Phones

Environmental Remote Sensing GEOG 2021

Point Matching as a Classification Problem for Fast and Robust Object Pose Estimation

Edge detection. (Trucco, Chapt 4 AND Jain et al., Chapt 5) -Edges are significant local changes of intensity in an image.

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Part-Based Recognition

Topographic Change Detection Using CloudCompare Version 1.0

MIFT: A Mirror Reflection Invariant Feature Descriptor

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY

Geometric Camera Parameters

Object Recognition. Selim Aksoy. Bilkent University

Universidad de Cantabria Departamento de Tecnología Electrónica, Ingeniería de Sistemas y Automática. Tesis Doctoral

EXPLORING IMAGE-BASED CLASSIFICATION TO DETECT VEHICLE MAKE AND MODEL FINAL REPORT

CS231M Project Report - Automated Real-Time Face Tracking and Blending

Distinctive Image Features from Scale-Invariant Keypoints

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

A Learning Based Method for Super-Resolution of Low Resolution Images

Convolution. 1D Formula: 2D Formula: Example on the web:

Linear Threshold Units

Real-Time Automated Simulation Generation Based on CAD Modeling and Motion Capture

Object tracking & Motion detection in video sequences

Automatic georeferencing of imagery from high-resolution, low-altitude, low-cost aerial platforms

CS 534: Computer Vision 3D Model-based recognition

Canny Edge Detection

Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R.

Automatic 3D Mapping for Infrared Image Analysis

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Vision based Vehicle Tracking using a high angle camera

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

DETECTION OF PLANAR PATCHES IN HANDHELD IMAGE SEQUENCES

Supporting Online Material for

GUIDE TO POST-PROCESSING OF THE POINT CLOUD

Optical Tracking Using Projective Invariant Marker Pattern Properties

Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition

V-PITS : VIDEO BASED PHONOMICROSURGERY INSTRUMENT TRACKING SYSTEM. Ketan Surender

G E N E R A L A P P R O A CH: LO O K I N G F O R D O M I N A N T O R I E N T A T I O N I N I M A G E P A T C H E S

Interactive 3D Scanning Without Tracking

Unsupervised Joint Alignment of Complex Images

siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite

Automated Location Matching in Movies

Evaluation of Interest Point Detectors

How To Analyze Ball Blur On A Ball Image

Projection Center Calibration for a Co-located Projector Camera System

A Short Introduction to Computer Graphics

Make and Model Recognition of Cars


Edge-based Template Matching and Tracking for Perspectively Distorted Planar Objects

Face Recognition in Low-resolution Images by Using Local Zernike Moments

TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service

Normalisation of 3D Face Data

Handbook of Robotics. Chapter 22 - Range Sensors

Introduction. C 2009 John Wiley & Sons, Ltd

A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms

Online Learning of Patch Perspective Rectification for Efficient Object Detection

Automatic Labeling of Lane Markings for Autonomous Vehicles

The use of computer vision technologies to augment human monitoring of secure computing facilities

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability

Multispectral stereo acquisition using 2 RGB cameras and color filters: color and disparity accuracy

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

Component Ordering in Independent Component Analysis Based on Data Power

Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture.

FACE RECOGNITION BASED ATTENDANCE MARKING SYSTEM

Traffic Flow Monitoring in Crowded Cities

3D Modeling and Simulation using Image Stitching

2.2 Creaseness operator

Transcription:

Computer Vision - part II Review of main parts of Section B of the course School of Computer Science & Statistics Trinity College Dublin Dublin 2 Ireland www.scss.tcd.ie Lecture Name Course Name 1 1

2 nd half Vision Course - on one page 3D vision camera calibration stereo single moving camera photometric stereo radiometry Structure from motion VSLAM Feature Extraction Dense - HOG Sparse - SIFT, SURF Classification Training & evaluation (ROC) Feature Selection High dimensional data Applications Recognition, Photosynth, CBIR 2

3D vision - Camera Calibration Pinhole Camera model Extrinsics and Intrinsics Zhang s Method Know main mathematical structure of the method Know how it is applied practically 3

Stereo Vision Epipolar Geometry Canonical configuration Calculation of Depth Assumptions & Limitations Solving the Correspondence Problem Constraints to apply Bottom up (regions to features) Vs top down (features to regions) Disparity - PMF algorithm + Middlebury stereo vision page vision.middlebury.edu/stereo/ 4

VSLAM-Davison Real time - optimise for speed EKF based approach Shi-Thomasi feature extract Patch around keypoint Orientation and warp function assessment Building sparse 3D map EKF update Limit the search space - limit the number of keypoints 5

SFM - Pollefeys Offline process - optimise for accuracy Locate keypoints - Shi & Thomasi Solve F matrix - know the steps Use only key frames Solve for close views Find Calibration matrix Dense surface estimation Multi-view linking 3D surface reconstruction & texture 6

People tracking with HOG What is a HOG? Human Detection, Dalal and Triggs, CVPR 2005 Break Image in to cells Calculate HOG Normalise in overlapping blocks HOG data used in Classification Support Vector Machine or other classifiers Practical Issues (effect of smoothing, sampling scales, etc) 7

People Tracking - HOF Improvement when combined with HOG Motion used in activity recognition Differential Optical Flow Boundary Motion Histogram Internal Motion Histogram 8

SIFT Recognise 3D objects in 2D images Challenges that must be overcome Scale, view point, lighting, occlusion, noise SIFT - sparse features Detecting Features in scale space Different Gaussians, at different scales Find features in approximately in scale space Select and precise fit using principal curvatures HOG computed around feature point Rotational Invariance by relative to primary direction Illumination Invariance through normalisation 9

SURF Keypoint Approximation of Hessian Filters for Dxx,Dyy, Dxy and Dyx Use of the Integral Image Scale space through scaling filters Non maximal suppression in 3X3X3 region Descriptor Harr wavelet responses in a rotating window Accelerated matching due to contrast measure 10

Classification Bayesian Classifiers Model the PDF of the classes KNN Maximum likelihood Mahalanobis Distance Performance Evaluation Overfitting / Selection Bias ROC curve Analysis Cross Validation / Bootstrapping 11

Feature Selection The Problem High Dimensional Data, Data Set Imbalance PCA Find the features with the most variance LDA Find the features that are most separable Advanced methods to achieve classification SVM, Manifolds. All about getting good features 12

Applications: Recognition Face Recognition Individual recognition Face class recognition Eigen Faces Restrictions on data format 3D object recognition SIFT and SURF Strong features Geometric relationships - matching criterion 13

Applications: Photosynth Feature Extraction SIFT Camera Callibration PTLens F - Matrix Calculation RANSAC 3D point cloud hyperlinks between images Image selection based in view angle and scale 14

Applications: CBIR Search for images based on Content Feature Extraction Global vs Local Feature Vector Fast comparison histograms Earth Mover Distance Relevance Feedback 15

Exam Section A Dr Pitie - 3 questions Section B Dr Lacey - 3 questions Do 4 questions from 6, 2 questions from each section All questions have the same structure: Theory / knowledge part [7 marks] Practical problem solving part [18 marks] 16

Previous question Part A Analyse the differences between the SIFT and SURF feature detectors, comparing feature key point identi@ication methods and methods for achieving scale invariance and orientation invariance. Part B Your new employer, the Dublin Virtual Tourist Board, wants to create an interactive web site that allows users to explore major landmarks in Dublin online. Given recent budget cuts the only equipment they can give you is a good quality SLR camera and a reasonably powerful computer. You need to propose a design that will achieve their objective. Your design should include a detailed description of the steps required to process the images captured and an analysis of any limitations in the performance of the system. Please clearly state any assumptions that you make in the design of the system. 17

SIFT Multi-Scale Feature point detection: A Scale space pyramid is created by sub sampling the image to produce images of different sizes...explain Low contrast or poor edge responses are rejected if below a threshold. Long edges are removed by examining the Principal Curvatures at the point if it is strong in one direction and weak in the perpendicular direction these key points are rejected. [2 marks] SURF Feature point detection SURF uses an approximation to the hessian to find key points in the image. The hessian is the matrix of partial derivatives. SURF approximates these partial derivatives using binary masks. The binary masks are convolved with the Integral image to find the features. The integral image is constructed by summing all pixels above and to the left of the current pixel. [1 mark] SURF Multi-Scale Feature point detection Multiscale detection is achieved by scaling up the size of the binary masks. Key points are detected in a 3X3 neighbourhood, if they are also present in the scale above and below they are marked as potential key points [1 mark] SIFT Orientation invariance SIFT calculates the Histogram of orientation gradients in a window around the key point. The gradient strength and the distance from the key point weight the values in the histogram. The histogram is smoothed and threshold. If there are more than one dominant direction in the histogram a second key point is generated with that orientation. [1 mark] SURF Orientation invariance SURF calculates the response of vertical and Harr wavelets in a sliding window around the keypoint. The angle of the sliding window is a configurable parameter. [1 mark] SURF is faster than SIFT and the SURF feature orientation vector is less prone to being corrupted by noise because it calculated over the area of the harr wavelet rather than using a single pixel edge direction. [1 mark] 18

Students should highlight two main solutions: 1. Solution based on extracting the 3D surfaces from the images and allowing users to browse the database of photographs based on their location within the 3D model. 2. Solution based on extracting 3D surfaces from the images and also extracting textures from the images and building a fully textured 3D model that can be explored by the user. [2 marks] In both solutions students should cover the following key issues: Camera Calibration: performing a camera calibration of the camera by using a checker board pattern and the approach of Zhang as implemented in OpenCV (this would limit the camera to one focal length). An alternative, and preferred approach would be to exploit the information contained in the JPEG header of the image file and use PTLens to determine the camera intrinsic parameters. A third approach (less favoured) would be to perform selfcalibration again this would lead to a limitation of a single camera focal length. For both camera calibration approaches the single focal length limitation could be counteracted by taking several different sequences using different focal lengths / lenses and combining the separately calculated 3D models. [2 marks] Feature extraction: using a feature extraction system such as Shi-Thomasi, SIFT, SURF, etc. to identify key points between the images. SIFT and SURF would be preferable as the features are capable of being matched at multiple scales and are more unique.[2 marks] Stereo View set up Calculating the F matrix between the images using RANSAC and computing the number of feature points in the image that are inliers. Iterate until high confidence has been achieved. The validity of the stereo calculation needs to be assessed if the baseline between the two views is small (the angle between the views in less than 10deg) then the calculation of the F matrix will be ill conditioned. This can be verified in two ways. 1. Where the estimate of the camera positions is very close reject the match 2. After Polfleys if the stereo match estimated by using the Epipolar lines from the F matrix is better than using a simple 2D planar homography then this is a good stereo pair otherwise reject it. [4 marks] 19

Dense 3D point matching Having identified the good stereo pairs dense stereo matching should be performed along the Epi-poalr lines. Images may be rectified into the canonical configuration in order to speed up the matching process. Constraints such as the Disparity Limit, order constraint and other constraints - describe these [3 marks] Multi-View Linking The points generated from multiple 3D images pairs must be merged. Noise and camera calibration errors means that the same physical point may be recorded in different 3D positions in different views. Describe how this works..[3 marks] Depending on the approach taken by the student: 1. 3D model Approach One approach to displaying the images is to build a 3D surface from the 3D point cloud and texture it using the texture information from the 2D images. In order to achieve this the 3D depth math would have to be smoothed to remove the impact of noise. Then a 3D polygonal mesh would have to be built using Delaunay Triangulation or similar. The texture for the polygons..explain how to build a model. [2 marks] 2. 3D browsing of photo database If we take the Photosynth approach the 3D model is used to explore the database of original images. The user sees the image from the database that is closest to the view direction and scale of the current view of the 3D model. If the user change s their view position or zooms. Explain how the photosynth apporach works.[2 marks] 20

Use Diagrams where appropriate Use Bullet points where appropriate Use Flowcharts where appropriate Long rambling answers tend not to pick up marks - be concise and to the point Answer all questions in a seperate answerbook 21

Best of Luck! School of Computer Science & Statistics Trinity College Dublin Dublin 2 Ireland www.scss.tcd.ie Lecture Name Course Name 22 22