Object Detection - Basics 1

Similar documents
Data Mining Practical Machine Learning Tools and Techniques

Lecture 2: The SVM classifier

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

Local features and matching. Image classification & object localization

Colour Image Segmentation Technique for Screen Printing

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye MSRC

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Social Media Mining. Data Mining Essentials

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Topographic Change Detection Using CloudCompare Version 1.0

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

Decision Trees from large Databases: SLIQ

Environmental Remote Sensing GEOG 2021

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

Monday Morning Data Mining

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

Active Learning SVM for Blogs recommendation

Segmentation of building models from dense 3D point-clouds

L25: Ensemble learning

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing Classifier

Random Forest Based Imbalanced Data Cleaning and Classification

ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite

Data Mining Algorithms Part 1. Dejan Sarka

Big Data Analytics CSCI 4030

Supporting Online Material for

Pixels Description of scene contents. Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT) Banksy, 2006

Machine Learning Final Project Spam Filtering

How To Cluster

Supervised Learning (Big Data Analytics)

3D Model based Object Class Detection in An Arbitrary View

Chapter 6. The stacking ensemble approach

Image Segmentation and Registration

COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS

OUTLIER ANALYSIS. Data Mining 1

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

Part-Based Recognition

Application of Face Recognition to Person Matching in Trains

Leveraging Ensemble Models in SAS Enterprise Miner

Robust Real-Time Face Detection

Data Mining for Knowledge Management. Classification

Improving performance of Memory Based Reasoning model using Weight of Evidence coded categorical variables

Open-Set Face Recognition-based Visitor Interface System

Face detection is a process of localizing and extracting the face region from the

Canny Edge Detection

SVM Ensemble Model for Investment Prediction

The Delicate Art of Flower Classification

G E N E R A L A P P R O A CH: LO O K I N G F O R D O M I N A N T O R I E N T A T I O N I N I M A G E P A T C H E S

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Digital image processing

Going Big in Data Dimensionality:

Data Mining Part 5. Prediction

Classification algorithm in Data mining: An Overview

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) ( ) Roman Kern. KTI, TU Graz

CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION

MA 323 Geometric Modelling Course Notes: Day 02 Model Construction Problem

A New Approach to Cutting Tetrahedral Meshes

Galaxy Morphological Classification

Human Pose Estimation from RGB Input Using Synthetic Training Data

Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016

Lecture 10: Regression Trees

Pigeonhole Principle Solutions

TRTML - A Tripleset Recommendation Tool based on Supervised Learning Algorithms

FAST APPROXIMATE NEAREST NEIGHBORS WITH AUTOMATIC ALGORITHM CONFIGURATION

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

T O B C A T C A S E G E O V I S A T DETECTIE E N B L U R R I N G V A N P E R S O N E N IN P A N O R A MISCHE BEELDEN

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

LCs for Binary Classification

Statistical Validation and Data Analytics in ediscovery. Jesse Kornblum

SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING

A Non-Linear Schema Theorem for Genetic Algorithms

Document Image Retrieval using Signatures as Queries

Information Retrieval and Web Search Engines

RANDOM PROJECTIONS FOR SEARCH AND MACHINE LEARNING

Feature Subset Selection in Spam Detection

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Android Ros Application

Data Mining. Nonlinear Classification

W6.B.1. FAQs CS535 BIG DATA W6.B If the distance of the point is additionally less than the tight distance T 2, remove it from the original set

Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks

1 if 1 x 0 1 if 0 x 1

Gerry Hobbs, Department of Statistics, West Virginia University

Introduction to data mining. Example of remote sensing image analysis

CS231M Project Report - Automated Real-Time Face Tracking and Blending

DYNAMIC FUZZY PATTERN RECOGNITION WITH APPLICATIONS TO FINANCE AND ENGINEERING LARISA ANGSTENBERGER

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Role of Neural network in data mining

International Journal of Advance Research in Computer Science and Management Studies

Calculation of Minimum Distances. Minimum Distance to Means. Σi i = 1

Machine Learning. CUNY Graduate Center, Spring Professor Liang Huang.

Data Warehousing und Data Mining

Signature Segmentation from Machine Printed Documents using Conditional Random Field

R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants

Transcription:

Object Detection - Basics 1 Lecture 28 See Sections 10.1.1, 10.1.2, and 10.1.3 in Reinhard Klette: Concise Computer Vision Springer-Verlag, London, 2014 1 See last slide for copyright information. 1 / 33

Agenda 1 Localization, Classification, Evaluation 2 Descriptors, Classifiers, Learning 3 Performance of Object Detectors 4 Descriptor Example: Histogram of Oriented Gradients 2 / 33

Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG Localization Localization, classification, and evaluation are three basic steps of an object detection system Object candidates are localized within a rectangular bounding box 3 / 33

Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG Classification Localized object candidates are mapped by classification either in detected objects or rejected candidates Face detection: one false-positive and two false-negatives (not counting the side-view of a face) 4 / 33

Evaluation A true-positive, also called a detection, is a correctly-detected object A false-positive, also called a false detection, occurs if we detect an object where there is none A false-negative denotes a case where we miss an object A true-negative describes the cases where non-object regions are correctly identified as non-object regions (typically not of interest) 5 / 33

Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG Which one is TP or FP or FN or TN? 6 / 33

Agenda 1 Localization, Classification, Evaluation 2 Descriptors, Classifiers, Learning 3 Performance of Object Detectors 4 Descriptor Example: Histogram of Oriented Gradients 7 / 33

Descriptors Classification is membership in pairwise-disjoint classes being subsets of R n, where n > 0 is defined by the used descriptors A descriptor x = (x 1,..., x n ) is a point in the n-dimensional descriptor space R n representing measured or calculated property values in a given order Two Examples: n = 128 for SIFT n = 2 on the next page: descriptor space is defined by properties perimeter and area ; e.g. descriptor x 1 = (621.605, 10 940) for Segment 1 8 / 33

Example: 2D Descriptor Space Left: Regions in a segmented image. Right: Descriptor space 80,000 Area 3 1 4 70,000 60,000 2 3 5 6 50,000 40,000 30,000 +1-1 20,000 10,000 5 4 6 1 2 Perimeter 200 600 1,000 1,400 1,800 2,200 2,600 The blue line defines a binary classifier; it subdivides the descriptor space into two half-planes such that descriptors in one half-plane have value +1 (i.e. +1 is a class identifier) assigned, and -1 if in the other half-plane 9 / 33

Classifiers A classifier (i.e. a partioning of the descriptor space) assigns class numbers to descriptors Training: using a given set {x 1,..., x m } of already-classified descriptors (the learning set) for defining the partitioning (the classifier) Application: on descriptors generated for recorded data General classifier: Assigns class numbers 1, 2,..., k for k > 1 classes, and 0 for not classified Binary classifier: Assigns class numbers 1 or +1 10 / 33

Weak or Strong Classifiers A classifier is weak if it does not perform up to expectations (e.g., it might be just a bit better than random guessing) Multiple weak classifiers can be mapped into a strong classifier, aiming at a satisfactory solution of a classification problem Weak or strong classifiers can be general-case (i.e. multi-class) classifiers or just binary classifiers; just being binary does not define weak Example: AdaBoost defines a statistical combination of multiple weak classifiers into one strong classifier (see later) 11 / 33

Example 1: Binary Classifier by Linear Separation We define a binary classifier by constructing a hyperplane in R n, for n 1 Vector w R n is the weight vector Real b R is the bias of Π Π : w x + b = 0 Example: n = 2 or n = 3, then w is the gradient or normal orthogonal to the defined line or plane Π, respectively 12 / 33

Example 1: Continued x 2 x 2 Π x 1 Π x 1 Left: Linear-separable distribution of descriptors pre-classified to be either in class +1 (green descriptors) or -1 (red descriptors) Right: Not linear separable; sum of shown distances (black line segments) of misclassified descriptors defines total error for Π 13 / 33

Example 1: Continued h(x) = w x + b h(x) 0: One side of the hyperplane (including the plane itself) defines value +1 h(x) < 0: The other side (not including the plane itself) value -1 Linear classifier defined by w and b can be calculated for a distribution of (pre-classified) training descriptors in nd descriptor space Error for a misclassified descriptor x is the perpendicular distance w x + b d 2 (x, Π) = w 2 to the hyperplane Π Task: Calculate Π such that total error for all misclassified training descriptors is minimized 14 / 33

Example 2: Classification by Using a Binary Decision Tree Classifier defined by binary decisions at split nodes in a tree (i.e. yes or no ) Each decision is formalized by a rule, and given input data can be tested whether they satisfy the rule or not Accordingly, we proceed with the identified successor node in the tree Each leaf node of the tree defines finally an assignment of data arriving at this node into classes Example: each leaf node identifies exactly one class in R n ; see next slide for n = 2 15 / 33

Example 2: Continued Left: Decision tree Right: Resulting subdivison in 2D descriptor space x 2 x 1 < 100 yes no x 2 >60 x 1 >160 yes no yes no x 1 + x 2 < 120 yes no 200 180 160 140 120 100 80 60 40 20 x 1 20 40 60 80 100 120 140 160 180 200 Tested rules in the shown example of a tree define straight lines in the 2D descriptor space; descriptors arriving at one of the leaf nodes are then in one of the shown subsets of R 2 16 / 33

Trees, Forests, Cascades of Binary Classifiers A single decision tree (defined by at least one split node) can be considered to be an example for a weak classifier A set of decision trees, called a forest, can then be used for defining a strong classifier Observation. A single decision tree provides a way to partition a descriptor space into multiple regions (i.e. classes) When applying binary classifiers defined by linear separation then we need to combine several of those (e.g. in a cascade) to achieve a similar partitioning of a descriptor space 17 / 33

Learning Learning is the process when defining or training a classifier based on a set of descriptors Classification is the actual application of the classifier During classification we may also identify some misbehavior, and this can lead again to another phase of learning The set of descriptors used for learning may be pre-classified or not Supervised learning: We have a mechanism for assigning class numbers to descriptors (e.g. manually based on expertise such as yes, the driver does have closed eyes in this image ) Unsupervised learning: We do not have prior knowledge about class memberships of descriptors, e.g. for randomly selected patches in an image: a typical patch for a pedestrian or not? 18 / 33

Unsupervised Learning: Two Examples Data distribution in learning set decides about the classifier Clustering Apply a clustering algorithm for a given set of descriptors for identifying a separation of R n into classes Example: Analyze the density of the distribution of given descriptors in R n ; a region having a dense distribution defines a seed point of one class, and then we assign all descriptors to identified seed points by applying, for example, the nearest-neighbor rule Learn Rules at Split Nodes in a Decision Tree Learn decision rules at split nodes e.g. by having a general scheme how to define such rules, and optimise parameters by maximising the information gain at this split node (e.g. equal number of training descriptors passing to either the left or the right successor) 19 / 33

Positive (for Pedestrian ) and Negative Class Examples 20 / 33

Combined Learning Approaches There are also cases where we may combine supervised learning with strategies known from unsupervised learning Example Supervised: Decide whether a given bounding box shows a pedestrian, or decide for a patch, being a subwindow of a bounding box, whether it possibly belongs to a pedestrian Unsupervised: Generate a decision tree, e.g. by maximising information gain at split nodes Result: Assign class probabilities to a leaf node in the generated tree according to percentages of pre-classified descriptors arriving at this leaf node 21 / 33

Agenda 1 Localization, Classification, Evaluation 2 Descriptors, Classifiers, Learning 3 Performance of Object Detectors 4 Descriptor Example: Histogram of Oriented Gradients 22 / 33

Object Detector and Measures An object detector is defined by applying a classifier for an object detection problem We assume that any made decision can be evaluated as being either correct or false Evaluations of designed object detectors are required to compare their performance under particular conditions There are common measures in pattern recognition or information retrieval for performance evaluation of classifiers 23 / 33

Basic Definitions Let tp or fp denote the numbers of true-positives or false-positives, respectively Let tn or fn denote the numbers of true-negatives or false-negatives, respectively What are the numbers for the example on Page 6? Note: just the image does not indicate how many non-object regions have been analyzed (and correctly identified as being no faces); thus we cannot specify the number tn; we need to analyze the applied classifier for obtaining tn 24 / 33

PR, RC, MR, and FPPI Precision is the ratio of true-positives compared to all detections Recall (or sensitivity) is the ratio of true-positives to all potentially possible detections PR = tp tp + fp and RC = tp tp + fn PR = 1: no false-positive is detected RC = 1: all visible objects are detected & there is no false-negative Miss rate is the ratio of false-negatives to all objects False-positives per image is the ratio of false-positives to all detected objects MR = fn fp = 1 RC and FPPI = tp + fn tp + fp = 1 PR MR = 0: all visible objects are detected FPPI = 0: detected objects are correctly classified 25 / 33

TNR and AC tn is not a common entry for performance measures, but, if available then we also have TNR and AC: True-negative rate (or specificity) is the ratio of true-negatives to all decisions in no-object regions Accuracy is the ratio of correct decisions to all decisions TNR = tn tn + fp and AC = tp + tn tp + tn + fp + fn 26 / 33

Detected? How to decide whether a detected object is true-positive? Assume: Objects in images have been locally identified (e.g. manually) by bounding boxes, serving as the ground truth Detected objects are matched with these ground-truth boxes by calculating ratios of areas of overlapping regions a o = A(D T ) A(D T ) where A denotes the area of a region in an image, D is the detected bounding box of the object, and T is the area of the bounding box of the matched ground-truth box If a o T, say for T = 0.5, the detected object is taken as a true-positive If more than one possible matching for a detected bounding box then use the one with the largest a o -value 27 / 33

Agenda 1 Localization, Classification, Evaluation 2 Descriptors, Classifiers, Learning 3 Performance of Object Detectors 4 Descriptor Example: Histogram of Oriented Gradients 28 / 33

Scanning an Image for Object Candidates 1 Window of the size of the expected bounding box scans through an image 2 The scan stops at potential object candidates 3 If a potential bounding box has been identified, a process for descriptor calculation starts Histogram of oriented gradients (HoG) is a common way to derive a descriptor for a bounding box for an object candidate 29 / 33

Bounding Box, Blocks, and Cells A bounding box (here: of a pedestrian) is subdivided into blocks, and each block into smaller cells for calculating the HoG Yellow solid or dashed blocks are subdivided into red cells; a block moves left to right, top down, through a bounding box Right: Magnitudes of gradient vectors 30 / 33

Algorithm for Calculating the HoG Descriptor 1 Preprocessing. Intensity normalization and smoothing 2 Calculate an edge map. Gradient magnitudes and gradient angles for each pixel, generating a magnitude map I m and an angle map I a 3 Spatial binning. 1 Group pixels into non-overlapping cells (e.g. 8 8) 2 Accumulate magnitude values in I m into direction bins (e.g., nine bins for intervals of 20 each) to obtain a voting vector for each cell calculation 4 Normalize voting values for generating a descriptor. 1 Group cells (e.g., 2 2) into one block 2 Normalize voting vectors over each block, and combine them into one block vector 5 Concatenation. Augment all block vectors consecutively; this produces the final HoG descriptor 31 / 33

Two Examples Length of vectors in nine different directions in each cell represents the accumulated magnitude of gradient vectors for one of those nine directions 32 / 33

Copyright Information This slide show was prepared by Reinhard Klette with kind permission from Springer Science+Business Media B.V. The slide show can be used freely for presentations. However, all the material is copyrighted. R. Klette. Concise Computer Vision. c Springer-Verlag, London, 2014. In case of citation: just cite the book, that s fine. 33 / 33