Semantic Recognition: Object Detection and Scene Segmentation

Size: px
Start display at page:

Download "Semantic Recognition: Object Detection and Scene Segmentation"

Transcription

1 Semantic Recognition: Object Detection and Scene Segmentation Xuming He Computer Vision Research Group NICTA Robotic Vision Summer School 2015 Acknowledgement: Slides from Fei-Fei Li, R. Fergus, A. Torralba, K. Grauman.

2 Semantic scene understanding Semantic recognition tasks: Object detection Semantic segmentation

3 Semantic scene understanding Semantic recognition tasks: Object detection Semantic segmentation Activity recognition Scene layout estimation Scene categorization Geo-localization

4 Outline Object (category) detection Basic framework of object detection Case study: Viola-Jones, DPM and R-CNN Discussions Semantic scene segmentation Pixel labeling and CRF Design of CRF models Discussions

5 Kristen Grauman Object detection: why it is so hard? Illumination Object pose Clutter Occlusions Intra-class appearance Viewpoint

6 Kristen Grauman Object detection: why it is so hard? Realistic scenes are crowded, cluttered, have overlapping objects.

7 What works reliably today (semi-) Rigid objects Eg., face, car, pedestrian, license plate/ traffic sign

8 Kristen Grauman Generic object detection: basic framework Build/train object model (training stage) Choose an object representation Learn or fit parameters of object model Generate candidates in new image Score the candidates (inference/prediction stage)

9 Case study I: Viola-Jones face detector Overview: A seminal approach to real-time object detection (Viola and Jones, IJCV 2004) Object / feature representation: Global template with Haar wavelet features Object model and scoring: Classifying object candidates into face/non-face Use boosted combination of discriminative features as final classifier

10 Kristen Grauman Viola-Jones detector: Haar wavelets Rectangular filters Feature output is difference between adjacent regions Efficiently computable with integral image: any sum can be computed in constant time. Value at (x,y) is sum of pixels above and to the left of (x,y) Integral image

11 Viola-Jones detector: features Which subset of these features should we use to determine if a window has a face? Considering all possible filter parameters: position, scale, and type: 180,000+ possible features associated with each 24 x 24 window Use AdaBoost both to select the informative features and to build the classifier Kristen Grauman

12 Viola-Jones detector: Boosting Defines a classifier using an additive model: Strong classifier Features vector Weight Weak classifier Training: incrementally selecting weaker classifiers During each step, we select a weak learner that does well on examples that were hard for the previous weak learners Hardness is captured by weights attached to training examples

13 Kristen Grauman Viola-Jones detector: AdaBoost Want to select the single rectangle feature and threshold that best separates positive (faces) and negative (nonfaces) training examples, in terms of weighted error. Resulting weak classifier: Outputs of a possible rectangle feature on faces and non-faces. For next round, reweight the examples according to errors, choose another filter/threshold combo.

14 Viola-Jones detector: Learned model First two features selected

15 Kristen Grauman Viola-Jones detector: Candidate generation Sliding window at multiple scales face/non-face Classifier

16 Kristen Grauman Cascading classifiers for detection Form a cascade with low false negative rates early on Apply less accurate but faster classifiers first to immediately discard windows that clearly appear to be negative

17 Detecting profile faces? Can we use the same detector?

18 Viola-Jones detector: Strengths Sliding window detection and global appearance descriptors: Simple detection protocol to implement Good feature choices critical Past successes for certain classes Implementation: OpenCV Kristen Grauman

19 Viola-Jones detector: Limitations Non-rigid, deformable objects not captured well with representations assuming a fixed 2d structure; or must assume fixed viewpoint Objects with less-regular textures not captured well with holistic appearance-based descriptions Kristen Grauman

20 Case Study II: Deformable Part-based Models Overview: Felzenszwalb et al. PAMI 10, and winner of the PASCAL detection challenge (2008,2009) Part-based representation: Global (root) template + deformable parts Trained from global bounding-boxes only

21 DPMs: Part-based representation Objects are decomposed into parts and spatial relations among parts Fischler and Elschlager 73 22

22 DPMs: Object representation Based on HOG features (1 root + 6 parts) Full model is a mixture of deformable part-based models

23 DPMs: Object model Object candidates obtained in a multi-scale fashion

24 DPMs: Object model Score of candidates

25 DPMs: Learning object models Training data consists of images with labeled bounding boxes Need to learn the model structure, filters and deformation costs: Latent SVM

26 DPMs: Candidate generation and scoring Detection: Defined by a high-scoring root location Relies on an overall score based on the placement of the parts Efficient dynamic programming/bp

27 Deformable Part-based Models: Results Car detections

28 Deformable Part-based Models: Results Person detections

29 Other part-based representations Tree model Articulated objects 30

30 Part-based models: Pose estimation Pose estimation in video (Ramanan et al, 2007) Running on street Dancing

31 DPMs: Strengths Part-based representation : Flexible object model with deformation/pose Discriminatively learned with bounding box annotation Past successes in PASCAL detection challenges. Implementation: Kristen Grauman

32 DPMs: Limitations Manually designed feature (HOG) Pre-defined object-part structure Trainable classifier is often generic (e.g. SVM) Where next? Better classifiers? Or keep building more features? Object candidates Hand-designed feature extraction Trainable classifier Object Class Kristen Grauman

33 Case Study III: Regions with CNN features Overview: Girshick et al. CVPR 14, and significant improvement on the PASCAL VOC Learned object representation based on Convolutional Neural Network. Candidate generation by region proposal (objectness)

34 RCNN: Object representation Learn a feature hierarchy all the way from pixels to classifier Each layer extracts features from the output of previous layer Train all layers jointly Object Candidates Layer 1 Layer 2 Layer 3 Simple Classifier

35 Layer 1: Top-9 Patches Patches from validation images that give maximal activation of a given feature map

36 Layer 2: Top-9 Patches

37 Layer 3: Top-9 Patches

38 RCNN: Candidate generation Class-generic Object Detection, or Objectness (eg. Alexe, Deselaers, and Ferrari, 2010) Saliency Edge Map Segments RCNN uses Selective Search method (van de Sande, Uijlings, Gevers, Smeulders, 2011)

39 RCNN: Detection pipeline and results Strength: Significant improvement on public benchmarks (map = 53.7% on PASCAL VOC 2010 vs ~35% with DPM.) Implementation https://github.com/rbgirshick/rcnn

40 Outline Object (category) detection Basic framework of object detection Case study: Viola-Jones, DPM and R-CNN Discussions Semantic scene segmentation Pixel labeling and CRF Design of CRF models Discussions

41 Semantic scene understanding Semantic recognition tasks: Object detection Semantic segmentation

42 Pixel labeling problem Problem formulation Assign predefined labels to image elements Multiple label spaces Typical settings Semantic object class Segmentation + recognition Object instance labeling Geometric class labeling etc. (Gould and He, CACM 2014)

43 Pixel labeling problem Surface layout (Hoiem, Efros & Hebert ICCV05; Gupta et al, ECCV 2010) Sky Non-Planar Porous Vertical Non-Planar Solid Support Planar (Left/Center/Right) Geometry + semantic segmentation (Gould et al, ICCV09) 45

44 A local solution Multiclass segmentation Image inputs A pixel-wise classifier energy function

45 Challenges in local approaches Local cues can be ambiguous for scene analysis Objects/regions are correlated in a scene P.5 P.5 Sky Water Sky Water P.5 P.5 Sky Water Sky Water (He et al, CVPR 2004)

46 Adding contextual information Incorporating spatial context Labels are generally spatially smooth Image inputs Local image cues Contextual information

47 An example: A simple smooth model Same labeling for neighboring pixels unless an intensity gradient exists Unary only + Pairwise (Shotton et al, ECCV2006)

48 Conditional Random Field framework Input Output CRF model Unary potential Pairwise potential Higher-order potential Examples: surface normal, object class, depth, etc.

49 Conditional Random Field framework Energy minimization perspective Label prediction: MAP estimation Global optimization of combinatorial problems Design choices in scene modeling Feature representation Modeling context Integrating top-down information

50 Image features and unary potentials Manually designed features Stuff class: local features Thing class: + shape cues Global image features Deep network features (Long, Shelhamer and Darrel, Arxiv 2014) (Farabet, et al, PAMI2013)

51 Image features and unary potentials

52 Pixel vs superpixel Pixel representation Redundancy Leading to complex models Super-pixel representation Over-segmentation Reduced model size Larger support for feature extraction Fast and regular-shaped e.g. SLIC (Achanta, et al, PAMI2012) implemented in VLFeat. Irregular graph and Inaccurate object boundaries Better to combine both representations.

53 Modeling context Local context Bottom-up grouping Superpixels to labels Regional context Pairwise interaction between neighboring regions Spatial relations between regions (Galleguillos et al., ICCV07, CVPR08)

54 Modeling longer-range context Fully-connected CRFs (Krahenbuhl and Koltun, NIPS2012) Higher-order models (Kohli et al., CVPR08; Park & Gould, ECCV12)

55 Integrating object-specific cues Previous potentials: smoothing Object shape mask as a top-down cue Integrating scene classification, object detection, etc (Yao, et al. CVPR 2012)

56 Integrating object-specific cues Object shape mask as a top-down cue Integrating object detection with semantic video labeling (Liu, et al. WACV & CVPR 2015) NICTA Copyright 2012 From imagination to impact

57 Datasets and software Datasets Stanford Background Dataset ; Microsoft Research Cambridge Dataset Pascal VOC; Labelme Dataset; MSCOCO Dataset Software packages Darwin software framework ALE (Automatic Labeling Environment)

58 Summary Pixel labeling and CRF framework Design choices in semantic segmentation Image feature representation Modeling context (short-range vs. long-range) Integrating top-down information at object and scene level Ongoing research directions Deep network features and CRF framework Nonparametric label transfer Multiple modality in scene labeling Gould and He, Scene Understanding by Labeling Pixels. Communications of the ACM, 2014 NICTA Copyright 2012 From imagination to impact 60

Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection

Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection CSED703R: Deep Learning for Visual Recognition (206S) Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 3 Object detection

More information

Local features and matching. Image classification & object localization

Local features and matching. Image classification & object localization Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

More information

Using geometry and related things

Using geometry and related things Using geometry and related things Region labels + Boundaries and objects Stronger geometric constraints from domain knowledge Reasoning on aspects and poses 3D point clouds Qualitative More quantitative

More information

Deformable Part Models with CNN Features

Deformable Part Models with CNN Features Deformable Part Models with CNN Features Pierre-André Savalle 1, Stavros Tsogkas 1,2, George Papandreou 3, Iasonas Kokkinos 1,2 1 Ecole Centrale Paris, 2 INRIA, 3 TTI-Chicago Abstract. In this work we

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks SHAOQING REN, KAIMING HE, ROSS GIRSHICK, JIAN SUN Göksu Erdoğan Object Detection Detection as Regression? DOG, (x, y, w, h)

More information

Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016

Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016 Module 5 Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016 Previously, end-to-end.. Dog Slide credit: Jose M 2 Previously, end-to-end.. Dog Learned Representation Slide credit: Jose

More information

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr Image Classification and Object Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Image classification Image (scene) classification is a fundamental

More information

The Visual Internet of Things System Based on Depth Camera

The Visual Internet of Things System Based on Depth Camera The Visual Internet of Things System Based on Depth Camera Xucong Zhang 1, Xiaoyun Wang and Yingmin Jia Abstract The Visual Internet of Things is an important part of information technology. It is proposed

More information

Object Detection based on Convolutional Neural Network

Object Detection based on Convolutional Neural Network Object Detection based on Convolutional Neural Network Shijian Tang Department of Electrical Engineering Stanford University sjtang@stanford.edu Ye Yuan Department of Computer Science Stanford University

More information

Sliding windows and face detection

Sliding windows and face detection Slidi windows and face detection Tuesday, Nov 10 Kristen Grauman UT Austin Last time Modeli categories with local features and spatial information: Histograms configurations of visual words to capture

More information

Convolutional Feature Maps

Convolutional Feature Maps Convolutional Feature Maps Elements of efficient (and accurate) CNN-based object detection Kaiming He Microsoft Research Asia (MSRA) ICCV 2015 Tutorial on Tools for Efficient Object Detection Overview

More information

Object Detectors Emerge in Deep Scene CNNs

Object Detectors Emerge in Deep Scene CNNs Object Detectors Emerge in Deep Scene CNNs Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba Massachusetts Institute of Technology CNN for Object Recognition Large-scale image classification

More information

Scalable Object Detection by Filter Compression with Regularized Sparse Coding

Scalable Object Detection by Filter Compression with Regularized Sparse Coding Scalable Object Detection by Filter Compression with Regularized Sparse Coding Ting-Hsuan Chao, Yen-Liang Lin, Yin-Hsi Kuo, and Winston H Hsu National Taiwan University, Taipei, Taiwan Abstract For practical

More information

Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision

Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by Gyozo Gidofalvi Computer Science and Engineering Department

More information

Pedestrian Detection with RCNN

Pedestrian Detection with RCNN Pedestrian Detection with RCNN Matthew Chen Department of Computer Science Stanford University mcc17@stanford.edu Abstract In this paper we evaluate the effectiveness of using a Region-based Convolutional

More information

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

Lecture 6: Classification & Localization. boris. ginzburg@intel.com Lecture 6: Classification & Localization boris. ginzburg@intel.com 1 Agenda ILSVRC 2014 Overfeat: integrated classification, localization, and detection Classification with Localization Detection. 2 ILSVRC-2014

More information

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015 CS 1699: Intro to Computer Vision Deep Learning Prof. Adriana Kovashka University of Pittsburgh December 1, 2015 Today: Deep neural networks Background Architectures and basic operations Applications Visualizing

More information

Robust Real-Time Face Detection

Robust Real-Time Face Detection Robust Real-Time Face Detection International Journal of Computer Vision 57(2), 137 154, 2004 Paul Viola, Michael Jones 授 課 教 授 : 林 信 志 博 士 報 告 者 : 林 宸 宇 報 告 日 期 :96.12.18 Outline Introduction The Boost

More information

Classification using intersection kernel SVMs is efficient

Classification using intersection kernel SVMs is efficient Classification using intersection kernel SVMs is efficient Jitendra Malik UC Berkeley Joint work with Subhransu Maji and Alex Berg Fast intersection kernel SVMs and other generalizations of linear SVMs

More information

The Middle Child Problem: Revisiting Parametric Min-cut and Seeds for Object Proposals

The Middle Child Problem: Revisiting Parametric Min-cut and Seeds for Object Proposals The Middle Child Problem: Revisiting Parametric Min-cut and Seeds for Object Proposals Ahmad Humayun Fuxin Li James M. Rehg Georgia Institute of Technology Oregon State University http://cpl.cc.gatech.edu/projects/poise

More information

Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance

Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance 2012 IEEE International Conference on Multimedia and Expo Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance Rogerio Feris, Sharath Pankanti IBM T. J. Watson Research Center

More information

Cascade Region Regression for Robust Object Detection

Cascade Region Regression for Robust Object Detection Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng Team Name: Amax Centre for Quantum Computation & Intelligent Systems (QCIS),

More information

RCNN, Fast RCNN, Faster RCNN

RCNN, Fast RCNN, Faster RCNN RCNN, Fast RCNN, Faster RCNN Topics of the lecture: Problem statement Review of slow R-CNN Review of Fast R-CNN Review of Faster R-CNN Presented by: Roi Shikler & Gil Elbaz Advisor: Prof. Michael Lindenbaum

More information

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning : Intro. to Computer Vision Deep learning University of Massachusetts, Amherst April 19/21, 2016 Instructor: Subhransu Maji Finals (everyone) Thursday, May 5, 1-3pm, Hasbrouck 113 Final exam Tuesday, May

More information

Pedestrian Detection using R-CNN

Pedestrian Detection using R-CNN Pedestrian Detection using R-CNN CS676A: Computer Vision Project Report Advisor: Prof. Vinay P. Namboodiri Deepak Kumar Mohit Singh Solanki (12228) (12419) Group-17 April 15, 2016 Abstract Pedestrian detection

More information

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform

More information

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28 Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object

More information

Semantic Image Segmentation and Web-Supervised Visual Learning

Semantic Image Segmentation and Web-Supervised Visual Learning Semantic Image Segmentation and Web-Supervised Visual Learning Florian Schroff Andrew Zisserman University of Oxford, UK Antonio Criminisi Microsoft Research Ltd, Cambridge, UK Outline Part I: Semantic

More information

Edge Boxes: Locating Object Proposals from Edges

Edge Boxes: Locating Object Proposals from Edges Edge Boxes: Locating Object Proposals from Edges C. Lawrence Zitnick and Piotr Dollár Microsoft Research Abstract. The use of object proposals is an effective recent approach for increasing the computational

More information

Multi-view Face Detection Using Deep Convolutional Neural Networks

Multi-view Face Detection Using Deep Convolutional Neural Networks Multi-view Face Detection Using Deep Convolutional Neural Networks Sachin Sudhakar Farfade Yahoo fsachin@yahoo-inc.com Mohammad Saberian Yahoo saberian@yahooinc.com Li-Jia Li Yahoo lijiali@cs.stanford.edu

More information

Finding people in repeated shots of the same scene

Finding people in repeated shots of the same scene Finding people in repeated shots of the same scene Josef Sivic 1 C. Lawrence Zitnick Richard Szeliski 1 University of Oxford Microsoft Research Abstract The goal of this work is to find all occurrences

More information

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles mjhustc@ucla.edu and lunbo

More information

Geodesic Object Proposals

Geodesic Object Proposals Geodesic Object Proposals Philipp Krähenbühl 1 and Vladlen Koltun 2 1 Stanford University 2 Adobe Research Abstract. We present an approach for identifying a set of candidate objects in a given image.

More information

Ensemble of Exemplar-SVMs for Object Detection and Beyond

Ensemble of Exemplar-SVMs for Object Detection and Beyond Ensemble of -SVMs for Object Detection and Beyond Tomasz Malisiewicz Carnegie Mellon University Abhinav Gupta Carnegie Mellon University Alexei A. Efros Carnegie Mellon University Abstract This paper proposes

More information

Localizing 3D cuboids in single-view images

Localizing 3D cuboids in single-view images Localizing 3D cuboids in single-view images Jianxiong Xiao Bryan C. Russell Antonio Torralba Massachusetts Institute of Technology University of Washington Abstract In this paper we seek to detect rectangular

More information

CNN-aware Binary Map... for General Semantic Segmentation

CNN-aware Binary Map... for General Semantic Segmentation CNN-aware Binary Map for Mahdyar Ravanbakhsh, Hossein Mousavi, Moin Nabi, Mohammad Rastegari, Carlo Regazzoni -II September 206 CNN-aware Binary Map /8 Low-level VS Semantic human Low-level Segmentatio

More information

High Level Describable Attributes for Predicting Aesthetics and Interestingness

High Level Describable Attributes for Predicting Aesthetics and Interestingness High Level Describable Attributes for Predicting Aesthetics and Interestingness Sagnik Dhar Vicente Ordonez Tamara L Berg Stony Brook University Stony Brook, NY 11794, USA tlberg@cs.stonybrook.edu Abstract

More information

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Machine Learning for Computer Vision 1 MVA ENS Cachan Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Department of Applied Mathematics Ecole Centrale Paris Galen

More information

Shared Parts for Deformable Part-based Models

Shared Parts for Deformable Part-based Models Shared Parts for Deformable Part-based Models Patrick Ott and Mark Everingham School of Computing University of Leeds {ott me}@comp.leeds.ac.uk Abstract The deformable part-based model (DPM) proposed by

More information

Bert Huang Department of Computer Science Virginia Tech

Bert Huang Department of Computer Science Virginia Tech This paper was submitted as a final project report for CS6424/ECE6424 Probabilistic Graphical Models and Structured Prediction in the spring semester of 2016. The work presented here is done by students

More information

What, Where & How Many? Combining Object Detectors and CRFs

What, Where & How Many? Combining Object Detectors and CRFs What, Where & How Many? Combining Object Detectors and CRFs L ubor Ladický, Paul Sturgess, Karteek Alahari, Chris Russell, and Philip H.S. Torr Oxford Brookes University http://cms.brookes.ac.uk/research/visiongroup

More information

A Dynamic Conditional Random Field Model for Joint Labeling of Object and Scene Classes

A Dynamic Conditional Random Field Model for Joint Labeling of Object and Scene Classes A Dynamic Conditional Random Field Model for Joint Labeling of Object and Scene Classes Christian Wojek and Bernt Schiele {wojek, schiele}@cs.tu-darmstadt.de Computer Science Department TU Darmstadt Abstract.

More information

ColorCrack: Identifying Cracks in Glass

ColorCrack: Identifying Cracks in Glass ColorCrack: Identifying Cracks in Glass James Max Kanter Massachusetts Institute of Technology 77 Massachusetts Ave Cambridge, MA 02139 kanter@mit.edu Figure 1: ColorCrack automatically identifies cracks

More information

Latest Advances in Deep Learning. Yao Chou

Latest Advances in Deep Learning. Yao Chou Latest Advances in Deep Learning Yao Chou Outline Introduction Images Classification Object Detection R-CNN Traditional Feature Descriptor Selective Search Implementation Latest Application Deep Learning

More information

Segmentation as Selective Search for Object Recognition

Segmentation as Selective Search for Object Recognition Segmentation as Selective Search for Object Recognition Koen E. A. van de Sande Jasper R. R. Uijlings Theo Gevers Arnold W. M. Smeulders University of Amsterdam University of Trento Amsterdam, The Netherlands

More information

Towards a Deep Learning Framework for Unconstrained Face Detection

Towards a Deep Learning Framework for Unconstrained Face Detection Towards a Deep Learning Framework for Unconstrained Face Detection Yutong Zheng Chenchen Zhu Khoa Luu Chandrasekhar Bhagavatula T. Hoang Ngan Le Marios Savvides CyLab Biometrics Center and the Department

More information

Fast Semantic Segmentation of 3D Point Clouds using a Dense CRF with Learned Parameters

Fast Semantic Segmentation of 3D Point Clouds using a Dense CRF with Learned Parameters Fast Semantic Segmentation of 3D Point Clouds using a Dense CRF with Learned Parameters Daniel Wolf, Johann Prankl and Markus Vincze Abstract In this paper, we present an efficient semantic segmentation

More information

Facade Segmentation in a Multi-View Scenario

Facade Segmentation in a Multi-View Scenario Facade in a Multi-View Scenario Michal Recky, Andreas Wendel, and Franz Leberl {recky, wendel, leberl}@icg.tugraz.at Institute for Computer Graphics and Vision (ICG) Graz University of Technology, Austria

More information

3D Model based Object Class Detection in An Arbitrary View

3D Model based Object Class Detection in An Arbitrary View 3D Model based Object Class Detection in An Arbitrary View Pingkun Yan, Saad M. Khan, Mubarak Shah School of Electrical Engineering and Computer Science University of Central Florida http://www.eecs.ucf.edu/

More information

Holistic Scene Understanding

Holistic Scene Understanding Holistic Scene Understanding Jian Yao August 6, 2015 1 License This code is copyright 2013 Jian Yao, Sanja Fidler and Raquel Urtasun. It is released for personal or academic use only. Any commercial use

More information

Object Categorization using Co-Occurrence, Location and Appearance

Object Categorization using Co-Occurrence, Location and Appearance Object Categorization using Co-Occurrence, Location and Appearance Carolina Galleguillos Andrew Rabinovich Serge Belongie Department of Computer Science and Engineering University of California, San Diego

More information

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006 Practical Tour of Visual tracking David Fleet and Allan Jepson January, 2006 Designing a Visual Tracker: What is the state? pose and motion (position, velocity, acceleration, ) shape (size, deformation,

More information

Weakly Supervised Object Boundaries Supplementary material

Weakly Supervised Object Boundaries Supplementary material Weakly Supervised Object Boundaries Supplementary material Anna Khoreva Rodrigo Benenson Mohamed Omran Matthias Hein 2 Bernt Schiele Max Planck Institute for Informatics, Saarbrücken, Germany 2 Saarland

More information

Region-oriented Convolutional Networks for Object Retrieval

Region-oriented Convolutional Networks for Object Retrieval Region-oriented Convolutional Networks for Object Retrieval Bachelor s Thesis Audiovisual Systems Engineering Author: Advisors: Eduard Fontdevila Bosch Xavier Giró-i-Nieto and Amaia Salvador Aguilera Universitat

More information

Segmentation & Clustering

Segmentation & Clustering EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong Jan 26, 2016 Today Administrivia A bigger picture and some common questions Object detection proposals, by Samer

More information

Vehicle Tracking by Simultaneous Detection and Viewpoint Estimation

Vehicle Tracking by Simultaneous Detection and Viewpoint Estimation Vehicle Tracking by Simultaneous Detection and Viewpoint Estimation Ricardo Guerrero-Gómez-Olmedo, Roberto López-Sastre, Saturnino Maldonado-Bascón, and Antonio Fernández-Caballero 2 GRAM, Department of

More information

Computer Vision - part II

Computer Vision - part II Computer Vision - part II Review of main parts of Section B of the course School of Computer Science & Statistics Trinity College Dublin Dublin 2 Ireland www.scss.tcd.ie Lecture Name Course Name 1 1 2

More information

Action Recognition From Weak Alignment of Body Parts

Action Recognition From Weak Alignment of Body Parts HOAI, LADICKÝ, ZISSERMAN: ACTION FROM WEAK ALIGNMENT OF BODY PARTS 1 Action Recognition From Weak Alignment of Body Parts Minh Hoai 12 http://www.robots.ox.ac.uk/~minhhoai/ L ubor Ladický 3 http://www.inf.ethz.ch/personal/ladickyl/

More information

Decomposing a Scene into Geometric and Semantically Consistent Regions

Decomposing a Scene into Geometric and Semantically Consistent Regions Decomposing a Scene into Geometric and Semantically Consistent Regions Stephen Gould Dept. of Electrical Engineering Stanford University sgould@stanford.edu Richard Fulton Dept. of Computer Science Stanford

More information

Quality Assessment for Crowdsourced Object Annotations

Quality Assessment for Crowdsourced Object Annotations S. VITTAYAKORN, J. HAYS: CROWDSOURCED OBJECT ANNOTATIONS 1 Quality Assessment for Crowdsourced Object Annotations Sirion Vittayakorn svittayakorn@cs.brown.edu James Hays hays@cs.brown.edu Computer Science

More information

Image Modeling using Tree Structured Conditional Random Fields

Image Modeling using Tree Structured Conditional Random Fields Image Modeling using Tree Structured Conditional Random Fields Pranjal Awasthi IBM India Research Lab New Delhi prawasth@in.ibm.com Aakanksha Gagrani Dept. of CSE IIT Madras aksgag@cse.iitm.ernet.in Balaraman

More information

The goal is multiply object tracking by detection with application on pedestrians.

The goal is multiply object tracking by detection with application on pedestrians. Coupled Detection and Trajectory Estimation for Multi-Object Tracking By B. Leibe, K. Schindler, L. Van Gool Presented By: Hanukaev Dmitri Lecturer: Prof. Daphna Wienshall The Goal The goal is multiply

More information

Learning Spatial Context: Using Stuff to Find Things

Learning Spatial Context: Using Stuff to Find Things Learning Spatial Context: Using Stuff to Find Things Geremy Heitz Daphne Koller Department of Computer Science, Stanford University {gaheitz,koller}@cs.stanford.edu Abstract. The sliding window approach

More information

How important are Deformable Parts in the Deformable Parts Model?

How important are Deformable Parts in the Deformable Parts Model? How important are Deformable Parts in the Deformable Parts Model? Santosh K. Divvala, Alexei A. Efros, and Martial Hebert Robotics Institute, Carnegie Mellon University. Abstract. The Deformable Parts

More information

Improving Spatial Support for Objects via Multiple Segmentations

Improving Spatial Support for Objects via Multiple Segmentations Improving Spatial Support for Objects via Multiple Segmentations Tomasz Malisiewicz and Alexei A. Efros Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 Abstract Sliding window scanning

More information

Fast R-CNN Object detection with Caffe

Fast R-CNN Object detection with Caffe Fast R-CNN Object detection with Caffe Ross Girshick Microsoft Research arxiv code Latest roasts Goals for this section Super quick intro to object detection Show one way to tackle obj. det. with ConvNets

More information

SLIC Superpixels. Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk

SLIC Superpixels. Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk SLIC Superpixels Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk School of Computer and Communication Sciences (IC) École Polytechnique Fédrale de Lausanne

More information

Introduction to Segmentation

Introduction to Segmentation Lecture 2: Introduction to Segmentation Jonathan Krause 1 Goal Goal: Identify groups of pixels that go together image credit: Steve Seitz, Kristen Grauman 2 Types of Segmentation Semantic Segmentation:

More information

Segmentation. Lecture 12. Many slides from: S. Lazebnik, K. Grauman and P. Kumar

Segmentation. Lecture 12. Many slides from: S. Lazebnik, K. Grauman and P. Kumar Segmentation Lecture 12 Many slides from: S. Lazebnik, K. Grauman and P. Kumar Image Segmentation Image segmentation The goals of segmentation Group together similar-looking pixels for efficiency of further

More information

Deep Fisher Networks and Class Saliency Maps for Object Classification and Localisation

Deep Fisher Networks and Class Saliency Maps for Object Classification and Localisation Deep Fisher Networks and Class Saliency Maps for Object Classification and Localisation Karén Simonyan, Andrea Vedaldi, Andrew Zisserman Visual Geometry Group, University of Oxford Outline Classification

More information

Research Statement: Towards Detailed Recognition of Visual Categories

Research Statement: Towards Detailed Recognition of Visual Categories As humans, we have a remarkable ability to perceive the world around us in minute detail purely from the light that is reflected off it we can estimate material and metric properties of objects, localize

More information

Recognition Using Visual Phrases

Recognition Using Visual Phrases Recognition Using Visual Phrases Mohammad Amin Sadeghi 1,2, Ali Farhadi 1 1 Computer Science Department, University of Illinois at Urbana-Champaign 2 Computer Vision Group, Institute for Research in Fundamental

More information

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite Philip Lenz 1 Andreas Geiger 2 Christoph Stiller 1 Raquel Urtasun 3 1 KARLSRUHE INSTITUTE OF TECHNOLOGY 2 MAX-PLANCK-INSTITUTE IS 3

More information

Real-Time Tracking via On-line Boosting

Real-Time Tracking via On-line Boosting 1 Real-Time Tracking via On-line Boosting Helmut Grabner, Michael Grabner, Horst Bischof Institute for Computer Graphics and Vision Graz University of Technology {hgrabner, mgrabner, bischof}@icg.tu-graz.ac.at

More information

Fast R-CNN. Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th. Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083.

Fast R-CNN. Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th. Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083. Fast R-CNN Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083. ECS 289G 001 Paper Presentation, Prof. Lee Result 1 67% Accuracy

More information

HUMAN DETECTION, TRACKING AND SEGMENTATION IN SURVEILLANCE VIDEO. GUANG SHU M.S. Shanghai Jiaotong University, 2009

HUMAN DETECTION, TRACKING AND SEGMENTATION IN SURVEILLANCE VIDEO. GUANG SHU M.S. Shanghai Jiaotong University, 2009 HUMAN DETECTION, TRACKING AND SEGMENTATION IN SURVEILLANCE VIDEO by GUANG SHU M.S. Shanghai Jiaotong University, 2009 A dissertation submitted in partial fulfilment of the requirements for the degree of

More information

arxiv: v1 [cs.cv] 29 Sep 2016

arxiv: v1 [cs.cv] 29 Sep 2016 Pano2CAD: Room Layout From A Single Panorama Image Jiu Xu 1 Björn Stenger 1 Tommi Kerola Tony Tung 2 1 Rakuten Inc. 2 Facebook arxiv:1609.09270v1 [cs.cv] 29 Sep 2016 Abstract This paper presents a method

More information

Image Classification for Dogs and Cats

Image Classification for Dogs and Cats Image Classification for Dogs and Cats Bang Liu, Yan Liu Department of Electrical and Computer Engineering {bang3,yan10}@ualberta.ca Kai Zhou Department of Computing Science kzhou3@ualberta.ca Abstract

More information

Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree.

Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree. Visual search: what's next? Cees Snoek University of Amsterdam The Netherlands Euvision Technologies The Netherlands Problem statement US flag Tree Aircraft Humans Dog Smoking Building Basketball Table

More information

Lecture 14: Convolutional neural networks for computer vision

Lecture 14: Convolutional neural networks for computer vision Lecture 14: Convolutional neural networks for computer vision Dr. Richard E. Turner (ret26@cam.ac.uk) November 20, 2014 Big picture Goal: how to produce good internal representations of the visual world

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

A discriminative parts-based model

A discriminative parts-based model A discriminative parts-based model Deva Ramanan UC Irvine Joint work with Pedro Felzenszwalb (UChicago) David McAllester (TTI-C) PASCAL07 Challenge Difficult objects aren t scored, but truncated ones are

More information

Image Classification using Random Forests and Ferns

Image Classification using Random Forests and Ferns Image Classification using Random Forests and Ferns Anna Bosch Computer Vision Group University of Girona aboschr@eia.udg.es Andrew Zisserman Dept. of Engineering Science University of Oxford az@robots.ox.ac.uk

More information

Training R- CNNs of various velocities Slow, fast, and faster

Training R- CNNs of various velocities Slow, fast, and faster Training R- CNNs of various velocities Slow, fast, and faster Ross Girshick Facebook AI Research (FAIR) Tools for Efficient Object Detection, ICCV 2015 Tutorial Section overview Kaiming just covered inference

More information

Big Data: Image & Video Analytics

Big Data: Image & Video Analytics Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)

More information

Computer Vision: Filtering

Computer Vision: Filtering Computer Vision: Filtering Raquel Urtasun TTI Chicago Jan 10, 2013 Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 1 / 82 Today s lecture... Image formation Image Filtering Raquel Urtasun (TTI-C) Computer

More information

Informed Haar-like Features Improve Pedestrian Detection

Informed Haar-like Features Improve Pedestrian Detection Informed Haar-like Features Improve Pedestrian Detection Shanshan Zhang, Christian Bauckhage, Armin B. Cremers University of Bonn, Germany Fraunhofer IAIS, Germany Bonn-Aachen International Center for

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Located Hidden Random Fields: Learning Discriminative Parts for Object Detection

Located Hidden Random Fields: Learning Discriminative Parts for Object Detection Located Hidden Random Fields: Learning Discriminative Parts for Object Detection Ashish Kapoor 1 and John Winn 2 1 MIT Media Laboratory, Cambridge, MA 02139, USA kapoor@media.mit.edu 2 Microsoft Research,

More information

Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics

Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University

More information

Human Pose Estimation from RGB Input Using Synthetic Training Data

Human Pose Estimation from RGB Input Using Synthetic Training Data Human Pose Estimation from RGB Input Using Synthetic Training Data Oscar Danielsson and Omid Aghazadeh School of Computer Science and Communication KTH, Stockholm, Sweden {osda02, omida}@kth.se arxiv:1405.1213v2

More information

Interactive Offline Tracking for Color Objects

Interactive Offline Tracking for Color Objects Interactive Offline Tracking for Color Objects Yichen Wei Jian Sun Xiaoou Tang Heung-Yeung Shum Microsoft Research Asia, Beijing, China {yichenw,jiansun,xitang,hshum}@microsoft.com Abstract In this paper,

More information

Unsupervised Discovery of Mid-Level Discriminative Patches

Unsupervised Discovery of Mid-Level Discriminative Patches Unsupervised Discovery of Mid-Level Discriminative Patches Saurabh Singh, Abhinav Gupta, and Alexei A. Efros Carnegie Mellon University, Pittsburgh, PA 15213, USA http://graphics.cs.cmu.edu/projects/discriminativepatches/

More information

Application of Face Recognition to Person Matching in Trains

Application of Face Recognition to Person Matching in Trains Application of Face Recognition to Person Matching in Trains May 2008 Objective Matching of person Context : in trains Using face recognition and face detection algorithms With a video-surveillance camera

More information

Machine Learning in Computer Vision A Tutorial. Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN

Machine Learning in Computer Vision A Tutorial. Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN Machine Learning in Computer Vision A Tutorial Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN Outline Introduction Supervised Learning Unsupervised Learning Semi-Supervised

More information

Vehicle Categorization: Parts for Speed and Accuracy

Vehicle Categorization: Parts for Speed and Accuracy Vehicle Categorization: Parts for Speed and Accuracy Eric Nowak,2 Frédéric Jurie Laboratoire GRAVIR / UMR 5527 du CNRS - INRIA Rhone-Alpes - UJF - INPG 2 Société Bertin - Technologies, Aix-en-Provence

More information

T O B C A T C A S E G E O V I S A T DETECTIE E N B L U R R I N G V A N P E R S O N E N IN P A N O R A MISCHE BEELDEN

T O B C A T C A S E G E O V I S A T DETECTIE E N B L U R R I N G V A N P E R S O N E N IN P A N O R A MISCHE BEELDEN T O B C A T C A S E G E O V I S A T DETECTIE E N B L U R R I N G V A N P E R S O N E N IN P A N O R A MISCHE BEELDEN Goal is to process 360 degree images and detect two object categories 1. Pedestrians,

More information

Image and Video Understanding

Image and Video Understanding Image and Video Understanding 2VO 710.095 WS Christoph Feichtenhofer, Axel Pinz Slide credits: Many thanks to all the great computer vision researchers on which this presentation relies on. Most material

More information

Visual Relationship Detection with Language Priors

Visual Relationship Detection with Language Priors Visual Relationship Detection with Language Priors Cewu Lu*, Ranjay Krishna*, Michael Bernstein, Li Fei-Fei Stanford University * = equal contribution image #1 image #2 llama person llama person 2 next

More information