Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection

Size: px
Start display at page:

Download "Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection"

Transcription

1 CSED703R: Deep Learning for Visual Recognition (206S) Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection Bohyung Han Computer Vision Lab. 2 3 Object detection Region based CNN (RCNN) Input image Extract region proposal Compute CNN features Any proposal method Any architecture (e.g., selective search, edgebox) Classification Softmax, SVM Independent evaluation of each proposal Bounding box regression improves detection accuracy. Mean average precision (map): 53.7% with bounding box regression in VOC 200 test set [Girshick4] R. Girshick, J. Donahue, S. Guadarrama, T. Darrell, J. Malik: Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR Motivation Selective Search Sliding window approach is not feasible for object detection with convolutional neural networks. We need a more faster method to identify object candidates. Finding object proposals Greedy hierarchical superpixel segmentation Diversification of superpixel construction and merge Using a variety of color spaces Using different similarity measures Varying staring regions [Uijlings3] J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, A. W. M. Smeulders: Selective Search for Object Recognition. IJCV 203

2 Bounding Box Regression Detection Results Learning a transformation of bounding box VOC 200 test set Region proposal:,,, Ground truth:,,, Transformation:,,, exp Feature analysis on VOC 2007 test set exp 5 argmin CNN pool5 feature 6 Fast RCNN Faster RCNN Fast RCNN + RPN Proposal computation into network Marginal cost of proposals: 0ms 7 Fast version of RCNN 9x faster in training and 23x faster in testing than RCNN A single feature computation and ROI pooling using object proposals Bounding box regression into network Single stage training using multi task loss [Girshick5] R. Girshick: Fast R CNN, ICCV 205 [Ren5] S. Ren, K. He, R. Girshick, J. Sun: Faster R CNN: Towards Real Time Object Detection with Region Proposal Networks. NIPS 205 8

3 Object Detection Performance Faster RCNN with ResNet RCNN family achieves the state of the art performance in object detection! Pascal VOC 2007 Object Detection map (%) 9 0 Faster RCNN with ResNet Visual Tracking with Convolutional Neural Networks 2

4 Main Idea Training shared features and domain specific classifiers jointly. Domain Domain specific classifiers Visual Tracking MDNet (Multi Domain Network) Multi domain learning Separating shared and domain specific layers Shared feature representation Domain 2 Domain 3 Domain 4 3 Transfer to a new domain Multi Domain Learning [Nam5] Hyeonseob Nam, Bohyung Han: Learning Multi Domain Convolutional Neural Networks for Visual Tracking, CVPR The Winner of Visual Object Tracking Challenge 205 Online Tracking using MDNet Features Iteration #nk+ #nk+2 Transfer shared features New Sequence 5 6

5 Online Tracking using MDNet Features Online Tracking: Overview : positive score Transfer shared features Frame 2 argmax x New Sequence Draw target candidates Find the optimal state Collect training samples Update the CNN if needed Fine Tuning Repeat for the next frame 7 8 Long Term Update Performed at regular intervals Using long term training samples For Robustness Online Network Update Long-term update Short Term Update Performed at abrupt appearance changes ( 0.5 Using short term training samples For Adaptiveness Provide a hard minibatch in each training iteration. Pool of Negative Samples Randomly draw samples Hard Negative Mining Select samples with highest scores A MINIBATCH Training CNN Frame # Short-term update 20 Pool of Positive Samples Randomly draw samples

6 Hard Negative Mining Bounding Box Regression Positive sample Negative sample Improve the localization quality. DPM [Felzenszwalb et al. PAMI 0], R CNN [Girshick et al. CVPR 4] Frame Frame Ground-Truth st minibatch 5 th minibatch 30 th minibatch Positive samples Train a bounding box regression model. Tracking result Adjust the tracking result by bounding box regression. Training iteration 2 22 Results on OTB00 [Wu5] Results on VOT205 Protocol MDNet is trained with 58 sequences from {VOT 3, 4, 5} excluding {OTB00}. Distance precision and overlap success rate by One Pass Evaluation (OPE) 23 [Wu5] Y. Wu, J. Lim, M. H. Yang: Object Tracking Benchmark. TPAMI Ground truth Our 5 repetitions

7 Semantic Segmentation Segmenting images based on its semantic notion Semantic Segmentation by Fully Convolutional Network Semantic Segmentation using CNN Image classification Fully Convolutional Network (FCN) Interpreting fully connected layers as convolution layers Each fully connected layer is identical to a convolution layer with a large spatial filter that covers entire input field. Query image Semantic segmentation Given an input image, obtain pixel wise segmentation mask using a deep Convolutional Neural Network (CNN) fc7 fc6 pool fc7 fc6 fc7 fc pool pool Fully connected layers Convolution layers For the larger Input field Query image 27 28

8 FCN for Semantic Segmentation Network architecture [Long5] End to end CNN architecture for semantic segmentation Interpret fully connected layers to convolutional layers 500x500x3 Bilinear interpolation filter Deconvolution Filter Same filter for every class No filter learning! How does this deconvolution work? Deconvolution layer is fixed. Fining tuning convolutional layers of the network with segmentation ground truth. 6x6x2 seg Deconvolution Fixed Pretrained on ImageNet Fine tuned for segmentation 64x64 bilinear interpolation [Long5] J. Long, E. Shelhamer, and T. Darrell, Fully Convolutional Network for Semantic Segmentation. CVPR Skip Architecture Ensemble of three different scales Combining complementary features More semantic Limitations of FCN based Semantic Segmentation Coarse output score map A single bilinear filter should handle the variations in all kinds of object classes. Difficult to capture detailed structure of objects in image Fixed size receptive field Unable to handle multiple scales Difficult to delineate too small or large objects compared to the size of rec eptive field Noisy predictions due to skip architecture Trade off between details and noises Minor quantitative performance improvement 3 More detailed 32

9 Results and Limitations Results and Limitations Input image GT FCN 32s FCN 6s FCN 8s Input image GT FCN 32s FCN 6s FCN 8s

Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016

Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016 Module 5 Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016 Previously, end-to-end.. Dog Slide credit: Jose M 2 Previously, end-to-end.. Dog Learned Representation Slide credit: Jose

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks SHAOQING REN, KAIMING HE, ROSS GIRSHICK, JIAN SUN Göksu Erdoğan Object Detection Detection as Regression? DOG, (x, y, w, h)

More information

Convolutional Feature Maps

Convolutional Feature Maps Convolutional Feature Maps Elements of efficient (and accurate) CNN-based object detection Kaiming He Microsoft Research Asia (MSRA) ICCV 2015 Tutorial on Tools for Efficient Object Detection Overview

More information

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

Lecture 6: Classification & Localization. boris. ginzburg@intel.com Lecture 6: Classification & Localization boris. ginzburg@intel.com 1 Agenda ILSVRC 2014 Overfeat: integrated classification, localization, and detection Classification with Localization Detection. 2 ILSVRC-2014

More information

Training R- CNNs of various velocities Slow, fast, and faster

Training R- CNNs of various velocities Slow, fast, and faster Training R- CNNs of various velocities Slow, fast, and faster Ross Girshick Facebook AI Research (FAIR) Tools for Efficient Object Detection, ICCV 2015 Tutorial Section overview Kaiming just covered inference

More information

Pedestrian Detection with RCNN

Pedestrian Detection with RCNN Pedestrian Detection with RCNN Matthew Chen Department of Computer Science Stanford University mcc17@stanford.edu Abstract In this paper we evaluate the effectiveness of using a Region-based Convolutional

More information

Object Detection based on Convolutional Neural Network

Object Detection based on Convolutional Neural Network Object Detection based on Convolutional Neural Network Shijian Tang Department of Electrical Engineering Stanford University sjtang@stanford.edu Ye Yuan Department of Computer Science Stanford University

More information

Semantic Recognition: Object Detection and Scene Segmentation

Semantic Recognition: Object Detection and Scene Segmentation Semantic Recognition: Object Detection and Scene Segmentation Xuming He xuming.he@nicta.com.au Computer Vision Research Group NICTA Robotic Vision Summer School 2015 Acknowledgement: Slides from Fei-Fei

More information

RCNN, Fast RCNN, Faster RCNN

RCNN, Fast RCNN, Faster RCNN RCNN, Fast RCNN, Faster RCNN Topics of the lecture: Problem statement Review of slow R-CNN Review of Fast R-CNN Review of Faster R-CNN Presented by: Roi Shikler & Gil Elbaz Advisor: Prof. Michael Lindenbaum

More information

Cascade Region Regression for Robust Object Detection

Cascade Region Regression for Robust Object Detection Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng Team Name: Amax Centre for Quantum Computation & Intelligent Systems (QCIS),

More information

Fast R-CNN. Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th. Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083.

Fast R-CNN. Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th. Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083. Fast R-CNN Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083. ECS 289G 001 Paper Presentation, Prof. Lee Result 1 67% Accuracy

More information

Fast R-CNN Object detection with Caffe

Fast R-CNN Object detection with Caffe Fast R-CNN Object detection with Caffe Ross Girshick Microsoft Research arxiv code Latest roasts Goals for this section Super quick intro to object detection Show one way to tackle obj. det. with ConvNets

More information

Pedestrian Detection using R-CNN

Pedestrian Detection using R-CNN Pedestrian Detection using R-CNN CS676A: Computer Vision Project Report Advisor: Prof. Vinay P. Namboodiri Deepak Kumar Mohit Singh Solanki (12228) (12419) Group-17 April 15, 2016 Abstract Pedestrian detection

More information

Deformable Part Models with CNN Features

Deformable Part Models with CNN Features Deformable Part Models with CNN Features Pierre-André Savalle 1, Stavros Tsogkas 1,2, George Papandreou 3, Iasonas Kokkinos 1,2 1 Ecole Centrale Paris, 2 INRIA, 3 TTI-Chicago Abstract. In this work we

More information

Bert Huang Department of Computer Science Virginia Tech

Bert Huang Department of Computer Science Virginia Tech This paper was submitted as a final project report for CS6424/ECE6424 Probabilistic Graphical Models and Structured Prediction in the spring semester of 2016. The work presented here is done by students

More information

Object Detectors Emerge in Deep Scene CNNs

Object Detectors Emerge in Deep Scene CNNs Object Detectors Emerge in Deep Scene CNNs Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba Massachusetts Institute of Technology CNN for Object Recognition Large-scale image classification

More information

MulticoreWare. Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan

MulticoreWare. Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan 1 MulticoreWare Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan Focused on Heterogeneous Computing Multiple verticals spawned from core competency Machine Learning

More information

Latest Advances in Deep Learning. Yao Chou

Latest Advances in Deep Learning. Yao Chou Latest Advances in Deep Learning Yao Chou Outline Introduction Images Classification Object Detection R-CNN Traditional Feature Descriptor Selective Search Implementation Latest Application Deep Learning

More information

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015 CS 1699: Intro to Computer Vision Deep Learning Prof. Adriana Kovashka University of Pittsburgh December 1, 2015 Today: Deep neural networks Background Architectures and basic operations Applications Visualizing

More information

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning : Intro. to Computer Vision Deep learning University of Massachusetts, Amherst April 19/21, 2016 Instructor: Subhransu Maji Finals (everyone) Thursday, May 5, 1-3pm, Hasbrouck 113 Final exam Tuesday, May

More information

CNN-aware Binary Map... for General Semantic Segmentation

CNN-aware Binary Map... for General Semantic Segmentation CNN-aware Binary Map for Mahdyar Ravanbakhsh, Hossein Mousavi, Moin Nabi, Mohammad Rastegari, Carlo Regazzoni -II September 206 CNN-aware Binary Map /8 Low-level VS Semantic human Low-level Segmentatio

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong Jan 26, 2016 Today Administrivia A bigger picture and some common questions Object detection proposals, by Samer

More information

Region-oriented Convolutional Networks for Object Retrieval

Region-oriented Convolutional Networks for Object Retrieval Region-oriented Convolutional Networks for Object Retrieval Bachelor s Thesis Audiovisual Systems Engineering Author: Advisors: Eduard Fontdevila Bosch Xavier Giró-i-Nieto and Amaia Salvador Aguilera Universitat

More information

Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks

Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks 1 Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks Tomislav Hrkać, Karla Brkić, Zoran Kalafatić Faculty of Electrical Engineering and Computing University of

More information

Reversible Recursive Instance-level Object Segmentation

Reversible Recursive Instance-level Object Segmentation Reversible Recursive Instance-level Object Segmentation Xiaodan Liang1,3, 1 Sun Yat-sen University Yunchao Wei3, Xiaohui Shen4, Zequn Jie3, Jiashi Feng3 Liang Lin1, Shuicheng Yan2,3 2 360 AI Institute

More information

Using geometry and related things

Using geometry and related things Using geometry and related things Region labels + Boundaries and objects Stronger geometric constraints from domain knowledge Reasoning on aspects and poses 3D point clouds Qualitative More quantitative

More information

HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection

HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection : Towards Accurate Region Proposal Generation and Joint Object Detection Tao Kong Anbang Yao 2 Yurong Chen 2 Fuchun Sun State Key Lab. of Intelligent Technology and Systems Tsinghua National Laboratory

More information

Towards a Deep Learning Framework for Unconstrained Face Detection

Towards a Deep Learning Framework for Unconstrained Face Detection Towards a Deep Learning Framework for Unconstrained Face Detection Yutong Zheng Chenchen Zhu Khoa Luu Chandrasekhar Bhagavatula T. Hoang Ngan Le Marios Savvides CyLab Biometrics Center and the Department

More information

Object Detection in Video using Faster R-CNN

Object Detection in Video using Faster R-CNN Object Detection in Video using Faster R-CNN Prajit Ramachandran University of Illinois at Urbana-Champaign prmchnd2@illinois.edu Abstract Convolutional neural networks (CNN) currently dominate the computer

More information

Multi-view Face Detection Using Deep Convolutional Neural Networks

Multi-view Face Detection Using Deep Convolutional Neural Networks Multi-view Face Detection Using Deep Convolutional Neural Networks Sachin Sudhakar Farfade Yahoo fsachin@yahoo-inc.com Mohammad Saberian Yahoo saberian@yahooinc.com Li-Jia Li Yahoo lijiali@cs.stanford.edu

More information

Task-driven Progressive Part Localization for Fine-grained Recognition

Task-driven Progressive Part Localization for Fine-grained Recognition Task-driven Progressive Part Localization for Fine-grained Recognition Chen Huang Zhihai He chenhuang@mail.missouri.edu University of Missouri hezhi@missouri.edu Abstract In this paper we propose a task-driven

More information

Fast Accurate Fish Detection and Recognition of Underwater Images with Fast R-CNN

Fast Accurate Fish Detection and Recognition of Underwater Images with Fast R-CNN Fast Accurate Fish Detection and Recognition of Underwater Images with Fast R-CNN Xiu Li 1, 2, Min Shang 1, 2, Hongwei Qin 1, 2, Liansheng Chen 1, 2 1. Department of Automation, Tsinghua University, Beijing

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 1 Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun arxiv:1506.01497v3 [cs.cv] 6 Jan 2016 Abstract State-of-the-art object

More information

Image and Video Understanding

Image and Video Understanding Image and Video Understanding 2VO 710.095 WS Christoph Feichtenhofer, Axel Pinz Slide credits: Many thanks to all the great computer vision researchers on which this presentation relies on. Most material

More information

The Visual Internet of Things System Based on Depth Camera

The Visual Internet of Things System Based on Depth Camera The Visual Internet of Things System Based on Depth Camera Xucong Zhang 1, Xiaoyun Wang and Yingmin Jia Abstract The Visual Internet of Things is an important part of information technology. It is proposed

More information

Weakly Supervised Object Boundaries Supplementary material

Weakly Supervised Object Boundaries Supplementary material Weakly Supervised Object Boundaries Supplementary material Anna Khoreva Rodrigo Benenson Mohamed Omran Matthias Hein 2 Bernt Schiele Max Planck Institute for Informatics, Saarbrücken, Germany 2 Saarland

More information

Edge Boxes: Locating Object Proposals from Edges

Edge Boxes: Locating Object Proposals from Edges Edge Boxes: Locating Object Proposals from Edges C. Lawrence Zitnick and Piotr Dollár Microsoft Research Abstract. The use of object proposals is an effective recent approach for increasing the computational

More information

Show, Divide and Neural: Weighted Style Transfer

Show, Divide and Neural: Weighted Style Transfer Show, Divide and Neural: Weighted Style Transfer Ethan Chan Stanford University ethancys@stanford.edu Rishabh Bhargava Stanford University rish93@stanford.edu Abstract The neural style algorithm has been

More information

arxiv:1604.08893v1 [cs.cv] 29 Apr 2016

arxiv:1604.08893v1 [cs.cv] 29 Apr 2016 Faster R-CNN Features for Instance Search Amaia Salvador, Xavier Giró-i-Nieto, Ferran Marqués Universitat Politècnica de Catalunya (UPC) Barcelona, Spain {amaia.salvador,xavier.giro}@upc.edu Shin ichi

More information

arxiv:1504.08083v2 [cs.cv] 27 Sep 2015

arxiv:1504.08083v2 [cs.cv] 27 Sep 2015 Fast R-CNN Ross Girshick Microsoft Research rbg@microsoft.com arxiv:1504.08083v2 [cs.cv] 27 Sep 2015 Abstract This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object

More information

arxiv: v2 [cs.cv] 6 Jan 2016

arxiv: v2 [cs.cv] 6 Jan 2016 Learning Multi-Domain Convolutional Neural Networks for Visual Tracking Hyeonseob Nam Bohyung Han Dept. of Computer Science and Engineering, POSTECH, Korea {namhs9, bhhan}@postech.ac.kr arxiv:.9v [cs.cv]

More information

Deep Residual Networks

Deep Residual Networks Deep Residual Networks Deep Learning Gets Way Deeper 8:30-10:30am, June 19 ICML 2016 tutorial Kaiming He Facebook AI Research* *as of July 2016. Formerly affiliated with Microsoft Research Asia 7x7 conv,

More information

Scene recognition with CNNs: objects, scales and dataset bias

Scene recognition with CNNs: objects, scales and dataset bias Scene recognition with CNNs: objects, scales and dataset bias Luis Herranz, Shuqiang Jiang, Xiangyang Li Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences (CAS) Institute

More information

R-CNN minus R. 1 Introduction. Karel Lenc http://www.robots.ox.ac.uk/~karel. Department of Engineering Science, University of Oxford, Oxford, UK.

R-CNN minus R. 1 Introduction. Karel Lenc http://www.robots.ox.ac.uk/~karel. Department of Engineering Science, University of Oxford, Oxford, UK. LENC, VEDALDI: R-CNN MINUS R 1 R-CNN minus R Karel Lenc http://www.robots.ox.ac.uk/~karel Andrea Vedaldi http://www.robots.ox.ac.uk/~vedaldi Department of Engineering Science, University of Oxford, Oxford,

More information

Scalable Object Detection by Filter Compression with Regularized Sparse Coding

Scalable Object Detection by Filter Compression with Regularized Sparse Coding Scalable Object Detection by Filter Compression with Regularized Sparse Coding Ting-Hsuan Chao, Yen-Liang Lin, Yin-Hsi Kuo, and Winston H Hsu National Taiwan University, Taipei, Taiwan Abstract For practical

More information

LIBSVX and Video Segmentation Evaluation

LIBSVX and Video Segmentation Evaluation CVPR 14 Tutorial! 1! LIBSVX and Video Segmentation Evaluation Chenliang Xu and Jason J. Corso!! Computer Science and Engineering! SUNY at Buffalo!! Electrical Engineering and Computer Science! University

More information

Image Classification for Dogs and Cats

Image Classification for Dogs and Cats Image Classification for Dogs and Cats Bang Liu, Yan Liu Department of Electrical and Computer Engineering {bang3,yan10}@ualberta.ca Kai Zhou Department of Computing Science kzhou3@ualberta.ca Abstract

More information

CRAFT Objects from Images

CRAFT Objects from Images CRAFT Objects from Images Bin Yang Zhen Lei Stan Z. Li National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences {bin.yang, zlei, szli}@nlpr.ia.ac.cn Abstract Object

More information

Segmentation as Selective Search for Object Recognition

Segmentation as Selective Search for Object Recognition Segmentation as Selective Search for Object Recognition Koen E. A. van de Sande Jasper R. R. Uijlings Theo Gevers Arnold W. M. Smeulders University of Amsterdam University of Trento Amsterdam, The Netherlands

More information

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles mjhustc@ucla.edu and lunbo

More information

InstaNet: Object Classification Applied to Instagram Image Streams

InstaNet: Object Classification Applied to Instagram Image Streams InstaNet: Object Classification Applied to Instagram Image Streams Clifford Huang Stanford University chuang8@stanford.edu Mikhail Sushkov Stanford University msushkov@stanford.edu Abstract The growing

More information

SSD: Single Shot MultiBox Detector

SSD: Single Shot MultiBox Detector SSD: Single Shot MultiBox Detector Wei Liu 1, Dragomir Anguelov 2, Dumitru Erhan 3, Christian Szegedy 3, Scott Reed 4, Cheng-Yang Fu 1, Alexander C. Berg 1 1 UNC Chapel Hill 2 Zoox Inc. 3 Google Inc. 4

More information

Geodesic Object Proposals

Geodesic Object Proposals Geodesic Object Proposals Philipp Krähenbühl 1 and Vladlen Koltun 2 1 Stanford University 2 Adobe Research Abstract. We present an approach for identifying a set of candidate objects in a given image.

More information

arxiv: v1 [cs.cv] 8 Dec 2015

arxiv: v1 [cs.cv] 8 Dec 2015 SSD: Single Shot MultiBox Detector Wei Liu, Dragomir Anguelov 2, Dumitru Erhan 2, Christian Szegedy 2, Scott Reed 3 UNC Chapel Hill 2 Google Inc. 3 University of Michigan, Ann-Arbor wliu@cs.unc.edu, 2

More information

Deep Fisher Networks and Class Saliency Maps for Object Classification and Localisation

Deep Fisher Networks and Class Saliency Maps for Object Classification and Localisation Deep Fisher Networks and Class Saliency Maps for Object Classification and Localisation Karén Simonyan, Andrea Vedaldi, Andrew Zisserman Visual Geometry Group, University of Oxford Outline Classification

More information

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform

More information

CNN Based Object Detection in Large Video Images. WangTao, wtao@qiyi.com IQIYI ltd. 2016.4

CNN Based Object Detection in Large Video Images. WangTao, wtao@qiyi.com IQIYI ltd. 2016.4 CNN Based Object Detection in Large Video Images WangTao, wtao@qiyi.com IQIYI ltd. 2016.4 Outline Introduction Background Challenge Our approach System framework Object detection Scene recognition Body

More information

Keypoint Density-based Region Proposal for Fine-Grained Object Detection and Classification using Regions with Convolutional Neural Network Features

Keypoint Density-based Region Proposal for Fine-Grained Object Detection and Classification using Regions with Convolutional Neural Network Features Keypoint Density-based Region Proposal for Fine-Grained Object Detection and Classification using Regions with Convolutional Neural Network Features JT Turner 1, Kalyan Gupta 1, Brendan Morris 2, & David

More information

The Middle Child Problem: Revisiting Parametric Min-cut and Seeds for Object Proposals

The Middle Child Problem: Revisiting Parametric Min-cut and Seeds for Object Proposals The Middle Child Problem: Revisiting Parametric Min-cut and Seeds for Object Proposals Ahmad Humayun Fuxin Li James M. Rehg Georgia Institute of Technology Oregon State University http://cpl.cc.gatech.edu/projects/poise

More information

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition 1 Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun arxiv:1406.4729v4 [cs.cv] 23 Apr 2015 Abstract Existing deep convolutional

More information

Getting Started with Caffe Julien Demouth, Senior Engineer

Getting Started with Caffe Julien Demouth, Senior Engineer Getting Started with Caffe Julien Demouth, Senior Engineer What is Caffe? Open Source Framework for Deep Learning http://github.com/bvlc/caffe Developed by the Berkeley Vision and Learning Center (BVLC)

More information

Local features and matching. Image classification & object localization

Local features and matching. Image classification & object localization Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

More information

Hybrid Learning Framework for Large-Scale Web Image Annotation and Localization

Hybrid Learning Framework for Large-Scale Web Image Annotation and Localization Hybrid Learning Framework for Large-Scale Web Image Annotation and Localization Yong Li 1, Jing Liu 1, Yuhang Wang 1, Bingyuan Liu 1, Jun Fu 1, Yunze Gao 1, Hui Wu 2, Hang Song 1, Peng Ying 1, and Hanqing

More information

Compacting ConvNets for end to end Learning

Compacting ConvNets for end to end Learning Compacting ConvNets for end to end Learning Jose M. Alvarez Joint work with Lars Pertersson, Hao Zhou, Fatih Porikli. Success of CNN Image Classification Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton,

More information

Abstract. (CNN) ImageNet CNN [18] CNN CNN. fine-tuning ( )[14] Neocognitron[10] 1990 LeCun (CNN)[27]

Abstract. (CNN) ImageNet CNN [18] CNN CNN. fine-tuning ( )[14] Neocognitron[10] 1990 LeCun (CNN)[27] Abstract (CNN) ImageNet CNN CNN CNN fine-tuning 1 ( )[14] ( 1) ( ). (CNN)[27] CNN (pre-trained network) pre-trained CNN 2 2.1 (CNN) CNN CNN [18] 2 ( ). Neocognitron[10] 1990 LeCun [27] CNN 2000 CNN [34,

More information

A Brief Overview of Structured Support Vector Machines (S-SVM) for Computer Vision by Magnus Burenius at KTH

A Brief Overview of Structured Support Vector Machines (S-SVM) for Computer Vision by Magnus Burenius at KTH A Brief Overview of Structured Support Vector Machines (S-SVM) for Computer Vision by Magnus Burenius at KTH Disadvantages of standard binary SVM For object recognition it does not handle multiple classes

More information

Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs

Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs 1 Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs arxiv:1610.01119v1 [cs.cv] 4 Oct 2016 Limin Wang, Sheng Guo, Weilin Huang, Member, IEEE, Yuanjun Xiong,

More information

Convolutional Neural Networks for Named. Entity Recognition in Images of Documents

Convolutional Neural Networks for Named. Entity Recognition in Images of Documents Convolutional Neural Networks for Named Entity Recognition in Images of Documents Master Thesis Jan van de Kerkhof Supervisors: Roelof Pieters (KTH), Erik Rehn (Dooer), David Hallvig (Dooer) Stockholm

More information

Do Convnets Learn Correspondence?

Do Convnets Learn Correspondence? Do Convnets Learn Correspondence? Jonathan Long Ning Zhang Trevor Darrell University of California Berkeley {jonlong, nzhang, trevor}@cs.berkeley.edu Abstract Convolutional neural nets (convnets) trained

More information

Deep Learning for Identifying Malaria Parasites in Images

Deep Learning for Identifying Malaria Parasites in Images Deep Learning for Identifying Malaria Parasites in Images Carlos Sánchez Sánchez E H U N I V E R S I T Y T O H F R G E D I N B U Master of Science Artificial Intelligence School of Informatics University

More information

Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg

Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual

More information

Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance

Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance 2012 IEEE International Conference on Multimedia and Expo Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance Rogerio Feris, Sharath Pankanti IBM T. J. Watson Research Center

More information

Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree.

Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree. Visual search: what's next? Cees Snoek University of Amsterdam The Netherlands Euvision Technologies The Netherlands Problem statement US flag Tree Aircraft Humans Dog Smoking Building Basketball Table

More information

Human Pose Estimation and Activity Classification Using Convolutional Neural Networks

Human Pose Estimation and Activity Classification Using Convolutional Neural Networks Human Pose Estimation and Activity Classification Using Convolutional Neural Networks Amy Bearman Stanford University Catherine Dong Stanford University abearman@cs.stanford.edu cdong@cs.stanford.edu Abstract

More information

Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution

Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution Wanli Ouyang, Xiaogang Wang, The Chinese University of Hong Kong wlouyang, xgwang@ee.cuhk.edu.hk Cong Zhang, Xiaokang Yang

More information

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15 Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright GENIVI Alliance

More information

HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT

HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT International Journal of Scientific and Research Publications, Volume 2, Issue 4, April 2012 1 HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT Akhil Gupta, Akash Rathi, Dr. Y. Radhika

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick Jeff Donahue Trevor Darrell Jitendra Malik UC Berkeley {rbg,jdonahue,trevor,malik}@eecs.berkeley.edu Abstract

More information

Denoising Convolutional Autoencoders for Noisy Speech Recognition

Denoising Convolutional Autoencoders for Noisy Speech Recognition Denoising Convolutional Autoencoders for Noisy Speech Recognition Mike Kayser Stanford University mkayser@stanford.edu Victor Zhong Stanford University vzhong@stanford.edu Abstract We propose the use of

More information

Master s Program in Information Systems

Master s Program in Information Systems The University of Jordan King Abdullah II School for Information Technology Department of Information Systems Master s Program in Information Systems 2006/2007 Study Plan Master Degree in Information Systems

More information

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite Philip Lenz 1 Andreas Geiger 2 Christoph Stiller 1 Raquel Urtasun 3 1 KARLSRUHE INSTITUTE OF TECHNOLOGY 2 MAX-PLANCK-INSTITUTE IS 3

More information

Convolutional Neural Networks with Intra-layer Recurrent Connections for Scene Labeling

Convolutional Neural Networks with Intra-layer Recurrent Connections for Scene Labeling Convolutional Neural Networks with Intra-layer Recurrent Connections for Scene Labeling Ming Liang Xiaolin Hu Bo Zhang Tsinghua National Laboratory for Information Science and Technology (TNList) Department

More information

Classification using intersection kernel SVMs is efficient

Classification using intersection kernel SVMs is efficient Classification using intersection kernel SVMs is efficient Jitendra Malik UC Berkeley Joint work with Subhransu Maji and Alex Berg Fast intersection kernel SVMs and other generalizations of linear SVMs

More information

Lecture 14: Convolutional neural networks for computer vision

Lecture 14: Convolutional neural networks for computer vision Lecture 14: Convolutional neural networks for computer vision Dr. Richard E. Turner (ret26@cam.ac.uk) November 20, 2014 Big picture Goal: how to produce good internal representations of the visual world

More information

Monza: Image Classification of Vehicle Make and Model Using Convolutional Neural Networks and Transfer Learning

Monza: Image Classification of Vehicle Make and Model Using Convolutional Neural Networks and Transfer Learning Monza: Image Classification of Vehicle Make and Model Using Convolutional Neural Networks and Transfer Learning Derrick Liu Stanford University lediur@stanford.edu Yushi Wang Stanford University yushiw@stanford.edu

More information

Interpreting American Sign Language with Kinect

Interpreting American Sign Language with Kinect Interpreting American Sign Language with Kinect Frank Huang and Sandy Huang December 16, 2011 1 Introduction Accurate and real-time machine translation of sign language has the potential to significantly

More information

Taking a Deeper Look at Pedestrians

Taking a Deeper Look at Pedestrians Taking a Deeper Look at Pedestrians Jan Hosang Mohamed Omran Rodrigo Benenson Bernt Schiele Max Planck Institute for Informatics Saarbrücken, Germany firstname.lastname@mpi-inf.mpg.de Abstract In this

More information

Tracking Algorithms. Lecture17: Stochastic Tracking. Joint Probability and Graphical Model. Probabilistic Tracking

Tracking Algorithms. Lecture17: Stochastic Tracking. Joint Probability and Graphical Model. Probabilistic Tracking Tracking Algorithms (2015S) Lecture17: Stochastic Tracking Bohyung Han CSE, POSTECH bhhan@postech.ac.kr Deterministic methods Given input video and current state, tracking result is always same. Local

More information

arxiv: v2 [cs.cv] 30 Dec 2016

arxiv: v2 [cs.cv] 30 Dec 2016 FastMask: Segment Multi-scale Object Candidates in One Shot Hexiang Hu University of California, Los Angeles Los Angeles, CA hexiang.frank.hu@gmail.com Shiyi Lan Fudan University Shanghai, China sylan14@fudan.edu.cn

More information

arxiv:1409.1556v6 [cs.cv] 10 Apr 2015

arxiv:1409.1556v6 [cs.cv] 10 Apr 2015 VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION Karen Simonyan & Andrew Zisserman + Visual Geometry Group, Department of Engineering Science, University of Oxford {karen,az}@robots.ox.ac.uk

More information

arxiv: v2 [cs.cv] 1 Jul 2015

arxiv: v2 [cs.cv] 1 Jul 2015 Discovering Characteristic Landmarks on Ancient Coins using Convolutional Networks Jongpil Kim and Vladimir Pavlovic Rutgers, The State University of New Jersey {jpkim, vladimir}@cs.rutgers.edu arxiv:1506.09174v2

More information

End-to-End Deep Learning for Person Search

End-to-End Deep Learning for Person Search End-to-End Deep Learning for Person Search Tong Xiao 1 Shuang Li 1 Bochao Wang 2 Liang Lin 2 Xiaogang Wang 1 1 The Chinese University of Hong Kong 2 Sun Yat-Sen University {xiaotong,sli,xgwang}@ee.cuhk.edu.hk,

More information

Big Data Text Mining and Visualization. Anton Heijs

Big Data Text Mining and Visualization. Anton Heijs Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark

More information

arxiv: v1 [cs.cv] 23 Jan 2015

arxiv: v1 [cs.cv] 23 Jan 2015 Taking a Deeper Look at Pedestrians Jan Hosang Mohamed Omran Rodrigo Benenson Bernt Schiele Max Planck Institute for Informatics Saarbrücken, Germany firstname.lastname@mpi-inf.mpg.de arxiv:50.05790v [cs.cv]

More information

Research Statement: Towards Detailed Recognition of Visual Categories

Research Statement: Towards Detailed Recognition of Visual Categories As humans, we have a remarkable ability to perceive the world around us in minute detail purely from the light that is reflected off it we can estimate material and metric properties of objects, localize

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

Learning and transferring mid-level image representions using convolutional neural networks

Learning and transferring mid-level image representions using convolutional neural networks Willow project-team Learning and transferring mid-level image representions using convolutional neural networks Maxime Oquab, Léon Bottou, Ivan Laptev, Josef Sivic 1 Image classification (easy) Is there

More information

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He 1, Xiangyu Zhang 2, Shaoqing Ren 3, and Jian Sun 1 1 Microsoft Research 2 Xi an Jiaotong University 3 University

More information

LocNet: Improving Localization Accuracy for Object Detection

LocNet: Improving Localization Accuracy for Object Detection LocNet: Improving Localization Accuracy for Object Detection Spyros Gidaris, Nikos Komodakis To cite this version: Spyros Gidaris, Nikos Komodakis. LocNet: Improving Localization Accuracy for Object Detection.

More information

arxiv: v1 [cs.cv] 9 Apr 2016

arxiv: v1 [cs.cv] 9 Apr 2016 1 T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos arxiv:1604.02532v1 [cs.cv] 9 Apr 2016 Kai Kang, Hongsheng Li, Junjie Yan, Xingyu Zeng, Bin Yang, Tong Xiao, Cong Zhang,

More information