Compacting ConvNets for end to end Learning

Similar documents
Bert Huang Department of Computer Science Virginia Tech

CAP 6412 Advanced Computer Vision

Convolutional Feature Maps

Pedestrian Detection with RCNN

Image and Video Understanding

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015

Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016

Steven C.H. Hoi School of Information Systems Singapore Management University

Network Morphism. Abstract. 1. Introduction. Tao Wei

Fast R-CNN Object detection with Caffe

Object Detection in Video using Faster R-CNN

CNN Based Object Detection in Large Video Images. WangTao, IQIYI ltd

MulticoreWare. Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan

Pedestrian Detection using R-CNN

Introduction to Machine Learning CMU-10701

Image Classification for Dogs and Cats

Fast R-CNN. Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th. Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:

Deformable Part Models with CNN Features

arxiv: v6 [cs.cv] 10 Apr 2015

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

GPU-Based Deep Learning Inference:

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning

Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks

Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection

arxiv: v2 [cs.cv] 19 Apr 2014

arxiv: v2 [cs.cv] 19 Jun 2015

Deep Residual Networks

arxiv: v1 [cs.cv] 29 Apr 2016

An Introduction to Deep Learning

Getting Started with Caffe Julien Demouth, Senior Engineer

arxiv: v2 [cs.cv] 15 Apr 2015

Fast Accurate Fish Detection and Recognition of Underwater Images with Fast R-CNN

Image Caption Generator Based On Deep Neural Networks

Learning and transferring mid-level image representions using convolutional neural networks

Learning to Process Natural Language in Big Data Environment

arxiv: v1 [cs.cv] 6 Feb 2015

Bayesian Dark Knowledge

Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers

arxiv: v2 [cs.cv] 27 Sep 2015

Taking Inverse Graphics Seriously

Deep Learning and GPUs. Julie Bernauer

SIGNAL INTERPRETATION

Simultaneous Deep Transfer Across Domains and Tasks

Generating Natural Language Descriptions for Semantic Representations of Human Brain Activity

This is the author s version of a work that was submitted/accepted for publication in the following source:

Applications of Deep Learning to the GEOINT mission. June 2015

Convolutional Neural Networks with Intra-layer Recurrent Connections for Scene Labeling

Deep Learning for Big Data

Do Convnets Learn Correspondence?

PhD in Computer Science and Engineering Bologna, April Machine Learning. Marco Lippi. Marco Lippi Machine Learning 1 / 80

Deep Learning & Convolutional Networks

Understanding Deep Image Representations by Inverting Them

3D Object Recognition using Convolutional Neural Networks with Transfer Learning between Input Channels

arxiv: v3 [cs.lg] 19 Apr 2016

InstaNet: Object Classification Applied to Instagram Image Streams

On the Number of Linear Regions of Deep Neural Networks

arxiv: v1 [cs.cv] 15 May 2014

HE Shuncheng March 20, 2016

SEMANTIC CONTEXT AND DEPTH-AWARE OBJECT PROPOSAL GENERATION

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Big Data in the Mathematical Sciences

Deep Learning using Linear Support Vector Machines

arxiv: v1 [cs.cv] 18 May 2015

Deep Image: Scaling up Image Recognition

Marr Revisited: 2D-3D Alignment via Surface Normal Prediction

R-CNN minus R. 1 Introduction. Karel Lenc Department of Engineering Science, University of Oxford, Oxford, UK.

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15

Storing and Analyzing Efficiently Big Data at GSI/FAIR

Implementation of Neural Networks with Theano.

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Two-Stream Convolutional Networks for Action Recognition in Videos

arxiv: v4 [cs.cv] 2 Apr 2015

Object Detection from Video Tubelets with Convolutional Neural Networks

Interactive Level-Set Deformation On the GPU

A Bayesian Framework for Unsupervised One-Shot Learning of Object Categories

Stochastic Pooling for Regularization of Deep Convolutional Neural Networks

Advanced analytics at your hands

Scalable Machine Learning - or what to do with all that Big Data infrastructure

Multi-view Face Detection Using Deep Convolutional Neural Networks

Taking a Deeper Look at Pedestrians

DEEP LEARNING WITH GPUS

arxiv: v2 [cs.cv] 9 Mar 2016

An Empirical Evaluation of Current Convolutional Architectures Ability to Manage Nuisance Location and Scale Variability

Task-driven Progressive Part Localization for Fine-grained Recognition

Weakly Supervised Object Boundaries Supplementary material

Do Deep Nets Really Need to be Deep?

ImageNet Classification with Deep Convolutional Neural Networks

Implementing Deep Neural Networks with Non Volatile Memories

arxiv: v1 [cs.cv] 17 Sep 2014

A Dynamic Convolutional Layer for Short Range Weather Prediction

Scalable Object Detection by Filter Compression with Regularized Sparse Coding

arxiv:submit/ [cs.cv] 13 Apr 2016

Simple and efficient online algorithms for real world applications

Semantic Recognition: Object Detection and Scene Segmentation

Real-Time Grasp Detection Using Convolutional Neural Networks

Image Captioning A survey of recent deep-learning approaches

Going Deeper with Convolutional Neural Network for Intelligent Transportation

Keypoint Density-based Region Proposal for Fine-Grained Object Detection and Classification using Regions with Convolutional Neural Network Features

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

Large Scale Semi-supervised Object Detection using Visual and Semantic Knowledge Transfer

Transcription:

Compacting ConvNets for end to end Learning Jose M. Alvarez Joint work with Lars Pertersson, Hao Zhou, Fatih Porikli.

Success of CNN Image Classification Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS, 2012

Success of CNN Object Detection from Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, arxiv:1506.01497

Success of CNN Semantic Segmentation Jifeng Dai, Kaiming He, Jian Sun, BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation, arxiv:1503.01640

Success of CNN Image Captioning Andrej Karpathy, Li Fei-Fei, Deep Visual-Semantic Alignments for Generating Image Description, CVPR, 2015 Video classification

Key of success Better training algorithms Batch normalization Initializations Momentum

Key of success Better training algorithms Large amount of data / labels

Key of success Better training algorithms Large amount of data / labels Hardware / Storage GPU, parallel systems Memory GPU (in Gb) 14 12 10 8 6 4 2 0 GTX-580 Titan Black ('14) Titan X ('15)

Key of success Better training algorithms Large amount of data / labels Hardware / Storage Larger community of researchers

Key of success Enabled larger networks 160 140 120 100 80 60 40 20 0 Num. Parameters (in Millions) LeNet-5 AlexNet VGGNet-16

Key of success 150 Num. Parameters (in Millions) 100 50 0 LeNet-5 AlexNet VGGNet-16

Key of success 150 Num. Parameters (in Millions) 100 50 0 LeNet-5 AlexNet VGGNet-16

Key of success 150 Num. Parameters (in Millions) 100 50 0 LeNet-5 AlexNet VGGNet-16

Challenges Embedded devices with limited resources / power 2014 Jetson TK1 2015/16 Jetson TX1

Challenges Embedded devices with limited resources / power - Memory is a limiting factor - Real time operation

Computational Cost AlexNet Forward-pass is time consuming

Computational Cost AlexNet Memory bottleneck

Computational Cost VGGNet Memory bottleneck conv3-64 x 2 : 38,720 conv3-128 x 2 : 221,440 conv3-256 x 3 : 1,475,328 conv3-512 x 3 : 5,899,776 conv3-512 x 3 : 7,079,424 fc1 : 102,764,544 fc2 : 16,781,312 fc3 : 4,097,000 TOTAL : 138,357,544

Do we need all these parameters?

Over-Parameterization Needed for high non-convex optimization 1 Anna Choromanska, Mikael Henaff, Michael Mathieu, Gérard Ben Arous, Yann LeCun. The Loss Surfaces of Multilayer Networks

Over-Parameterization Needed for high non-convex optimization Deeper structures, larger learning capacity 1 1 Guido Montúfar, Razvan Pascanu, Kyunghyun Cho, Yoshua Bengio. On the Number of Linear Regions of Deep Neural Networks. NIPS 2014

Over-Parameterization Needed for high non-convex optimization Deeper structures, larger learning capacity From images to Video -> Even larger nets? A. Karpathy et. al. Large-scale Video Classification with Convolutional Neural Networks. CVPR 2014.

Compacting CNN

Compacting CNN Network distillation Network pruning Structured parameters Ours

Compacting CNN Network distillation

Compacting CNN Network distillation Large network learns from data Generate labels using the trained network Train smaller nets using the output or soft layer Geoffrey Hinton, Oriol Vinyals, Jeff Dean. Distilling the Knowledge in a Neural Network. NIPSw 2015

Compacting CNN Network distillation (II) Use intermediate layers to guide the training Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta and Yoshua Bengio. FitNets: Hints for Thin Deep Nets. ICLR 2015

Compacting CNN Pros In general better generalization and faster. Equal or slightly better performance Cons Requires a larger network to learn from.

Compacting CNN Network distillation Network pruning Directly remove unimportant parameters during training Requires second derivatives. Remove parameters + quantification 1 Good compression rates (orthogonal to other approaches) 1 S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. CoRR, abs/1510.00149, 2015

Compacting CNN Network distillation Network pruning Structured parameters

Compacting CNN: Structured parameters Low rank approximations Max Jaderberg, Andrea Vedaldi, Andrew Zisserman Speeding up Convolutional Neural Networks with Low Rank Expansions. BMVC 2014

Compacting CNN: Structured parameters Low rank approximations (II) Emily Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, Rob Fergus. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation. NIPS 2014

Compacting CNN: Structured parameters Low rank approximations (III) Weights are approximated by a sum of rank 1 tensors. Emily Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, Rob Fergus. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation. NIPS 2014

Compacting CNN: Structured parameters Weak-Points Needs a full-rank network completely trained Not all filters can be approximated Theoretical speeds-up with drop of performance. Emily Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, Rob Fergus. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation. NIPS 2014

Compacting CNN: Structured parameters Weak-Points Needs a full-rank network completely trained. Not all filters can be approximated. Drop of performance. Strengths Potential ability to aid in regularization during or post training. Parameter sharing within the layer.

Compacting CNN: Structured parameters Low rank approximations (IV) VGG nets restrict filters during training. Same receptive field Deeper networks (more nonlinearities) Less parameters (49C 2 vs 3x(3x3)C 2 ) K. Simonyan, A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR, 2015

Compacting CNN: Structured parameters Low rank approximations (Ours 1 ) Filter restriction during training. Larger receptive fields Deeper networks (more nonlinearities) Parameter sharing Less parameters 1 Joint work with Lars Pertersson. Under review

Compacting CNN: Structured parameters Low rank approximations (Ours) ImageNet Results (AlexNet). Baseline: Alex Krizhevsky. Ilya Sutskever. Geoffrey Hinton. ImageNet Classification with Deep. Convolutional Neural Networks. NIPS 2012

Compacting CNN: Structured parameters Low rank approximations (Ours) Stereo Matching. Ours-3 32K Ours-1 32K Ours-1 48K Baseline: Jure Zbontar, Yann LeCun. Computing the Stereo Matching Cost With a Convolutional Neural Network. CVPR 2015

Memory?

Computational Cost VGGNet Memory bottleneck conv3-64 x 2 : 38,720 conv3-128 x 2 : 221,440 conv3-256 x 3 : 1,475,328 conv3-512 x 3 : 5,899,776 conv3-512 x 3 : 7,079,424 fc1 : 102,764,544 fc2 : 16,781,312 fc3 : 4,097,000 TOTAL : 138,357,544

Computational Cost AlexNet Memory bottleneck

Memory Bottleneck Sparse constraints during training (Ours 2 ) Directly reduce the number of neurons. Select the optimum number of neurons. Significant memory reductions with minor drop of performance 2 Joint work with Hao Zhou, Fatih Porikli. Under review

Memory Bottleneck Sparse constraints during training (Ours 2 ) 2 Joint work with Hao Zhou, Fatih Porikli. Under review

Do we need all these parameters?

Compacting ConvNets for end to end Learning Jose M. Alvarez Joint work with Lars Pertersson, Hao Zhou, Fatih Porikli.