Getting Started with Caffe Julien Demouth, Senior Engineer

Similar documents
CS231n Caffe Tutorial

Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection

Fast R-CNN. Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th. Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

Fast R-CNN Object detection with Caffe

Convolutional Feature Maps

Steven C.H. Hoi School of Information Systems Singapore Management University

MulticoreWare. Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015

Bert Huang Department of Computer Science Virginia Tech

DEEP LEARNING WITH GPUS

HE Shuncheng March 20, 2016

Compacting ConvNets for end to end Learning

Pedestrian Detection with RCNN

Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016

Convolutional Networks for Stock Trading

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15

Applications of Deep Learning to the GEOINT mission. June 2015

Advanced analytics at your hands

SIGNAL INTERPRETATION

GPU-Based Deep Learning Inference:

Pedestrian Detection using R-CNN

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

InstaNet: Object Classification Applied to Instagram Image Streams

Image and Video Understanding

Deep Learning and GPUs. Julie Bernauer

Denoising Convolutional Autoencoders for Noisy Speech Recognition

Deep Learning GPU-Based Hardware Platform

PhD in Computer Science and Engineering Bologna, April Machine Learning. Marco Lippi. Marco Lippi Machine Learning 1 / 80

Deep Residual Networks

LIBSVX and Video Segmentation Evaluation

CIKM 2015 Melbourne Australia Oct. 22, 2015 Building a Better Connected World with Data Mining and Artificial Intelligence Technologies

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff

Image Classification for Dogs and Cats

Apache Spark : Fast and Easy Data Processing Sujee Maniyam Elephant Scale LLC sujee@elephantscale.com

Part I Courses Syllabus

CUDA Optimization with NVIDIA Tools. Julien Demouth, NVIDIA

Image Caption Generator Based On Deep Neural Networks

EdVidParse: Detecting People and Content in Educational Videos

HPC with Multicore and GPUs

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

Deformable Part Models with CNN Features

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices

Workshop on Android and Applications Development

Pattern Insight Clone Detection

Big Data in medical image processing

Fast Accurate Fish Detection and Recognition of Underwater Images with Fast R-CNN

arxiv: v2 [cs.cv] 27 Sep 2015

Deep Learning For Text Processing

Latest Advances in Deep Learning. Yao Chou

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS

arxiv: v1 [cs.cv] 18 May 2015

Outline. Introduction. State-of-the-art Forensic Methods. Hardware-based Workload Forensics. Experimental Results. Summary. OS level Hypervisor level

CNN Based Object Detection in Large Video Images. WangTao, IQIYI ltd

Here to take you beyond Mobile Application development using Android Course details

GPU Renderfarm with Integrated Asset Management & Production System (AMPS)

OMU350 Operations Manager 9.x on UNIX/Linux Advanced Administration

High Level Design Distributed Network Traffic Controller

vnas Series All-in-one NAS with virtualization platform

AW-HE60 Firmware Upgrade Procedure

Convolutional Neural Networks with Intra-layer Recurrent Connections for Scene Labeling

CAP 6412 Advanced Computer Vision

SQL Server Business Intelligence

Predict Influencers in the Social Network

Copyright by Parallels Holdings, Ltd. All rights reserved.

Topology Aware Analytics for Elastic Cloud Services

TEGRA X1 DEVELOPER TOOLS SEBASTIEN DOMINE, SR. DIRECTOR SW ENGINEERING

TRANSLATION PROXY A Hands-Off Option for Managing Multilingual Websites

RIPL. An Image Processing DSL. Rob Stewart & Deepayan Bhowmik. 1st May, Heriot Watt University

Using Mobile Processors for Cost Effective Live Video Streaming to the Internet

Introduction to GPU hardware and to CUDA

Accelerating CFD using OpenFOAM with GPUs

RADIANTGRID PLATFORM INSTALLATION PREREQUISITES

Packet-based Network Traffic Monitoring and Analysis with GPUs

arxiv: v6 [cs.cv] 10 Apr 2015

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

Client-aware Cloud Storage

Implementation of Neural Networks with Theano.

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

ON THE IMPLEMENTATION OF ADAPTIVE FLOW MEASUREMENT IN THE SDN-ENABLED NETWORK: A PROTOTYPE

Dell Fabric Manager Installation Guide 1.0.0

Good Morning Wireless! SSID: MSFTOPEN No Username or Password Required

ANDROID DEVELOPER TOOLS TRAINING GTC Sébastien Dominé, NVIDIA

OpenFlow with Intel Voravit Tanyingyong, Markus Hidell, Peter Sjödin

GURLS: A Least Squares Library for Supervised Learning

LSI SAS inside 60% of servers. 21 million LSI SAS & MegaRAID solutions shipped over last 3 years. 9 out of 10 top server vendors use MegaRAID

How to use PDFlib products with PHP

Quick Start Guide for Parallels Virtuozzo

CUDA in the Cloud Enabling HPC Workloads in OpenStack With special thanks to Andrew Younge (Indiana Univ.) and Massimo Bernaschi (IAC-CNR)

MDSplus Automated Build and Distribution System

Transcription:

Getting Started with Caffe Julien Demouth, Senior Engineer

What is Caffe? Open Source Framework for Deep Learning http://github.com/bvlc/caffe Developed by the Berkeley Vision and Learning Center (BVLC) C++/CUDA architecture Command line, Python, MATLAB interfaces Pre-processing and deployment tools, reference models and examples Image data management Seamless GPU acceleration Large community of contributors to the open-source project 2

What is Caffe? End-to-end Deep Learning for the Practitioner and Developer Prototype Train Deploy 3

Caffe Features Data Pre-processing and Management Data ingest formats LevelDB or LMDB database In-memory (C++ and Python only) HDF5 Image files Pre-processing tools LevelDB/LMDB creation from raw images Training and validation set creation with shuffling Mean-image generation Data transformations Image cropping, resizing, scaling and mirroring Mean subtraction $CAFFE_ROOT/build/tools 4

Caffe Features Deep Learning Model Definition Protobuf model format Strongly typed format Human readable Auto-generates and checks Caffe code Developed by Google Used to define network architecture and training parameters No coding required! name: conv1 type: Convolution bottom: data top: conv1 convolution_param { num_output: 20 kernel_size: 5 stride: 1 weight_filler { type: xavier } } 5

Caffe Features Deep Learning Model Definition Loss functions: Classification Softmax Hinge loss Linear regression Euclidean loss Attributes/multiclassification Sigmoid cross entropy loss and more Available layer types: Convolution Pooling Normalization Activation functions: ReLU Sigmoid Tanh and more 6

Caffe Features Deep Neural Network Training Network training also requires no coding just define a solver file net: lenet_train.prototxt base_lr: 0.01 momentum: 0.9 max_iter: 10000 snapshot_prefix: lenet_snapshot solver_mode: GPU All you need to run things on the GPU > caffe train solver lenet_solver.prototxt gpu 0 Multiple optimization algorithms available: SGD (+momentum), ADAGRAD, NAG 7

Caffe Features Monitoring the Training Process Output to stdout: To visualize pipe, parse and plot or use DIGITS 8

Caffe Features Deep Neural Network Deployment Standard, compact model format caffe train produces a binary.caffemodel file Easily integrate trained models into data pipelines Deploy against new data using command line, Python or MATLAB interfaces Deploy models across HW and OS environments.caffemodel files transfer to any other Caffe installation (including DIGITS) 9

Caffe Features Deep Neural Network Sharing Caffe Model Zoo hosts community shared models Benefit from networks that you could not practically train yourself https://github.com/bvlc/caffe/wiki/model-zoo Caffe comes with unrestricted use of BVLC models: AlexNet R-CNN GoogLeNet Caffe model directory Solver + model prototxt(s) readme.md containing: Caffe version URL and SHA1 of.caffemodel License Description of training data 10

Caffe Features Extensible Code Layer Protocol == Class Interface Define a class in C++ or Python to extend Layer Include your new layer in a network prototxt layer { type: "Python" python_param { module: "layers" layer: "EuclideanLoss" } } 11

Caffe Features Extend Caffe from C++ Start from the caffe::layer class Implement the member functions LayerSetUp Reshape Forward_cpu/Forward_gpu Backward_cpu/Backward_gpu 12

Caffe Features Extend Caffe from C++ Write the CUDA kernel Forward_gpu launches the CUDA kernel 13

Caffe Features Extend Caffe from C++ Extend the Protobuffer file 14

Caffe Example Applications 15

Example applications Use Case 1: Classification of Images Object Scene Style http://demo.caffe.berkeleyvision.org/ http://places.csail.mit.edu/ http://demo.vislab.berkeleyvision.org/ B. Zhou et al. NIPS 14 Karayev et al. Recognizing Image Style. BMVC14 Open source demo code: $CAFFE_ROOT/examples/web_demo 16

Example applications Use Case 2: Localization Traditional CNN ROI pooling layer Objects Softmax Candidate object regions FC layers Regressor Bounding boxes (Fast) Region based Convolutional Networks (R-CNN) Ross Girshick, Microsoft Research https://github.com/rbgirshick/fast-rcnn 17

Example applications Use Case 3: Pixel Level Classification and Segmentation http://fcn.berkeleyvision.org Long, Shelhamer, Darrell, Fully convolutional networks for semantic segmentation, CVPR 2015 18

Example applications Use Case 4: Sequence Learning Recurrent Neural Networks (RNNs) and Long Short Term Memory (LSTM) Jeff Donahue et al. Video Language Dynamic data Current Caffe pull request to add support https://github.com/bvlc/caffe/pull/1873 http://arxiv.org/abs/1411.4389 Jeff Donahue et al. 19

Example applications Use Case 5: Transfer Learning Lots of data Just change a few lines in the model prototxt file New data CNN Object Classifier layer { name: data type: Data data_param { source: ilsvrc12_train } } layer { name: fc8 type: InnerProduct inner_product_param { { num_output: 1000 } } Transfer weights Dog vs. Cat Top 10 in 10 mins after finetuning layer { name: data type: Data data_param: { source: dogcat_train } } layer { name: fc8-dogcat type: InnerProduct inner_product_param num_output: 2 } } 20

Caffe Setup and Performance 21

Caffe Setup Tried and tested by BVLC on Ubuntu 14.04/12.04 and OS X 10.8+ Also demonstrated to compile on RHEL, Fedora and CentOS Download source from https://github.com/bvlc/caffe Unofficial 64-bit Windows port https://github.com/niuzhiheng/caffe Linux setup (see http://caffe.berkeleyvision.org/installation.html) Download Install pre-requisites Install CUDA and cudnn for GPU acceleration Compile using make 22

GPU Acceleration -gpu N flag tells caffe which gpu to use Alternatively, specify solver_mode: GPU in solver.prototxt 16.2s 10x 163s Tesla K40m, cudnn v3-rc ECC off, autoboost on Intel Xeon E5-2698 v3 @ 2.60GHz, 3.6GHz turbo, (16 cores total), HT off Benchmark: Train Caffenet model, 20 iterations, 256x256 images, mini-batch size 256 23

cudnn integration http://developer.nvidia.com/cudnn Drop-in support Install cudnn, uncomment USE_CUDNN :=1 in Makefile.config before build 24

Caffe Mobile Deployment Jetson TX1 Up to 340 img/s on Alexnet (FP16) No need to change code Compile Caffe Copy a trained.caffemodel to TX1 Run *Source: http://petewarden.com/2014/10/25/how-to-run-the-caffe-deep-learning-vision-library-on-nvidias-jetson-mobile-gpu-board/ 25

Hands-on lab Preview bit.ly/dlnvlab3 Use data pre-processing tools Edit a network definition Train a model Improve classification accuracy by modifying network parameters Visualize trained network weights Deploy a model using Python 26

Hands-on Lab 1. Create an account at nvidia.qwiklab.com 2. Go to LHC 2015 Workshop lab 3. Start the lab and enjoy! Only requires a supported browser, no NVIDIA GPU necessary! 27