Image and Video Understanding

Size: px
Start display at page:

Download "Image and Video Understanding"

Transcription

1 Image and Video Understanding 2VO WS Christoph Feichtenhofer, Axel Pinz Slide credits: Many thanks to all the great computer vision researchers on which this presentation relies on. Most material is taken from tutorials at NIPS, CVPR and BMVC conferences.

2 Outline Convolutional Networs (ConvNets) for Image Classification Operations in each layer Architecture Training Visualizations Krizhevsky, A., Sutskever, I. and Hinton, G. E., ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012 M. Zeiler & R. Fergus, Visualizing and Understanding Convolutional Networks, ECCV, 2014 A. Mahendran and A. Vedaldi. Understanding Deep Image Representations by Inverting Them CVPR 2015 State of the art methods Applications of ConvNets K. Simonyan and A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR S. Ioffe, C. Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, ICML 2015 K. He, X. Zhang, S. Ren, & j. Sun. Deep Residual Learning for Image Recognition. arxiv 2015.

3 Goal: Scene Understanding Car Person Leave a car in the desert Scenes Objects Actions Activities Desert Temporal input sequence Get out of car Open door

4 One application: Image retrieval

5 Deep Learning - breakthrough in visual and speech recognition Credit: B. Ginzburg

6 ConvNet Detection Examples

7 Credit: R. Fergus Summary: Compare: SIFT Descriptor Image Pixels Apply Gabor filters Spatial pool (Sum) Normalize to unit length Feature Vector

8 What are the weakest links limiting performance? Replace each component of the deformable part model detector with humans Good Features (part detection) and accurate Localization (NMS) are most important [Parikh & Zitnick CVPR 10]

9 Typical visual recognition pipeline Select / develop features Add on top of this Machine Learning for multi-class recognition and train classifier Image/Video Pixels Feature Extractor e.g. SIFT, HoG... Trainable Classifier e.g. SVM Object Class

10 Intuition Behind Deep Neural Nets Build features automatically based on training data Combine feature extraction and classification All the way from pixels classifier Each box is a feature detector One layer extracts features from output of previous layer Layer 1 Layer 2 Layer 3 CAR

11 Some Key Ingredients for Convolutional Neural Networks

12 Neural networks trained via backpropagation B-Prop Back-propagate error signal to get derivatives for learning parameters θ Compare outputs with correct answer to get error signal L outputs F-Prop Training hidden layers F-Prop / B-Prop (features) Learning by stochastic gradient descent (SGD): A) Compute loss L on weights small mini-batch of data input vector B) Compute gradient w.r.t. θ C) Use gradient to update θ (make a step in the opposite direction) Non-linearity Credit: G. Hinton

13 Neural networks trained via backpropagation Cat Compare outputs with correct answer to get error signal Back-propagate error signal to get derivatives for learning parameters θ Training F-Prop / B-Prop Learning by SGD: A) Compute loss on small mini-batch B) Compute gradient w.r.t. θ C) Use gradient to update θ input vector Credit: G. Hinton

14 Neural networks trained via backpropagation Dog Compare outputs with correct answer to get error signal Back-propagate error signal to get derivatives for learning parameters θ Training F-Prop / B-Prop Learning by SGD: A) Compute loss on small mini-batch B) Compute gradient w.r.t. θ C) Use gradient to update θ input vector Credit: G. Hinton

15 Credit: G. Hinton Neural networks testing Cat F-Prop

16 Credit: G. Hinton Neural networks testing Dog F-Prop

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32 Summary: Modularity of backpropagation Credit: A. Vedaldi

33 Credit: M. A. Ranzato Motivation: Images as a composition of local parts Pixel-based representation Too many parameters!

34 Motivation: Images as a composition of local parts Patch-based representation Credit: M. A. Ranzato

35 Motivation: Images as a composition of local parts Patch-based representation Credit: M. A. Ranzato Still too many parameters!

36 Motivation: Images as a composition of local parts Convolution example Credit: M. A. Ranzato

37 Motivation: Images as a composition of local parts Convolution example Credit: M. A. Ranzato

38 Credit: R. Fergus Motivation: Images as a composition of local parts Filtering example Why translation equivariance? Input translation leads to a translation of features Fewer filters needed: no translated replications But still need to cover orientation/frequency Patch-based Patch-based Convolutional

39 Convolutional Neural Networks

40 Multistage HubelWiesel Architecture: An Old Idea for Local Shift Invariance [Hubel & Wiesel 1962] Simple cells detect local features Complex cells pool the outputs of simple cells within a retinotopic neighborhood. Simple cells Complex cells Multiple convolutions pooling & subsampling Convolutional Networks [LeCun 1988-present] Retinotopic Feature Maps Credit: Y. LeCun

41 Convolutional Neural Networks Neural network with specialized connectivity structure After a few convolution and subsampling stages, spatial resolution is very small Use fully connected layers up to classification at output layer [LeCun et al. 1989] [LeCun et al. 1989]

42 Training large ConvNets on Paradigm shift enabled by Large annotated dataset Strong regularization (dropout) GPU(s) for fast processing ~ 150 images/sec days weeks of training Imagenet database: 1K classes ~ 1K training images per class! ~ 1M training images [Russakovsky et al. ICCV 13]

43 Top-5 error rate % ImageNet Classification 2012 Krizhevsky et al % error (top-5) Next best (non-convnet) 26.2% error SuperVision ISI Oxford INRIA Amsterdam Credit: R. Fergus

44 Architecture of [Krizhevsky et al. NIPS 12] ImageNet Classification % error (top-5) next best (variant of SIFT + Fisher Vectors) 26.2% error Same idea as in [LeCun 98] but on a larger scale: more training images (10 6 vs 10 3 ) more hidden layers Overall: 60,000,000 parameters which are trained on 2 GPUs for a week with several tricks to reduce overfitting Data augmentation DropOut (new regularization)

45 ConvNet Architecture Feed-forward: Convolve input Non-linearity (rectified linear) Pooling (local max) Supervised Train convolutional filters by back-propagating classification error Feature maps Pooling Non-linearity Convolution (Learned) Input Image Credit: R. Fergus

46 Components of Each Layer Pixels / Features Filter with Dictionary (convolutional or tiled) + Non-linearity Spatial/Feature (Sum or Max) [Optional] Normalization between feature responses Output Features Credit: R. Fergus

47 Convolutional filtering Why convolution? Statistics of images look similar at different locations Dependencies are very local Filtering is an opteration with translation equivariance... Input Feature Map Credit: R. Fergus

48 Non-Linearity Non-linearity applied to each response Per-feature independent Tanh Sigmoid: 1/(1+exp(-x)) Rectified linear : max(0,x) Simplifies backprop Makes learning faster Avoids saturation issues (much higher dynamic range) Preferred option Credit: A. Vedaldi

49 Pooling Spatial Pooling Non-overlapping / overlapping regions (-0.3% in error) Sum or max Provides invariance to local transformations Max Sum Credit: R. Fergus

50 Normalization Contrast normalization The same response for different contrasts is desired

51 Normalization Contrast normalization (between/across feature maps) Local mean = 0, local std. = 1, Local 7x7 Gaussian Equalizes the features maps Feature Maps Feature Maps After Contrast Normalization Credit: R. Fergus

52 Local feature normalisation Credit: A. Vedaldi

53 Recap: Components of a CNN CNN - multi-layer NN architecture Convolutional + Non-Linear Layer Sub-sampling Layer Convolutional +Non-Linear Layer Fully connected layers Supervised Feature Extraction Classification Credit: B. Ginzburg

54 Summary: Components of Each Layer Credit: A. Vedaldi

55 Architecture of Krizhevsky et al. Softmax Output Layer 7: Full Trained via backprop by maximizing the log-prob. of the correct class-label 8 layers total Trained on ImageNet dataset Layer 6: Full Layer 5: Conv + Pool Layer 4: Conv Layer 3: Conv Layer 2: Conv + Pool Layer 1: Conv + Pool Input Image Credit: R. Fergus

56 Motivation: Improving Generalization: DropOut Random Forests generalize well due to averaging of many models Decision Trees are fast - ConvNets are slow many models are not feasible Similar to random forest bagging [Breiman 94], but differs in that parameters are shared For fully connected layers only: In training: Independently set each hidden unit activity to zero with 0.5 probability In testing multiply neuron output by 0.5 [Hinton et al. NIPS 12] Corresponds to averaging over exponentially many samples from different model architectures

57 [Chatfield et al. BMVC 14] Mean removal Centered (0- mean) RGB values. Input representation Further pre-processing tricks Centered (0-mean) RGB values. Data augmentation Train on 224x224 patches extracted randomly from images, and also their horizontal reflections An input image (256x256) Minus sign The mean input image [Krizhevsky et al. NIPS 12]

58 Test error (top-5) Credit: R. Fergus ImageNet Classification 2013/2014 Results Pre-2012: 26.2% error 2012: 16.5% error 2013: 11.2% error

59 ImageNet Sample classifications [Krizhevsky et al. NIPS 12]

60 Validation ImageNet Sample classifications [Krizhevsky et al. NIPS 12]

61 ImageNet Classification Progress from Credit: Y. LeCun

62 Credit: K. Simonyan

63 Better ( deeper) architectures exist now ILSVRC14 Winners: ~7.3% Top-5 error VGG: 16 layers of stacked 3x3 convolution with stride 1 Credit: K. Simonyan

64 Better ( deeper) architectures exist now ILSVRC14 Winners: ~7.3% Top-5 error VGG: 16 layers of stacked 3x3 convolution with stride 1 Credit: K. Simonyan

65 Better ( deeper) architectures exist now ILSVRC14 Winners: ~6.6% Top-5 error - GoogLeNet: composition of multi-scale dimensionreduced modules: Zeiler-Fergus Architecture (1 tower) - 1x1 convolutions serve as dimensionality reduction Convolution Pooling Softmax Other wire together what fires together Credit: C. Szegedy

66 In images, correlations tend to be local Credit: C. Szegedy

67 Cover very local clusters by 1x1 convolutions number of filters 1x1 Credit: C. Szegedy

68 Less spread out correlations number of filters 1x1 Credit: C. Szegedy

69 Cover more spread out clusters by 3x3 convolutions number of filters 1x1 3x3 Credit: C. Szegedy

70 Cover more spread out clusters by 5x5 convolutions number of filters 1x1 3x3 Credit: C. Szegedy

71 Cover more spread out clusters by 5x5 convolutions number of filters 1x1 3x3 5x5 Credit: C. Szegedy

72 A heterogeneous set of convolutions number of filters 1x1 3x3 5x5 Credit: C. Szegedy

73 Schematic view (naive version) number of filters 1x1 Filter concatenation 3x3 1x1 convolutions 3x3 convolutions 5x5 convolutions 5x5 Previous layer Credit: C. Szegedy

74 Naive idea Filter concatenation 1x1 convolutions 3x3 convolutions 5x5 convolutions Previous layer Credit: C. Szegedy

75 Naive idea (does not work!) - Too many parameters Filter concatenation 1x1 convolutions 3x3 convolutions 5x5 convolutions 3x3 max pooling Previous layer Credit: C. Szegedy

76 Inception module Filter concatenation 1x1 convolutions 3x3 convolutions 1x1 convolutions 5x5 convolutions 1x1 convolutions 1x1 convolutions 3x3 max pooling Previous layer 1x1 Convolution Only operates across channels Performs a linear projections of a single pixel s features Can be used to reduce dimensionality Increases depth

77 Inception 9 Inception modules Network in a network in a network... Convolution Pooling Softmax Other Credit: C. Szegedy

78 Classification failure cases Groundtruth: coffee mug GoogLeNet: table lamp lamp shade printer projector desktop computer Credit: C. Szegedy

79 Classification failure cases Groundtruth: hay GoogLeNet: sorrel (horse) hartebeest Arabian camel warthog gaselle Credit: C. Szegedy

80 Classification failure cases Groundtruth: Police car GoogLeNet: laptop hair drier binocular ATM machine seat belt Credit: C. Szegedy

81 Credit: K. Simonyan

82 Credit: K. Simonyan

83 Credit: K. Simonyan

84 Better ( deeper) architectures exist now ILSVRC15 Winners: ~3.57% Top-5 error Credit: K. He

85 Faster R-CNN + ResNet Examples Credit: K. He

86 Faster R-CNN + ResNet Examples Credit: K. He

87 Faster R-CNN + ResNet Examples Credit: K. He

88 Better ( deeper) architectures exist now ILSVRC15 Winners: Object detection revolution Credit: K. He

89 Better ( deeper) architectures exist now Ultra deep all 3x3 conv filters (except first: 7x7) spatial size /2 # filters x2 Uses Batch Normalisation during training Standard hyperparameters & data augmentation No hidden fc layer No max pooling (except after first layer) No dropout 2014 Credit: K. He

90 Simply stacking layers does not work Credit: K. He

91 Simply stacking layers does not work Credit: K. He

92 Solution: Residual learning Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. Deep Residual Learning for Image Recognition. arxiv Credit: K. He

93 Breaking CNNs Intriguing properties of neural networks [Szegedy ICLR 2014]

94 What is going on? x E x x x E x Adversarial examples: modify the image to increase classifier error Explaining and Harnessing Adversarial Examples [Goodfellow ICLR 2015]

95 What is learned? Visualizing CNNs M. Zeiler & R. Fergus, Visualizing and Understanding Convolutional Networks, ECCV, 2014 A. Mahendran and A. Vedaldi. Understanding Deep Image Representations by Inverting Them, CVPR 2015

96 Visualization using Deconvolutional Networks Provides way to map activations at high layers back to the input [Zeiler et al. CVPR 10, ICCV 11, ECCV 14] Feature maps Same operations as Convnet, but in reverse: Unpool feature maps Convolve unpooled maps Filters copied from Convnet Used here purely as a probe Originally proposed as unsupervised learning method No inference, no learning Unpooling Non-linearity Convolution (learned) Input Image Credit: R. Fergus

97 Credit: R. Fergus Unpooling Operation Switches record where the pooled activations came from to unpool the reconstructed layer above

98 Deconvnet Credit: R. Fergus Deconvnet Projection from Higher Layers [Zeiler and Fergus. ECCV 14] Filters Feature Map... Filters Layer 2 Reconstruction Layer 2: Feature maps Layer 1 Reconstruction Layer 1: Feature maps Convnet Visualization Input Image

99 Visualizations of Higher Layers Use ImageNet 2012 validation set (stack of images) Push each image through network and look for image with the strongest activation for each feature map Feature Map... Take max activation from feature map associated with each filter Lower Layers Filters Use Deconvnet to project back to pixel space Use pooling switches distinctive to that activation Input Image Validation Images Credit: R. Fergus

100 Layer 1 Filters Credit: R. Fergus

101 Layer 1: Top-9 Patches Credit: R. Fergus

102 Layer 2: Top-1 Credit: R. Fergus

103 Layer 2: Top-9 Parts of input image that give strong activation of this feature map

104 Layer 2: Top-9 Patches Patches from validation images that give maximal activation of a given feature map

105 Layer 3: Top-1

106 Layer 3: Top-9

107 Layer 3: Top-9 Layer 3: Top-9 Patches

108 Layer 4: Top-1

109 Layer 4: Top-9

110 Layer 4: Top-9 patches

111 Layer 5: Top-1

112 Layer 5: Top-9

113 Layer 5: Top-9 Patches

114 Evolution of Features During Training

115 Credit: R. Fergus Evolution of Features During Training Higher layers evolve later in later epochs during training.

116 Visualizations can be used to improve model Visualization of Krizhevsky et al. s architecture showed some problems with layers 1 and 2 Alter architecture: smaller stride & filter size Visualizations look better and Performance improves Blocking artifacts Smaller stride for convolution Layer 2 filters Credit: R. Fergus

117 Credit: R. Fergus Too specific for low-level + dead filters Restrict size (smaller) Layer 1 filters Visualizations can be used to improve model Visualization of Krizhevsky et al. s architecture showed some problems with layers 1 and 2 Alter architecture: smaller stride & filter size Visualizations look better and Performance improves 11x11 filters, stride 4 7x7 filters, stride 2

118 Credit: R. Fergus Occlusion Experiment Mask parts of input with occluding square Monitor output of classification network Perhaps network using scene context?

119 Input image p(true class) Most probable class Credit: R. Fergus

120 Input image p(true class) Most probable class Credit: R. Fergus

121 Input image p(true class) Most probable class Credit: R. Fergus

122 Finding a Pre-Image Credit: A. Vedaldi

123 Images are locally smooth Credit: A. Vedaldi

124 Finding a pre-image at intermediat layers Credit: A. Vedaldi

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015 CS 1699: Intro to Computer Vision Deep Learning Prof. Adriana Kovashka University of Pittsburgh December 1, 2015 Today: Deep neural networks Background Architectures and basic operations Applications Visualizing

More information

Convolutional Feature Maps

Convolutional Feature Maps Convolutional Feature Maps Elements of efficient (and accurate) CNN-based object detection Kaiming He Microsoft Research Asia (MSRA) ICCV 2015 Tutorial on Tools for Efficient Object Detection Overview

More information

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

Lecture 6: Classification & Localization. boris. ginzburg@intel.com Lecture 6: Classification & Localization boris. ginzburg@intel.com 1 Agenda ILSVRC 2014 Overfeat: integrated classification, localization, and detection Classification with Localization Detection. 2 ILSVRC-2014

More information

Image Classification for Dogs and Cats

Image Classification for Dogs and Cats Image Classification for Dogs and Cats Bang Liu, Yan Liu Department of Electrical and Computer Engineering {bang3,yan10}@ualberta.ca Kai Zhou Department of Computing Science kzhou3@ualberta.ca Abstract

More information

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning : Intro. to Computer Vision Deep learning University of Massachusetts, Amherst April 19/21, 2016 Instructor: Subhransu Maji Finals (everyone) Thursday, May 5, 1-3pm, Hasbrouck 113 Final exam Tuesday, May

More information

Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016

Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016 Module 5 Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016 Previously, end-to-end.. Dog Slide credit: Jose M 2 Previously, end-to-end.. Dog Learned Representation Slide credit: Jose

More information

Compacting ConvNets for end to end Learning

Compacting ConvNets for end to end Learning Compacting ConvNets for end to end Learning Jose M. Alvarez Joint work with Lars Pertersson, Hao Zhou, Fatih Porikli. Success of CNN Image Classification Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton,

More information

Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg

Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Deep Learning Barnabás Póczos & Aarti Singh Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey

More information

Deformable Part Models with CNN Features

Deformable Part Models with CNN Features Deformable Part Models with CNN Features Pierre-André Savalle 1, Stavros Tsogkas 1,2, George Papandreou 3, Iasonas Kokkinos 1,2 1 Ecole Centrale Paris, 2 INRIA, 3 TTI-Chicago Abstract. In this work we

More information

Pedestrian Detection using R-CNN

Pedestrian Detection using R-CNN Pedestrian Detection using R-CNN CS676A: Computer Vision Project Report Advisor: Prof. Vinay P. Namboodiri Deepak Kumar Mohit Singh Solanki (12228) (12419) Group-17 April 15, 2016 Abstract Pedestrian detection

More information

Fast R-CNN. Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th. Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083.

Fast R-CNN. Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th. Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083. Fast R-CNN Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083. ECS 289G 001 Paper Presentation, Prof. Lee Result 1 67% Accuracy

More information

Stochastic Pooling for Regularization of Deep Convolutional Neural Networks

Stochastic Pooling for Regularization of Deep Convolutional Neural Networks Stochastic Pooling for Regularization of Deep Convolutional Neural Networks Matthew D. Zeiler Department of Computer Science Courant Institute, New York University zeiler@cs.nyu.edu Rob Fergus Department

More information

arxiv:1312.6034v2 [cs.cv] 19 Apr 2014

arxiv:1312.6034v2 [cs.cv] 19 Apr 2014 Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps arxiv:1312.6034v2 [cs.cv] 19 Apr 2014 Karen Simonyan Andrea Vedaldi Andrew Zisserman Visual Geometry Group,

More information

Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection

Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection CSED703R: Deep Learning for Visual Recognition (206S) Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 3 Object detection

More information

Pedestrian Detection with RCNN

Pedestrian Detection with RCNN Pedestrian Detection with RCNN Matthew Chen Department of Computer Science Stanford University mcc17@stanford.edu Abstract In this paper we evaluate the effectiveness of using a Region-based Convolutional

More information

Fast R-CNN Object detection with Caffe

Fast R-CNN Object detection with Caffe Fast R-CNN Object detection with Caffe Ross Girshick Microsoft Research arxiv code Latest roasts Goals for this section Super quick intro to object detection Show one way to tackle obj. det. with ConvNets

More information

Bert Huang Department of Computer Science Virginia Tech

Bert Huang Department of Computer Science Virginia Tech This paper was submitted as a final project report for CS6424/ECE6424 Probabilistic Graphical Models and Structured Prediction in the spring semester of 2016. The work presented here is done by students

More information

Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks

Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks 1 Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks Tomislav Hrkać, Karla Brkić, Zoran Kalafatić Faculty of Electrical Engineering and Computing University of

More information

arxiv:1502.01852v1 [cs.cv] 6 Feb 2015

arxiv:1502.01852v1 [cs.cv] 6 Feb 2015 Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification Kaiming He Xiangyu Zhang Shaoqing Ren Jian Sun arxiv:1502.01852v1 [cs.cv] 6 Feb 2015 Abstract Rectified activation

More information

Taking Inverse Graphics Seriously

Taking Inverse Graphics Seriously CSC2535: 2013 Advanced Machine Learning Taking Inverse Graphics Seriously Geoffrey Hinton Department of Computer Science University of Toronto The representation used by the neural nets that work best

More information

arxiv:1409.1556v6 [cs.cv] 10 Apr 2015

arxiv:1409.1556v6 [cs.cv] 10 Apr 2015 VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION Karen Simonyan & Andrew Zisserman + Visual Geometry Group, Department of Engineering Science, University of Oxford {karen,az}@robots.ox.ac.uk

More information

Learning and transferring mid-level image representions using convolutional neural networks

Learning and transferring mid-level image representions using convolutional neural networks Willow project-team Learning and transferring mid-level image representions using convolutional neural networks Maxime Oquab, Léon Bottou, Ivan Laptev, Josef Sivic 1 Image classification (easy) Is there

More information

SIGNAL INTERPRETATION

SIGNAL INTERPRETATION SIGNAL INTERPRETATION Lecture 6: ConvNets February 11, 2016 Heikki Huttunen heikki.huttunen@tut.fi Department of Signal Processing Tampere University of Technology CONVNETS Continued from previous slideset

More information

Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition

Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition Marc Aurelio Ranzato, Fu-Jie Huang, Y-Lan Boureau, Yann LeCun Courant Institute of Mathematical Sciences,

More information

Deep Residual Networks

Deep Residual Networks Deep Residual Networks Deep Learning Gets Way Deeper 8:30-10:30am, June 19 ICML 2016 tutorial Kaiming He Facebook AI Research* *as of July 2016. Formerly affiliated with Microsoft Research Asia 7x7 conv,

More information

Two-Stream Convolutional Networks for Action Recognition in Videos

Two-Stream Convolutional Networks for Action Recognition in Videos Two-Stream Convolutional Networks for Action Recognition in Videos Karen Simonyan Andrew Zisserman Visual Geometry Group, University of Oxford {karen,az}@robots.ox.ac.uk Abstract We investigate architectures

More information

Supporting Online Material for

Supporting Online Material for www.sciencemag.org/cgi/content/full/313/5786/504/dc1 Supporting Online Material for Reducing the Dimensionality of Data with Neural Networks G. E. Hinton* and R. R. Salakhutdinov *To whom correspondence

More information

HE Shuncheng hsc12@outlook.com. March 20, 2016

HE Shuncheng hsc12@outlook.com. March 20, 2016 Department of Automation Association of Science and Technology of Automation March 20, 2016 Contents Binary Figure 1: a cat? Figure 2: a dog? Binary : Given input data x (e.g. a picture), the output of

More information

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28 Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong Jan 26, 2016 Today Administrivia A bigger picture and some common questions Object detection proposals, by Samer

More information

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform

More information

Object Detection in Video using Faster R-CNN

Object Detection in Video using Faster R-CNN Object Detection in Video using Faster R-CNN Prajit Ramachandran University of Illinois at Urbana-Champaign prmchnd2@illinois.edu Abstract Convolutional neural networks (CNN) currently dominate the computer

More information

An Analysis of Single-Layer Networks in Unsupervised Feature Learning

An Analysis of Single-Layer Networks in Unsupervised Feature Learning An Analysis of Single-Layer Networks in Unsupervised Feature Learning Adam Coates 1, Honglak Lee 2, Andrew Y. Ng 1 1 Computer Science Department, Stanford University {acoates,ang}@cs.stanford.edu 2 Computer

More information

Latest Advances in Deep Learning. Yao Chou

Latest Advances in Deep Learning. Yao Chou Latest Advances in Deep Learning Yao Chou Outline Introduction Images Classification Object Detection R-CNN Traditional Feature Descriptor Selective Search Implementation Latest Application Deep Learning

More information

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles mjhustc@ucla.edu and lunbo

More information

Learning to Process Natural Language in Big Data Environment

Learning to Process Natural Language in Big Data Environment CCF ADL 2015 Nanchang Oct 11, 2015 Learning to Process Natural Language in Big Data Environment Hang Li Noah s Ark Lab Huawei Technologies Part 1: Deep Learning - Present and Future Talk Outline Overview

More information

Selecting Receptive Fields in Deep Networks

Selecting Receptive Fields in Deep Networks Selecting Receptive Fields in Deep Networks Adam Coates Department of Computer Science Stanford University Stanford, CA 94305 acoates@cs.stanford.edu Andrew Y. Ng Department of Computer Science Stanford

More information

TRAFFIC sign recognition has direct real-world applications

TRAFFIC sign recognition has direct real-world applications Traffic Sign Recognition with Multi-Scale Convolutional Networks Pierre Sermanet and Yann LeCun Courant Institute of Mathematical Sciences, New York University {sermanet,yann}@cs.nyu.edu Abstract We apply

More information

Novelty Detection in image recognition using IRF Neural Networks properties

Novelty Detection in image recognition using IRF Neural Networks properties Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,

More information

CNN Based Object Detection in Large Video Images. WangTao, wtao@qiyi.com IQIYI ltd. 2016.4

CNN Based Object Detection in Large Video Images. WangTao, wtao@qiyi.com IQIYI ltd. 2016.4 CNN Based Object Detection in Large Video Images WangTao, wtao@qiyi.com IQIYI ltd. 2016.4 Outline Introduction Background Challenge Our approach System framework Object detection Scene recognition Body

More information

Semantic Recognition: Object Detection and Scene Segmentation

Semantic Recognition: Object Detection and Scene Segmentation Semantic Recognition: Object Detection and Scene Segmentation Xuming He xuming.he@nicta.com.au Computer Vision Research Group NICTA Robotic Vision Summer School 2015 Acknowledgement: Slides from Fei-Fei

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

MulticoreWare. Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan

MulticoreWare. Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan 1 MulticoreWare Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan Focused on Heterogeneous Computing Multiple verticals spawned from core competency Machine Learning

More information

Applications of Deep Learning to the GEOINT mission. June 2015

Applications of Deep Learning to the GEOINT mission. June 2015 Applications of Deep Learning to the GEOINT mission June 2015 Overview Motivation Deep Learning Recap GEOINT applications: Imagery exploitation OSINT exploitation Geospatial and activity based analytics

More information

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet Classification with Deep Convolutional Neural Networks ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky University of Toronto kriz@cs.utoronto.ca Ilya Sutskever University of Toronto ilya@cs.utoronto.ca Geoffrey E. Hinton University

More information

arxiv:1506.03365v2 [cs.cv] 19 Jun 2015

arxiv:1506.03365v2 [cs.cv] 19 Jun 2015 LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop Fisher Yu Yinda Zhang Shuran Song Ari Seff Jianxiong Xiao arxiv:1506.03365v2 [cs.cv] 19 Jun 2015 Princeton

More information

Lecture 6. Artificial Neural Networks

Lecture 6. Artificial Neural Networks Lecture 6 Artificial Neural Networks 1 1 Artificial Neural Networks In this note we provide an overview of the key concepts that have led to the emergence of Artificial Neural Networks as a major paradigm

More information

Multi-Column Deep Neural Network for Traffic Sign Classification

Multi-Column Deep Neural Network for Traffic Sign Classification Multi-Column Deep Neural Network for Traffic Sign Classification Dan Cireşan, Ueli Meier, Jonathan Masci and Jürgen Schmidhuber IDSIA - USI - SUPSI Galleria 2, Manno - Lugano 6928, Switzerland Abstract

More information

Deep learning applications and challenges in big data analytics

Deep learning applications and challenges in big data analytics Najafabadi et al. Journal of Big Data (2015) 2:1 DOI 10.1186/s40537-014-0007-7 RESEARCH Open Access Deep learning applications and challenges in big data analytics Maryam M Najafabadi 1, Flavio Villanustre

More information

An Introduction to Neural Networks

An Introduction to Neural Networks An Introduction to Vincent Cheung Kevin Cannons Signal & Data Compression Laboratory Electrical & Computer Engineering University of Manitoba Winnipeg, Manitoba, Canada Advisor: Dr. W. Kinsner May 27,

More information

The Delicate Art of Flower Classification

The Delicate Art of Flower Classification The Delicate Art of Flower Classification Paul Vicol Simon Fraser University University Burnaby, BC pvicol@sfu.ca Note: The following is my contribution to a group project for a graduate machine learning

More information

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15 Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright GENIVI Alliance

More information

Università degli Studi di Bologna

Università degli Studi di Bologna Università degli Studi di Bologna DEIS Biometric System Laboratory Incremental Learning by Message Passing in Hierarchical Temporal Memory Davide Maltoni Biometric System Laboratory DEIS - University of

More information

A Dynamic Convolutional Layer for Short Range Weather Prediction

A Dynamic Convolutional Layer for Short Range Weather Prediction A Dynamic Convolutional Layer for Short Range Weather Prediction Benjamin Klein, Lior Wolf and Yehuda Afek The Blavatnik School of Computer Science Tel Aviv University beni.klein@gmail.com, wolf@cs.tau.ac.il,

More information

Semantic Image Segmentation and Web-Supervised Visual Learning

Semantic Image Segmentation and Web-Supervised Visual Learning Semantic Image Segmentation and Web-Supervised Visual Learning Florian Schroff Andrew Zisserman University of Oxford, UK Antonio Criminisi Microsoft Research Ltd, Cambridge, UK Outline Part I: Semantic

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 1 Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun arxiv:1506.01497v3 [cs.cv] 6 Jan 2016 Abstract State-of-the-art object

More information

Local features and matching. Image classification & object localization

Local features and matching. Image classification & object localization Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

More information

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Natalia Vassilieva, PhD Senior Research Manager GTC 2016 Deep learning proof points as of today Vision Speech Text Other Search & information

More information

DeepFace: Closing the Gap to Human-Level Performance in Face Verification

DeepFace: Closing the Gap to Human-Level Performance in Face Verification DeepFace: Closing the Gap to Human-Level Performance in Face Verification Yaniv Taigman Ming Yang Marc Aurelio Ranzato Facebook AI Research Menlo Park, CA, USA {yaniv, mingyang, ranzato}@fb.com Lior Wolf

More information

Implementation of Neural Networks with Theano. http://deeplearning.net/tutorial/

Implementation of Neural Networks with Theano. http://deeplearning.net/tutorial/ Implementation of Neural Networks with Theano http://deeplearning.net/tutorial/ Feed Forward Neural Network (MLP) Hidden Layer Object Hidden Layer Object Hidden Layer Object Logistic Regression Object

More information

arxiv:1505.04597v1 [cs.cv] 18 May 2015

arxiv:1505.04597v1 [cs.cv] 18 May 2015 U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger, Philipp Fischer, and Thomas Brox arxiv:1505.04597v1 [cs.cv] 18 May 2015 Computer Science Department and BIOSS Centre for

More information

Do Convnets Learn Correspondence?

Do Convnets Learn Correspondence? Do Convnets Learn Correspondence? Jonathan Long Ning Zhang Trevor Darrell University of California Berkeley {jonlong, nzhang, trevor}@cs.berkeley.edu Abstract Convolutional neural nets (convnets) trained

More information

Understanding Deep Image Representations by Inverting Them

Understanding Deep Image Representations by Inverting Them Understanding Deep Image Representations by Inverting Them Aravindh Mahendran University of Oxford aravindh@robots.ox.ac.uk Andrea Vedaldi University of Oxford vedaldi@robots.ox.ac.uk Abstract Image representations,

More information

A Hybrid Neural Network-Latent Topic Model

A Hybrid Neural Network-Latent Topic Model Li Wan Leo Zhu Rob Fergus Dept. of Computer Science, Courant institute, New York University {wanli,zhu,fergus}@cs.nyu.edu Abstract This paper introduces a hybrid model that combines a neural network with

More information

PANDA: Pose Aligned Networks for Deep Attribute Modeling

PANDA: Pose Aligned Networks for Deep Attribute Modeling PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang 1,2, Manohar Paluri 1, Marc Aurelio Ranzato 1, Trevor Darrell 2, Lubomir Bourdev 1 1 Facebook AI Research 2 EECS, UC Berkeley {nzhang,

More information

DeepFace: Closing the Gap to Human-Level Performance in Face Verification

DeepFace: Closing the Gap to Human-Level Performance in Face Verification DeepFace: Closing the Gap to Human-Level Performance in Face Verification Yaniv Taigman Ming Yang Marc Aurelio Ranzato Facebook AI Group Menlo Park, CA, USA {yaniv, mingyang, ranzato}@fb.com Lior Wolf

More information

Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not?

Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not? Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not? Erjin Zhou zej@megvii.com Zhimin Cao czm@megvii.com Qi Yin yq@megvii.com Abstract Face recognition performance improves rapidly

More information

Methods and Applications for Distance Based ANN Training

Methods and Applications for Distance Based ANN Training Methods and Applications for Distance Based ANN Training Christoph Lassner, Rainer Lienhart Multimedia Computing and Computer Vision Lab Augsburg University, Universitätsstr. 6a, 86159 Augsburg, Germany

More information

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support

More information

arxiv:1504.08083v2 [cs.cv] 27 Sep 2015

arxiv:1504.08083v2 [cs.cv] 27 Sep 2015 Fast R-CNN Ross Girshick Microsoft Research rbg@microsoft.com arxiv:1504.08083v2 [cs.cv] 27 Sep 2015 Abstract This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object

More information

R-CNN minus R. 1 Introduction. Karel Lenc http://www.robots.ox.ac.uk/~karel. Department of Engineering Science, University of Oxford, Oxford, UK.

R-CNN minus R. 1 Introduction. Karel Lenc http://www.robots.ox.ac.uk/~karel. Department of Engineering Science, University of Oxford, Oxford, UK. LENC, VEDALDI: R-CNN MINUS R 1 R-CNN minus R Karel Lenc http://www.robots.ox.ac.uk/~karel Andrea Vedaldi http://www.robots.ox.ac.uk/~vedaldi Department of Engineering Science, University of Oxford, Oxford,

More information

Chapter 4: Artificial Neural Networks

Chapter 4: Artificial Neural Networks Chapter 4: Artificial Neural Networks CS 536: Machine Learning Littman (Wu, TA) Administration icml-03: instructional Conference on Machine Learning http://www.cs.rutgers.edu/~mlittman/courses/ml03/icml03/

More information

Convolutional Neural Networks with Intra-layer Recurrent Connections for Scene Labeling

Convolutional Neural Networks with Intra-layer Recurrent Connections for Scene Labeling Convolutional Neural Networks with Intra-layer Recurrent Connections for Scene Labeling Ming Liang Xiaolin Hu Bo Zhang Tsinghua National Laboratory for Information Science and Technology (TNList) Department

More information

Deep Image: Scaling up Image Recognition

Deep Image: Scaling up Image Recognition Deep Image: Scaling up Image Recognition Ren Wu 1, Shengen Yan, Yi Shan, Qingqing Dang, Gang Sun Baidu Research Abstract We present a state-of-the-art image recognition system, Deep Image, developed using

More information

Deep Learning using Linear Support Vector Machines

Deep Learning using Linear Support Vector Machines Yichuan Tang tang@cs.toronto.edu Department of Computer Science, University of Toronto. Toronto, Ontario, Canada. Abstract Recently, fully-connected and convolutional neural networks have been trained

More information

Tracking in flussi video 3D. Ing. Samuele Salti

Tracking in flussi video 3D. Ing. Samuele Salti Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking

More information

Fast Accurate Fish Detection and Recognition of Underwater Images with Fast R-CNN

Fast Accurate Fish Detection and Recognition of Underwater Images with Fast R-CNN Fast Accurate Fish Detection and Recognition of Underwater Images with Fast R-CNN Xiu Li 1, 2, Min Shang 1, 2, Hongwei Qin 1, 2, Liansheng Chen 1, 2 1. Department of Automation, Tsinghua University, Beijing

More information

Image Super-Resolution Using Deep Convolutional Networks

Image Super-Resolution Using Deep Convolutional Networks 1 Image Super-Resolution Using Deep Convolutional Networks Chao Dong, Chen Change Loy, Member, IEEE, Kaiming He, Member, IEEE, and Xiaoou Tang, Fellow, IEEE arxiv:1501.00092v3 [cs.cv] 31 Jul 2015 Abstract

More information

Multi-view Face Detection Using Deep Convolutional Neural Networks

Multi-view Face Detection Using Deep Convolutional Neural Networks Multi-view Face Detection Using Deep Convolutional Neural Networks Sachin Sudhakar Farfade Yahoo fsachin@yahoo-inc.com Mohammad Saberian Yahoo saberian@yahooinc.com Li-Jia Li Yahoo lijiali@cs.stanford.edu

More information

Learning Invariant Features through Topographic Filter Maps

Learning Invariant Features through Topographic Filter Maps Learning Invariant Features through Topographic Filter Maps Koray Kavukcuoglu Marc Aurelio Ranzato Rob Fergus Yann LeCun Courant Institute of Mathematical Sciences New York University {koray,ranzato,fergus,yann}@cs.nyu.edu

More information

Denoising Convolutional Autoencoders for Noisy Speech Recognition

Denoising Convolutional Autoencoders for Noisy Speech Recognition Denoising Convolutional Autoencoders for Noisy Speech Recognition Mike Kayser Stanford University mkayser@stanford.edu Victor Zhong Stanford University vzhong@stanford.edu Abstract We propose the use of

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK

SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK N M Allinson and D Merritt 1 Introduction This contribution has two main sections. The first discusses some aspects of multilayer perceptrons,

More information

Learning Deep Face Representation

Learning Deep Face Representation Learning Deep Face Representation Haoqiang Fan Megvii Inc. fhq@megvii.com Zhimin Cao Megvii Inc. czm@megvii.com Yuning Jiang Megvii Inc. jyn@megvii.com Qi Yin Megvii Inc. yq@megvii.com Chinchilla Doudou

More information

Efficient online learning of a non-negative sparse autoencoder

Efficient online learning of a non-negative sparse autoencoder and Machine Learning. Bruges (Belgium), 28-30 April 2010, d-side publi., ISBN 2-93030-10-2. Efficient online learning of a non-negative sparse autoencoder Andre Lemme, R. Felix Reinhart and Jochen J. Steil

More information

The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization

The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization Adam Coates Andrew Y. Ng Stanford University, 353 Serra Mall, Stanford, CA 94305 acoates@cs.stanford.edu ang@cs.stanford.edu

More information

VideoStory A New Mul1media Embedding for Few- Example Recogni1on and Transla1on of Events

VideoStory A New Mul1media Embedding for Few- Example Recogni1on and Transla1on of Events VideoStory A New Mul1media Embedding for Few- Example Recogni1on and Transla1on of Events Amirhossein Habibian, Thomas Mensink, Cees Snoek ISLA, University of Amsterdam Problem statement Recognize and

More information

arxiv:1604.08893v1 [cs.cv] 29 Apr 2016

arxiv:1604.08893v1 [cs.cv] 29 Apr 2016 Faster R-CNN Features for Instance Search Amaia Salvador, Xavier Giró-i-Nieto, Ferran Marqués Universitat Politècnica de Catalunya (UPC) Barcelona, Spain {amaia.salvador,xavier.giro}@upc.edu Shin ichi

More information

An Early Attempt at Applying Deep Reinforcement Learning to the Game 2048

An Early Attempt at Applying Deep Reinforcement Learning to the Game 2048 An Early Attempt at Applying Deep Reinforcement Learning to the Game 2048 Hong Gui, Tinghan Wei, Ching-Bo Huang, I-Chen Wu 1 1 Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan

More information

Image denoising: Can plain Neural Networks compete with BM3D?

Image denoising: Can plain Neural Networks compete with BM3D? Image denoising: Can plain Neural Networks compete with BM3D? Harold C. Burger, Christian J. Schuler, and Stefan Harmeling Max Planck Institute for Intelligent Systems, Tübingen, Germany http://people.tuebingen.mpg.de/burger/neural_denoising/

More information

Generating more realistic images using gated MRF s

Generating more realistic images using gated MRF s Generating more realistic images using gated MRF s Marc Aurelio Ranzato Volodymyr Mnih Geoffrey E. Hinton Department of Computer Science University of Toronto {ranzato,vmnih,hinton}@cs.toronto.edu Abstract

More information

Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines

Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines Marc Aurelio Ranzato Geoffrey E. Hinton Department of Computer Science - University of Toronto 10 King s College Road,

More information

Machine Learning. 01 - Introduction

Machine Learning. 01 - Introduction Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge

More information

Using Convolutional Neural Networks for Image Recognition

Using Convolutional Neural Networks for Image Recognition Using Convolutional Neural Networks for Image Recognition By Samer Hijazi, Rishi Kumar, and Chris Rowen, IP Group, Cadence Convolutional neural networks (CNNs) are widely used in pattern- and image-recognition

More information

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

More information

Scalable Object Detection by Filter Compression with Regularized Sparse Coding

Scalable Object Detection by Filter Compression with Regularized Sparse Coding Scalable Object Detection by Filter Compression with Regularized Sparse Coding Ting-Hsuan Chao, Yen-Liang Lin, Yin-Hsi Kuo, and Winston H Hsu National Taiwan University, Taipei, Taiwan Abstract For practical

More information

Simplified Machine Learning for CUDA. Umar Arshad @arshad_umar Arrayfire @arrayfire

Simplified Machine Learning for CUDA. Umar Arshad @arshad_umar Arrayfire @arrayfire Simplified Machine Learning for CUDA Umar Arshad @arshad_umar Arrayfire @arrayfire ArrayFire CUDA and OpenCL experts since 2007 Headquartered in Atlanta, GA In search for the best and the brightest Expert

More information

Visualizing Higher-Layer Features of a Deep Network

Visualizing Higher-Layer Features of a Deep Network Visualizing Higher-Layer Features of a Deep Network Dumitru Erhan, Yoshua Bengio, Aaron Courville, and Pascal Vincent Dept. IRO, Université de Montréal P.O. Box 6128, Downtown Branch, Montreal, H3C 3J7,

More information

Transfer Learning for Latin and Chinese Characters with Deep Neural Networks

Transfer Learning for Latin and Chinese Characters with Deep Neural Networks Transfer Learning for Latin and Chinese Characters with Deep Neural Networks Dan C. Cireşan IDSIA USI-SUPSI Manno, Switzerland, 6928 Email: dan@idsia.ch Ueli Meier IDSIA USI-SUPSI Manno, Switzerland, 6928

More information