Deep Learning Methods for Vision (draft)

Size: px
Start display at page:

Download "Deep Learning Methods for Vision (draft)"

Transcription

1 1 CVPR 2012 Tutorial Deep Learning Metods for Vision (draft) Honglak Lee Computer Science and Engineering Division University of Micigan, Ann Arbor

2 pixel 2 Feature representations pixel 1 Input pixel 2 Input space Motorbikes Non -Motorbikes Learning algoritm pixel 1 2

3 pixel 2 andle Feature representations andle weel Input Input space Feature representation Motorbikes Non -Motorbikes Learning algoritm Feature space pixel 1 weel 3

4 How is computer perception done? State-of-te-art: and-crafting Input data Feature representation Learning algoritm Object detection Image Low-level vision features (SIFT, HOG, etc.) Object detection / classification Audio classification Audio Low-level audio features (spectrogram, MFCC, etc.) Speaker identification 4

5 Computer vision features SIFT Spin image HoG RIFT Textons GLOH 5

6 Computer vision features SIFT Spin image HoG Hand-crafted features: 1. Needs expert knowledge Textons RIFT 2. Requires time-consuming and-tuning 3. (Arguably) one of te limiting factors of computer vision systems GLOH 6

7 7 Learning Feature Representations Key idea: Learn statistical structure or correlation of te data from unlabeled data Te learned representations can be used as features in supervised and semi-supervised settings Known as: unsupervised feature learning, feature learning, deep learning, representation learning, etc. Topics covered in tis talk: Restricted Boltzmann Macines Deep Belief Networks Denoising Autoencoders Applications: Vision, Audio, and Multimodal learning

8 8 Learning Feature Hierarcy Deep Learning Deep arcitectures can be representationally efficient. 3rd layer Objects Natural progression from low level to ig level structures. Can sare te lower-level representations for multiple tasks. 2nd layer Object parts 1st layer edges Input

9 9 Outline Restricted Boltzmann macines Deep Belief Networks Denoising Autoencoders Applications to Vision Applications to Audio and Multimodal Data

10 11 Learning Feature Hierarcy [Related work: Hinton, Bengio, LeCun, Ng, and oters.] Higer layer: DBNs (Combinations of edges) First layer: RBMs (edges) Input image patc (pixels)

11 12 Restricted Boltzmann Macines wit binary-valued input data Representation Undirected bipartite grapical model v 0,1 D : observed (visible) binary variables 0,1 K : idden binary variables. idden (H) j i visible (V)

12 13 Conditional Probabilities (RBM wit binary-valued input data) Given v, all te j are conditionally independent P j = 1 v = exp i W ijv j +b j exp ( i W ij v j +b j )+1 =sigmoid( i W ij v j + b j ) =sigmoid(w T j v + b j ) 1 w 1 idden (H) 2 j w 2 3 w 3 P( v) can be used as features Given, all te v i are conditionally independent P v i = sigmoid( j W ij j + c i ) i v 1 v 2 visible (V)

13 14 Restricted Boltzmann Macines wit real-valued input data Representation Undirected bipartite grapical model V: observed (visible) real variables H: idden binary variables. idden (H) j i visible (V)

14 15 Conditional Probabilities (RBM wit real-valued input data) Given v, all te j are conditionally independent P j = 1 v = exp 1 σ exp ( 1 σ i W ij v j +b j i W ij v j +b j )+1 =sigmoid( 1 W σ i ij v j + b j ) =sigmoid( 1 w σ j T v + b j ) P( v) can be used as features 1 w 1 idden (H) i 2 j w 2 3 w 3 Given, all te v i are conditionally independent P v i = N σ j W ij j + c i, σ 2 or P v = N σw + c, σ 2 I. v 1 v 2 visible (V)

15 16 Inference Conditional Distribution: P(v ) or P( v) Easy to compute (see previous slides). Due to conditional independence, we can sample all idden units given all visible units in parallel (and vice versa) Joint Distribution: P(v,) Requires Gibbs Sampling (approximate; lots of iterations to converge). Initialize wit v 0 Sample 0 from P( v 0 ) Repeat until convergence (t=1, ) { Sample v t from P(v t t 1 ) Sample t from P( v t ) }

16 Model: Training RBMs How can we find parameters θ tat maximize P θ (v)? Data Distribution (posterior of given v) Model Distribution We need to compute P( v) and P(v,), and derivative of E wrt parameters {W,b,c} P( v): tractable P(v,): intractable Can approximate wit Gibbs sampling, but requires lots of iterations 17 17

17 Contrastive Divergence An approximation of te log-likeliood gradient for RBMs 1. Replace te average over all possible inputs by samples 2. Run te MCMC cain (Gibbs sampling) for only k steps starting from te observed example Initialize wit v 0 = v Sample 0 from P( v 0 ) For t = 1,,k { Sample v t from P(v t t 1 ) Sample t from P( v t ) } 18 18

18 A picture of te maximum likeliood learning algoritm for an RBM j j j j v i j 0 v i j i i i i t = 0 t = 1 t = 2 t = infinity Equilibrium distribution Slide Credit: Geoff Hinton 19

19 A quick way to learn an RBM i j v 0 1 i j v i j t = 0 t = 1 data reconstruction i j Start wit a training vector on te visible units. Update all te idden units in parallel Update te all te visible units in parallel to get a reconstruction. Update te idden units again. Slide Credit: Geoff Hinton 20

20 21 Update Rule: Putting togeter Training via stocastic gradient. Note, E W ij = i v j. Terefore, Can derive similar update rule for biases b and c Implemented in ~10 lines of matlab code

21 22 Oter ways of training RBMs Persistent CD [Tieleman, ICML 2008; Tieleman & Hinton, ICML 2009] Keep a background MCMC cain to obtain te negative pase samples. Related to Stocastic Approximation Robbins and Monro, Ann. Mat. Stats, 1957 L. Younes, Probability Teory 1989 Score Matcing [Swersky et al., ICML 2011; Hyvarinen, JMLR 2005] Use score function to eliminate Z Matc model s & empirical score function

22 23 Estimating Log-Likeliood of RBMs How do we measure te likeliood of te learned models? RBM: requires estimating partition function Reconstruction error provides a ceap proxy Log Z tractable analytically for < 25 binary inputs or idden Can be approximated wit Annealed Importance Sampling (AIS) Salakutdinov & Murray, ICML 2008 Open question: efficient ways to monitor progress

23 Variants of RBMs 25

24 Main idea Constrain te idden layer nodes to ave sparse average values (activation). [cf. sparse coding] Optimization Sparse RBM / DBN Tradeoff between likeliood and sparsity penalty Log-likeliood Sparsity penalty [Lee et al., NIPS 2008] we can use oter penalty functions (e.g., KLdivergence) Average activation Target sparsity 26

25 Modeling andwritten digits Sparse dictionary learning via sparse RBMs W 1 First layer bases ( pen-strokes ) input nodes (data) Learned sparse representations can be used as features. Training examples [Lee et al., NIPS 2008; Ranzato et al, NIPS 2007] 27

26 3-way factorized RBM [Ranzato et al., AISTATS 2010; Ranzato and Hinton, CVPR 2010] Models te covariance structure of images using idden variables 3-way factorized RBM / mean-covariance RBM Gaussian MRF 3-way (covariance) RBM k c F x p F x q x p C F C x q [Slide Credit: Marc Aurelio Ranzato] 28

27 Generating natural image patces Natural images mcrbm Ranzato and Hinton CVPR 2010 GRBM from Osindero and Hinton NIPS 2008 S-RBM + DBN from Osindero and Hinton NIPS 2008 Slide Credit: Marc Aurelio Ranzato 29

28 30 Outline Restricted Boltzmann macines Deep Belief Networks Denoising Autoencoders Applications to Vision Applications to Audio and Multimodal Data

29 32 Deep Belief Networks (DBNs) Probabilistic generative model Deep arcitecture multiple layers Unsupervised pre-learning provides a good initialization of te network maximizing te lower-bound of te log-likeliood of te data Supervised fine-tuning Generative: Up-down algoritm Discriminative: backpropagation Hinton et al., 2006

30 33 DBN structure v Visible layer Hidden layers RBM Directed belief nets ), ( ) ( )... ( ) ( ),...,,, ( l l l l l P P P P P v v Hinton et al., 2006

31 34 DBN structure v ) ( 2 1 P ) ( 1 v P ), ( 2 3 P ) ( 1 v Q ) ( 1 2 Q ) ( ) ( P Q 1 W 2 W 3 W j i i j i j i i sigm P ) W ' ( ) (. 1 b j i i j i j i i sigm Q ) W ( ) ( b ), ( ) ( )... ( ) ( ),...,,, ( l l l l l P P P P P v v Hinton et al., 2006 (approximate) inference Generative process

32 35 First step: DBN Greedy training Construct an RBM wit an input layer v and a idden layer Train te RBM Hinton et al., 2006

33 36 Second step: DBN Greedy training Stack anoter idden layer on top of te RBM to form a new RBM 1 1 Fix W, sample from Q( 1 v) as input. Train as RBM. 2 W 2 W Hinton et al., 2006 Q( 1 v) 1 W

34 37 Tird step: DBN Greedy training Continue to stack layers on top of te network, train it as previous step, wit sample sampled 2 1 from Q( ) And so on Q( 2 1 ) 3 W 2 W 3 Hinton et al., 2006 Q( 1 v) 1 W

35 38 Wy greedy training works? RBM specifies P(v,) from P(v ) and P( v) Implicitly defines P(v) and P() Key idea of stacking Keep P(v ) from 1st RBM Replace P() by te distribution generated by 2nd level RBM Hinton et al., 2006

36 39 Wy greedy training works? Greey Training: Variational lower-bound justifies greedy layerwise training of RBMs Hinton et al., 2006 Q( v) 1 W Trained by te second layer RBM

37 40 Wy greedy training works? Greey Training: Variational lower-bound justifies greedy layerwise training of RBMs Note: RBM and 2-layer DBN are equivalent wen W 2 = W 1 T. Terefore, te lower bound is tigt and te log-likeliood improves by greedy training. Q( 1 v) 2 W 1 W Hinton et al., 2006 Trained by te second layer RBM

38 41 DBN and supervised fine-tuning Discriminative fine-tuning Initializing wit neural nets + backpropagation Maximizes log P( Y X ) (X: data Y: label) Generative fine-tuning Up-down algoritm Maximizes (joint likeliood of data and labels) log P( Y, X )

39 A model for digit recognition Te top two layers form an associative memory wose energy landscape models te low dimensional manifolds of te digits. Te energy valleys ave names 2000 top-level neurons 10 label neurons 500 neurons Te model learns to generate combinations of labels and images. To perform recognition we start wit a neutral state of te label units and do an up-pass from te image followed by a few iterations of te top-level associative memory. ttp:// 500 neurons 28 x 28 pixel image Slide Credit: Geoff Hinton 43

40 Fine-tuning wit a contrastive version of te wake-sleep algoritm After learning many layers of features, we can fine-tune te features to improve generation. 1. Do a stocastic bottom-up pass Adjust te top-down weigts to be good at reconstructing te feature activities in te layer below. 2. Do a few iterations of sampling in te top level RBM -- Adjust te weigts in te top-level RBM. 3. Do a stocastic top-down pass Adjust te bottom-up weigts to be good at reconstructing te feature activities in te layer above. ttp:// Slide Credit: Geoff Hinton 44

41 45 Generating sample from a DBN Want to sample from Sample using Gibbs sampling in te RBM Sample te lower layer from l 1 ) ( 1 i i P i 1 ), ( ) ( )... ( ) ( ),...,,, ( l l l l l P P P P P v v

42 Generating samples from DBN Hinton et al, A Fast Learning Algoritm for Deep Belief Nets,

43 Result for supervised fine-tuning on MNIST Very carefully trained backprop net wit 1.6% one or two idden layers (Platt; Hinton) SVM (Decoste & Scoelkopf, 2002) 1.4% Generative model of joint density of 1.25% images and labels (+ generative fine-tuning) Generative model of unlabelled digits 1.15% followed by gentle backpropagation (Hinton & Salakutdinov, Science 2006) ttp:// Slide Credit: Geoff Hinton 47

44 48 More details on up-down algoritm: Hinton, G. E., Osindero, S. and Te, Y. (2006) A fast learning algoritm for deep belief nets, Neural Computation, 18, pp ttp:// Handwritten digit demo: ttp://

45 49 Outline Restricted Boltzmann macines Deep Belief Networks Denoising Autoencoders Applications to Vision Applications to Audio and Multimodal Data

46 Denoising Autoencoder Denoising Autoencoder Perturbs te input x to a corrupted version: E.g., randomly sets some of te coordinates of input to zeros. Recover x from encoded of perturbed data. Minimize loss between x and x [Vincent et al., ICML 2008] 51

47 Denoising Autoencoder Learns a vector field towards iger probability regions Minimizes variational lower bound on a generative model Corresponds to regularized score matcing on an RBM [Vincent et al., ICML 2008] Corrupted input Corrupted input Slide Credit: Yosua Bengio 52

48 Stacked Denoising Autoencoders Greedy Layer wise learning Start wit te lowest level and stack upwards Train eac layer of autoencoder on te intermediate code (features) from te layer below Top layer can ave a different output (e.g., softmax nonlinearity) to provide an output for classification Slide Credit: Yosua Bengio 53

49 Stacked Denoising Autoencoders No partition function, can measure training criterion Encoder & decoder: any parametrization Performs as well or better tan stacking RBMs for usupervised pre-training Infinite MNIST Slide Credit: Yosua Bengio 54

50 Denoising Autoencoders: Bencmarks Larocelle et al., 2009 Slide Credit: Yosua Bengio 55

51 Denoising Autoencoders: Results Test errors on te bencmarks Larocelle et al., 2009 Slide Credit: Yosua Bengio 56

52 Wy Greedy Layer Wise Training Works (Bengio 2009, Eran et al. 2009) Regularization Hypotesis Pre-training is constraining parameters in a region relevant to unsupervised dataset Better generalization (Representations tat better describe unlabeled data are more discriminative for labeled data) Optimization Hypotesis Unsupervised training initializes lower level parameters near localities of better minima tan random initialization can Slide Credit: Yosua Bengio 58

53 65 Outline Restricted Boltzmann macines Deep Belief Networks Denoising Autoencoders Applications to Vision Applications to Audio and Multimodal Data

54 67 Convolutional Neural Networks (LeCun et al., 1989) Local Receptive Fields Weigt saring Pooling

55 Deep Convolutional Arcitectures State-of-te-art on MNIST digits, Caltec-101 objects, etc. Slide Credit: Yann LeCun 70

56 71 Learning object representations Learning objects and parts in images Large image patces contain interesting igerlevel structures. E.g., object parts and full objects

57 72 Illustration: Learning an eye detector Srink (max over 2x2) Eye detector Advantage of srinking 1. Filter size is kept smal 2. Invariance filter1 filter2 filter3 filter4 Filtering output Example image

58 Convolutional RBM (CRBM) [Lee et al, ICML 2009] [Related work: Norouzi et al., CVPR 2009; Desjardins and Bengio, 2008] For filter k, max-pooling node (binary) Max-pooling layer P Detection layer H Constraint: At most one idden node is 1 (active). Hidden nodes (binary) W k Filter weigts (sared) V Input (visible data V layer) Key Properties Convolutional structure Probabilistic max-pooling ( mutual exclusion ) 73

59 Convolutional deep belief networks illustration Layer 3 W 3 Layer 3 activation (coefficients) Layer 2 W 2 Layer 2 activation (coefficients) Layer 1 W 1 Filter visualization Layer 1 activation (coefficients) Input image Example image 74

60 75 Unsupervised learning from natural images Second layer bases contours, corners, arcs, surface boundaries First layer bases localized, oriented edges

61 76 Unsupervised learning of object-parts Faces Cars Elepants Cairs

62 Image classification wit Spatial Pyramids [Lazebnik et al., CVPR 2005; Yang et al., CVPR 2009] Descriptor Layer: detect and locate features, extract corresponding descriptors (e.g. SIFT) Code Layer: code te descriptors Vector Quantization (VQ): eac code as only one non-zero element Soft-VQ: small group of elements can be non-zero SPM layer: pool codes across subregions and average/normalize into a istogram Slide Credit: Kai Yu 78

63 Improving te coding step Classifiers using tese features need nonlinear kernels Lazebnik et al., CVPR 2005; Grauman and Darrell, JMLR 2007 Hig computational complexity Idea: modify te coding step to produce feature representations tat linear classifiers can use effectively Sparse coding [Olsausen & Field, Nature 1996; Lee et al., NIPS 2007; Yang et al., CVPR 2009; Boureau et al., CVPR 2010] Local Coordinate coding [Yu et al., NIPS 2009; Wang et al., CVPR 2010] RBMs [Son, Jung, Lee, Hero III, ICCV 2011] Oter feature learning algoritms Slide Credit: Kai Yu 79

64 80 Object Recognition Results Classification accuracy on Caltec 101/256 < Caltec 101 > # of training images Zang et al., CVPR Griffin et al., ScSPM [Yang et al., CVPR 2009] LLC [Wang et al., CVPR 2010] Macrofeatures [Boureau et al., CVPR 2010] Boureau et al., ICCV Sparse RBM [Son et al., ICCV 2011] Sparse CRBM [Son et al., ICCV 2011] Competitive performance to oter state-of-te-art metods using a single type of features

65 81 Object Recognition Results Classification accuracy on Caltec 101/256 < Caltec 256 > # of training images Griffin et al. [2] vangemert et al., PAMI ScSPM [Yang et al., CVPR 2009] LLC [Wang et al., CVPR 2010] Sparse CRBM [Son et al., ICCV 2011] Competitive performance to oter state-of-te-art metods using a single type of features

66 Local Convolutional RBM Modeling convolutional structures in local regions jointly More statistically effective in learning for nonstationary (rougly aligned) images [Huang, Lee, Learned-Miller, CVPR 2012] 88

67 89 Face Verification [Huang, Lee, Learned-Miller, CVPR 2012] Feature Learning Convolutional Deep Belief Network 1 to 2 layers of CRBM/local CRBM Face Verification LFW pairs Representation witened pixel intensity/lbp Representation top-most pooling layer Faces Source Verification Algoritm Metric Learning + SVM Kyoto (self-taugt learning) matced Prediction mismatced

68 90 Face Verification [Huang, Lee, Learned-Miller, CVPR 2012] Metod Accuracy ± SE V1-like wit MKL (Pinto et al., CVPR 2009) ± Linear rectified units (Nair and Hinton, ICML 2010) ± CSML (Nguyen & Bai, ACCV 2010) ± Learning-based descriptor (Cao et al., CVPR 2010) ± OSS, TSS, full (Wolf et al., ACCV 2009) ± OSS only (Wolf et al., ACCV 2009) ± Combined (LBP + deep learning features) ±

69 Tiled Convolutional models IDEA: ave one subset of filters applied to tese locations, 91 91

70 Tiled Convolutional models IDEA: ave one subset of filters applied to tese locations, anoter subset to tese locations 92 92

71 Tiled Convolutional models IDEA: ave one subset of filters applied to tese locations, anoter subset to tese locations, etc. Train jointly all parameters. Gregor LeCun arxiv 2010 Ranzato, Mni, Hinton NIPS 2010 No block artifacts Reduced redundancy 93 93

72 Tiled Convolutional models Treat tese units as data to train a similar model on te top SECOND STAGE Field of binary RBM's. Eac idden unit as a receptive field of 30x30 pixels in input space

73 Facial Expression Recognition Toronto Face Dataset (J. Susskind et al. 2010) ~ 100K unlabeled faces from different sources ~ 4K labeled images Resolution: 48x48 pixels 7 facial expressions neutral sadness surprise 95 Ranzato, et al. CVPR

74 Facial Expression Recognition Drawing samples from te model (5 t layer wit 128 iddens) 5 4 RBM... RBM RBM RBM 5 t layer RBM... gmrf 1 st layer 96 96

75 Facial Expression Recognition Drawing samples from te model (5 t layer wit 128 iddens) 97 Ranzato, et al. CVPR

76 Facial Expression Recognition - 7 syntetic occlusions - use generative model to fill-in (conditional on te known pixels) RBM RBM RBM RBM RBM RBM RBM RBM RBM RBM RBM RBM RBM RBM RBM... gmrf gmrf gmrf gmrf gmrf 98 Ranzato, et al. CVPR

77 Facial Expression Recognition originals Type 1 occlusion: eyes Restored images 99 Ranzato, et al. CVPR

78 Facial Expression Recognition originals Type 2 occlusion: mout Restored images 100 Ranzato, et al. CVPR

79 Facial Expression Recognition originals Type 3 occlusion: rigt alf Restored images 101 Ranzato, et al. CVPR

80 Facial Expression Recognition originals Type 5 occlusion: top alf Restored images 103 Ranzato, et al. CVPR

81 Facial Expression Recognition originals Type 7 occlusion: 70% of pixels at random Restored images 105 Ranzato, et al. CVPR

82 Facial Expression Recognition occluded images for bot training and test Dailey, et al. J. Cog. Neuros Wrigt, et al. PAMI Ranzato, et al. CVPR

83 107 Outline Restricted Boltzmann macines Deep Belief Networks Denoising Autoencoders Applications to Vision Applications to Audio and Multimodal Data

84 109 Motivation: Multi-modal learning Single learning algoritms tat combine multiple input domains Images Audio & speec Video Text Robotic sensors Time-series data Oters image audio text

85 110 Motivation: Multi-modal learning Benefits: more robust performance in Multimedia processing Biomedical data mining fmri PET scan X-ray EEG Ultra sound Robot perception Visible ligt image Audio Termal Infrared Camera array 3d range scans

86 Sparse dictionary learning on audio Spectrogram ~ 0.9 * * * x [Lee, Largman, Pam, 63 Ng, NIPS 2009] 111

87 frequency Convolutional DBN for audio Max pooling node Detection nodes Spectrogram time [Lee, Largman, Pam, Ng, NIPS 2009] 113

88 frequency Convolutional DBN for audio Spectrogram time [Lee, Largman, Pam, Ng, NIPS 2009] 114

89 Convolutional DBN for audio [Lee, Largman, Pam, Ng, NIPS 2009] 115

90 Convolutional DBN for audio Max pooling Detection nodes Second CDBN layer Max pooling Detection nodes One CDBN layer [Lee, Largman, Pam, Ng, NIPS 2009] 116

91 CDBNs for speec Trained on unlabeled TIMIT corpus Learned first-layer bases [Lee, Largman, Pam, Ng, NIPS 2009] 117

92 Poneme First layer bases 118 Comparison of bases to ponemes oy el s

93 Experimental Results Speaker identification TIMIT Speaker identification Accuracy Prior art (Reynolds, 1995) 99.7% Convolutional DBN 100.0% Pone classification TIMIT Pone classification Accuracy Clarkson et al. (1999) 77.6% Petrov et al. (2007) 78.6% Sa & Saul (2006) 78.9% Yu et al. (2009) 79.2% Convolutional DBN 80.3% Transformation-invariant RBM (Son et al., ICML 2012) 81.5% [Lee, Pam, Largman, Ng, NIPS 2009] 120

94 Pone recognition using mcrbm Mean-covariance RBM + DBN Mean-covariance RBM [Dal, Ranzato, Moamed, Hinton, NIPS 2009] 122

95 Speec Recognition on TIMIT Metod PER Stocastic Segmental Models 36.0% Conditional Random Field 34.8% Large-Margin GMM 33.0% CD-HMM 27.3% Augmented conditional Random Fields 26.6% Recurrent Neural Nets 26.1% Bayesian Tripone HMM 25.6% Monopone HTMs 24.8% Heterogeneous Classifiers 24.4% Deep Belief Networks(DBNs) 23.0% Tripone HMMs discriminatively trained w/ BMMI 22.7% Deep Belief Networks wit mcrbm feature extraction 20.5% (Dal et al., NIPS 2010) 123

96 Multimodal Feature Learning Lip reading via multimodal feature learning (audio / visual data) Slide credit: Jiquan Ngiam 126

97 Multimodal Feature Learning Lip reading via multimodal feature learning (audio / visual data) Q. Is concatenating te best option? Slide credit: Jiquan Ngiam 127

98 Multimodal Feature Learning Concatenating and learning features (via a single layer)doesn t work Mostly unimodal features are learned 128

99 Multimodal Feature Learning Bimodal autoencoder Idea: predict unseen modality from observed modality Ngiam et al., ICML

100 Multimodal Feature Learning Visualization of learned filters Audio(spectrogram) and Video features learned over 100ms windows Results: AVLetters Lip reading dataset Metod Accuracy Prior art (Zao et al., 2009) 58.9% Multimodal deep autoencoder (Ngiam et al., 2011) 65.8% 130

101 131 Summary Learning Feature Representations Restricted Boltzmann Macines Deep Belief Networks Stacked Denoising Autoencoders Deep learning algoritms and unsupervised feature learning algoritms sow promising results in many applications vision, audio, multimodal data, and oters.

102 Tank you! 132

Tutorial on Deep Learning and Applications

Tutorial on Deep Learning and Applications NIPS 2010 Workshop on Deep Learning and Unsupervised Feature Learning Tutorial on Deep Learning and Applications Honglak Lee University of Michigan Co-organizers: Yoshua Bengio, Geoff Hinton, Yann LeCun,

More information

Visualizing Higher-Layer Features of a Deep Network

Visualizing Higher-Layer Features of a Deep Network Visualizing Higher-Layer Features of a Deep Network Dumitru Erhan, Yoshua Bengio, Aaron Courville, and Pascal Vincent Dept. IRO, Université de Montréal P.O. Box 6128, Downtown Branch, Montreal, H3C 3J7,

More information

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations Honglak Lee Roger Grosse Rajesh Ranganath Andrew Y. Ng Computer Science Department, Stanford University,

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Deep Learning Barnabás Póczos & Aarti Singh Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey

More information

Efficient online learning of a non-negative sparse autoencoder

Efficient online learning of a non-negative sparse autoencoder and Machine Learning. Bruges (Belgium), 28-30 April 2010, d-side publi., ISBN 2-93030-10-2. Efficient online learning of a non-negative sparse autoencoder Andre Lemme, R. Felix Reinhart and Jochen J. Steil

More information

An Analysis of Single-Layer Networks in Unsupervised Feature Learning

An Analysis of Single-Layer Networks in Unsupervised Feature Learning An Analysis of Single-Layer Networks in Unsupervised Feature Learning Adam Coates 1, Honglak Lee 2, Andrew Y. Ng 1 1 Computer Science Department, Stanford University {acoates,ang}@cs.stanford.edu 2 Computer

More information

Supporting Online Material for

Supporting Online Material for www.sciencemag.org/cgi/content/full/313/5786/504/dc1 Supporting Online Material for Reducing the Dimensionality of Data with Neural Networks G. E. Hinton* and R. R. Salakhutdinov *To whom correspondence

More information

Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines

Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines Marc Aurelio Ranzato Geoffrey E. Hinton Department of Computer Science - University of Toronto 10 King s College Road,

More information

A Hybrid Neural Network-Latent Topic Model

A Hybrid Neural Network-Latent Topic Model Li Wan Leo Zhu Rob Fergus Dept. of Computer Science, Courant institute, New York University {wanli,zhu,fergus}@cs.nyu.edu Abstract This paper introduces a hybrid model that combines a neural network with

More information

Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine

Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine George E. Dahl, Marc Aurelio Ranzato, Abdel-rahman Mohamed, and Geoffrey Hinton Department of Computer Science University of Toronto

More information

Unsupervised Feature Learning and Deep Learning

Unsupervised Feature Learning and Deep Learning Unsupervised Feature Learning and Deep Learning Thanks to: Adam Coates Quoc Le Honglak Lee Andrew Maas Chris Manning Jiquan Ngiam Andrew Saxe Richard Socher Develop ideas using Computer vision Audio Text

More information

Deep Belief Nets (An updated and extended version of my 2007 NIPS tutorial)

Deep Belief Nets (An updated and extended version of my 2007 NIPS tutorial) UCL Tutorial on: Deep Belief Nets (An updated and extended version of my 2007 NIPS tutorial) Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science University of Toronto

More information

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed A brief history of backpropagation

More information

Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition

Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition Marc Aurelio Ranzato, Fu-Jie Huang, Y-Lan Boureau, Yann LeCun Courant Institute of Mathematical Sciences,

More information

The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization

The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization Adam Coates Andrew Y. Ng Stanford University, 353 Serra Mall, Stanford, CA 94305 [email protected] [email protected]

More information

Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection

Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection CSED703R: Deep Learning for Visual Recognition (206S) Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection Bohyung Han Computer Vision Lab. [email protected] 2 3 Object detection

More information

Rectified Linear Units Improve Restricted Boltzmann Machines

Rectified Linear Units Improve Restricted Boltzmann Machines Rectified Linear Units Improve Restricted Boltzmann Machines Vinod Nair [email protected] Geoffrey E. Hinton [email protected] Department of Computer Science, University of Toronto, Toronto, ON

More information

Taking Inverse Graphics Seriously

Taking Inverse Graphics Seriously CSC2535: 2013 Advanced Machine Learning Taking Inverse Graphics Seriously Geoffrey Hinton Department of Computer Science University of Toronto The representation used by the neural nets that work best

More information

Convolutional Feature Maps

Convolutional Feature Maps Convolutional Feature Maps Elements of efficient (and accurate) CNN-based object detection Kaiming He Microsoft Research Asia (MSRA) ICCV 2015 Tutorial on Tools for Efficient Object Detection Overview

More information

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015 CS 1699: Intro to Computer Vision Deep Learning Prof. Adriana Kovashka University of Pittsburgh December 1, 2015 Today: Deep neural networks Background Architectures and basic operations Applications Visualizing

More information

Image and Video Understanding

Image and Video Understanding Image and Video Understanding 2VO 710.095 WS Christoph Feichtenhofer, Axel Pinz Slide credits: Many thanks to all the great computer vision researchers on which this presentation relies on. Most material

More information

Manifold Learning with Variational Auto-encoder for Medical Image Analysis

Manifold Learning with Variational Auto-encoder for Medical Image Analysis Manifold Learning with Variational Auto-encoder for Medical Image Analysis Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill [email protected] Abstract Manifold

More information

Generating more realistic images using gated MRF s

Generating more realistic images using gated MRF s Generating more realistic images using gated MRF s Marc Aurelio Ranzato Volodymyr Mnih Geoffrey E. Hinton Department of Computer Science University of Toronto {ranzato,vmnih,hinton}@cs.toronto.edu Abstract

More information

Selecting Receptive Fields in Deep Networks

Selecting Receptive Fields in Deep Networks Selecting Receptive Fields in Deep Networks Adam Coates Department of Computer Science Stanford University Stanford, CA 94305 [email protected] Andrew Y. Ng Department of Computer Science Stanford

More information

Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network

Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Anthony Lai (aslai), MK Li (lilemon), Foon Wang Pong (ppong) Abstract Algorithmic trading, high frequency trading (HFT)

More information

Generalized Denoising Auto-Encoders as Generative Models

Generalized Denoising Auto-Encoders as Generative Models Generalized Denoising Auto-Encoders as Generative Models Yoshua Bengio, Li Yao, Guillaume Alain, and Pascal Vincent Département d informatique et recherche opérationnelle, Université de Montréal Abstract

More information

Novelty Detection in image recognition using IRF Neural Networks properties

Novelty Detection in image recognition using IRF Neural Networks properties Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,

More information

A Fast Learning Algorithm for Deep Belief Nets

A Fast Learning Algorithm for Deep Belief Nets LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton [email protected] Simon Osindero [email protected] Department of Computer Science, University

More information

Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images

Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images For Modeling Natural Images Marc Aurelio Ranzato Alex Krizhevsky Geoffrey E. Hinton Department of Computer Science - University of Toronto Toronto, ON M5S 3G4, CANADA Abstract Deep belief nets have been

More information

Sparse deep belief net model for visual area V2

Sparse deep belief net model for visual area V2 Sparse deep belief net model for visual area V2 Honglak Lee Chaitanya Ekanadham Andrew Y. Ng Computer Science Department Stanford University Stanford, CA 9435 {hllee,chaitu,ang}@cs.stanford.edu Abstract

More information

Learning multiple layers of representation

Learning multiple layers of representation Review TRENDS in Cognitive Sciences Vol.11 No.10 Learning multiple layers of representation Geoffrey E. Hinton Department of Computer Science, University of Toronto, 10 King s College Road, Toronto, M5S

More information

Lecture 6: Classification & Localization. boris. [email protected]

Lecture 6: Classification & Localization. boris. ginzburg@intel.com Lecture 6: Classification & Localization boris. [email protected] 1 Agenda ILSVRC 2014 Overfeat: integrated classification, localization, and detection Classification with Localization Detection. 2 ILSVRC-2014

More information

Introduction to Deep Learning Variational Inference, Mean Field Theory

Introduction to Deep Learning Variational Inference, Mean Field Theory Introduction to Deep Learning Variational Inference, Mean Field Theory 1 Iasonas Kokkinos [email protected] Center for Visual Computing Ecole Centrale Paris Galen Group INRIA-Saclay Lecture 3: recap

More information

A fast learning algorithm for deep belief nets

A fast learning algorithm for deep belief nets A fast learning algorithm for deep belief nets Geoffrey E. Hinton and Simon Osindero Department of Computer Science University of Toronto 10 Kings College Road Toronto, Canada M5S 3G4 {hinton, osindero}@cs.toronto.edu

More information

Deep learning applications and challenges in big data analytics

Deep learning applications and challenges in big data analytics Najafabadi et al. Journal of Big Data (2015) 2:1 DOI 10.1186/s40537-014-0007-7 RESEARCH Open Access Deep learning applications and challenges in big data analytics Maryam M Najafabadi 1, Flavio Villanustre

More information

LEUKEMIA CLASSIFICATION USING DEEP BELIEF NETWORK

LEUKEMIA CLASSIFICATION USING DEEP BELIEF NETWORK Proceedings of the IASTED International Conference Artificial Intelligence and Applications (AIA 2013) February 11-13, 2013 Innsbruck, Austria LEUKEMIA CLASSIFICATION USING DEEP BELIEF NETWORK Wannipa

More information

Image Classification for Dogs and Cats

Image Classification for Dogs and Cats Image Classification for Dogs and Cats Bang Liu, Yan Liu Department of Electrical and Computer Engineering {bang3,yan10}@ualberta.ca Kai Zhou Department of Computing Science [email protected] Abstract

More information

Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

More information

To Recognize Shapes, First Learn to Generate Images

To Recognize Shapes, First Learn to Generate Images Department of Computer Science 6 King s College Rd, Toronto University of Toronto M5S 3G4, Canada http://learning.cs.toronto.edu fax: +1 416 978 1455 Copyright c Geoffrey Hinton 2006. October 26, 2006

More information

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28 Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object

More information

called Restricted Boltzmann Machines for Collaborative Filtering

called Restricted Boltzmann Machines for Collaborative Filtering Restricted Boltzmann Machines for Collaborative Filtering Ruslan Salakhutdinov [email protected] Andriy Mnih [email protected] Geoffrey Hinton [email protected] University of Toronto, 6 King

More information

Why Does Unsupervised Pre-training Help Deep Learning?

Why Does Unsupervised Pre-training Help Deep Learning? Journal of Machine Learning Research 11 (2010) 625-660 Submitted 8/09; Published 2/10 Why Does Unsupervised Pre-training Help Deep Learning? Dumitru Erhan Yoshua Bengio Aaron Courville Pierre-Antoine Manzagol

More information

arxiv:1312.6062v2 [cs.lg] 9 Apr 2014

arxiv:1312.6062v2 [cs.lg] 9 Apr 2014 Stopping Criteria in Contrastive Divergence: Alternatives to the Reconstruction Error arxiv:1312.6062v2 [cs.lg] 9 Apr 2014 David Buchaca Prats Departament de Llenguatges i Sistemes Informàtics, Universitat

More information

A Practical Guide to Training Restricted Boltzmann Machines

A Practical Guide to Training Restricted Boltzmann Machines Department of Computer Science 6 King s College Rd, Toronto University of Toronto M5S 3G4, Canada http://learning.cs.toronto.edu fax: +1 416 978 1455 Copyright c Geoffrey Hinton 2010. August 2, 2010 UTML

More information

Efficient Learning of Sparse Representations with an Energy-Based Model

Efficient Learning of Sparse Representations with an Energy-Based Model Efficient Learning of Sparse Representations with an Energy-Based Model Marc Aurelio Ranzato Christopher Poultney Sumit Chopra Yann LeCun Courant Institute of Mathematical Sciences New York University,

More information

Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg [email protected] www.multimedia-computing.{de,org} References

More information

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning : Intro. to Computer Vision Deep learning University of Massachusetts, Amherst April 19/21, 2016 Instructor: Subhransu Maji Finals (everyone) Thursday, May 5, 1-3pm, Hasbrouck 113 Final exam Tuesday, May

More information

Machine Learning and AI via Brain simulations

Machine Learning and AI via Brain simulations Machine Learning and AI via Brain simulations Stanford University & Google Thanks to: Stanford: Adam Coates Quoc Le Honglak Lee Andrew Saxe Andrew Maas Chris Manning Jiquan Ngiam Richard Socher Will Zou

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott

More information

Transfer Learning for Latin and Chinese Characters with Deep Neural Networks

Transfer Learning for Latin and Chinese Characters with Deep Neural Networks Transfer Learning for Latin and Chinese Characters with Deep Neural Networks Dan C. Cireşan IDSIA USI-SUPSI Manno, Switzerland, 6928 Email: [email protected] Ueli Meier IDSIA USI-SUPSI Manno, Switzerland, 6928

More information

Deep Learning for Big Data

Deep Learning for Big Data Deep Learning for Big Data Yoshua Bengio Département d Informa0que et Recherche Opéra0onnelle, U. Montréal 30 May 2013, Journée de la recherche École Polytechnique, Montréal Big Data & Data Science Super-

More information

Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not?

Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not? Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not? Erjin Zhou [email protected] Zhimin Cao [email protected] Qi Yin [email protected] Abstract Face recognition performance improves rapidly

More information

Steven C.H. Hoi School of Information Systems Singapore Management University Email: [email protected]

Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Steven C.H. Hoi School of Information Systems Singapore Management University Email: [email protected] Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual

More information

Applying Deep Learning to Enhance Momentum Trading Strategies in Stocks

Applying Deep Learning to Enhance Momentum Trading Strategies in Stocks This version: December 12, 2013 Applying Deep Learning to Enhance Momentum Trading Strategies in Stocks Lawrence Takeuchi * Yu-Ying (Albert) Lee [email protected] [email protected] Abstract We

More information

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Natalia Vassilieva, PhD Senior Research Manager GTC 2016 Deep learning proof points as of today Vision Speech Text Other Search & information

More information

Research on the Anti-perspective Correction Algorithm of QR Barcode

Research on the Anti-perspective Correction Algorithm of QR Barcode Researc on te Anti-perspective Correction Algoritm of QR Barcode Jianua Li, Yi-Wen Wang, YiJun Wang,Yi Cen, Guoceng Wang Key Laboratory of Electronic Tin Films and Integrated Devices University of Electronic

More information

Local features and matching. Image classification & object localization

Local features and matching. Image classification & object localization Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

More information

Supervised Feature Selection & Unsupervised Dimensionality Reduction

Supervised Feature Selection & Unsupervised Dimensionality Reduction Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or

More information

Deep Learning Face Representation by Joint Identification-Verification

Deep Learning Face Representation by Joint Identification-Verification Deep Learning Face Representation by Joint Identification-Verification Yi Sun 1 Yuheng Chen 2 Xiaogang Wang 3,4 Xiaoou Tang 1,4 1 Department of Information Engineering, The Chinese University of Hong Kong

More information

arxiv:1312.6034v2 [cs.cv] 19 Apr 2014

arxiv:1312.6034v2 [cs.cv] 19 Apr 2014 Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps arxiv:1312.6034v2 [cs.cv] 19 Apr 2014 Karen Simonyan Andrea Vedaldi Andrew Zisserman Visual Geometry Group,

More information

Lecture 2: The SVM classifier

Lecture 2: The SVM classifier Lecture 2: The SVM classifier C19 Machine Learning Hilary 2015 A. Zisserman Review of linear classifiers Linear separability Perceptron Support Vector Machine (SVM) classifier Wide margin Cost function

More information

Stochastic Pooling for Regularization of Deep Convolutional Neural Networks

Stochastic Pooling for Regularization of Deep Convolutional Neural Networks Stochastic Pooling for Regularization of Deep Convolutional Neural Networks Matthew D. Zeiler Department of Computer Science Courant Institute, New York University [email protected] Rob Fergus Department

More information

Learning to Process Natural Language in Big Data Environment

Learning to Process Natural Language in Big Data Environment CCF ADL 2015 Nanchang Oct 11, 2015 Learning to Process Natural Language in Big Data Environment Hang Li Noah s Ark Lab Huawei Technologies Part 1: Deep Learning - Present and Future Talk Outline Overview

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform

More information

DeepFace: Closing the Gap to Human-Level Performance in Face Verification

DeepFace: Closing the Gap to Human-Level Performance in Face Verification DeepFace: Closing the Gap to Human-Level Performance in Face Verification Yaniv Taigman Ming Yang Marc Aurelio Ranzato Facebook AI Research Menlo Park, CA, USA {yaniv, mingyang, ranzato}@fb.com Lior Wolf

More information

Latent variable and deep modeling with Gaussian processes; application to system identification. Andreas Damianou

Latent variable and deep modeling with Gaussian processes; application to system identification. Andreas Damianou Latent variable and deep modeling with Gaussian processes; application to system identification Andreas Damianou Department of Computer Science, University of Sheffield, UK Brown University, 17 Feb. 2016

More information

Group Sparse Coding. Fernando Pereira Google Mountain View, CA [email protected]. Dennis Strelow Google Mountain View, CA strelow@google.

Group Sparse Coding. Fernando Pereira Google Mountain View, CA pereira@google.com. Dennis Strelow Google Mountain View, CA strelow@google. Group Sparse Coding Samy Bengio Google Mountain View, CA [email protected] Fernando Pereira Google Mountain View, CA [email protected] Yoram Singer Google Mountain View, CA [email protected] Dennis Strelow

More information

Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016

Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016 Module 5 Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016 Previously, end-to-end.. Dog Slide credit: Jose M 2 Previously, end-to-end.. Dog Learned Representation Slide credit: Jose

More information

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski [email protected]

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski [email protected] Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems

More information

Contractive Auto-Encoders: Explicit Invariance During Feature Extraction

Contractive Auto-Encoders: Explicit Invariance During Feature Extraction : Explicit Invariance During Feature Extraction Salah Rifai (1) Pascal Vincent (1) Xavier Muller (1) Xavier Glorot (1) Yoshua Bengio (1) (1) Dept. IRO, Université de Montréal. Montréal (QC), H3C 3J7, Canada

More information

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised

More information

Unsupervised and Transfer Learning Challenge: a Deep Learning Approach

Unsupervised and Transfer Learning Challenge: a Deep Learning Approach JMLR: Workshop and Conference Proceedings 27:97 111, 2012 Workshop on Unsupervised and Transfer Learning Unsupervised and Transfer Learning Challenge: a Deep Learning Approach Grégoire Mesnil 1,2 [email protected]

More information

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos [email protected]

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Machine Learning for Computer Vision 1 MVA ENS Cachan Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos [email protected] Department of Applied Mathematics Ecole Centrale Paris Galen

More information

Big Data Deep Learning: Challenges and Perspectives

Big Data Deep Learning: Challenges and Perspectives Big Data Deep Learning: Challenges and Perspectives D.saraswathy Department of computer science and engineering IFET college of engineering Villupuram [email protected] Abstract Deep

More information

Simplified Machine Learning for CUDA. Umar Arshad @arshad_umar Arrayfire @arrayfire

Simplified Machine Learning for CUDA. Umar Arshad @arshad_umar Arrayfire @arrayfire Simplified Machine Learning for CUDA Umar Arshad @arshad_umar Arrayfire @arrayfire ArrayFire CUDA and OpenCL experts since 2007 Headquartered in Atlanta, GA In search for the best and the brightest Expert

More information

Learning Invariant Features through Topographic Filter Maps

Learning Invariant Features through Topographic Filter Maps Learning Invariant Features through Topographic Filter Maps Koray Kavukcuoglu Marc Aurelio Ranzato Rob Fergus Yann LeCun Courant Institute of Mathematical Sciences New York University {koray,ranzato,fergus,yann}@cs.nyu.edu

More information

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles [email protected] and lunbo

More information

A Simple Introduction to Support Vector Machines

A Simple Introduction to Support Vector Machines A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear

More information

Machine Learning in Computer Vision A Tutorial. Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN

Machine Learning in Computer Vision A Tutorial. Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN Machine Learning in Computer Vision A Tutorial Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN Outline Introduction Supervised Learning Unsupervised Learning Semi-Supervised

More information

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step

More information

Learning Deep Face Representation

Learning Deep Face Representation Learning Deep Face Representation Haoqiang Fan Megvii Inc. [email protected] Zhimin Cao Megvii Inc. [email protected] Yuning Jiang Megvii Inc. [email protected] Qi Yin Megvii Inc. [email protected] Chinchilla Doudou

More information

Deep Learning for Multivariate Financial Time Series. Gilberto Batres-Estrada

Deep Learning for Multivariate Financial Time Series. Gilberto Batres-Estrada Deep Learning for Multivariate Financial Time Series Gilberto Batres-Estrada June 4, 2015 Abstract Deep learning is a framework for training and modelling neural networks which recently have surpassed

More information

An Introduction to Deep Learning

An Introduction to Deep Learning Thought Leadership Paper Predictive Analytics An Introduction to Deep Learning Examining the Advantages of Hierarchical Learning Table of Contents 4 The Emergence of Deep Learning 7 Applying Deep-Learning

More information

Evaluating probabilities under high-dimensional latent variable models

Evaluating probabilities under high-dimensional latent variable models Evaluating probabilities under ig-dimensional latent variable models Iain Murray and Ruslan alakutdinov Department of Computer cience University of oronto oronto, ON. M5 3G4. Canada. {murray,rsalaku}@cs.toronto.edu

More information

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Optical Flow Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Outline Introduction Variation Models Feature Matching Methods End-to-end Learning based Methods Discussion Optical Flow Goal: Pixel motion

More information

Recurrent Neural Networks

Recurrent Neural Networks Recurrent Neural Networks Neural Computation : Lecture 12 John A. Bullinaria, 2015 1. Recurrent Neural Network Architectures 2. State Space Models and Dynamical Systems 3. Backpropagation Through Time

More information

DeepFace: Closing the Gap to Human-Level Performance in Face Verification

DeepFace: Closing the Gap to Human-Level Performance in Face Verification DeepFace: Closing the Gap to Human-Level Performance in Face Verification Yaniv Taigman Ming Yang Marc Aurelio Ranzato Facebook AI Group Menlo Park, CA, USA {yaniv, mingyang, ranzato}@fb.com Lior Wolf

More information

Semi-Supervised Support Vector Machines and Application to Spam Filtering

Semi-Supervised Support Vector Machines and Application to Spam Filtering Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery

More information

Learning Mid-Level Features For Recognition

Learning Mid-Level Features For Recognition Learning Mid-Level Features For Recognition Y-Lan Boureau 1,3,4 Francis Bach 1,4 Yann LeCun 3 Jean Ponce 2,4 1 INRIA 2 Ecole Normale Supérieure 3 Courant Institute, New York University Abstract Many successful

More information

Deep Learning of Invariant Features via Simulated Fixations in Video

Deep Learning of Invariant Features via Simulated Fixations in Video Deep Learning of Invariant Features via Simulated Fixations in Video Will Y. Zou 1, Shenghuo Zhu 3, Andrew Y. Ng 2, Kai Yu 3 1 Department of Electrical Engineering, Stanford University, CA 2 Department

More information

Research Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network

Research Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network Distributed Sensor Networks Volume 2015, Article ID 157453, 7 pages http://dx.doi.org/10.1155/2015/157453 Research Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network

More information

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,

More information

degrees of freedom and are able to adapt to the task they are supposed to do [Gupta].

degrees of freedom and are able to adapt to the task they are supposed to do [Gupta]. 1.3 Neural Networks 19 Neural Networks are large structured systems of equations. These systems have many degrees of freedom and are able to adapt to the task they are supposed to do [Gupta]. Two very

More information

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research [email protected]

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research [email protected] Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

More information

Università degli Studi di Bologna

Università degli Studi di Bologna Università degli Studi di Bologna DEIS Biometric System Laboratory Incremental Learning by Message Passing in Hierarchical Temporal Memory Davide Maltoni Biometric System Laboratory DEIS - University of

More information

Programming Exercise 3: Multi-class Classification and Neural Networks

Programming Exercise 3: Multi-class Classification and Neural Networks Programming Exercise 3: Multi-class Classification and Neural Networks Machine Learning November 4, 2011 Introduction In this exercise, you will implement one-vs-all logistic regression and neural networks

More information