Lecture 14: Convolutional neural networks for computer vision

Similar documents
Taking Inverse Graphics Seriously

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015

Fast R-CNN. Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th. Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:

Introduction to Machine Learning CMU-10701

Steven C.H. Hoi School of Information Systems Singapore Management University

Image and Video Understanding

Supporting Online Material for

Image Classification for Dogs and Cats

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

Deep learning applications and challenges in big data analytics

Multi-Column Deep Neural Network for Traffic Sign Classification

Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks

Novelty Detection in image recognition using IRF Neural Networks properties

Transfer Learning for Latin and Chinese Characters with Deep Neural Networks

Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승

Handwritten Digit Recognition with a Back-Propagation Network

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

Applications of Deep Learning to the GEOINT mission. June 2015

NEURAL NETWORKS A Comprehensive Foundation

Classifying Manipulation Primitives from Visual Data

Big Data in the Mathematical Sciences

Modelling and Big Data. Leslie Smith ITNPBD4, October Updated 9 October 2015

Analecta Vol. 8, No. 2 ISSN

Stochastic Pooling for Regularization of Deep Convolutional Neural Networks

Learning to Process Natural Language in Big Data Environment

6.2.8 Neural networks for data mining

How To Use Neural Networks In Data Mining

Object Recognition. Selim Aksoy. Bilkent University

Machine Learning Introduction

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection

An Approach for Utility Pole Recognition in Real Conditions

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning

Self Organizing Maps: Fundamentals

Selecting Receptive Fields in Deep Networks

Limitations of Human Vision. What is computer vision? What is computer vision (cont d)?

Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images

Neural Network Design in Cloud Computing

Local features and matching. Image classification & object localization

Application of Neural Networks to Character Recognition

Face Recognition For Remote Database Backup System

Biological Neurons and Neural Networks, Artificial Neurons

Introduction to Machine Learning Using Python. Vikram Kamath

Recurrent Neural Networks

Comparison of K-means and Backpropagation Data Mining Algorithms

Follow links Class Use and other Permissions. For more information, send to:

An Introduction to Neural Networks

Fact sheet. Tractable. Automating visual recognition tasks with Artificial Intelligence

Deep Convolutional Inverse Graphics Network

TRAINING A LIMITED-INTERCONNECT, SYNTHETIC NEURAL IC

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Methods and Applications for Distance Based ANN Training

Course 395: Machine Learning

Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006

Probabilistic Latent Semantic Analysis (plsa)

Tracking in flussi video 3D. Ing. Samuele Salti

How To Get A Computer Science Degree

Lecture 6. Artificial Neural Networks

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH

Semantic Recognition: Object Detection and Scene Segmentation

Chapter 4: Artificial Neural Networks

TRAFFIC sign recognition has direct real-world applications

Learning Deep Architectures for AI. Contents

Neural Networks and Support Vector Machines

Visualizing Higher-Layer Features of a Deep Network

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Lecture 13: Validation

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

An Analysis of Single-Layer Networks in Unsupervised Feature Learning

Data Mining Algorithms Part 1. Dejan Sarka

GLOVE-BASED GESTURE RECOGNITION SYSTEM

Pixels Description of scene contents. Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT) Banksy, 2006

Neural Networks algorithms and applications

Sparse deep belief net model for visual area V2

Data Mining and Neural Networks in Stata

Pedestrian Detection with RCNN

Implementation of Neural Networks with Theano.

Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition

3 An Illustrative Example

G E N E R A L A P P R O A CH: LO O K I N G F O R D O M I N A N T O R I E N T A T I O N I N I M A G E P A T C H E S

Predict Influencers in the Social Network

Part-Based Recognition

The Delicate Art of Flower Classification

8. Machine Learning Applied Artificial Intelligence

Denoising Convolutional Autoencoders for Noisy Speech Recognition

Lecture 8 February 4

Digital image processing

ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION

Space Perception and Binocular Vision

NEURAL NETWORKS IN DATA MINING

Big Data Deep Learning: Challenges and Perspectives

Data Mining Techniques Chapter 7: Artificial Neural Networks

Università degli Studi di Bologna

MS1b Statistical Data Mining

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Transcription:

Lecture 14: Convolutional neural networks for computer vision Dr. Richard E. Turner (ret26@cam.ac.uk) November 20, 2014

Big picture Goal: how to produce good internal representations of the visual world to support recognition... detect and classify objects into categories, independently of pose, scale, illumination, conformation, occlusion and clutter how could an artificial vision system learn appropriate internal representations automatically, the way humans seem to by simply looking at the world? previously in CV and the course: hand-crafted feature extractor now in CV and the course: learn suitable representations of images apple orange

Why use hierarchical multi-layered models? Argument 1: visual scenes are hierachically organised object trees object parts bark, leaves, etc. primitive features oriented edges input image forest image

Why use hierarchical multi-layered models? Argument 2: biological vision is hierachically organised object trees Inferotemporal cortex object parts primitive features bark, leaves, etc. oriented edges V4: different textures V1: simple and complex cells input image forest image photo-receptors retina

Why use hierarchical multi-layered models? Argument 3: shallow architectures are inefficient at representing deep functions single layer neural network implements: output shallow networks can be computationally inefficient hidden layer inputs layer networks we met last lecture with large enough single hidden layer can implement any function 'universal approximator' however, if the function is 'deep' a very large hidden layer may be required

What s wrong with standard neural networks? How many parameters does this neural network have? For a small 32 by 32 image: Hard to train over-fitting and local optima Need to initialise carefully layer wise training unsupervised schemes Convolutional nets reduce the number of parameters

The key ideas behind convolutional neural networks image statistics are translation invariant (objects and viewpoint translates) build this translation invariance into the model (rather than learning it) tie lots of the weights together in the network reduces number of parameters expect learned low-level features to be local (e.g. edge detector) build this into the model by allowing only local connectivity reduces the numbers of parameters further expect high-level features learned to be coarser (c.f. biology) build this into the model by subsampling more and more up the hierarchy reduces the number of parameters again

Building block of a convolutional neural network mean or subsample also used pooling stage e.g. non-linear stage only parameters convolutional stage input image

Full convolutional neural network connects to several feature maps 'normal' neural network non-linear stage layer 2 layer 1 non-linear stage convolutional stage non-linear stage non-linear stage convolutional stage will have different filters

How many parameters does a convolutional network have? How many parameters does this neural network have? For a small 32 by 32 image:

Training back-propagation for training: stochastic gradient ascent like last lecture output interpreted as a class label probability, x = p(t = 1 z) now x is a more complex function of the inputs z can optimise same objective function computed over a mini-batch of datapoints data-augmentation: always improves performance substantially (include shifted, rotations, mirroring, locally distorted versions of the training data) typical numbers: 5 convolutional layers, 3 layers in top neural network 500,000 neurons 50,000,000 parameters 1 week to train (GPUs)

Demo CIFAR 10 dataset: 50,000 training images, 10,000 test images http://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html

Looking into a convolutional neural network s brain reconstruction of image patches from that unit (indicates aspect of patches which unit is sensitive to) top 9 image patches that cause maximal activation in layer 2 unit

Looking into a convolutional neural network s brain

Looking into a convolutional neural network s brain

Summary higher level layers encode more abstract features higher level layers show more invariance to instantiation parameters translation rotation lighting changes a method for learning feature detectors first layer learns edge detectors subsequent layers more complex integrates training of the classifier with training of the featural representation

Convolutional neural networks in the news convolutional neural networks are the go-to model for many computer vision classification problems form of neural network with an architecture suited to vision problems

Finally some cautionary words hierarchical modelling is a very old idea and not new the deep learning revolution has come about mainly due to new methods for initialising learning of neural networks current methods aim at invariance, but this is far from all there is to computer and biological vision: e.g. instantiation parameters should also be represented classification can only go so far: "tell us a story about what happened in this picture"