Bored by Classification ConvNets? End-to-end Learning of other Computer Vision Tasks

Size: px

Start display at page:

Download "Bored by Classification ConvNets? End-to-end Learning of other Computer Vision Tasks"

Kerry Fields
9 years ago
Views:

1 Bored by Classification ConvNets? End-to-end Learning of other Computer Vision Tasks Thomas Brox University of Freiburg Germany Research funded by ERC Starting Grant VideoLearn, the German Research Foundation, and the Deutsche Telekom Stiftung Thomas Brox

2 Outline Generative networks U-Net: Multi-instance segmentation FlowNet: Estimating optical flow Thomas Brox 2

3 Typical ConvNet architecture cat Classification network Thomas Brox 3

4 Typical ConvNet architecture cat cat Classification network Thomas Brox 4

5 Up-convolutional network Alexey Dosovitskiy CVPR 2015 New: Expanding network architecture Small gray office chair, side view cat Image generation Related work: Eigen et al. NIPS 2014: Network for depth map prediction Long et al. CVPR 2015: Network for semantic segmentation Thomas Brox 5

6 Generating chair images with a network Dosovitskiy et al., CVPR 2015 Thomas Brox 6

7 Training set 3D chair dataset Aubry et al. CVPR 2014 Rendering 809 chair styles From 62 viewpoints Source: Some of the rendered chairs Thomas Brox 7

8 Generating images of unseen views Training set split into two subsets: Source set: 62 viewpoints available (90% of all chair models) Target set: fewer viewpoints available (10% of all models) Thomas Brox 8

9 Generating images of unseen views 8 azimuths available 4 azimuths available 2 azimuths available 1 azimuth available Thomas Brox 9

10 Comparison to baselines Alexey Dosovitskiy CVPR 2015 Thomas Brox 10

11 Interpolation of chair styles Alexey Dosovitskiy CVPR 2015 Thomas Brox 11

12 Correspondences between chair instances Alexey Dosovitskiy CVPR 2015 Thomas Brox 12

13 Correspondences between chair instances Generate intermediate images with the network Track points with optical flow (LDOF) along the sequence Deformable Spatial Pyramid Matching (Kim et al. 2013) all easy difficult SIFT Flow (Liu et al. 2008) Ours Human performance Thomas Brox 13

from its feature representation Related work: Mahendran & Vedaldi CVPR

14 Preview: Inverting ConvNets with ConvNets Up-convolutional network Image features e.g. from AlexNet Learn to re-generate the input image from its feature representation Related work: Mahendran & Vedaldi CVPR 2015 Zeiler & Fergus ECCV 2014 Alexey Dosovitskiy arxiv 2015 Thomas Brox 14

15 Reconstruction results Up-Conv. Mahendran & Vedaldi Autoencoder More reconstructions with up-convolutional network: Thomas Brox 15

16 Color and position are preserved in high layers input All FC8 Top 5 FC8 All but Top 5 FC8 Color experiment Position experiment Thomas Brox 16

17 Outline A generative network U-Net: Multi-instance segmentation FlowNet: Estimating optical flow Thomas Brox 17

18 U-Net: Image segmentation with a ConvNet Olaf Ronneberger MICCAI 2015 Philipp Fischer Similar to Fully Convolutional Network [Long et al., CVPR 2015] Original inspiration: Depth map prediction [Eigen et al., NIPS 2014] Thomas Brox 18

19 Binary segmentation Electron Microscopy ISBI 2012 Challenge Rank 1 Light microscopy cell tracking ISBI 2015 Challenge Rank 1 Thomas Brox 19

20 Multi-class semantic segmentation X-ray dental segmentation, ISBI 2015 Challenge, Rank 1 Intersection over union: 77.5% Second best: 46% Thomas Brox 20

21 Multi-instance segmentation Light microscopy, DIC-HeLa cell tracking ISBI 2015 Challenge: Rank 1 Thomas Brox 21

22 Outline A generative network U-Net: Multi-instance segmentation FlowNet: Estimating optical flow Thomas Brox 22

23 FlowNet: Estimating optical flow with a ConvNet Refinement: expanding architecture Thomas Brox 23

24 Helping the network with a correlation layer Joint work with the group of Daniel Cremers Philipp Fischer Alexey Dosovitskiy Eddy Ilg Philip Häusser Caner Hazirbas Vladimir Golkov Thomas Brox 24

25 Enough data to train such a network? Getting ground truth optical flow for realistic videos is hard Existing datasets are small: Frames with ground truth Middlebury 8 KITTI 194 Sintel 1041 Needed >10000 Thomas Brox 25

26 Realism is overrated: the flying chair dataset Rendered image Optical flow Thomas Brox 26

27 It works! Input images Ground truth FlowNetSimple FlowNetCorr Although the network has only seen flying chairs for training, it predicts good optical flow on Sintel Thomas Brox 27

28 Results on various datasets Middlebury KITTI Sintel Clean Sintel Final Flying Chairs EpicFlow DeepFlow LDOF FlowNetS FlowNetS+v FlowNetS+ft FlowNetS+ft+v FlowNetC FlowNetC+v FlowNetC+ft FlowNetC+ft+v Networks can compete with state-of-the-art conventional optical flow estimation methods Thomas Brox 28

29 Can handle large displacements Input images Ground truth FlowNetSimple FlowNetCorr DeepFlow (Weinzaepfel et al. ICCV 2013) EpicFlow (Revaud et al. CVPR 2015) Thomas Brox 29

30 Sometimes wrong direction Input images Ground truth FlowNetSimple FlowNetCorr DeepFlow (Weinzaepfel et al. ICCV 2013) EpicFlow (Revaud et al. CVPR 2015) Thomas Brox 30

31 Often captures fine details Input images Ground truth FlowNetSimple FlowNetCorr DeepFlow (Weinzaepfel et al. ICCV 2013) EpicFlow (Revaud et al. CVPR 2015) Thomas Brox 31

32 Results on Flying chairs test set Input images Ground truth EpicFlow (Revaud et al. CVPR 2015) FlowNetCorr Thomas Brox 32

33 Runs with 10fps on the GPU Thomas Brox 33

34 Summary A generative network U-Net: Multi-instance segmentation FlowNet: Estimating optical flow Thomas Brox 34

35 Tip of the day Thomas Brox 35

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Optical Flow Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Outline Introduction Variation Models Feature Matching Methods End-to-end Learning based Methods Discussion Optical Flow Goal: Pixel motion