Semantic 3D Reconstruction joint reconstruction, recognition and segmentation

Similar documents
Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite

Part-Based Recognition

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016

LIBSVX and Video Segmentation Evaluation

Incremental Surface Extraction from Sparse Structure-from-Motion Point Clouds

Colorado School of Mines Computer Vision Professor William Hoff

3D Model based Object Class Detection in An Arbitrary View

How does the Kinect work? John MacCormick

Pedestrian Detection with RCNN

3D Vision An enabling Technology for Advanced Driver Assistance and Autonomous Offroad Driving

Limitations of Human Vision. What is computer vision? What is computer vision (cont d)?

Applications of Deep Learning to the GEOINT mission. June 2015

Identification algorithms for hybrid systems

Announcements. Active stereo with structured light. Project structured light patterns onto the object

Digital Image Increase

Arrangements And Duality

. Learn the number of classes and the structure of each class using similarity between unlabeled training patterns

Tracking of Small Unmanned Aerial Vehicles

The KITTI-ROAD Evaluation Benchmark. for Road Detection Algorithms

MeshLab and Arc3D: Photo-Reconstruction and Processing of 3D meshes

Automatic Reconstruction of Parametric Building Models from Indoor Point Clouds. CAD/Graphics 2015

Object Recognition. Selim Aksoy. Bilkent University

A unified representation for interactive 3D modeling

Artificial Intelligence and Politecnico di Milano. Presented by Matteo Matteucci

Behavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011

What more can we do with videos?

Computer Graphics. Geometric Modeling. Page 1. Copyright Gotsman, Elber, Barequet, Karni, Sheffer Computer Science - Technion. An Example.

Static Environment Recognition Using Omni-camera from a Moving Vehicle

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Segmentation of building models from dense 3D point-clouds

Visual Servoing Methodology for Selective Tree Pruning by Human-Robot Collaborative System

Online Learning for Offroad Robots: Using Spatial Label Propagation to Learn Long Range Traversability

3D Vision An enabling Technology for Advanced Driver Assistance and Autonomous Offroad Driving

Professor, D.Sc. (Tech.) Eugene Kovshov MSTU «STANKIN», Moscow, Russia

Seminar. Path planning using Voronoi diagrams and B-Splines. Stefano Martina

Computation of crystal growth. using sharp interface methods

Algebra Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Epipolar Geometry and Visual Servoing

Semantic Visual Understanding of Indoor Environments: from Structures to Opportunities for Action

ANALYTICS IN BIG DATA ERA

Real-Time 3D Reconstruction Using a Kinect Sensor

3D/4D acquisition. 3D acquisition taxonomy Computer Vision. Computer Vision. 3D acquisition methods. passive. active.

CityGML goes to Broadway

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

High Resolution RF Analysis: The Benefits of Lidar Terrain & Clutter Datasets

Edge tracking for motion segmentation and depth ordering

Edmund Li. Where is defined as the mutual inductance between and and has the SI units of Henries (H).

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions

Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling

Classifying Manipulation Primitives from Visual Data

3 Image-Based Photo Hulls. 2 Image-Based Visual Hulls. 3.1 Approach. 3.2 Photo-Consistency. Figure 1. View-dependent geometry.

ASSESSMENT OF VISUALIZATION SOFTWARE FOR SUPPORT OF CONSTRUCTION SITE INSPECTION TASKS USING DATA COLLECTED FROM REALITY CAPTURE TECHNOLOGIES

Point Cloud Simulation & Applications Maurice Fallon

Binocular Vision and The Perception of Depth

3D Model of the City Using LiDAR and Visualization of Flood in Three-Dimension

Curriculum Map by Block Geometry Mapping for Math Block Testing August 20 to August 24 Review concepts from previous grades.

Microsoft Mathematics for Educators:

In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data.

Semantic Representation For Navigation In Large-Scale Environments

A multimedia database system for 3D crime scene representation and analysis

3D Face Modeling. Vuong Le. IFP group, Beckman Institute University of Illinois ECE417 Spring 2013

Pre-Algebra Academic Content Standards Grade Eight Ohio. Number, Number Sense and Operations Standard. Number and Number Systems

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

Introduction. C 2009 John Wiley & Sons, Ltd

Number Sense and Operations

Big Data in the Mathematical Sciences

Support Vector Machine-Based Human Behavior Classification in Crowd through Projection and Star Skeletonization

Human Behavior Analysis in Intelligent Retail Environments

Fast Semantic Segmentation of 3D Point Clouds using a Dense CRF with Learned Parameters

Probabilistic Latent Semantic Analysis (plsa)

Data Centric Interactive Visualization of Very Large Data

Robotics. Lecture 3: Sensors. See course website for up to date information.

Applied Research Laboratory: Visualization, Information and Imaging Programs

Prentice Hall Algebra Correlated to: Colorado P-12 Academic Standards for High School Mathematics, Adopted 12/2009

Big datasets - promise or Big data, shmig data. D.A. Forsyth, UIUC

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

Automatic Labeling of Lane Markings for Autonomous Vehicles

CS171 Visualization. The Visualization Alphabet: Marks and Channels. Alexander Lex [xkcd]

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Dynamic Programming and Graph Algorithms in Computer Vision

The goal is multiply object tracking by detection with application on pedestrians.

Blender 3D Animation

Flow Separation for Fast and Robust Stereo Odometry

Coupled State-Space Models for Monitoring Nursing Home Resident Activity

False alarm in outdoor environments

A. OPENING POINT CLOUDS. (Notepad++ Text editor) (Cloud Compare Point cloud and mesh editor) (MeshLab Point cloud and mesh editor)

Visibility optimization for data visualization: A Survey of Issues and Techniques

Interactive Level-Set Deformation On the GPU

MATHS LEVEL DESCRIPTORS

Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree.

Tracking in flussi video 3D. Ing. Samuele Salti

Interactive Segmentation, Tracking, and Kinematic Modeling of Unknown 3D Articulated Objects

Paper 2 Revision. (compiled in light of the contents of paper1) Higher Tier Edexcel

Dynamic composition of tracking primitives for interactive vision-guided navigation

11.1. Objectives. Component Form of a Vector. Component Form of a Vector. Component Form of a Vector. Vectors and the Geometry of Space

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Convex Mirrors. Ray Diagram for Convex Mirror

Big Ideas in Mathematics

EECS 556 Image Processing W 09. Interpolation. Interpolation techniques B splines

STRAND: Number and Operations Algebra Geometry Measurement Data Analysis and Probability STANDARD:

Transcription:

Semantic 3D Reconstruction joint reconstruction, recognition and segmentation Marc Pollefeys joint work with Christian Haene, Nikolay Savinov, Lubor Ladicky, Christopher Zach, Andrea Cohen

Outline Some recent 3D reconstruction projects Autonomous cars & micro-aerial vehicles Mobile 3D modeling (Google Tango, mobile phones) Semantic 3D reconstruction Joint reconstruction, recognition and segmentation High-order ray potentials Joint classifier Modeling objects 2

3D mapping for autonomous driving (Lee et al CVPR13; Heng et al. ICRA14; Haene et al. IROS15; ) Dense real-time temporal fisheye stereo (GPU) omnidirectional visual simultaneous localization and mapping cameras as main sensors: no lasers, no GPS

Fully autonomous vision-based navigation and mapping full on-board processing 2+1 cameras + IMU indoor and outdoor operation obstacle avoidance, mapping and exploration no laser, no GPS, no network (Fraundorfer et al. IROS12 best paper finalist) Our lab is also the home of PixHawk/PX4 more info at: http://pixhawk.ethz.ch

Research partner of Google s Project Tango 5 (Camposeco ICRA15, Schoeps et al. 3DV2015, Lynen RSS15...)

3D selfies! (Tanskanen et al. ICCV13; Kolev CVPR14)

Outline Highlights of some ongoing projects Autonomous cars & micro-aerial vehicles Mobile 3D modeling (Google Tango, mobile phones) Semantic 3D reconstruction Joint reconstruction, recognition and segmentation High-order ray potentials Joint classifier Modeling objects 7

Classical 3D reconstruction from images Classical 3D from image approach Compute relative pose between images (structure-from-motion) Estimate per pixel depth estimation (multi-view stereo matching) Compute surface reconstruction (graph energy minimization) E(x) = unary depth evidence term free-space S Ω ρ S ρ S optimal solution x s x S +φ ( x S ) φ isotropic shape prior (spherical) line-of-sight model occupied-space

Semantic 3D reconstruction n Building n Ground n Vegetation n Clutter Joint 3D reconstruction and class segmentation Obtain separate surface for each class of object Corresponds to multi-label volumetric segmentation problem 10

Discrete and Continuous Formulations Discrete Domain Continuous Domain Smoothness: transitions along edges Formulation: Linear Program [Schlesinger 1976] Arbitrary smoothness cost allowed Smoothness: (anisotropic) boundary length Formulation: Convex Program [Chambolle, Cremers, Pock 2008] Smoothness needs to form a metric

Convex, Continuous Multi-Label Formulation [Zach, Häne, Pollefeys, CVPR 2012, TPAMI 2014] Metric smoothness fulfills triangle inequality Truncated quadratic smoothness non-metric Our continuously inspired formulation Takes best from both worlds Non-metric and anisotropic boundary length cost

Cost for boundary: Energy Formulation

Cost for boundary: Energy Formulation

Cost for boundary: Energy Formulation

Wulff shapes The shape of an equilibrium crystal is obtained, according to the Gibbs thermodynamic principle, by minimizing the total surface free energy associated to the crystal-medium interface. http://www.scholarpedia.org/ Wulff's theorem: The minimum surface energy for a given volume of a polyhedron will be achieved if the distances of its faces from one given point are proportional to their surface tension Proposed use for anisotropic regularization [Esedoglu and Oscher 2004, Zach et al. 2009, Haene et al. 2013/14/15]

Dense Semantic 3D Reconstruction [Häne, Zach, Cohen, Angst, Pollefeys, CVPR 2013] Input Images Semantic Classifier Sparse Reconstruction Dense Matching Class Likelihoods Depth Maps

Dense Semantic 3D Reconstruction [Häne, Zach, Cohen, Angst, Pollefeys, CVPR 2013] Class Likelihoods Depth Maps Joint Fusion, Convex Optimization Dense Semantic 3D Model

Formulation: Labeling of a Voxel Space [Häne, Zach, Cohen, Angst, Pollefeys, CVPR 2013] Data Term: Described as per-voxel unary potentials Regularization Term: Class-specific, direction dependent, surface area penalization Learned from training data

Class specific training/selection of building ground vegetation stuff air building ground vegetation stuff 20

Joint 3D reconstruction and class segmentation (Haene et al CVPR13) 21

Weakly Observed Structures I Buildings standing on the ground

Weakly Observed Structures II Building separated from vegetation

Labels can be separated Unobserved Surfaces

Labels can be separated Unobserved Surfaces

Joint 3D reconstruction and class segmentation (Haene et al CVPR13) 26

More Results Single reconstruction, segmentation per dataset Input Image Dense 3D Reconstruction Semantic 3D Reconstruction

Higher-order ray potentials to model visibility Volumetric formulation (Savinov et al, CVPR15/CVPR16) Ray potentials Pairwise regularizer Cost based on the first occupied voxel along the ray freespace depth label

Cost based on the first occupied voxel along the ray (Savinov et al, CVPR15/CVPR16) r 3 ψ r (x r )=φ r (3, red) K r =3 No dependency on those voxels

Visibility Consistency Constraint (Savinov et al, CVPR16) (non-convex)

Results minimize Δ inference generative 31 (Savinov et al, CVPR15/16)

2-label V-Constraint: SOTA on Middlebury V-Constraint: nicely handles thin objects

Joint classification of depth and class? 34

Learning joint classifier for depth and class? (Ladicky, Shi and Pollefeys CVPR 2014) key idea: avoid learning class appearance at different scales How well do we estimate depth? proposed approach is both efficient and unbiased towards specific depths 35

Can our approach also handle objects? Extend approach from Stuff to Things Introduce location dependent anisotropic smoothness prior 36

Learning location-dependent anisotropic smoothnes prior (Haene et al CVPR 2014) Download training data from Google 3D warehouse At every voxel in 3D bounding box, estimate distribution of observed shape normals Determine convex Wulff shape which best represents observed statistics 37

38 Cars

39 Car reconstructions

Advantages of multi-class segmentation Learn separate statistics for car-air and car-ground transition likelihoods 40

Semantic multi-class 3D head reconstruction (Maninchedda et al.) Align prior to data by minimizing smoothness term towards position parametric shape model implicit semantic shape model

Segment based 3D object shape priors represent non-convex object shape priors as combination of convex part priors Karimi, Haene, Pollefeys (CVPR15) build prior by performing (approximate) convex decomposition of example (and observe which transitions occur) 42 Computational 3D Photography

Result: Tables

Result: Trees

Result Mug

Multi-Body Depth-Map Fusion with Non-Intersection Constraints Laptop with and without self-intersection (Jacquet et al ECCV14) 46

Detect and regularize for symmetries (Speciale et al.) A preference for symmetry can be introduced by adding non-local regularization terms

Conclusion Volumetric multi-view approach which performs joint reconstruction, recognition and segmentation Strong coupling between geometry and appearance via classdependent anisotropic smoothness term Clean energy formulation (with tight convex relaxation) Significant qualitative improvement with respect to state of the art (regularized solution make more sense) Semantic reconstruction is more useful

Challenges and future research Scale to many classes leverage sparsity of semantic interactions, class hierarchies Scale to large volumes adaptive space discretization and basis representation Dynamic scenes Spatio-temporal interactions, extensions to 4D volumes and Wulff shapes Exploration and robotics Enable real-time navigation and exploration Predict information gain of perception actions

Thank you for your attention! Questions? Collaborators: Christian Haene, Nikolay Savinov, Lubor Ladicky, Christopher Zach, Andrea Cohen