3D Scanning for Virtual Shopping. Dr. Jürgen Sturm Group Leader RGB-D

Similar documents
GPU Programming in Computer Vision

Real-Time 3D Reconstruction Using a Kinect Sensor

High speed 3D capture for Configuration Management DOE SBIR Phase II Paul Banks

Point Cloud Simulation & Applications Maurice Fallon

Super-Resolution Keyframe Fusion for 3D Modeling with High-Quality Textures

PCL - SURFACE RECONSTRUCTION

MVTec Software GmbH.

Interactive Dense 3D Modeling of Indoor Environments

BUILDING TELEPRESENCE SYSTEMS: Translating Science Fiction Ideas into Reality

Video-Rate Stereo Vision on a Reconfigurable Hardware. Ahmad Darabiha Department of Electrical and Computer Engineering University of Toronto

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite

MeshLab and Arc3D: Photo-Reconstruction and Processing of 3D meshes

Incremental Surface Extraction from Sparse Structure-from-Motion Point Clouds

Introduction. C 2009 John Wiley & Sons, Ltd

REALTIME 3D MAPPING, OPTIMIZATION, AND RENDERING BASED ON A DEPTH SENSOR CRAIG MOUSER. Bachelor of Science in Computer Engineering

Creating Smarter, More Interactive Apps and Systems with Computer Vision

KinectFusion: Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera*

Real-time Visual Tracker by Stream Processing

Large-Scale and Drift-Free Surface Reconstruction Using Online Subvolume Registration

Removing Moving Objects from Point Cloud Scenes

How To Analyze Ball Blur On A Ball Image

A Short Introduction to Computer Graphics

Introduction to Computer Graphics

3D Vision An enabling Technology for Advanced Driver Assistance and Autonomous Offroad Driving

Pedestrian Detection with RCNN

Limits and Possibilities of Markerless Human Motion Estimation

Amit Moran, Gila Kamhi, Artyom Popov, Raphaël Groscot Perceptual Computing - Advanced Technologies Israel

Augmented Reality for Construction Site Monitoring and Documentation

Flow Separation for Fast and Robust Stereo Odometry

Situated Visualization with Augmented Reality. Augmented Reality

A Methodology for Obtaining Super-Resolution Images and Depth Maps from RGB-D Data

Matthias Greiner Augmented Reality by metaio

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Digital Image Increase

Online Learning for Offroad Robots: Using Spatial Label Propagation to Learn Long Range Traversability

Interactive Segmentation, Tracking, and Kinematic Modeling of Unknown 3D Articulated Objects

Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling

Industrial Vision Days 2012 Making Cameras Smarter: FPGA Based Image Pre-processing Unleashed

Part-Based Recognition

Applications of Deep Learning to the GEOINT mission. June 2015

Tracking and integrated navigation Konrad Schindler

Mobile Robot FastSLAM with Xbox Kinect

Multi-Resolution Surfel Maps for Efficient Dense 3D Modeling and Tracking

Terrain Traversability Analysis using Organized Point Cloud, Superpixel Surface Normals-based segmentation and PCA-based Classification

IP-S3 HD1. Compact, High-Density 3D Mobile Mapping System

3D Face Modeling. Vuong Le. IFP group, Beckman Institute University of Illinois ECE417 Spring 2013

Virtual Environments - Basics -

RGB-D Mapping: Using Kinect-Style Depth Cameras for Dense 3D Modeling of Indoor Environments

VZ-C3D Visualizer. Bringing reality closer in 3D

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

VIRTUAL TRIAL ROOM USING AUGMENTED REALITY

Making Machines Understand Facial Motion & Expressions Like Humans Do

Visual and mobile Smart Data

Optical Digitizing by ATOS for Press Parts and Tools

INTRODUCTION TO RENDERING TECHNIQUES

3D U ser I t er aces and Augmented Reality

NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect

A Prototype For Eye-Gaze Corrected

GIS DATA CAPTURE APPLICATIONS MANDLI COMMUNICATIONS

Colorado School of Mines Computer Vision Professor William Hoff

Self-Positioning Handheld 3D Scanner

Big Data: Image & Video Analytics

Efficient Organized Point Cloud Segmentation with Connected Components

Computer Graphics Hardware An Overview

CUBE-MAP DATA STRUCTURE FOR INTERACTIVE GLOBAL ILLUMINATION COMPUTATION IN DYNAMIC DIFFUSE ENVIRONMENTS

Characterizing Task Usage Shapes in Google s Compute Clusters

Service-Oriented Visualization of Virtual 3D City Models

Morphological analysis on structural MRI for the early diagnosis of neurodegenerative diseases. Marco Aiello On behalf of MAGIC-5 collaboration

Low power GPUs a view from the industry. Edvard Sørgård

Photography & Video Style Guide Standards for St. Edward s University

Visual and Inertial Data Fusion for Globally Consistent Point Cloud Registration

Segmentation of building models from dense 3D point-clouds

Intelligent Flexible Automation

Basler. Line Scan Cameras

Automatic Labeling of Lane Markings for Autonomous Vehicles

Computer Animation and Visualisation. Lecture 1. Introduction

Monash University Clayton s School of Information Technology CSE3313 Computer Graphics Sample Exam Questions 2007

The Scientific Data Mining Process

Interactive Level-Set Segmentation on the GPU

ENGN D Photography / Winter 2012 / SYLLABUS

A 3D OBJECT SCANNER An approach using Microsoft Kinect.

Blender 3D Animation

Solution Guide III-C. 3D Vision. Building Vision for Business. MVTec Software GmbH

Color Segmentation Based Depth Image Filtering

Augmented reality. Parco Regionale Appia Antica

Constrained Tetrahedral Mesh Generation of Human Organs on Segmented Volume *

Sensor Fusion Mobile Platform Challenges and Future Directions Jim Steele VP of Engineering, Sensor Platforms, Inc.

Circle Object Recognition Based on Monocular Vision for Home Security Robot

Rendering Microgeometry with Volumetric Precomputed Radiance Transfer

Immersive Medien und 3D-Video

CS231M Project Report - Automated Real-Time Face Tracking and Blending

Advanced Methods for Pedestrian and Bicyclist Sensing

Real-Time Camera Tracking and 3D Reconstruction Using Signed Distance Functions

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES

Privacy Preserving Automatic Fall Detection for Elderly Using RGBD Cameras

Digital 3D Animation

CHAPTER 1. Introduction to CAD/CAM/CAE Systems

Interactive Level-Set Deformation On the GPU

Static Environment Recognition Using Omni-camera from a Moving Vehicle

1. INTRODUCTION Graphics 2

Transcription:

3D Scanning for Virtual Shopping Dr. Jürgen Sturm Group Leader RGB-D metaio GmbH

Augmented Reality with the Metaio SDK 2

Metaio SDK SDK for developers to create augmented reality apps Supports ios, Android, Windows, Unity Provides efficient implementations for Marker-based 3D tracking Template-based 3D tracking Sparse visual odometry Sparse SLAM (feature-based, local+global bundle adjustment, relocalization, uses depth if available) Edge-based tracking (known CAD model) This talk: Cutting edge research at Metaio Dense RGBD-based camera tracking and 3D reconstruction Face detection, tracking, and shape alignment Visual inertial sparse odometry

My Research Background Visual navigation for mobile robots RoboCup Kinematic Learning Articulated Objects Quadrotors Camera tracking and 3D reconstruction RGB-D SLAM Visual Odometry Large-Scale Reconstruction 3D Scanning

My Research Background Visual navigation for mobile robots Camera-based Navigation of a Low-Cost Quadrotor RoboCup Kinematic Learning Articulated Objects Quadrotors Camera tracking and 3D reconstruction [IROS 12, RSS 13, UAV-g 13, RAS 14] TUM TeachInf Best Lecture Award 2012 and 2013 EdX Course AUTONAVx with 20k participants RGB-D SLAM Visual Odometry Large-Scale Reconstruction 3D Printing

My Research Background Visual navigation for mobile robots Camera-based Navigation of a Low-Cost Quadrotor RoboCup Kinematic Learning Articulated Objects Quadrotors Camera tracking and 3D reconstruction [IROS 12, RSS 13, UAV-g 13, RAS 14] TUM TeachInf Best Lecture Award 2012 and 2013 EdX Course AUTONAVx with 20k participants RGB-D SLAM Visual Odometry Large-Scale Reconstruction 3D Printing

My Research Background Visual navigation for mobile robots EdX Course Autonomous Navigation for Flying Robots RoboCup Kinematic Learning Articulated Objects Quadrotors Camera tracking and 3D reconstruction [IROS 12, RSS 13, UAV-g 13, RAS 14] TUM TeachInf Best Lecture Award 2012 and 2013 EdX Course AUTONAVx with 20k participants RGB-D SLAM Visual Odometry Large-Scale Reconstruction 3D Printing

My Research Background Visual navigation for mobile robots RoboCup Kinematic Learning Articulated Objects Quadrotors Camera tracking and 3D reconstruction RGB-D SLAM Visual Odometry Large-Scale Reconstruction 3D Scanning

Motivation Wouldn t it be cool to have a 3D photo booth? Questions: How to scan a person in 3D? How to prepare the model for 3D printing?

Problem Description Setup: Static RGB-D camera, person sitting on a swivel chair Given: A sequence of color and depth images Wanted: Accurate, watertight 3D model

Signed Distance Function (SDF) [Curless and Levoy, 96] D(x) < 0 D(x) = 0 D(x) > 0 Negative signed distance (=outside) Positive signed distance (=inside)

Signed Distance Function (SDF) [Curless and Levoy, 96] Compute SDF from a depth image Measure distance of each voxel to the observed surface Can be done in parallel for all voxels ( GPU) d obs = z I Z (¼(x; y; z)) camera

Signed Distance Function (SDF) [Curless and Levoy, 96] Calculate weighted average over all measurements Assume known camera poses (for now) c n c n 1. c 1 Several measurements of each voxel WD + wd D Ã W + w WC + wc C Ã W + w W Ã W + w

Mesh Extraction using Marching Cubes Find zero-crossings in the signed distance function by interpolation

Estimating the Camera Pose SDF built from the first k frames c k ::: c 1

Estimating the Camera Pose We seek the next camera pose (k+1) c k+ 1 c k ::: c 1

Estimating the Camera Pose KinectFusion generates a synthetic depth image from SDF and aligns it using ICP c k+ 1 c k ::: c 1

Estimating the Camera Pose [Bylow, Sturm, Kerl, Kahl, Cremers; RSS 2013] Our approach: Use SDF directly during minimization c k+ 1 c k ::: c 1

Estimating the Camera Pose [Bylow, Sturm, Kerl, Kahl, Cremers; RSS 2013] Our approach: Use SDF directly during minimization c k+ 1 c k ::: c 1

Estimating the Camera Pose [Bylow, Sturm, Kerl, Kahl, Cremers; RSS 2013] Our approach: Use SDF directly during minimization

Evaluation on Benchmark [Bylow, Sturm, Kerl, Kahl, Cremers; RSS 2013] Thorough evaluation on TUM RGB-D benchmark Comparison with KinFu and RGB-D SLAM Significantly more accurate and robust than ICP Algorithm Resolution Teddy (RMSE) Desk (RMSE) Plant (RMSE) KinFu 256 0.156 m 0.057m 0.598 m KinFu 512 0.337 m 0.068 m 0.281 m Our 256 0.086 m 0.038 m 0.047 m Our 512 0.080 m 0.035 m 0.043 m

Automatically Close Holes [Sturm, Bylow, Kahl, Cremers; GCPR 2013] Certain voxels are never observed in near range Regions with no data result in holes Idea: Truncate weights to positive values high visibility no/poor visibility

Hollowing Out [Sturm, Bylow, Kahl, Cremers; GCPR 2013] Printing cost is mostly dominated by volume Idea: Make the model hollow before after

Video (real-time) [Sturm, Bylow, Kahl, Cremers; GCPR 2013]

Examples of Printed Figures [Sturm, Bylow, Kahl, Cremers; GCPR 2013]

More Examples [Sturm, Bylow, Kahl, Cremers; GCPR 2013] Still need a Christmas present?

How Can Use This for Virtual Shopping? Virtual Try-On of Sunglasses Earrings Target devices: Kiosk applications Smartphones Productification challenges that we faced at Metaio: Dense SDF representation is not efficient High memory consumption Computationally intense (strong GPU needed for real-time processing) Make it so that it always works Initialization / recovery / re-localization Automatic object placement Smartphones do not have depth cameras (yet) Can we do the same with a monocular (user-facing) camera of a smartphone?

Efficient 3D Face Reconstruction How to reduce memory and CPU utilization? Specialized representation for faces Spherical base shape + bump map + 2D texture map Bump Image with Texture Information Bump Image Recording Local Geometry

3D Face Reconstruction using Bump Maps Bump map update takes 2ms on CPU Tracking is currently being ported from ICP to DVO, expected <5ms

3D Face Reconstruction using Bump Maps Reconstruct head while user turns left and right Use 3D geometry as an occlusion model

Physics Simulation Add physics / gravity to the models Unity engine

Initialization and Automatic Recovery How do we know what to reconstruct? 2D face detection on the image How do we initialize the first pose? 3D face pose estimation using random forests How can we recognize tracking loss? Evaluate inlier ratio after ICP step

3D Head Pose Estimation using Random Forests Train a random forest that predicts 3D head pose from depth images Pixel-wise classification head/no-head Head pixels vote for center + head orientation Estimate head center using mean shift and average inlier votes Recorded training dataset: 26 persons with marker on head

Re-initialization and non-frontal initialization

Demo

Can we do the same with a monocular camera? Shape alignment using random regression forests + ridge regression 68 fiducials (landmarks) per face Live demo

Face Alignment Demo

Virtual Shopping: Place in your room IKEA smartphone app Allows customers to place virtual furniture in their room Uses catalog as marker for SLAM & scale initialization How can we improve the shopping experience? Occlusion modelling Light estimation Re-texturing of surfaces 3D room scanning Automatic room measurements

Occlusion Modeling

Re-Texturing of Surfaces

3D Room Scanning

3D Room Scanning

DVO Demo

Sparse inertial visual odometry (SIVO) Kalman filter-based approach IMU is used in prediction step Feature tracks are used in correction step Very robust, efficient, metrically correct

Summary Exciting field Hot topics Dense methods for 3D tracking and reconstruction Deep learning for face detection, pose estimation and alignment of fiducials Approaches for tight IMU integration and data fusion Live demos Virtual try-on of earrings Face alignment 3D room scanning / occlusion demo / re-texturing We re always looking for interesting speakers at our weekly research colloqium We re hiring! Computer Vision Machine Learning Sensor/IMU data fusion http://www.metaio.com/careers/

We re hiring! Metaio Phone (EMEA): +49-89-5480-198-0 Phone (US): +1-415-814-3376 info@metaio.com www.metaio.com http://www.facebook.com/metaio @twitt_ar http://twitter.com/#!/twitt_ar http://augmentedblog.wordpress.com/ http://www.youtube.com/user/metaioar