Point Cloud & Applications Maurice Fallon Contributors: MIT: Hordur Johannsson and John Leonard U. of Salzburg: Michael Gschwandtner and Roland Kwitt
Overview : Dense disparity information Efficient Image Parallelizing to large arrays of viewports Localization using only RGB-D Algorithm Development and Comparison: Kinect Fusion Segmentation, Object Recognition, next best viewpoint Stochastic Point Cloud Alignment An alternative: Blensor Integrated within Blender
Simulating Depth Images Kinect/RGB-D: Dense disparity information Uncertainty is a function of disparity Overview: Create an OpenGL camera viewport with a Kinect-like calibration: 640x480, fx, fy, cx, cy. Render Triangle Based Model Optionally compare to a reference depth image Uses GL Shading Language Pyramid-based summation for efficiency Read Depth, Color and Score Buffer back from GPU Quantize corresponding to actual sensor disparity (e.g. XX levels) Add normally distributed noise (on disparity) Optionally convert to a range image or save to PCD file 20m away Alternatives and other noise characteristics: more later
Example Depth Image
Simulating Depth Images Kinect/RGB-D: Dense disparity information Uncertainty is a function of disparity Overview: Create an OpenGL camera viewport with a Kinect-like calibration: 640x480, fx, fy, cx, cy. Render Triangle Based Model Optionally compare to a reference depth image Uses GL Shading Language Pyramid-based summation for efficiency Read Depth, Color and Score Buffer back from GPU Quantize corresponding to actual sensor disparity (e.g. XX levels) Add normally distributed noise (on disparity) Optionally convert to a Range Image or save to PCD file 20m away Alternatives and other noise characteristics: more later
Parallel Depth Image State Estimation -Traditional -Stochastic Kinect Localization: -Visual Odometry -Depth Optimizations: Triangle-based Model is stored on GPU (8000 triangles) Virtual camera array created using particle poses Each scene simultaneously rendered with OpenGL Likelihood function evaluated in parallel on GPU using GLSL GL Shading Language can handle the basic arithmetic Results: -Multi-robot demos -Future work Measurements decimated to 20x15
pcl::simulation Input Information:.ply (currently preferred),.obj,.vtk input with or w/o color Initial Pose Development: Started in Summer 2011 Targeted for pcl_1.6 Demo applications: pcl_sim_viewer Point and click simulation using usual PCL viewer pcl_sim_terminal_demo Illustration of the programming interface
pcl::simulation 27840 triangle faces, 13670 vertices Different Configs: 57.00Hz: simulation only 30.61Hz: simulation, getpointcloud 40.00Hz: simulation, getpointcloud, writebinary 28.50Hz: simulation, addnoise, getpointcloud, writebinary Module Comparison: 31% simulation 11% getpointcloud (from GPU) 41% addnoise 16% writebinary
Overview : Dense disparity information Efficient Image Parallelizing to large arrays of viewports Localization using only RGB-D Algorithm Development and Comparison: Kinect Fusion Segmentation, Object Recognition, next best viewpoint Stochastic Point Cloud Alignment An alternative: Blensor simulation add-on for Blender
RGB-D Vision SLAM Kinect has fueled new interest in Vision SLAM: 3D maps [Henry et al. ISER 2010] Kinect Fusion [Newcombe et al. ISMAR 2011] Henry et al. ISER 2010 Raphael Favier TuE Full model requires: Views of every room, every direction, less than 5m range Storage requirements for a building 100s of GBs Mesh in video: 300MB Newcombe et al. ISMAR 2011
The Case for Visual Localization Extreme 3D motion, low lighting, cheap, dynamics v Current vslam capabilities
Real-time. 5 m/s. 1000 particles. Only a Kinect
Simple Input Map Convert 3D vslam output to Planar Map Representation Scales well: Low maintenance Supports loop-closures Sparse, small file size: 2MB for 9 floor MIT Stata Center Convert Points to Planes Large indoor planes don t often change Can simulate views not seen during SLAM
Other Approaches to Localization Convert to simulated LIDAR scan: Use with LIDAR MCL, limited to 2D with wheel odometry 3D Visual Feature Registration: Requires full SfM solution Many failure modes Potentially more accurate Complementary to our research H. Johannsson
Maintaining Particle Diversity Localization must degrade smoothly: Model is incomplete or has changed When sensor imagery is blurred or uninformative People moving, occluding the sensor Efficient belief propagation: 1000s of particles @ 10-20Hz
Particle Filter Overview Propagation Likelihood
Particle Propagation Visual Odometry State Vector: Propagate using FOVIS: Fast Odometry for Vision 1 0.08m/s mean velocity error When VO Fails: Add extra noise, drift Future: IMU integration [1] A Huang, A Bachrach, P Henry, M Krainin, D Maturana, D Fox, N Roy. Visual Odometry and Mapping for Autonomous Flight Using an RGB-D Camera, ISRR 2011 http://code.google.com/p/fovis/
Particle Propagation Failed VO VO failure modes VO success
Particle Filter Overview Propagation Likelihood
Likelihood Function Evaluation Likelihood for pixel i (evaluated on disparity):
Human Portable
Extensive Testing 1.2m/s, 0.48m 1.05m/s, 0.66m 0.47m/s, 0.3m Paper results out of date Significant optimization since 2D works with 10s particles
Fast, Cheap and In Control x4 realtime
Localization Summary Contribution: Efficient simulation of 1000s of model views Robust Localization using only RGB-D Extensive testing Including closed loop Open Source Code: FOVIS VO Library Integrated within Point Cloud Library: pcl::simulation Future Work: IMU Color [Mason et al, ICRA 2011] Bag of Words Visual Recovery Barometry Integration with Lifelong vslam
Simulating Range Images : Dense disparity information Efficient Image Parallelizing to large arrays of viewports Localization using only RGB-D Algorithm Development and Comparison: Kinect Fusion Segmentation, Object Recognition, next best viewpoint Stochastic Point Cloud Alignment An alternative: Blensor simulation add-on for Blender
Simulated Data as Input to KinFu Provide simulated input to PCL s Kinect Fusion Algorithm: Decouple live sensor input from surface reconstruction Useful to new PCL users Better debugging with known MODEL ground truth Can hypothesis performance limitations: Speed of motion, noise, range
Simulated Table Top Newcombe et al. ISMAR 2011
Simulated Table Top Camera flies around table in a halo 27,000 Triangles Constructed by Michael Gschwandtner
of Color and Depth
KinFu Mesh Reconstruction
Simulating Range Images : Dense disparity information Efficient Image Parallelizing to large arrays of viewports Localization using only RGB-D Algorithm Development and Comparison: Kinect Fusion Segmentation, Object Recognition, next best viewpoint Stochastic Point Cloud Alignment An alternative: Blensor simulation add-on for Blender
Simulating Range Images : Dense disparity information Efficient Image Parallelizing to large arrays of viewports Localization using only RGB-D Algorithm Development and Comparison: Kinect Fusion Segmentation, Object Recognition, next best viewpoint Stochastic Point Cloud Alignment An alternative: Blensor simulation add-on for Blender
Sensor simulation based on Blender Additions to internal Blender functions to provide efficient access to the raycasting functions of Blender Python modules that encapsulate the sensor specific behaviour Supported sensors Rotating LIDAR Velodyne HDL-64e Line LIDAR Ibeo Sick ToF cameras Swissranger PMD Kinect
Features Emphasis on algorithm verification and testdata creation no realtime capabilities Provides Per point color from material settings or from textures (UV mapped or procedural) Per point object ID for verification of clustering and segmentation algorithms Clean data (ground truth) Noisy data (including all supported physical effects) Motiondata of all or only a subset of objects in the scene Single scans or range of scans with possible animation of the scene
Features Physics simulation through recording of BGE (Game Engine) simulations Scan visualization inside BlenSor Integrated with the animation system to show scans only with corresponding frames Textures can influence the reflectivity of objects This simulates non-uniform reflectivity Scan export EVD (custom blensor file format) PCD (pointcloud library format) PGM (16 Bit depthmaps, currently only for Kinect) settings are stored with the blend file Increases reproducibility of research by simplifying the distribution of sample data (only the blend file needs to be distributed)
Advantages & Disadvantages Cons is very slow. Especially on sensors with many points A certain familiarity with Blender is required Pros Integration of the animation/modeling with the simulation. Easy to iteratively adjust the simulation: model >> simulate >> remodel - Easy to leverage Blender features for the simulations (physics simulation, mesh tools, procedural textures, ) Dynamic animations can be done with or without custom code Through the animation system of Blender (camera/object movements, armature animation, etc.) By changing the scene via python scripts and/or calling the simulation directly from python
Kinect Cast all rays from the projector onto the scene Cast rays from the camera to all intersections of the projector rays with the scene If a projector ray hits the scene at the same point as a camera ray it is a valid measurement Otherwise occlusion Calculate the disparity between the sensor and the projector and quantize it to 1/8th of a pixel Recalculate the coordinates based on the quantized disparity For every valid measurement a 9x9 window is checked If enough points are within a certain distance a weighted average of them is calculated as the final result for this point
Roadmap Multicore support to speed up simulations Improved Kinect simulation (more accurate depth interpolation) Generic rotating LIDAR (not only Velodyne) Mixed pixel support Refraction Partial reflection (Mirrors are already supported) Spin image export
Example Kinect Scan Model from dmi-3d.net Without advanced reflectivity settings
Example Kinect Scan Model from dmi-3d.net With advanced reflectivity settings Reflectivity depends on distance Windshield has procedural noise on its reflectivity
Summary BlenSor pcl::simulation Dependencies Blender None beyond Open GL Speed 1/5Hz 20s Hz Realism Higher Lower Support RGB-D & LIDAR Only RGB-D Simultaneous Multi-View Possible Yes Input Models Wide variety PLY, VTK, OBJ
Real-time. 5 m/s. 1000 particles. Only a Kinect