COMPUTER VISION FOR ROBOT NAVIGATION. Sanketh Shetty Computer Vision and Robotics Laboratory University of Illinois Urbana-Champaign

Similar documents
Online Learning for Offroad Robots: Using Spatial Label Propagation to Learn Long Range Traversability

Colorado School of Mines Computer Vision Professor William Hoff

3D Vision An enabling Technology for Advanced Driver Assistance and Autonomous Offroad Driving

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite

Mobile Robot FastSLAM with Xbox Kinect

High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning

3D Scanner using Line Laser. 1. Introduction. 2. Theory

Static Environment Recognition Using Omni-camera from a Moving Vehicle

3D Vision An enabling Technology for Advanced Driver Assistance and Autonomous Offroad Driving

Classifying Manipulation Primitives from Visual Data

6 Space Perception and Binocular Vision

Visual Servoing Methodology for Selective Tree Pruning by Human-Robot Collaborative System

Limitations of Human Vision. What is computer vision? What is computer vision (cont d)?

Tracking and integrated navigation Konrad Schindler

Space Perception and Binocular Vision

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

Cloud Based Localization for Mobile Robot in Outdoors

Automatic Labeling of Lane Markings for Autonomous Vehicles

Robotics. Chapter 25. Chapter 25 1

3D U ser I t er aces and Augmented Reality

Tracking in flussi video 3D. Ing. Samuele Salti

Feature Tracking and Optical Flow

A Reliability Point and Kalman Filter-based Vehicle Tracking Technique

Terrain Traversability Analysis using Organized Point Cloud, Superpixel Surface Normals-based segmentation and PCA-based Classification

Binocular Vision and The Perception of Depth

An Iterative Image Registration Technique with an Application to Stereo Vision

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Build Panoramas on Android Phones

INTRODUCTION TO RENDERING TECHNIQUES

Definitions. A [non-living] physical agent that performs tasks by manipulating the physical world. Categories of robots

Limits and Possibilities of Markerless Human Motion Estimation

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY

Flow Separation for Fast and Robust Stereo Odometry

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016

Dynamic composition of tracking primitives for interactive vision-guided navigation

Digital Image Increase

Introduction Epipolar Geometry Calibration Methods Further Readings. Stereo Camera Calibration

Introduction to Computer Graphics

Introduction. C 2009 John Wiley & Sons, Ltd

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

Epipolar Geometry and Visual Servoing

Files Used in this Tutorial

A Short Introduction to Computer Graphics

Neural Network based Vehicle Classification for Intelligent Traffic Control

DAMAGED ROAD TUNNEL LASER SCANNER SURVEY

How To Analyze Ball Blur On A Ball Image

ARC 3D Webservice How to transform your images into 3D models. Maarten Vergauwen

Machine Learning for Data Science (CS4786) Lecture 1

How To Fuse A Point Cloud With A Laser And Image Data From A Pointcloud

Probabilistic Latent Semantic Analysis (plsa)

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

Robot Sensors. Outline. The Robot Structure. Robots and Sensors. Henrik I Christensen

BUILDING TELEPRESENCE SYSTEMS: Translating Science Fiction Ideas into Reality

Interactive Computer Graphics

An Approach for Utility Pole Recognition in Real Conditions

A technical overview of the Fuel3D system.

Removing Moving Objects from Point Cloud Scenes

VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

A Prototype For Eye-Gaze Corrected

Human behavior analysis from videos using optical flow

Analecta Vol. 8, No. 2 ISSN

C# Implementation of SLAM Using the Microsoft Kinect

3D Laser Scan Classification Using Web Data and Domain Adaptation

Robust Pedestrian Detection and Tracking From A Moving Vehicle

IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS

Applications of Deep Learning to the GEOINT mission. June 2015

Taking Inverse Graphics Seriously

Handbook of Robotics. Chapter 22 - Range Sensors

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

The use of computer vision technologies to augment human monitoring of secure computing facilities

Measuring large areas by white light interferometry at the nanopositioning and nanomeasuring machine (NPMM)

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES

large-scale machine learning revisited Léon Bottou Microsoft Research (NYC)

Building an Advanced Invariant Real-Time Human Tracking System

Environmental Remote Sensing GEOG 2021

3D Model of the City Using LiDAR and Visualization of Flood in Three-Dimension

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R.

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

The Scientific Data Mining Process

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions

Wii Remote Calibration Using the Sensor Bar

Advanced Methods for Pedestrian and Bicyclist Sensing

Learning a wall following behaviour in mobile robotics using stereo and mono vision

automatic road sign detection from survey video

Autonomous Advertising Mobile Robot for Exhibitions, Developed at BMF

Tracking of Small Unmanned Aerial Vehicles

MetropoGIS: A City Modeling System DI Dr. Konrad KARNER, DI Andreas KLAUS, DI Joachim BAUER, DI Christopher ZACH

ISSN Review. Autonomous Vehicles Navigation with Visual Target Tracking: Technical Approaches

Vehicle Tracking System Robust to Changes in Environmental Conditions

Real-Time Camera Tracking Using a Particle Filter

Real-Time Tracking of Pedestrians and Vehicles

Using Optech LMS to Calibrate Survey Data Without Ground Control Points

Traffic Flow Monitoring in Crowded Cities

Safe Robot Driving 1 Abstract 2 The need for 360 degree safeguarding

Decision Trees from large Databases: SLIQ

Automotive Applications of 3D Laser Scanning Introduction

Transcription:

COMPUTER VISION FOR ROBOT NAVIGATION Sanketh Shetty Computer Vision and Robotics Laboratory University of Illinois Urbana-Champaign

MEET THE ROBOTS

WHAT DOES A ROBOT CARE ABOUT WHEN NAVIGATING? Current Location and Destination (possibly intermediate tasks & goals) Minimizing damage to self Minimizing damage to environment Maximizing efficiency (energy spent vs. distance travelled) Knowing what objects it can interact with. Knowing how it can interact with its environment Learning about its operating environment (e.g. Map building)

WHERE CAN VISION HELP? Robot Localization Obstacle avoidance Mapping (determining navigable terrain) Recognizing people and objects Learn how to interact (e.g. grasp) with objects

CASE STUDY: DARPA GRAND CHALLENGE Stanley, VW Touareg Learned discriminative machine learning models on Laser range-finding data to determine navigable vs. rough-obstacle-filled terrain. Terrain classifier => speed bottleneck. Used vision to extend the path-planning horizon (video)

TODAY S PAPERS Vision based Robot Localization and Mapping using Scale Invariant Features, S. Se et al. (ICRA 2001) High Speed Obstacle Avoidance Using Monocular Vision and Reinforcement Learning, Michels et al. (ICML 2005) Opportunistic Use of Vision to Push Back the Path Planning Horizon, Nabbe et al. (IROS 2006)

Vision based Robot Localization and Mapping using Scale Invariant Features Goal: Simultaneous Localization and Map Building using stable visual features. Evaluated in an indoor environment. Prior Work: Used laser scanners and range finders for SLAM Limited range. Unsatisfactory description of the environment. Why do we care about building maps?

SYSTEM DESCRIPTION Obtain 3 images from Triclops camera. Detect SIFT Key-points.

STEREO MATCHING OF KEY-POINTS Match key-points across 3 images using following criterion: Epipolar Constraints Disparity Scale and Orientation Prune ambiguous matches. Calculate (X,Y,Z) world coordinates from disparity and camera parameters, for stable points.

EGO-MOTION ESTIMATION Initialize solution of transformation between two frames from odometry data. Use least squares to find a correction to transformation matrix that better projects points in one frame to the next.

LANDMARK TRACKING Each tracked SIFT landmark indexed by(x, Y, Z, scale, orientation, L) 4 Types of Landmark Points identified Details: Track Initiation Track Termination Field of View Only points with Z>0 and within 60 degree viewing angle of the Triclops are considered.

RESULTS & DISCUSSION Authors add heuristics to make SLAM robust. Viewing angle heuristic. Determining permanent landmarks Improve tracking of 3D position of key-points using Kalman Filtering.

DISCUSSION This is the bootstrapping problem where two models both require the other to be known before either can proceed. In this case, they present the interplay between localization over time and obtaining a map of the environment. - Ian As noted by the author, the processing time is mainly used by SIFT feature extraction. Probably in such environment, we don't need to use such a strong feature as the original SIFT feature. We could reduce the bins or cells to make it quicker. -Gang I think finding the interest points are most time-consuming. SIFT descriptor extraction is much fast, and we can even learn a quick approximation mapping for that. - Jianchao Processing time for SIFT feature extraction is always considerable. However I guess maintaining a database containing SIFT landmarks before robot navigation could be really useful. Even if you think in disaster management scenarios, first responders use maps to get into a building. So why not facilitating navigation, by precompiling a database of SIFT databases for critical infrastructures? - Mani

High Speed Obstacle Avoidance Using Monocular Vision and Reinforcement Learning Goal: Control a remote control car driven at high-speeds through an unstructured environment. Visual features are used as inputs to the control algorithm (RL). Novelty: Use monocular visual cues to figure out object depth. Similar ideas as Make3d (Saxena et al. 2008). Use computer graphics to generate copious amounts of data to train algorithm on. Stereo gave them poorer results Limited range Holes in the scene with no depth estimates I am just wondering why researchers are interested in recovering the 3D information from a single image. Even human cannot estimate the depth well using single eye. Maybe use more calibrated cameras with SIFT mapping? - Jianchao

POSSIBLE MONOCULAR CUES Texture & Color (Gini & Marchi 2002, Shao et al. 1988) Texture gradient Linear Perspective Occlusion Haze Defocus

SYSTEM DESCRIPTION Linear regression on features to predict depth of nearest obstacle in each stripe.

COMPOSITION OF FEATURE VECTOR Divide each scene into 16 stripes Each stripe is labeled with (log) depth of nearest obstacle. Divide each stripe into 11 overlapping windows over which: Texture energies and Texture gradients are computed. Augment each stripe feature with features from adjacent stripes.

TRAINING Synthetic graphics data is used to train the system on a number of plausible environments. Real Images + Laser scan data is also used. Linear regression/ SVR / Robust Regression models learned

CONTROL ALGORITHM (RL)

SAMPLE VIDEO

EXPERIMENTAL RESULTS Texture energy + Texture gradients work better together Harris and Radon features gave comparable performance Performance improved with increasing complexity of graphics Except when shadows and haze were added

QUESTIONS & COMMENTS Why do they use log (depth) here when they regress on depth for the Make3D paper? Why does a linear predictor on log(depth) work so well using texture features? The layout information recovered by Michels et al. (2005) seems like it would be incredibly useful for the map generation required by Se and colleagues (2001). By attaching SIFT features to approximated 3D locations, map creation can begin before ego-motion estimation. - Eamon

Opportunistic Use of Vision to Push Back the Path Planning Horizon Goal: Overcome myopic planning effect by early detection of faraway obstacles and determination of navigable terrain. Use rough geometric estimates of environment from monocular data for path-planning. Applications: Outdoor robotic navigation

PLANNING VIEWPOINTS

SAMPLE RUN ATRV-JR + Firewire Camera +SICK Laser Range finder

COMPARISON OF DIFFERENT SENSING STRATEGIES

DISCUSSION Knowledge of the ground plane is important (Laser scanning data can be used here) Performance should improve with system trained on images the robot is likely to see (e.g. the data used by Michels et al.) Training task specific categories (e.g. road vs. rough vs. grass vs. trees) should improve navigation performance.

FINAL QUESTIONS Can stereo be totally ignored? How can stereo cues be integrated to improve planning? Hadsell, et al. (IROS 2008): Train Deep Belief Networks on Image data with stereo supervision. Do we have a satisfactory explanation for why linear predictors of depth based on texture features work? What are effective strategies for data collection to train these robots?

DONE!