FACTS - A Computer Vision System for 3D Recovery and Semantic Mapping of Human Factors

Similar documents
Removing Moving Objects from Point Cloud Scenes

MetropoGIS: A City Modeling System DI Dr. Konrad KARNER, DI Andreas KLAUS, DI Joachim BAUER, DI Christopher ZACH

3D Vision An enabling Technology for Advanced Driver Assistance and Autonomous Offroad Driving

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite

Interactive Dense 3D Modeling of Indoor Environments

Digital Image Increase

Situated Visualization with Augmented Reality. Augmented Reality

Segmentation of building models from dense 3D point-clouds

Real-Time 3D Reconstruction Using a Kinect Sensor

T-REDSPEED White paper

3D Vision An enabling Technology for Advanced Driver Assistance and Autonomous Offroad Driving

Geo-Services and Computer Vision for Object Awareness in Mobile System Applications

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES

Automatic 3D Mapping for Infrared Image Analysis

RIVA Megapixel cameras with integrated 3D Video Analytics - The next generation

Robot Sensors. Outline. The Robot Structure. Robots and Sensors. Henrik I Christensen

PCL Tutorial: The Point Cloud Library By Example. Jeff Delmerico. Vision and Perceptual Machines Lab 106 Davis Hall UB North Campus.

Tracking and integrated navigation Konrad Schindler

TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

Interactive Segmentation, Tracking, and Kinematic Modeling of Unknown 3D Articulated Objects

A Genetic Algorithm-Evolved 3D Point Cloud Descriptor

NCC-RANSAC: A Fast Plane Extraction Method for Navigating a Smart Cane for the Visually Impaired

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

GOM Optical Measuring Techniques. Deformation Systems and Applications

RGB-D Mapping: Using Kinect-Style Depth Cameras for Dense 3D Modeling of Indoor Environments

Robot Perception Continued

Real time vehicle detection and tracking on multiple lanes

Advanced Methods for Pedestrian and Bicyclist Sensing

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY

Point Cloud Simulation & Applications Maurice Fallon

3D Scanner using Line Laser. 1. Introduction. 2. Theory

Psychology equipment

Keyframe-Based Real-Time Camera Tracking

INTRODUCTION TO RENDERING TECHNIQUES

PCL - SURFACE RECONSTRUCTION

Merging overlapping depth maps into a nonredundant point cloud

Modelling 3D Avatar for Virtual Try on

Tracking a Depth Camera: Parameter Exploration for Fast ICP

VIRTUAL TRIAL ROOM USING AUGMENTED REALITY

The Big Data methodology in computer vision systems

Incremental Surface Extraction from Sparse Structure-from-Motion Point Clouds

A Cheap Portable Eye-Tracker Solution for Common Setups

Motion Capture Sistemi a marker passivi

A Robust And Efficient Face Tracking Kernel For Driver Inattention Monitoring System

Automatic Labeling of Lane Markings for Autonomous Vehicles

New Measurement Concept for Forest Harvester Head

How does the Kinect work? John MacCormick

Behavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011

Tracking devices. Important features. 6 Degrees of freedom. Mechanical devices. Types. Virtual Reality Technology and Programming

A Prototype For Eye-Gaze Corrected

PASSENGER/PEDESTRIAN ANALYSIS BY NEUROMORPHIC VISUAL INFORMATION PROCESSING

Open Source UAS Software Toolkits. Keith Fieldhouse Technical Lead, Kitware Inc.

Bags of Binary Words for Fast Place Recognition in Image Sequences

Fast Matching of Binary Features

3D MODELING OF LARGE AND COMPLEX SITE USING MULTI-SENSOR INTEGRATION AND MULTI-RESOLUTION DATA

SUPERIOR EYE TRACKING TECHNOLOGY. Totally Free Head Motion Unmatched Accuracy State-Of-The-Art Analysis Software.

MoveInspect HF HR. 3D measurement of dynamic processes MEASURE THE ADVANTAGE. MoveInspect TECHNOLOGY

GPS-aided Recognition-based User Tracking System with Augmented Reality in Extreme Large-scale Areas

Making Machines Understand Facial Motion & Expressions Like Humans Do

Real-Time Tracking of Pedestrians and Vehicles

TRENTINO - The research, training and mobility programme in Trentino - PCOFUND-GA

Fast Voxel Maps with Counting Bloom Filters

Epipolar Geometry and Visual Servoing

THE CONTROL OF A ROBOT END-EFFECTOR USING PHOTOGRAMMETRY

Online Learning for Fast Segmentation of Moving Objects

Application Example: Automotive Testing: Optical 3D Metrology improves Safety and Comfort

Wii Remote Calibration Using the Sensor Bar

Live Feature Clustering in Video Using Appearance and 3D Geometry

Introduction. C 2009 John Wiley & Sons, Ltd

How To Analyze Ball Blur On A Ball Image

High speed 3D capture for Configuration Management DOE SBIR Phase II Paul Banks

3D Interactive Information Visualization: Guidelines from experience and analysis of applications

A method of generating free-route walk-through animation using vehicle-borne video image

National Performance Evaluation Facility for LADARs

3D Tracking in Industrial Scenarios: a Case Study at the ISMAR Tracking Competition

SYNTHESIZING FREE-VIEWPOINT IMAGES FROM MULTIPLE VIEW VIDEOS IN SOCCER STADIUM

Dynamic composition of tracking primitives for interactive vision-guided navigation

Vision based Vehicle Tracking using a high angle camera

Wide Area Localization on Mobile Phones

Intuitive Navigation in an Enormous Virtual Environment

Intelligent Flexible Automation

Dementia Ambient Care: Multi-Sensing Monitoring for Intelligent Remote Management and Decision Support

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Activity recognition in ADL settings. Ben Kröse

Augmented Crime Scenes: Virtual Annotation of Physical Environments for Forensic Investigation

Feasibility of an Augmented Reality-Based Approach to Driving Simulation

Visual and Inertial Data Fusion for Globally Consistent Point Cloud Registration

Probabilistic Latent Semantic Analysis (plsa)

Product Information. QUADRA-CHEK 3000 Evaluation Electronics For Metrological Applications

Hardware design for ray tracing

::pcl::registration Registering point clouds using the Point Cloud Library.

A new Optical Tracking System for Virtual and Augmented Reality Applications

Flow Separation for Fast and Robust Stereo Odometry

The Visual Internet of Things System Based on Depth Camera

3D/4D acquisition. 3D acquisition taxonomy Computer Vision. Computer Vision. 3D acquisition methods. passive. active.

3D Vehicle Extraction and Tracking from Multiple Viewpoints for Traffic Monitoring by using Probability Fusion Map

Transcription:

FACTS - A Computer Vision System for 3D Recovery and Semantic Mapping of Human Factors Lucas Paletta, Katrin Santner, Gerald Fritz, Albert Hofmann, Gerald Lodron, Georg Thallinger, Heinz Mayer

2 Human Attention & Environment Selectively attending to one aspect of the environment Study of joint attention for communication on objects Human factors in the context of environments Study of attention, workload, memory, stress, emotion and decision making Study of wayfinding systems, marketing concepts, usability of user interfaces and products

3 Wearable Eye Tracking Glasses HD camera

4 Eye Tracking Glasses (SMI ETG) wearable, 30 Hz binocular Suite of (Wearable) Sensors Arousal (Affectiva Q) Computational audition Biosensor pulse sensor acceleration galvanic skin response temperature limb motion 6DOF Eye Tracker static, 500 Hz binocular, SMI RED 500

5 Human Factors Analysis, User Modeling, and Simulation Wearable Multimodal Sensing User Interaction & Human Factors Analysis User Model Attention Model Simulation in 3D Model Statistical Analysis 3D model

6 Motivation 3D Gaze Estimation Understanding behavior in task specific ambiente Localise Real Human Gaze in the 3D environment Saliency map on attended infrastructure Vrvis, JR, AIT, 2006

7 Previous Work on 3D Attention Mapping Munn et al. [ETRA, 2008] Introduced monocular eye-tracking and triangulation of 2D gaze positions of subsequent key frames within the scene video of the eyetracking system. Reconstructed only single 3D points without the reference to a complete 3D model achieving angular error of 3.8 (our: 0.6 ) Voßkühler et al. [ECEM 2009], Pirri et al. [CVPR 2011] Requires special, not mass marketed stereo rig that is required in addition to a commercial eye-tracking device. The achieved accuracy indoor is 3.6 cm at 2 m distance to the target (our: 0.9 cm) at the same distance of our proposed workflow. No reference to 3D model

8 Workflow: Recovery of 3D Gaze & Semantics

9 3D Model Generation: RGB-D based Map Building Depth assocation by means of stereo calibration pointcloud Pose trajectory on ground plane

10 3D Model Generation: Methodology Fully automated 3D model generation Grabbing RGB-D images of environment with Kinect Performing depth based visual SLAM using both image and depth information [*] Reconstruction of sparse point cloud consisting of 3D feature points Each feature point is attached to a SIFT descriptor for robust data association during pose estimation Pose estimation using sliding window bundle adjustment while minimizing reprojection error and depth discrepancy using 2D-3D correspondences [*] K. Pirker Katrin, G. Schweighofer, M. Rüther, H. Bischof. GPSlam: Marrying Sparse Geometric and Dense Probabilistic Visual Mapping, Proc. 22nd British Machine Vision Conference (BMVC), 2011.

11 3D Model Generation: Loop Closing Loop closure detection through vocabulary tree search query frame potential loop closing candidates returned by the vocabulary tree Returns a probability for each image in the map/tree Geometr. consistency check delivers candidate frame Low memory and fast computation time

12 3D Model Generation: Dense Model For human attention analysis and realistic surface reconstruction, a dense environment model is constructed afterwards Using probabilistic occupancy grid mapping Every depth image is inserted into the voxel space Using pyramidal approach presented in [*] Real-time performance using GPU implementation Surface reconstruction is handled by standard marching cubes algorithm [**] [*] K. Pirker, G.Schweighofer, M. Rüther, H. Bischof: Fast and Accurate Environment Modeling using Three-Dimensional Occupancy Grids, Proc. 1st IEEE/ICCV Workshop on Consumer Depth Cameras for Computer Vision, 2011. [**] W. E. Lorensen, H. E. Cline: Marching Cubes: A high resolution 3D Surface Construction Algorithm, in Computer Graphics, vol. 21, 1987, pp. 163-169.

13 Result: 3D Model

14 Image based Pose Estimation: Matching Process matching point cloud matching Results in pose for every ETG frame

15 Image based Pose Estimation [**] Estimate the user s pose within previously reconstructed area Sparse three-dimensional point cloud and its SIFT keypoints build the matching model ETG 2D image descriptors are matched against those in the 3D point cloud (global/local) Pose estimation through perspective n-point algorithm [*] RANSAC is used to eliminate matching outlier [*] Lepetit V., Moreno-Noguer F. and Fua P.: EPnP: An Accurate O(n) Solution to the PnP Problem, International Journal of Computer Vision, pp. 155-166, 2009. [**] Santner, K., Paletta, L., Fritz, G., Mayer, H., Visual Recovery of Saliency Maps from Human Attention in 3D Environments, Proc. ICRA 2013.

16 Image based Pose Estimation: Issues? point cloud? 200 out of 2200 poses could not be estimated (~90% coverage)! less image feature points (textureless area)! rapid head movements (motion blur)

17 6 DOF Reconstruction of Human Gaze Given the estimated camera pose intersection of viewing ray with the dense environment model fast interference detection using object oriented bounding box tree [*] [*] Gottschalk S. & Lin M. C. & Manocha D.; OBB-Tree: A Hierarchical Structure for Rapid Interference Detection, Proc. 23rd Annual Conference on Computer Graphics and Interactive Techniques, 1996.

18 Reconstruction of Human Gaze

19 Reconstruction of Human Gaze

20 Precision of Gaze Mapping Angular Error max. 0,6 º Euclidean Error max. 1,1 cm

21 Continuous Estimation of 3D Attention

22 Large 3D Model

23 23 Mapping of Gaze and Arousal in Large Environments 3D attention shop

24 Attention Guided Behaviors: Exploration and Visual Search

25 ROIs for Visual Search Region (=objects) of interest (ROI) detection Annotation in 2D Annotation in 3D

26 Towards Cognition from Attention Mapping Dwell time indicates that gaze / points of regard (PORs) are in series within ROI Dwell times on ROI indicate conscious processing of object information (e.g., ROI #1) region of interest (ROI)

27 related work Context of the FACTS System Eye-tracking videos Computer vision /multisensor analysis applied Driver analysis: Driver distraction analysis Usability engineering: Mobile user behavior analysis User modeling: Eye contact behavior analysis

28 related work: Driver Distraction Analysis Driver with Eye Tracking Glasses Gaze tracked with optical flow analysis Projection onto reference images Collective saliency map onto environment Time analysis

29 Localisation of smartphone in eye-tracking videos Attention on display vs. environment Marker free tracking of the smartphone Saliency mapping on display image capture, rectified Behavior analysis related work: Mobile User Behavior Analysis Smartphone eye-tracking Smartphone saliency mapping

30 related work: Eye Contact - Behavior Analysis Eyben, Schuller, Paletta, et al., submitted to IEEE Pervasive Computing 2013 unweighted average recall area under the ROC subject A B C D mean UAR 70 % 67 % 65 % 68 % 67.4 % ±.02 AUC 77 % 71 % 68 % 78 % 73.2 % ±.05

31 System Components

32 Summary & Conclusions Summary Recovery of 3D gaze: Automated reconstruction of a 3D model Automated mapping of gaze into a 3D model Full recovery of semantic analysis (in the frame of ROIs) System approach various applications Future work Multisensor positioning (accelerometer, vision) Computational attention model using 3D information

Thank you for your attention Dr. Lucas Paletta +43 664 602 876 1769 lucas.paletta@joanneum.at JOANNEUM RESEARCH Forschungsgesellschaft mbh Institute for Information and Communication Technologies www.joanneum.at/digital