The KITTI-ROAD Evaluation Benchmark for Road Detection Algorithms 08.06.2014 Jannik Fritsch Honda Research Institute Europe, Offenbach, Germany Presented material created together with Tobias Kuehnl Research Institute for Cognition and Robotics, Bielefeld University, Bielefeld, Germany
Contents The KITTI ROAD Dataset Classical Pixel/Cell-based Performance Measures Cell-based Evaluation in Metric Space Proposal for Behavior-level Evaluation Benchmark Behavior-based Evaluation Short Discussion of Current Results on Website Disadvantages of KITT-ROAD data set Summary 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 2
KITTI Collection of Several Vision Benchmarks for Automotive Domain 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 3
KITTI-ROAD A New Benchmark for Road Detection Evaluation 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 4
KITTI ROAD Dataset Derived from the KITTI dataset stereo, velodyne, and IMU data available! Separated in 289 training (with annotation) and 290 test images 3 types of city roads Annotations for road-area and ego-lane (only UM) UU_ROAD: Urban Unmarked Road UM_ROAD: Urban Marked Road UM_LANE: EgoLane on UM_ROAD UMM_ROAD: Urban Multiple Marked Road Example used in this talk: Evaluation of road-area on complete URBAN dataset and ego-lane on UM_LANE 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 5
Evaluation Criteria for Road Terrain Detection Pixel-level correctness classical measure Kang et al. 2011, Alvarez et al. 2008, Wu et al. 2011 Boundary position and deviation (μ,σ) at one/multiple distances Zhao et al. 2012, Kuehnl et al. 2011, et al. Serfling 2008 Unoccupied lane length distance d up to obstacle Gumpp et al 2011 Corridor width continuous (μ,σ) or discrete classes Kuehnl et al 2012 Model-based lane shape similarity clothoid parameter deviations Gopalan et al. 2012, Gackstaetter 2011 Classical Evaluation metric Perspective Proposed new evaluation metric -- abstraction from exact boundary α BEV Application- Relevant Space α 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 6
Pixel-based evaluation measures employed typically: Classical Pixel/Cell-based Performance Measures Ideal performance: Set threshold TH to value maximizing harmonic F-measure (beta=1, i.e. equal weight to Precision and Recall) Average performance: Average Precision (AP) used in PASCAL VOC evaluations All measures are applicable to both: 1) Perspective image and 2) Metric BEV space 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 7
1. Baseline (BL): Accumulating ground truth annotations provides probabilistic spatial prior Perspective road-area Comparison Algorithms Bird s-eye-view (BEV) road-area ego-lane ego-lane 2. Here: Geometric Context (GC) [Hoiem2007]: (somewhat unfair, as it detects any planar surface) Segment into Superpixels and calculate probability distribution of surface orientations 3. Online: SPatial RAY features (SPRAY) [Kuehnl2012]: Separate classification of appearance and space 4. Online: Convolutional Neural Network (CNN) [Alvarez2012]: Combination of offline and online methods 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 8
Image Area for Pixel-based Evaluation Metric BEV area covers +/-10m x 40m Perspective image Covers only area matching to BEV (red polygon): Perspective BEV 46 26 same metric area evaluated also in Perspective space BUT different weighting due to different number of pixels (see pink rectangle) Perspective: Near range has high influence, Far range has low influence BEV: Near and far range have same influence 6-10 0 10 Note: Here both Metric results and Perspective results are provided for comparison KITTI-ROAD benchmark on webserver evaluates BEV only 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 9
Results for Road-area on URBAN Dataset BEV BL surprisingly good due to Low traffic density = road usually free Strong lighting variations = hard for vision algorithms Differences between algorithms more pronounced in BEV Results emphasize importance of BEV evaluation Perspective Perspective Bird s-eye-view (BEV) 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 10
Results for Ego-lane on UM Dataset Extremely high accuracy due to many TNs (ego-lane covers only small part in BEV)... but similar precision values as for road-area Again: BL extremely good SPRAY approach only better in BEV evaluation Perspective BEV Perspective Bird s-eye-view (BEV) 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 11
Basic Idea of Behavior-based Evaluation Single track model to predict vehicle motion Use different steering angles for generating corridor hypotheses Obtain fitness value for each hypothesis by integrating ego-lane confidences covered by corridor area (2.2m width = vehicle width) Select best corridor hypothesis to represent driveable ego-lane Corridor representation abstracts from pixel-level ego-lane area Ego-lane confidences Corridor hypotheses Corridor fitness 0.08 Best corridor hypotheses SPRAY 0.71 0.21 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 12
Examples of Corridor Hypotheses on KITTI ROAD benchmark Note: Most of the time the road is straight (exception: rural roads) Only small amount of recorded data/benchmark contains curvy scenes 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 13
Characteristics of Resulting Corridor Hypothesis Different applications of parametric corridor possible: LKAS support on roads with no/bad lane markings Detection of too narrow passages independent from road boundary type. Perspective BEV Generate corridor hypothesis Evaluation Evaluation issue: Corridor covers only subset of ego-lane ground truth area Classical evaluation metric not fully applicable (large number of irrelevant FN) 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 14
Evaluation for Ego-lane Corridor Hypothesis Lateral quality based on precision at different distances Longitudinal quality obtained by shrinking corridor to single line (avoid high FN) TP only if sufficient overlap in original pixel representation (2.0m out of 2.2m corridor) Evaluation of F1 measure and Hitrate (successful match of hypothesis to ground truth) Proposal for new performance measure for behavior-based performance (But: evaluation on single image, ADAS needs temporal integration events) 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 15
Individual Results (UU,UM,UMM) Available Online 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 16
Individual Results (UU,UM,UMM) Available Online 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 17
Choice of sensor data used Some Thoughts on the Current Results Mono: Only few submissions, partially good results (SPRAY, but needs ext. calib.) Stereo: very powerful but not better than mono (if ignoring color/texture) Planar depth information is not discriminative (SP) roads are not completely flat (small bending for rain drainage!) flat gras next to road gravel & tram rail area low curbs sidewalk at same height but from different material large distances are generally challenging (especially for stereo) Combination with BL helps to avoid some pitfalls (SP+BL) Completely different use of stereo data leads to similar results (RES3D-Stereo) Stereo with Color/Texture Top position exceeding best mono (ProbBoost) Diverse results at lower positions (HistonBoost, ANN) Velodyne: results much better than pure stereo (RES3D-Velo) But not superior to mono emphasizes importance of color/texture Better depth information in large distances Note: Results vary across individual data sets! 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 18
Disadvantages of KITTI-ROAD benchmark Currently only small amount of data (300/300), but total recordings available only allow for limited expansion of data set KITTI captures only German city roads and a little bit of rural roads very restricted form of roads (US: wider, JP: city roads smaller, ) Exposure control is not perfect, especially the road area is sometimes over-exposed Appearance changes are probably more drastic than in modern cameras Straight roads are common and enable high performance of simple spatial prior (BL) BUT: the interesting situation are curves more curve situations needed, difficult with KITTI since most roads are straight (not enough data to create Curve benchmark) Currently only polygonal area annotation, so only implicit boundaries could be added, if common understanding of boundary nature could be reached (see also paper in next session after the coffee break) 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 19
Summary A novel benchmark: KITTI ROAD dataset with 3 urban categories Online evaluation on webserver: Pixel-based road-area in BEV space Pixel-based ego-lane in BEV space Behavior-based ego-lane corridor hypothesis All datasets, Python example code, and results available at http://www.cvlibs.net/datasets/kitti/eval_road.php Although not perfect, it is at least a first attempt at benchmarking road/lane detection Please consider applying your road/lane detection algorithms on this benchmark! If this benchmark is not fitting your needs (only urban roads, no adverse weather, ) then let s discuss today together the requirements for a better benchmark! Thank you for your attention! 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 20
Thank you for your attention 8. June 2014 Jannik Fritsch, Honda Research Institute Europe 21