A Genetic Algorithm-Evolved 3D Point Cloud Descriptor



Similar documents
3D Object Recognition using Convolutional Neural Networks with Transfer Learning between Input Channels

Removing Moving Objects from Point Cloud Scenes

A Robust Method for Solving Transcendental Equations

PCL - SURFACE RECONSTRUCTION

Android Ros Application

Classifying Manipulation Primitives from Visual Data

Face Recognition in Low-resolution Images by Using Local Zernike Moments

Numerical Research on Distributed Genetic Algorithm with Redundant

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

A Comparison of Genotype Representations to Acquire Stock Trading Strategy Using Genetic Algorithms

Efficient Organized Point Cloud Segmentation with Connected Components

Determining optimal window size for texture feature extraction methods

Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm

Development and Evaluation of Point Cloud Compression for the Point Cloud Library

Filling the Semantic Gap: A Genetic Programming Framework for Content-Based Image Retrieval

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER

The Visual Internet of Things System Based on Depth Camera

CS 534: Computer Vision 3D Model-based recognition

PCL Tutorial: The Point Cloud Library By Example. Jeff Delmerico. Vision and Perceptual Machines Lab 106 Davis Hall UB North Campus.

Terrain Traversability Analysis using Organized Point Cloud, Superpixel Surface Normals-based segmentation and PCA-based Classification

Image Classification for Dogs and Cats

Circle Object Recognition Based on Monocular Vision for Home Security Robot

The Delicate Art of Flower Classification

6 Creating the Animation

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition

ROBOTRACKER A SYSTEM FOR TRACKING MULTIPLE ROBOTS IN REAL TIME. by Alex Sirota, alex@elbrus.com

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Analecta Vol. 8, No. 2 ISSN

Fast Matching of Binary Features

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

A Cognitive Approach to Vision for a Mobile Robot

Automatic Calibration of an In-vehicle Gaze Tracking System Using Driver s Typical Gaze Behavior

Dynamic composition of tracking primitives for interactive vision-guided navigation

A Parallel Processor for Distributed Genetic Algorithm with Redundant Binary Number

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

The STC for Event Analysis: Scalability Issues

Introduction Feature Estimation Keypoint Detection Keypoints + Features. PCL :: Features. Michael Dixon and Radu B. Rusu

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE.

College of information technology Department of software

Using Data Mining for Mobile Communication Clustering and Characterization

Evolutionary Algorithms using Evolutionary Algorithms

Segmentation of building models from dense 3D point-clouds

Introduction to Computer Graphics

EXPLORING SPATIAL PATTERNS IN YOUR DATA

Module II: Multimedia Data Mining

ISSN: ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 2, Issue 3, May 2013

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

An Approach for Utility Pole Recognition in Real Conditions

Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects

HYBRID GENETIC ALGORITHMS FOR SCHEDULING ADVERTISEMENTS ON A WEB PAGE

CODING MODE DECISION ALGORITHM FOR BINARY DESCRIPTOR CODING

How To Use Neural Networks In Data Mining

2. igps Web Service Overview

Building an Advanced Invariant Real-Time Human Tracking System

Document Image Retrieval using Signatures as Queries

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Three types of messages: A, B, C. Assume A is the oldest type, and C is the most recent type.

Management Science Letters

Genetic Algorithm Evolution of Cellular Automata Rules for Complex Binary Sequence Prediction

Genetic Algorithm. Based on Darwinian Paradigm. Intrinsically a robust search and optimization mechanism. Conceptual Algorithm

Effective Data Mining Using Neural Networks

An evolutionary learning spam filter system

Social Media Mining. Data Mining Essentials

::pcl::registration Registering point clouds using the Point Cloud Library.

GA as a Data Optimization Tool for Predictive Analytics

Keywords image processing, signature verification, false acceptance rate, false rejection rate, forgeries, feature vectors, support vector machines.

A Method of Caption Detection in News Video

Manufacturing Process and Cost Estimation through Process Detection by Applying Image Processing Technique

Real-Time 3D Reconstruction Using a Kinect Sensor

Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling

A SECURE CLIENT APPLICATION USING RETINAL IMAGE MATCHING

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

A System for Capturing High Resolution Images

Genetic Algorithm Based Interconnection Network Topology Optimization Analysis

Mean-Shift Tracking with Random Sampling

Background knowledge-enrichment for bottom clauses improving.

Experiments in Web Page Classification for Semantic Web

AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION

Assessment of Camera Phone Distortion and Implications for Watermarking

Automated Attendance Management System using Face Recognition

MULTI-LEVEL SEMANTIC LABELING OF SKY/CLOUD IMAGES

A Fast Computational Genetic Algorithm for Economic Load Dispatch

Visualization by Linear Projections as Information Retrieval

Data Mining: A Preprocessing Engine

Software Development Training Camp 1 (0-3) Prerequisite : Program development skill enhancement camp, at least 48 person-hours.

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service

Evolutionary Detection of Rules for Text Categorization. Application to Spam Filtering

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier

Evolutionary SAT Solver (ESS)

Image Search by MapReduce

Characterization of Three Algorithms for Detecting Surface Flatness Defects from Dense Point Clouds

Professor, D.Sc. (Tech.) Eugene Kovshov MSTU «STANKIN», Moscow, Russia

Self Organizing Maps for Visualization of Categories

Transcription:

A Genetic Algorithm-Evolved 3D Point Cloud Descriptor Dominik Wȩgrzyn and Luís A. Alexandre IT - Instituto de Telecomunicações Dept. of Computer Science, Univ. Beira Interior, 6200-001 Covilhã, Portugal Abstract. In this paper we propose a new descriptor for 3D point clouds that is fast when compared to others with similar performance and its parameters are set using a genetic algorithm. The idea is to obtain a descriptor that can be used in simple computational devices, that have no GPUs or high computational capabilities and also avoid the usual time-consuming task of determining the optimal parameters for the descriptor. Our proposal is compared with other similar algorithms in a public available point cloud library (PCL [1]). We perform a comparative evaluation on 3D point clouds using both the object and category recognition performance. Our proposal presents a comparable performance with other similar algorithms but is much faster and requires less disk space. 1 Introduction The current cheap depth+rgb cameras like the Kinect and the Xtion have increased the interest in 3D point cloud acquisition and processing. One of the key steps when processing this type of data are the descriptors, that enable a compact representation of a region of a point cloud. Although there are already several available descriptors [1, 2], the motivation for this work was two-fold: first, many of the available descriptors are computationally demanding, and make it difficult to use them in computationally restricted devices; second, all the descriptors require the adjustment of one or more parameters, which is usually done using a grid search or other similar process, which can be a lengthy process. The aim of this work is to design a descriptor that is computationally simple and hence fast and that has its parameters obtained using a genetic algorithm (GA), so as to address the two points raised above. Section 2 presents the pipeline used in this work, from the input clouds to the matching stage. Section 3 explains the ideas behind the proposed descriptor. The following section describes the use of the GA with our descriptor. Section 5 illustrates the results of the new descriptor and compares it to similar available descriptors. The final section contains the conclusion and possible future work. 2 3D Object Recognition Pipeline In this work the proposed descriptor uses both shape and color information. In order to represent this information histograms were used. We acknowledge the financial support of project PEst-OE/EEI/LA0008/2013.

2 First keypoints are extracted from the input clouds in order to reduce the cost of computing the histograms. The keypoint cloud represents the input cloud by containing only a subset of the original cloud such that an increased processing speed can be achieved. After computing the keypoints we find the normals of both, the input and the keypoint clouds. The normals are used in the calculation of the shape histograms that will be part of the final descriptor, as described below. The keypoint cloud is obtained from the input cloud using a VoxelGrid [1] with leaf size of 2 cm. The second part of the descriptor consists in adding color information. For this purpose the RGB colors are transformed into HSV. This model is used because with the HSV color space we can use only the H and S channels and obtain illumination invariance in terms of color representation. We create another histogram for the color component using the hue and saturation channels. For the matching process the input cloud descriptor is compared against the stored descriptors of known objects, using a set distance function. The object recognition pipeline used is presented in figure 1. Input Cloud Keypoint Extraction Normals Extraction Color Extraction Descriptor Extraction Matching Object Database Fig. 1. Object recognition pipeline

3 3 Descriptor 3.1 Regions around the keypoints The descriptor contains two parts: one to represent the shape and another to represent the color. The data used are collected from two regions around the keypoints, the first is a disk with radius R 1 and the second is a ring obtained by removing the first disk from the one obtained using radius R 2. This approach was proposed in [3]. These regions are illustrated in figure 2. R 1 Keypoint R2 Region 1 Region 2 Fig. 2. The two concentric regions used to collect data around a keypoint: a disk (region 1) and a ring (region 2). The advantage of using this disk and ring regions is that it makes it possible to analyze two areas around the keypoint to create the histograms. They separate the information between points that are very close to the keypoint and the points that further away from it, yielding a better representation of the region around the keypoints. 3.2 Shape information After computing the keypoint clouds and all normals, for each keypoint we search all of its neighbors inside region 1. This search is done in the original input cloud and for this task the PCL utility KdTreeFLANN [1] is used. The next step consists in finding the angle between the normal of this neighbor and the normal of the keypoint. This will give us information regarding the object s shape in the keypoint s neighborhood. Equation 1 shows how to calculate the angle. ( ) Normalkeypoint Normal neighbor angle = arccos Normal keypoint Normal neighbor The angle is used in degrees. (1)

4 We use an histogram to count the occurrences of the angles in the keypoint s neighborhood. The incremented bin is found using equation 2, where shape bins is the total number of histogram bins. bin = angle (shape bins 1) (2) 360 After we have found all angle for the points in a keypoint s neighborhood, the histogram is normalized to sum 1. The process just described is done also for the region 2 and a second shape histogram is obtained, at the same keypoint. 3.3 Color information The color histogram contains the saturation and hue information. Again we look at the neighbors of a keypoint for their hue, H, and saturation, S, values and select a bin h and a bin s, using the equations (3 and 4), where the total number of bins is m 2. bin h = H m 360 (3) bin s = S m (4) 100 Now to find the correct color bin to be incremented in the color histogram, we use the coordinates (bin h, bin s ) in the equation 5. color bin = m bin h + bin s (5) The color histogram is normalized to sum 1. As we did in the case of the shape information, the process is repeated for region 2, yielding a second normalized color histogram. The two histograms for regions 1 and 2 are concatenated yielding a single color histogram with a total number of bins equal to 2m 2. 3.4 Cloud distance To find the matching objects we need to compute the distances between two clouds. In order to get the distances between shape histograms from two clouds first we compute the centroid of the shape histograms of each cloud (c 1 and c 2 ), then using the chi-squared distance [4] (equation 6) we get the distance between the two centroids. d cent = i (c 1 [i] c 2 [i]) 2 2(c 1 [i] + c 2 [i]) (6) Then we do a similar step but instead of using the centroids we use the standard deviation of the histograms of each cloud. Equation 7, shows how to find this value for cloud 1, and a similar calculation should be done for the second cloud to be compared.

5 h 1 is the shape histogram, while c 1 is the centroid and N 1 is the number of keypoints in this cloud. We use the same process (equation 6) to get the distance, d std, between these standard deviations histograms (std 1 and std 2 ) as we used for the centroid histograms (c 1 and c 2 ). i (h 1[i] c 1 [i]) 2 std 1 = N 1 1 (7) Finally we sum the centroid distance, d cent, with the standard deviation distance d std, [2] for the final shape distance: d sh = d cent + d std. The same process is used to compute the color histogram distance, d cl. Equation 8 shows how we compute the final distance (d final ) between two clouds, where d sh is the shape distance between two histograms and d cl is the color distance between the same histograms. We use the weight w to represent the relative importance of the shape and color information regarding the final distance. d final = d sh w + d cl (1 w) (8) The next step is to find which test cloud fits best to each training cloud, this means the one with the smallest distance. The results of the matching of all test point clouds are used to produce the final result of the dataset. 4 Genetic Algorithm In this work a genetic algorithm [5], tries to find the optimal parameters for the 3D cloud descriptor. The chromosomes encode the following 5 descriptor parameters: Shape bins, m, R 1, R 2 and w (which is in fact used in the distance between the descriptors, and not in the descriptors themselves). The role of each of these parameters in the creation of the descriptor was explained in the previous section. The GA has some restrictions, which are the intervals in which the parameters lie. The parameter Shape bins is set between 8 and 64. The parameter m is set between 3 and 8. The parameter R 1 is set between 0.5 and 2.0 cm and R 2 is set between 2.1 and 5.0 cm. The R 2 has the maximal value possible set to 5.0 cm as the other descriptors used to compare with our descriptor also use this value. The last parameter optimized by the GA is the weight w, that is allowed to vary between 0 and 1. The chromosomes are binary encoded. The initial population is set randomly. The population used consisted of 10 chromosomes. This small value was chosen in order to avoid the generations taking too much time to compute. The object error represents the percentage of correctly matched point clouds from the training set 1 among the point clouds from training set 2 and is used as the fitness of the chromosome (training set 1 and training set 2 that are explained in the experiments section). The elitism selection [6] and the roulette-wheel selection [6] were used in order to make the selection. The used crossover technique is the uniform crossover [6]. Mutation makes it possible to search a wide area of solutions in order to find the optimal solution. The mutation used

6 is the in-order mutation [6]. The mutation probability is set to 40% in the first generation and decreases exponentially after each successive generation. The GA needs to be able to stop the evolution when the goal is achieved. The stopping criterion is set to either the number of generations reaching 200 generations, or if no better solution in 40 consecutive generations is found, or if a solution with 100% of correct object matches is found. After each generation we measure the validation error of the best chromosome. This way we avoid the overfitting of the descriptor that could lead to the loss of the ability that the descriptor has to recognize point clouds. For this purpose we check in a validation subset of point clouds (apart from the clouds used by GA to determine the best parameters for the descriptor) for the validation error of the best chromosome. When this error begins to rise, we stop the AG and consider that we have found the best descriptor. 5 Experiments 5.1 Dataset A subset of the large dataset 1 of 3D point clouds [7] was used to perform experiments. The clouds were divided into four subsets, constituted by two training subsets, one validation subset and one test set. The test clouds subset is composed by 475 different views of 10 objects and is used to calculate the final performance of the descriptor using 943 training clouds as possible match. The validation clouds subset has 239 clouds and is used to avoid the overfitting of the descriptor. We calculate the validation error of the best chromosome of each generation when making use of the GA. For this purpose we check how many clouds from the validation clouds are correctly matched, using 704 training clouds as the matching clouds. On the other hand these 704 training clouds are divided into training 1 and training 2 subsets. Those two training subsets were used by the GA to get the object error of the chromosomes (both training 1 and training 2 subsets contain 352 clouds). 5.2 Results The code used to implement our descriptor can be downloaded online 2. The best chromosome optimized by the GA has 60 shape bins, m = 8, R 1 = 1.3 cm, R 2 = 3.6 cm and w = 0.67. After the matches are done, we check how many of the 475 test clouds were correctly matched to the 943 training clouds. The best descriptor uses 248 bins to represent the point cloud. This descriptor has an accuracy of 72.47% in matching the cloud to the correct object (from the 474 test clouds 344 were correctly matched one of the test clouds was not used since it had less than 10 points) and 89.66% in matching the cloud to the correct category. In the paper [2] some available descriptors were tested to get their performance and computational time. Table 1 shows the performance of those descriptors, that were extracted using the same point clouds as the ones used to evaluate the descriptor proposed 1 http://www.cs.washington.edu/node/4229/ 2 http://www.di.ubi.pt/ lfbaa

7 in this paper. The column time refers to the necessary time to match the 475 test clouds using 943 point clouds as the training set. The column size refers to the number of real values required to store one descriptor. Table 1. Descriptors performance: test errors for the object and category recognition tasks along with time and size requirements. Descriptor Object (%) Category (%) Time (s) Size PFHRGB 20.25 5.27 2992 250 SHOTCOLOR 26.58 9.28 178 1353 Our 27.43 10.34 72 248 As we can see the SHOTCOLOR takes 178 s to compute, while our descriptor takes only 72 s using the same machine. Although we have a slightly lower accuracy, the temporal advantage can be important in real time applications. The PFHRGB has the best accuracy, however it takes 2992 s to compute. In terms of size, we can see that our descriptor uses only 248 real values that is significantly less than the SHOTCOLOR s 1353 and still less than the 250 values per descriptor that PFHRGB uses. Figure 3 contains the recall (1-precision) curves for the object recognition experiments. 1 PFHRGB SHOTCOLOR OUR 0.8 0.6 Recall 0.4 0.2 0 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-Precision Fig. 3. Recall (1-precision) curves for the object recognition experiments.

8 Although the PFHRGB curve is better than ours, we can see that our curve is close to the SHOTCOLOR curve and when we have a recall larger than 0.35 our curve is better than the SHOTCOLOR s. 6 Conclusion In this paper we presented a new descriptor for 3D point clouds that takes advantage of a genetic algorithm to find good parameters. It presents a performance similar to other existing descriptors, but is faster to compute and uses less space to store the extracted descriptors. Our descriptor when compared to the SHOTCOLOR presents a slightly higher error (27.43% versus 26.58% object recognition error) but it is much faster (uses 40% of the time needed by the SHOTCOLOR) and occupies less space. When compared to the PFHRGB, it is substantially faster (uses only 2.5% of the time needed by PFHRGB), uses the same space but has higher error (27.43% error versus 20.25%). So our proposal can be used to replace these descriptors, when extraction speed is important. A possible way to improve the quality of the descriptor is to let the GA optimize not only the values of the parameters, but also the entire structure of the descriptor (types of angles used, their number, types of regions to consider and their shapes). References 1. Rusu, R.B., Cousins, S.: 3d is here: Point cloud library (pcl). In: IEEE International Conference on Robotics and Automation (ICRA). (2011) 2. Alexandre, L.A.: 3D descriptors for object and category recognition: a comparative evaluation. In: Workshop on Color-Depth Camera Fusion in Robotics at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura, Portugal (October 2012) 3. Torsten Fiolka, Jorg Stuckler, D.A.D.S., Behnke, S.: Place recognition using surface entropy features. In: in Proc. of IEEE ICRA Workshop on Semantic Perception, Mapping, and Exploration, Saint Paul, MN, USA. (2012) 4. Puzicha, J., Hofmann, T., Buhmann, J.M.: Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. 2012 IEEE Conference on Computer Vision and Pattern Recognition 0 (1997) 267 272 5. Holland, J.H.: Adaptation in Natural and Artificial Systems. A Bradford Book (1975) 6. Engelbrecht, A.P.: Computational Intelligence, An Introduction. John Wiley & Sons (2002) 7. K. Lai, L. Bo, X.R., Fox, D.: A large-scale hierarchical multi-view rgb-d object dataset. In: Proc. of the IEEE International Conference on Robotics and Automation (ICRA). (2011)