1 Comparison of People Detection Techniques from D Laser Range Data Hadi Kheyruri Daniel Frey Department of Computer Science, University of Freiburg Abstract For a mobile robot operating in human environments detection of people is very crucial for carrying out different tasks. This paper presents our comparison of different approaches for people detection using geometric features. We compared a boosting system against Inscribed Angle Variance (IAV) method suitable for arc/circle detection. A bounding box approach is our baseline method. For segmentation purpose a simple jump distance algorithm is applied. We present experimental results of our comparison based on real data scanned by a laser range finder sensor. of the humans in consecutive scans and try to detect the humans in regards to their motion pattern. The other approach is to use one shot, single scan and try to perform people detection by employing geometric features of the local minima human blobs. In this paper we will try to solve the problem of detecting people in D laser information by incorporating and studying the geometric features. To this we will compare three different approaches in regards to geometric features. These are the baseline bounding box approach, circle fitting method and finally boosting. These techniques are in sections 4.1 through Introduction As robots slowly leave the factories and specific defined environments and enter the daily life of people new problems must be solved. A crucial aspect of a robot operating in human environments, is detecting humans. A mobile robot needs to interact with people and its corresponding environment to take its orders and perform them. In this regard a robot needs to detect humans firstly as to get its orders from them and secondly in order to undergo its given tasks it should avoid collision with humans and also not get in their way in the given environment. There for detection and tracking people is one of major issues concerning the Robotics community. Based on the type of the sensor and thereof input data there are two major approaches in this area of research. People detection and tracking in vision information namely images using camera as sensor or use of laser range finders as sensors are the two main approaches. Laser range data can be in D or 3D depending on the formulation of the problem on hand. This work tries to solve the problem based on data gathered with laser range finder sensors. Different tasks of mapping and localization has been previously researched based on laser range finder sensors. Other reason for use of these sensors opposed to vision sensors is that laser range finders are independent from ambient conditions. Figure 1 shows an example scan gathered by a laser range finder. The data of this sensor contains two dimensional range information. Regarding analysis and people detection in laser range data there are mainly two approaches. One approach is to study the motion Figure 1: One scan of the testing environment For the purpose of detection we will run a preprocessing segmentation phase. The goal of segmentation is to simplify the representation of data so that analysis of the points of interest would be easier. Points of interest in this research are the beams scanned by the laser range finder that correspond to a human. In this context we will prefer to have such segmentation algorithm that differentiates between human and nonhuman segments as much as possible. In other words preferred are the segmentation techniques that yield the least number of segments with mixed human and nonhuman points. In this paper we will not investigate this scope and will employ a simple jump distance which is discussed in section 3.
2 In the following section we will look into the related work and how it has effected this study. Related Work In [Zivkovic and Kroese, 7] the problem of people detection using D range data and omnidirectional vision is addressed. They first develop a reliable D range data leg detector using AdaBoost. After that they use a more reliable method which takes into account the arrangement of the detected legs. [Arras et al., 8] also works on the problem of detecting people in two dimensional range scans. In this work too AdaBoost is used to produce a strong classifier out of several weak classifiers. As proposed by the authors of these two papers AdaBoost has yielded promising results. There for we choose AdaBoost as one the main methods to perform our comparison of different people detecting techniques. The other paper we based our study on was [Xavier et al., 3]. There are several methods proposed for segmenting and feature extracting out of laser range data in mentioned in the paper. However, the main contribution of the paper is Inscribed Angle Variance (IAV) as they call it. IAV is a new method proposed by the authors for arc and circle fitting. Our aim is to compare between the newly proposed IAV, circle fitting method with an AdaBoost system using different geometric features. We have described in detail the methods applied by this paper in section 4 3 Segmentation For simplifying detection of humans in laser range data we would first divide the scan information into different segments. Goal of the segmentation phase is to separate sets of segments that are related to targets detected by laser range finer. Our targets of interest in this paper are people. We used a simple jump distance algorithm mentioned in [Premebida and Nunes, 5] for segmentation. Here we present a short description of the segmentation method applied by our work. A segment is represented as a set of points in polar coordinates p i with p 1 being the beginning of a segment and p n being the last point in the segment. In the polar representation of a scanned point r i is the radius of the i th point scanned and α i is its corresponding angle. where min(r i, r i+1 ) is the minimum radius of the two consecutive points and C 1 is given by (1 cos( α)). C is a constant parameter used for noise reduction. The choices for a better threshold condition in jump distance have been studied in other literature and will do not address it here. A schematic representation of jump distance approach is illustrated by figure and we do not discuss it further. In the next section we will several detection methods that we used are described. Figure : Schematic representation of a hyphothetical scan data and some involved parameters. 4 Detection In this section three approaches are described for detection of people in the D laser range data. A baseline approach which is followed by the circle fitting approach. The aim of this paper is to study the effect of combining several geometric features for people detection in comparison with single geometric features. To this end we will describe boosting as the third method. 4.1 Basline Approach S i = {p 1 (r 1, α 1 ),..., p i (r i, α i ),..., p n (r n, α n )} (1) The main idea of jump distance algorithm is that if the distance between two consecutive points is bigger than a specific threshold then segments are separated. If D(r i, r i+1 ) > D thd then, segments are separated else segments are not separated. where D thd is the threshold condition and D(i, i + 1) is Euclidean distance between two consecutive scanned points. The threshold condition is given by: D thd = C + C 1 min(r i, r i+1 ) () Figure 3: The concept of a bounding box  diagonal d as the parameter for detection Our baseline approach is to fit a bounding box to each segment. For any given segment S i if the diagonal of the bounding rectangle d Si is within a threshold, the algorithm would
3 regard the segment as a human; otherwise the segment would be flagged nonhuman. Although this approach is basic and does not capture crucial information regarding a human segment of points, it performs slightly better than random. The threshold for the diagonal was learned by a set of true examples from the data set. { dlowerbound < d H = Si < d upperbound accept Otherwise reject 4. Circle Fitting Humans appear to be curved shaped in the laser range data. Although other objects in the scan might also have curved shapes, we assume that the radius of human curves normally lie between two specific thresholds. The main idea here is to fit a circle to a segment of points and then check whether the radius of the fitted circle is within the learned thresholds. The thresholds are adopted according to true examples from the training data set. For employing the curved feature of human body that appears in the scans as a distinctive feature for detecting people we will use the circle fitting method expressed in [Xavier et al., 3]. The authors call their method Inscribed Angle Variance or IAV which is the name given to their circle and arc detection method. (3) formed through two pairs of the three points: the first line denoted a passes through points P1 and P, and the second line denoted b passes through points P and P4. The equations corresponding to these two lines are given by y a and y b as follows: y a = m a (x x 1 ) + y 1 with m a = y y 1 x x 1 (4) y b = m b (x x ) + y with m a = y 4 y x 4 x (5) Note that x i and y i are the corresponding Cartesian value of the point p i. The center of the circle is the intersection of the two lines a and b which are vertical to a and b. The equations corresponding to these two lines are given by y a and y b as follows: y a = 1 (x x 1 + x ) + y 1 + y m a y b = 1 (x x + x 4 ) + y + y 4 (7) m a Solving for x gives us the equation shown in equation 8. x = m am b (y 1 y 4 ) + m b (x 1 + x ) m a (x + x 4 ) (m b m a ) The y value can be achieved by applying x into the function y a or y b. Now, since we determined the center of the circle, we can easily compute the radius through the distance between the center and one of the points of the segment, for instance P1. (6) (8) 1.5 fitted circle points of the segment Figure 4: The inscribed angles of an arc are congruent The people detection through circle detection is performed in two steps. First we try to determine if the segment could be a circle or not. For this purpose we compute angles of each point of the segment, like shown in Figure 4, we compute these angles between the first point, the last point and one of the segment. Positive detection of circles occurs with standard deviation values, of these angles, between 8.6 and 3 and average values between 8 and 155. These values were tuned empirically to detect the maximum number of circles, while avoiding false positives. The second step, after realizing whether a segment S i could be a circle or not in the previous part, we try to determine whether the circle corresponding to the segment S i is generated from a human body curvature. For this purpose we first compute the center of the circle. From analytic geometry, we know that there is a unique circle that passes through any three points. For these points we will use the two extremes, P1 and P4, like shown in Figure 4, of the arc and the point with the inscribed angle most close to the average, denoted by P. Two secant lines can be Figure 5: A segment and corresponding fitted circle. sum of distances:.9877 standard deviation of x coordinate: standard deviation of y coordinate: width: jump distance to previous seg.: jump distance to next seg.: diagonal of bounding box:.3893 number of points: 18 fitted circle: radius center: (.3569, )
4 4.3 AdaBoost Approach [Arras et al., 8] was a base for our boosting approach. There are certain features regarding the points belonging to people. This approach is based on recognizing these features and quantifying them and then creating a set of weak classifiers based on these features. Each of these weak classifiers would perform slightly better than random guessing. For learning the parameters of the weak classifiers we have used a set of true examples from the training data set. The main idea here is to combine a set of weak classifiers in order to create a strong classifier for people detection. The algorithm we used for boosting is AdaBoost which is used for selecting and combining the most informative weak classifiers. Labelled training data are the input for the algorithm. In a series of rounds t = 1,..., T the algorithm selects a weak classifier and classifies the training set. At the end of each round the algorithm increases the importance of the examples that were wrongly classified by the previous weak classifier so that the new classifier focuses more on those examples. The final strong classifier consists of weighted sum of the weak classifiers. Algorthim 1, illustrated below is the AdaBoost algorithm that we used. We defined eight features to distinguish between human and nonhuman segments. Each weak classifier based on one of the defined features and it takes a segment that is consisted of a set of points and returns whether the segment belongs to a human or not. These features are as follows: 1. Number of points: size of the segment or in other words number of of points in a specific segment. n = S i. Sum of distances between i th and i + 1 th point: The sum of Euclidean distance between geometrically adjacent points in the segment. ζ = n= S i i=1 (xi x i+1 ) + (y i y i+1 ) 3. Standard deviation: This feature is calculated with the following formula. 1 σ = x j x n 1 j where x is the center of gravity of a segment S i. 4. Width: Euclidean distance between first point and last point of the segment. 5. Jump distance from proceeding segment: This feature corresponds to the Euclidean distance between the first point of S i and the last point of S i 1 6. Jump distance to succeeding segment: The Euclidean distance between the last point of S i and the first point of S i+1 7. Bounding box: This feature measures the diagonal of rectangle encompassing all the points of a segment S i. 8. Radius: This feature is the radius r c of the circle fitted to the points in the segment. This feature can be a measure for circularity of the segment. Each segment is assessed in regard to these features and the corresponding weak classifier returns +1 in case of detection of human and 1 in case of not recognizing the specific segment as a human. The final decision is made through the weighted sum of all weak decisions. The sign function of this sum would be resulting detection for a segment. Figure 5 depicts a positively detected segment and all its quantised values for its corresponding features. Input: Set of examples (e 1, l 1 ),..., (e N, l N ), where l n = +1 for positive examples and l n = 1 for negative ones 1 Initialize weights D 1 (n) = 1 n for t= 1,..., T do 3 For each h j calculate: r j = N n=1 D t(n)l n h j (e n ), where h j (e n ) {+1, 1} 4 Choose h j that maximizes r j and set (h t, r t ) = (h j, r j ) 5 Update the weights: D t+1 (n) = D t (n) exp( α t l n h t (e n )), where α t = 1 log( 1+r t 1 r t ) N Dt(i) i=1 6 Normalize the weights: D t (n) = D t(n) 7 end 8 The final strong classifier is given by: H(e) = sign(f (e)), where F (e) = T t=1 α th t (e) Algorithm 1: the AdaBoost algorithm 5 Experiments and Results The laser range data was gathered by using a 18 degrees SICK sensor. Our aim was to compare different people detection techniques based on this data. Each scan was segmented according to a basic jump distance algorithm and after that each segment was classified with three different approaches. 5 randomly selected scans from the initial data were used for training purposes. This includes learning the weights for the weak classifiers by the AdaBoost algorithm. 18 scans from the main data were used for testing the system. For learning and testing all the points in all the scans were manually labelled human or nonhuman. Human detection was performed with three different methods, namely the bounding box method, the circle fitting approach and by using boosting. For bounding box the diagonal parameter was set between two thresholds,.1 and.3 meters. As for circle fitting the acceptable radius for a human is considered to be between.5 and.6 meters. These values were tuned empirically. For boosting we used an AdaBoost based classifier. Eight weak classifiers were used for boosting purpose. Every weak classifier is manually designed and tries to use a specific feature for people detection. The features and the corresponding thresholds are as follows. Note that these thresholds were adjusted experimentally:
5 Jump distance between segments (features #5 and #6) measures the segments distance congruity to most frequent distance of human segments with other segments. The interval for these features is [1.,.3] meters. Circularity of a segment (feature #8) takes into account the curved shape of human body. A circle is fitted to the points in the segment. If the radius is in between.5 meter and.6 meter it is detected as a human. The lower bound of this interval represents radius of a typical thin human leg and the upper bound represents an above average human waist. Compactness of the segments is measured with three features. The bounding box (feature #7), Standard deviation (feature #3) and sum of distances between points in a segment (feature #). A local minima blob with an encompassing bounding box with a diagonal greater than.1 meter and smaller than.3 meter is accepted as a human. Based on Cartesian coordinates a standard deviation value is calculated for x and y of the points in the segment. The acceptance interval for standard deviation for the x value was set to [.1,.4] and for the y value it was set to [.5,.1]. As for the sum of Euclidean distances between adjacent points in the segment the acceptance interval is tuned to [, 1.5]. Segment width (feature #4) is treated as the Euclidean distance between the first and the last point in the segment, and in the cases that this distance is between.7 meter and.5 meter the segment is labelled as a person. Number of points (feature #1). Segments with less than 5 points or more than 4 points are discarded. The conducted experiments show that using a set of geometric features for people detection performes better than a single feature. The confusion tables for the three approaches depict a much higher percentage for true positives (65.97%) and a lower percentage of false negatives for AdaBoost. These tables were obtained in regard to the precision recall curves of these three approaches where the precision was equal with recall 6. The three approaches were almost identical in detecting true negative with AdaBoost being slightly better. The better performance of the AdaBoost in comparison with circle fitting method is well illustrated in the precision/recall graph. For generating these curves we changed the classification thresholds. In the case of AdaBoost we changed the boundary threshold for acceptance or rejection of a segment between the interval of (, 1). As for the case of circle fitting and bounding box we applied the parameter α to the two thresholds of these classifiers enlarging and shrinking the interval of acceptance of a segment as a true positive. The reason explaining the better performance of the boosting approach is that boosting relies on more than just one single feature (as in the bounding box or circle fitting). Boosting incorporates different features that capture more aspects and properties of the human segments. For circle fitting we can see gradual and steady fall in the precision. On the other hand the curve corresponding to the bounding box method shows an unexpected behaviour. The unexpected zigzag behaviour of the bounding box curve could be explained as follows. In the case where the two thresholds of the bounding box classifier are close to each other, most of the true positive examples are rejected due to the distribution of the data. This leads proportionately in higher false positives (in proportion to all true detections), which will then lead to a poor precision. This can be well explained with the histogram of diagonals corresponding to human and nonhuman segments shown in figure 1. As seen in the histogram 1(a) the diagonals corresponding to human segments do not follow a normal distribution. There for by changing the two thresholds of the bounding box classifier we do not capture a steady rise or fall in the number of true positives. In other words when the number of correctly detected human segments in proportion to all human segment (detected and not detected), detection of a nonhuman as human (a false positive) has more sever effect on the precision. At the mean time because of the very small diagonal of the bounding box classifier many human segments might not be detected leading to a very low recall as well as a low precision. Detected Label True Label Person No Person Total Person (57.44%) 7646 (4.56%) No Person 5579 (4.37%) 5596 (95.63%) Figure 7: confusion matrix for bounding box Detected Label True Label Person No Person Total Person 9437 (45.31%) 3554 (54.69%) No Person 3636 (6.1%) (93.79%) Figure 8: confusion matrix for circle fitting Detected Label True Label Person No Person Total Person 47 (65.97%) 754 (34.3%) No Person 3739 (4.6%) 5611 (95.94%) Figure 9: confusion matrix for AdaBoost 6 Conclusion and Future Work The problem of detecting people in D laser range data was introduced in this paper. Our aim was to compare three approaches in regard to geometric features for solving this problem. To this end three different approaches were implemented and compared. Boosting had the best performance in comparison with circle fitting and our baseline approach. This research shows that combining a number of weak classifiers in order to construct a strong classifier performs better than classifiers that take only one geometric feature into account.
6 1.9 Precision / Recall AdaBoost circle fitting bounding box recall precision Figure 6: The Precision/Recall graph This figure shows AdaBoost outperforming Circle fitting and the baseline approach along the main diagonal (a) human segments (b) nonhuman segments Figure 1: Histograms of the diagonals of the bounding boxes corresponding to human segments and nonhuman segments. For the purpose of detection a basic jump distance algorithm was used. A better segmentation method can be a appropriate next step for improving the performance of the system on hand. The reason for this is jump distance does not differentiate situations that the person is very close to wall and will regard wall and the person as being in the same segment. The main problem is that jump distance does not take into account the curve shaped feature of the human body. In this paper we have addressed the people detection only considering the geometric features. In our opinion having a layer of detection on top of this system which incorporates motion features for people detection can improve the performance. 7 Acknowledgements We would like to thank our supervisor, Gian Diego Tipaldi, for his help and support throughout the project. References [Arras et al., 8] K. O. Arras, O. Martinez Mozos, and W. Burgard. Using boosted features for the detection of people in d range data. Proc. of the IEEE Int. Conf. on Robotics and Automation, Rome, Italy, 8. [Premebida and Nunes, 5] Cristiano Premebida and Urbano Nunes. Segmentation and geometric primitives extraction from d laser range data for mobile robot applications. Robtica 5, April 5. [Xavier et al., 3] J. Xavier, M. Pacheco, D. Castro, A. Ruano, and U. Nunes. Fast line, arc/circle and leg detection from laser scan data in a player driver. Technical report, Institute of Systems and Robotics  ISR, University of Coimbra, 3. [Zivkovic and Kroese, 7] Zoran Zivkovic and Ben Kroese. Part based people detection using d range data and images. In Proc. IEEE/RSJ IROS, 7.
More information