3D Facial Image Comparison using Landmarks

Transcription

1 3D Facial Image Comparison using Landmarks A study to the discriminating value of the characteristics of 3D facial landmarks and their automated detection. Alize Scheenstra Master thesis: INF/SCR Netherlands Forensic Institute Institute of Information and Computing Sciences Utrecht University February 2005

2 3D Facial Image Comparison using Landmarks A study to the discriminating value of the characteristics of 3D facial landmarks and their automated detection. Alize Scheenstra Master thesis: INF/SCR Netherlands Forensic Institute Institute of Information and Computing Sciences Utrecht University February 2005 Front Page Image The front page image consist of three 3D facial scans. These scans are provided to me by the Netherlands Forensic Institute and are in this thesis only used for illustration purposes. The topmost face is an illustration of the landmarks that were included in the automated landmark detection. The other two faces show the distances between the landmarks that were included in the final landmark set. They are marked on two different scans to illustrate the small variations between subjects. 2

3 abstract Many researches have been dealing with the face recognition challenge of the great variability in head rotation and tilt, lighting intensity and angle, facial expression, aging, etc. The last few years more and more 2D face recognition algorithms are improved and tested on less than perfect images. However, 3D models hold more information of the face, like surface information, that can be used for face recognition or subject discrimination. Another major advantage is that 3D face recognition can be made pose invariant. In this thesis a literature survey is presented on recent face recognition algorithms. One way of performing 3D face recognition and face comparison, is to make use of facial landmarks. To study the possibilities of automated landmark detection, we first performed an analysis to find the landmarks that are best suited for automated facial comparison. We used 3D data from the facial area of 3D whole body scans, acquired in the Netherlands for the CEASAR-survey. At the time of 3D scanning, 8 facial landmarks were manually annotated, and recorded in the scanning process. We measured the absolute distances between these landmarks in the 3D model. The analysis of the measurements was performed in two ways: First, we analyzed the variance and correlation of distances between facial landmarks. Second, we used the Fisher discriminant analysis to find the most significant landmark distances. The resulting sets were almost equal and can both be used for facial comparison. The most informative distances appeared to be the distances from and to the gonion (posterior point of the jawbone). Unfortunately, the gonion is found by palpation of the underlying jawbone, and therefore it cannot be found in a 3D model. The next most informative are the distances from and to the sellion (point of greatest indentation of the nasal root depression) or the supramenton (point of greatest indentation of the mandibular symphysus). To find a measure of the discriminating value of the distance measurements of the CEASAR data, we calculated the probability that the measurements of two subjects are not significantly different. We assumed that the measurements for all subjects are normally distributed. If the measurements of a subject are close to the mean (i.e. a common face), there is a probability that the same measurements are found in 1 of 2 subjects. If the measurements of a subject are in the tail of this distribution (i.e. a rare face), the probability that the same measurements are found on another subject is at most 1 in 19 subjects. We also present an algorithm for the automated detection of 11 facial landmarks: 8 facial landmarks from the CAESAR-survey and the pronasale (nosetip) and the left and right alar curvatures (point indicating the facial insertion of the nasal wingbase). For the 8 facial landmarks is the local curvature analyzed by using bump hunting. The resulting classification rules are used for the detection of these landmarks in combination with the geometrical information of the face. The pronasale and alar curvature are detected by using only the geometrical information of the face. We used a lower bound to indicate the detection performance of the algorithm: The pronasale and the sellion are detected best with a lower bound 66%, the gonion is detected worst with a lower bound of 0.0%. Based on the results we can conclude that this algorithm performs best on landmarks with a typical local surface (like the sellion or the left and right alar curvature) instead of landmarks with a flat local surface (like the gonion and infraorbitale). 3

4 Acknowledgements I would like to thank a few people, who supported me during my project in many ways. First of all I thank my supervisors for their intellectual and mental support: Arnout Ruifrok and Jurrien Bijhold at the Netherlands Forensic Institute and Remco Veltkamp at the Utrecht University. They have provided new ideas and information, but also critical remarks when needed. Secondly, I would like to thank Ivo Alberink for his support, patience and feedback on statistical issues. Also, I would like to thank Hein Daanen and Koen Tan from TNO for providing me information and answering my questions concerning the CAESAR-survey and dataset. Furthermore, I have to say thank-you to all my colleagues at the Netherlands Forensic Institute for the pleasant time during working hours, but also for the interesting conversations on wednesday-evenings in the most attractive bar of Rijswijk. Finally, I would like to thank all my friends and family, particularly Sander Groenendijk, for supporting me mentally and for listening to all my remarks on the progress of my graduation. 4

5 Contents Abstract 4 Acknowledgements 5 1 Introduction Introduction Overview of Thesis Related Work Metrics and Performance Measures Statistical Approaches Eigenfaces Linear Discriminant Analysis Deformable Templates Other Methods Machine Learning Approaches Face Bunch Graph Matching Support Vector Machines Other Methods Other Approaches Intensity Based Infra-Red Images D Models Surface Based Approaches Template Matching Approaches Other Approaches Discussion and Conclusion Data and Materials Data Description Procedures of Acquiring 3D Whole Body Scans Materials Scanning Process Post processing of the Point Clouds Sanity Check Previous Study Errors in the Dataset Discussion and Conclusion Landmark Analysis Information of a Landmark Set Information Measures Variance and Correlation based Information Measure Fisher Discriminant Analysis Experiments Variance and Correlation based Information Measure Fisher Discriminant Analysis Discrimination Value of a Landmark Set

6 4.5 Error Analysis Discussion and Conclusion Landmark Detection Automated Landmark Detection Surface Characterization Landmark Detection Rules Landmark Detection Algorithm Performance and Results Discussion and Conclusion Conclusion Conclusion Further Research References 58 A Landmarks of the CAESAR-survey 67 B Multiple Analysis of Variance 73 B.1 MANOVA B.2 nested MANOVA C Selection Order of the Landmark Set 75 C.1 Variance Analysis C.2 Correlation Analysis C.3 Fisher Discriminant Analysis D Bump Hunting 83 D.1 Creation of Detection Rules D.2 Using the Detection Rules E Detection Rules of Facial Landmarks 86 E.1 sellion (se) E.2 infraorbitale (io) E.3 supramenton (sp) E.4 tragion (tr) E.5 gonion (go) F Program Manual 95 F.1 View options F.2 Landmark Detection F.3 Local curvature analysis for landmarks

7 1 Introduction 1.1 Introduction Although a lot of research is preformed in the area of face recognition, only a few systems can be used for forensic application and even less are actually used. Forensic application means the branch of science that uses scientific technology to assist the courts of law. This is due to the fact that a forensic scientist cannot afford to have a high false positive rate, which is mostly the case in face recognition systems. Also, many systems are based on facial images taken under controlled circumstances (same background, facial expression, high resolution, etc) where in forensic science the facial images are taken under uncontrolled circumstances. The surveillance systems that capture an image of an offender provide mostly images of bad quality, to which a suspect must be compared. Nowadays, facial comparison is performed by a reconstruction at the crime scene, where the suspect is placed in the exact position as the offender on the surveillance video This is much trouble for the suspect but, in case of a non-cooperative suspect, also for the forensic scientist. One can avoid the trouble of the reconstruction if a 3D scan of the suspect can be used for the facial comparison with the offender. An experimental study was performed on this subject by Yoshido et al. [1]. They compared 3D facial scans to 2D surveillance images where they used 18 manually located landmarks combined with superimposition. This method reached an equal error rate (equal false positive and false negative rate) of 4.2 percent. Besides a comparison between 3D scans and 2D images, one can also compare 3D facial scans with other 3D facial scans, for example for access authorization. A commercial system providing 3D face verification is lately brought onto the market by A4vision [2]. In this thesis we examine the possibilities of automated landmark detection for 3D facial comparison. First, we try to find a set of absolute distances between facial landmarks that can be used for facial comparison and we determine their discriminating value. To find a set of distances between facial landmarks, we propose two measures; one measure based on the variance and correlation of those distances and one based on the Fisher discriminant analysis. Second, we present an algorithm for the automated landmark detection of 11 landmarks. The algorithm is based on prior information of the local surface of the landmarks combined with geometrical information. The prior information is based on the manually located landmarks of the CAESAR-survey and analysed with bump hunting. 1.2 Overview of Thesis This thesis is organized as follows: In section 2 an introduction is given on face recognition by describing the recent developments in 2D and 3D face recognition algorithms. The 3D whole body scans of the CAESAR-survey used for the experiments are described and tested for errors in section 3. In section 4 we proposed a new measure to find a set of distances between landmarks that are suited for facial comparison, based on the absolute distances between those landmarks. Also, a discriminating value is calculated for the landmarks. The proposed landmark detection algorithm is described in section 5. In section 6 conclusions and recommendations for further research will be made. 7

8 2 Related Work One of the earliest face recognition method was presented in 1966 by Bledsoe [3]. In one of his papers [4], Bledsoe described the difficulties of the face recognition problem: This recognition problem is made difficult by the great variability in head rotation and tilt, lighting intensity and angle, facial expression, aging, etc. Some other attempts at facial recognition by machine have allowed for little or no variability in these quantities. Yet the method of correlation (or pattern matching) of unprocessed optical data, which is often used by some researchers, is certain to fail in cases where the variability is great. In particular, the correlation is very low between two pictures of the same person with two different head rotations. Since that time many researches have been dealing with this subject and have been trying to find an optimal face recognition method. In this section an overview is presented of the latest, most important face recognition methods. The main purpose of this overview is to describe the recent face recognition algorithms on still images. Previous face recognition surveys were presented by Samal and Iyengar [5], Chellappa et al. [6] and Zhao et al. [7]. In the Vendor Test 2002 the performance of different commercial face recognition methods were compared [8]. Most commercial face recognition systems use one or more algorithms as presented in the literature. However, for all systems is concealed which algorithm is used in their application. So, the commercial systems are excluded in this survey. 2.1 Metrics and Performance Measures In this section we adapt the nomenclature of the FERET Vendor Test 2002 [9]. The face recognition problem can be separated in two different subproblems: face identification If a facial image must be identified an algorithm computes the identification scores for all images in the dataset. These scores are ranked from best match to worst match. The k-rank recognition rate is the fraction of correct matches that are returned between rank 1 and rank k. A rank one recognition rate (k=1) is used most in the literature. face authentication For face authentication a threshold is set where the identification scores are compared to. The result of a face authentication is a set of images which are recognised to be the same as the requested image. These results can be presented by an Receiver Operating Curve or ROC (the false acceptance rate plotted against the false rejection rate). A good measure to report is the Equal Error Rate (EER) which returns the rate for which the false acceptance rate and face rejection rate are equal. Sometimes is face authentication also called face verification. The images from the dataset are called a target set. A target set can exist of one image for each subject or, which is the case for most target sets, multiple images for each subject. If the latter is the case, the set is called a gallery set. The images that are matched against the target set are called a query set. The 8

9 query exists of probe images and imposterior images. The first are the images that have a match in the target set and the imposterior images are the images that don t have a match in the target set. 2.2 Statistical Approaches In most face recognition methods some principal component analysis (PCA) is used to reduce the number of feature dimensions. In this section we will only describe the methods that use PCA or some variant to decode the facial images. The face recognition or face verification is then performed by some classification method, also described in this section Eigenfaces Kirby and Sirovich were the first to present face images by projecting them to a lower dimensionality using the Karhunen-Loeve Transform [10]. That this low-dimensional representation could be used for face recognition was first presented by Turk and Pentland [11]. The face recognition based on PCA, the so called eigenface method, was one of the best performing methods [7]. A global approach of the eigenface method is described below. A good description of PCA is presented in a book on Pattern Classification by Duda [12]. The eigenface method can be separated in two parts. The first part is called the face coding step. In this step are the facial images projected to the face space by PCA where each image is represented by an n-dimensional vector. The second step is used for face recognition and is called the classification step. In the classification step the images are compared to each other by some distance measure (for example the Euclidean distance) or by some classification method (for example the Support Vector Machines). The Support Vector Machines are described in section A few of this distance measures were compared by Navarette and Ruiz-del-Solar [13]. Pima and Aladjem proposed a new classifier called regularized discriminant analysis (RDA) that decreased the misclassification error of PCA [14]. However, when RDA was used for the classification based on linear discriminant analysis, as described in section 2.2.2, the misclassification error remained the same. The PCA based face recognition is commonly used, although it is very sensitive for changes in illumination, facial expressions or pose of the head. Also many researchers are using PCA as a reference method when presenting the recognition rate of their newly proposed method Linear Discriminant Analysis Another way of coding faces is based on Linear Discriminant Analysis (LDA), introduced by Fisher [15]. Belhumeur et al. was the first to apply LDA to face recognition [16]. His results showed a large improvement on the regular PCA based methods. For images with large illumination differences the presented LDA method had an error rate of 5% where the PCA method had an error rate of over 40%. In some articles this method was referred to under the name Multiple Class Analysis (MDA) or fisherfaces. A good description of LDA is presented in a book on Pattern Classification by Duda et al. [12]. 9

10 Zhao et al. proposed a combination of PCA and LDA [17]. They found that this approach outperforms both single methods. Tested on a dataset of 298 images the combination reached a recognition rate of 95%. A comparison between PCA and LDA was performed by Martinez et al. [18]. He claimed that PCA could outperform LDA in some cases. He performed several experiments varying the number of dimensionalities of the face space and the size of the training sample. He found that PCA is more robust to small trainingsamples than LDA. The reason for PCA to outperform LDA was that the underlying classes couldn t be correctly represented when using a small training sample. Also Navarette and Ruiz-del-Solar [13] compared the LDA and PCA to each other. Their results supported the conclusions of Martinez. Some studies were performed to improve the results of LDA when only a few trainings images for each subject are available [19, 20, 21] Deformable Templates One of the first articles on template methods was published by Brunelli and Poggio [22]. They used template images of the eyes, nose and mouth for face recognition and reached a recognition rate of 100% on a dataset neutral frontal faces of 47 subjects. Cootes and Taylor presented a point distribution model [23]. This point distribution model was later on presented by Lanites et al. as the Active Shape Model (ASM) [24]. The active Shape model was used to find points on a shape or contour by making constraints of that shape. In the same article the Active Appearance Model (AAM) was presented. The benefit of an AAM was that it took the intensity values instead of a point distribution model. On a dataset of 200 images the ASM only had a recognition rate of 70%, the AAM reached a recognition rate of 84% and a combination of both reached a recognition rate of 95.5%. But when the combined method was tested on a difficult dataset with occlusion images of 90 subjects a recognition rate of 48% was reached. The AAM and the ASM were extensively described in a recent technical report of Cootes and Taylor [25]. Jones et al. used a variation of the AAM presented as the morphable model [26]. They showed that the morphable model is more efficient than the AAM. Their conclusions were supported by Xu et al. [27]. This morphable model was recently extended to 3D by Blanz and Vetter [28], as described in section Xue et al. proposed the Bayesian Shape Model that adds prior information to the Active Shape Model [29]. Their results showed that the BSM extracted the facial features more accurate than the ASM. Zhang et al. proposed to use a constrained shape model (CSM) instead of the active shape model [30]. They proposed to use Gabor Wavelets to fit the shape model more accurately. The average error rate decreased by this adjustment from 3.10% (normal ASM) to 1.94% (CSM). The same idea was also proposed by Jiao et al. under the name W-PCA and was only applied to decode the faces [31] Other Methods Many variations on PCA and LDA were proposed in literature. All methods were presented with one purpose; making the existing methods more robust to illumination and pose changes. One way of dealing with illumination was to 10

11 exclude the first largest eigenvectors from the PCA method, since they coded most of the illumination variation in a face [16]. An improvement of removing the largest eigenvectors was proposed by Kim et al. [32]. They proposed to apply independent component analysis (ICA) of the subspace remaining after the largest eigenvectors were removed. Their results showed that the error rate of their proposed method was lower than the baseline PCA method, but higher than the baseline LDA method or second-order PCA [33]. Hwang et al. compared four variations of PCA to each other: PCA, Correlation, PCA without first three parameters and LDA on a database of 300 subjects [34]. Their database consisted of only Asian persons. In their results they described how these algorithms performed under different facial expressions and different occlusions of the face. LDA performed best with an average recognition rate 90%. Hwang et al. also showed that correlation outperformed baseline PCA and PCA without the three most important eigenvectors. Moghaddam et al. extended the properties of the linear discriminant analysis by using a probabilistic distance measure instead of the Euclidean distance [35]. Therefore he combined two low-dimensional subspaces, one for intra-personal changes and one for inter-personal changes, in a Bayesian similarity measure. An experiment from the FERET evaluation [36] showed that this method achieved a recognition rate of 95% for frontal images where the standard PCA method had a recognition rate of 83% for frontal images. Wang and Tang combined the face recognition method of PCA, LDA and Bayesian similarity measure from Moghaddam to one method [37]. The first results showed that the combined face recognition method had a better performance than the three separate methods. Yang applied kernel PCA (KPCA) and LDA (KLDA) to face recognition [38]. His new proposed method had an error rate that lies 2% lower than baseline PCA or baseline LDA. Zhou et al. proposed to combine the intra-personal space of Moghaddam [35] with the KPCA method [39]. Their results showed that the intra-personal subspace performed better than the PCA, LDA, KPCA and KLDA methods. The combined intra-personal method with KPCA had an equal recognition rate as the intra-personal subspace method. Other variations in subspace methods of eigenfaces were a PCA-mixture method introduced by Kim et al. [40], second order eigenfaces introduced by Wang and Tan [33] and a combination of both, the second order mixture-of-eigenfaces method introduced by kim et al. [41]. Zhao and Chellappa proposed in [42] a shape-from-shading (SFS) method for preprocessing. This SFS-based method used a depth map for generating synthetic frontal images. The LDA coding was applied to the synthetic images instead of the original images. The recognition rate increased with 4% when the synthetic images were used for LDA coding instead of the original images. Hu et al. proposed to use one neutral frontal image to create a 3D model to create synthetic images under different poses, illuminations and expressions [43]. Using this 3D model instead of the normal faces to apply PCA or LDA the recognition rate increased with an average of 10% for the half-profile images. A similar idea was proposed earlier by Lee and Ranganath where they presented a combination of an edge model and color region model for face recognition after the synthetic images were created by a deformable 3D model [44]. Their method was tested on a dataset with 15 subjects and reached a recognition rate of 92.3% when 10 synthetic images per subject were used and 26.2% if one image for each subject was used. 11

12 Guntruk et al. used super-resolution before projecting the faces to the eigenspace [45], he called his method eigenface-domain super resolution face recognition. Compared with pixel-domain super-resolution face recognition methods [46, 47, 48], the eigenface-domain super-resolution improved the recognition rate with 30%. Jin et al. proposed to combine the Foley-Sammon transformation and the LDA to find optimal discriminant vectors which were uncorrelated to each other [49]. Tested on the databases of ORL and NUST603, their method reaches an error rate of 2.5% and 1.5% respectively. 2.3 Machine Learning Approaches Different Machine Learning Approaches were presented during the past years. These approaches were not only used for face recognition, but also for gender classification, expression recognition, et cetera. In this section we will discuss the approaches that were presented for face recognition Face Bunch Graph Matching Wiskott et al. introduced the Gabor Wavelets for face recognition. In their first article they presented how Face Bunch Graphs were able to describe a face [50]. A few years later a full face recognition method was presented based on these Gabor wavelets [51]. With this method Wiskott et al. reached a recognition rate of 98% on frontal images of the FERET database. For profile images a recognition rate of 84% was reached, but for half profile images the recognition rate was 57% [51]. The method was tested on a dataset of 250 subjects Support Vector Machines As mentioned before in section 2.2, the support vector machines have been used in face recognition as a classification method after the face coding step. Mostly, principal component analysis was used for the face coding step. Support Vector Machines (SVM) were trained to distinguish between different classes or subjects by placing a hyperplane between the different classes to minimize the risk of misclassification. Detailed descriptions on SVMs were published by [52, 53, 12]. Jonsson et al. have compared the support vector machines with other common classification systems used after the decoding the faces by PCA [54]. They found out that when PCA is used as a face decoder, the support vector machines were outperforming nearest neighbor and correlation methods because SVM could find the discriminating vectors for discriminating between classes. However, when LDA was used for the decoding of the faces the relevant discriminating vectors were already extracted by LDA and the support vector machines performed equally as the nearest neighbor or correlation. Their conclusions were supported by Saedghi et al. who performed an equal test on the European BANCA database [55]. Lee et al. proposed to improve the performance of SVM by combining the SVM classifier with the nearest neighbor rule [44]. Their approach reached the recognition performance of the SVMs but had the low computational costs of the nearest neighbor rule. Costen and Brown used an appearance model as face decoder and proposed to use a sparse SVM as classifier [56]. On a database with 22 subjects their approach had a maximal error rate of 4.5%. 12

13 Heisele et al. compared one component-based approach and two global approaches of SVM [57]. The first global approach didn t take account of intrapersonal changes. In the other global approach clustering was performed to model the intra-personal changes of each subject. For the component-based approach 10 parts of the face were selected for recognition. Their results showed that the component based approach performs best with a recognition rate of 90% and that the clustered global approach outperformed the non-clustered approach. In one of his latest publications he described how the components for the component based approach automatically can be found [58] Other Methods Marcel proposed a multi-layer perceptron (MLP) for face verification instead of the support vector machines [59]. He extracted a feature space from each image by both principal component analysis and linear discriminant analysis. Although no recognition rate was given, Marcel stated that the MLP methods performed better that the SVM method on the same samples. Yang et al. proposed to extend the Gabor Wavelet method with AdaBoost trained for intraand extra-personal changes [60]. This approach brought back the number of features used for recognition to obtain the same recognition rate, which speeded up the computational costs. Wang et al. presented the recognition performance of landmark based face recognition for 2D images and for 3D models [61]. To detect the landmarks in 2D images, Gabor wavelets were used and for the landmark detection in 3D models point signatures were used as described in section After landmarks were detected, PCA was used as decoder method and the classification was performed by SVMs. Their results showed that 3D landmarks alone reached a recognition rate of 85% where 2D landmarks alone reached a recognition rate of 87%. If both 2D and 3D landmarks were combined they reached a recognition rate of 89%. The authors remarked that these results could also be influenced by the number of landmarks used for face recognition, since for the point signatures 4 landmarks were used, for the gabor wavelets 6 landmarks and for the combination 12 landmarks were used. Nefian and Heyes proposed to use Hidden Markov Models (HMM) for face recognition [62]. They reached a recognition rate of 85% on a dataset of 40 subjects. In their second attempt they proposed an embedded HMM model and reached a recognition rate of 98% on the same dataset [63]. 2.4 Other Approaches Intensity Based Takács and Wechsler [64] proposed a method based on the edges on a face. The edges of a face were detected by a Sobel filter. The resulted images were matched based on the Hausdorff distance. Their method was tested on 320 frontal neutral images from the FERET database and it achieved a recognition percentage of 92%. Chen et al. proposed to use a probability map to indicate the probability that a landmark is placed on a particular spot in the face [65]. This probability map is combined with a simplified 3D model which holds the geometric information of the face. They claimed that their landmark finding method is more accurate than the ASM-model, although no results are given. 13

14 2.4.2 Infra-Red Images A first face recognition method on infra-red images is performed by Wilder et al. [66]. He concluded that visible images reached the same recognition rate as infra-red images and that a fusion between both images performs best. In another study, Gyaourova et al. proved that face recognition with infrared light is sensitive for glasses [67]. They propose a fusion of visible images and infrared images to create an face recognition method invariant for subjects wearing glasses. Pan et al. proved that for near infra-red light (wavelength between 700 and 1000 nm) the reflectance of the human skin has large differences between subjects but small differences on different places on the skin of the same subject. Their face recognition was tested on hyperspectral images of 200 subjects with different poses of the head. Their best result achieves a face recognition rate 90%. Recent studies on face recognition using long wave infra-red light (LWIR) were performed by Socolinsky et al. [68, 69]. Their main conclusion was that the LWIR light methods outperform statistical approaches. However, they did also mention the problem that the acquisition of LWIR images is more expensive than visible images D Models In the last few years 3D facial models could be more easily acquired since the acquisition techniques have been improved. Therefore, some face recognition methods have been extended for 3-dimensional purposes. Using 3D models one can deal with one main problem in 2D face recognition; the pose of the head. Also the surface curvature of the head can now be used to describe a face. A recent survey of 3D face recognition was recently presented by Bowyer [70]. In this chapter we will give an overview of the recent work done on the area of 3D face recognition Surface Based Approaches Local Methods Gordon proposed to use the Gaussian and Mean Curvature combined with depth maps to extract the regions of the eyes and the nose. He matched these regions to each other and reached an recognition rate of 97% on a dataset of 24 subjects [71]. Moreno et al. used both median and Gaussian curvature for the selection of 35 features in the face describing the nose and eye region[72]. The best recognition rate was reached on neutral faces with a recognition rate of 78%. Xu et al. proposed to use Gaussian-Hermite moments as local descriptors combined with a global mesh [73]. Their approach reached a recognition rate of 92% when tested on a dataset of 30 subjects. When the dataset was increased to 120 subjects, the recognition rate decreased to 69%. Chua et al. [74, 75] introduced point signatures to describe the 3D landmark. They used point signatures to describe the forehead, nose and eyes. Their method reached a recognition rate of 100% when tested on a dataset with 6 subjects. Wang et al. used the point signatures to describe local points on a face (landmarks). They tested their method on a dataset of 50 subjects and compared their results with the Gabor wavelet approach [61]. Their results showed that point signatures alone reached a recognition rate of 85% where the 14

15 Gabor wavelets reached a recognition rate of 87%. If both 2D and 3D landmarks were combined, they reached a recognition rate of 89%. The authors remarked that these results could also be influenced by the number of landmarks used for face recognition, since for the point signatures 4 landmarks were used, for the Gabor wavelets 6 landmarks and for the combination of both 12 landmarks were used. Another approach on 3D face recognition with landmarks is proposed by Irfanoğlu et al. [76]. They automatically find landmarks based on a point could template combined with the mean and gaussian curvature. The matching is done based on the euclidean distance between the landmarks of the different faces. In a dataset of 30 subjects with 3 scans per subject, they reach a recognition rate of 96.7%. Global Methods One global method on curvature was lately presented by Wong et al. [77]. The surface of a facial model was represented by an Extended Gaussian Image (EGI) to reduce the 3D face recognition problem to a 2D histogram comparison. The proposed measure was the multiple conditional probability mass function classifier (MCPMFC). Tested on a dataset of 5 subjects with 50 scans per subject, the MCPMFC has a recognition rate of 80% where a minimum distance classifier (MDC) reached a recognition rate of 68%. However a test on synthetic data showed that for both methods the recognition rate decreased with 10% when the dataset was increased from 6 subjects to 21 subjects. Papatheodorou and Rueckert proposed to use a combination of a 3D model and the texture of a face [78]. They also proposed some similarity measures for rigid alignment of two faces for 3D models and for 3D models combined with the texture. Their results showed an increase for frontal images when adding a texture to the model. Beumier and Acheroy proposed to use vertical profiles of 3D models for face recognition. Their first attempt was based on three profiles of one face and had an error rate of 9.0% when it was tested on a dataset of 30 subjects [79]. In their second attempt they added grey value information to the matching process [80]. This attempt reduced the error rate to 2.5% when it was tested on the same database. Wu et al. proposed to perform 3D face recognition by extracting multiple horizontal profiles from the 3D model [81]. By matching these profiles to each other they reached an error rate between 1% and 5.5% tested on database with 30 subjects Template Matching Approaches Blanz, Vetter and Romdhani proposed to use a 3D morphable model for face recognition on 2D images [82, 28, 83]. With this method tested on a dataset of 68 subjects they reached a recognition rate of 99.8% for neutral frontal images and a recognition rate of 89% for profile images. Huang et al. added a component based approach to the morphable model [84] based on the approach of Heisele [58]. However, the recognition rate was for all approaches of the morphable model between the 75% and the 99%. Naftel et al. presented a method for automaticly detecting landmarks in 3D models by using a stereo camera [85]. The landmarks were found on the 2D images by an ASM model. These landmark points were transformed to the 3D 15

16 model by the stereo camera algorithm. This algorithm was correct in 80% of all cases when tested on a dataset of 25 subjects. A similar idea was proposed by Ansari and Abdel-Mottaleb [86]. They used the CANDIDE-3 model [87] for face recognition. Based on a stereo images landmark points around the eyes, nose and mouth were extracted from the 2D images and converted to 3D landmark points. A 3D model was created by transforming the CANDIDE-3 generic face to match the landmark points. The eyes, nose and mouth of the 3D model were separately matched during the face recognition. Their method achieved a recognition rate of 96.2% using a database of 26 subjects. Lu et al. had used the generic head from Terzopoulos and Waters [88] which they adapted for each subject based on manually placed feature points in the facial image [89]. Afterwards the models were matched based on PCA. This method was tested on frontal images and returns in 97% of all cases the correct face within the best 5 matches Other Approaches The original principal component method for 3D facial models was implemented by Mavridis et al. for the European project HiScore [90]. Chang et al. had compared the performance of 3D eigenfaces and 2D eigenfaces of neutral frontal faces on a dataset of 166 subjects [91]. They found no real difference in performance for the 2D eigenfaces and 3D eigenfaces. However, a combination of both dimensionalities scored best of all with a recognition rate of 82.8%. Xu et al. proposed to use 3D eigenfaces with nearest neighbor and k-nearest neighbor as classifiers [92]. Their approach reached a recognition rate around the 70% when tested on a dataset of 120 subjects. Bronstein et al. had proposed to transform the 3D models to a canonical form before applying the eigenface method to it [93]. They claimed that their method could discriminate between identical twins and was insensitive for facial expressions, although no recognition rates were given. Tsalakanidou et al. proposed to combine depth maps with intensity images. In their first attempt they used eigenfaces for the face recognition and his results showed a recognition rate of 99% for a combination of both on a database of 40 subjects [94]. In a second attempt embedded hidden markov models were used instead of eigenfaces to combine the depth images and intensity images [95]. This approach had an error rate between the 7 % and 9%. 2.6 Discussion and Conclusion It is hard to compare the results of different methods with each other since the experiments presented in literature are mostly based on different seized dataset. For example one method was tested on neutral frontal images and had a high recognition rate, while another method was tested on noisy images with different facial expressions or head poses and had a low error rate. Besides the performance of the algorithm the computational time is also important. Unfortunately, the computational costs are only mentioned in a few articles, so no correct comparison on the computational costs can be made. Some authors presented combinations of different approaches for a face recognition method and these performed all a little better than the separate methods. If the error rate decreases significantly, while the recognition rate increases 16

17 only a little bit, the combined method is still preferred. But if combining two methods make the computational costs increase much, this method wouldn t be preferable at all. Most interesting for this survey were the studies that presented a comparison study, like [13, 54, 55]. Phillips et al. present comparison studies performed on the FERET database [8]. The latest FERET test is performed in 2002 on commercial systems [9]. The FERET test performed on different algorithms is presented in 2002 [36]. In the latter survey the computational costs were taken a little into account. An important conclusion from this survey was that the recognition rates of all methods were improved over the years. The dynamic graph matching approach of Wistkott et al. [51] had the best overall performance on identification. For face verification the combination of PCA and LDA presented by Zhao et al. [17] performed best. Although published in 2000, the test itself was performed in 1997 which makes a new recent test preferable. An implementation of face recognition methods is published by Beveridge and Draper [96]. In their application, four different algorithms are implemented and downloadable: normal PCA, a combination between PCA and LDA [17], the Bayesian approach of Moghaddam [35] and an implementation of the Elastic Bunch Graph Matching [51]. So if one likes, one can compare these algorithms on their own dataset. In table 1 a summary is given for the most important and successful face recognition methods. One can see that the 3D face recognition approaches are still tested on very small datasets. However, the datasets are increasing during the years since better acquisition materials become available. By increasing a dataset, however, the recognition rate will decrease. So the algorithms must be adjusted and improved before they will be able to handle large datasets with the same recognition performance. Another disadvantage of most presented 3D face recognition methods is that most algorithms still treat the human face as a rigid object. This means that the methods aren t capable of handling facial expressions. In contrast of 3D face recognition algorithms, most 2D face recognition algorithms are already tested on large datasets and are able to handle the size of the data tolerable well. The last few years more and more 2D face recognition algorithms are improved and tested on less perfect images, like noisy images, half profile images, occlusion images, images with different facial expressions, et cetera. Although not one algorithm can be assumed to handle the difficult images good enough, an increasing line in performance can be found. Although the 2D face recognition seem to outperform the 3D face recognition methods now, in future it can be the other way around. The 3D models hold more information of the face, like surface information, that can be used for face recognition or subject discrimination. Compared to 2D face recognition only a little research is performed on 3D face recognition. Therefore, 3D face recognition is still a very challenging but very promising research area. 17

18 method modality reference number of subjects images per subject rank one recognition error rate variation in dataset per subject performance in % in % in images baseline PCA 2D [36] ? no baseline LDA 2D [54] 200 at least yes baseline Correlation 2D [36] ? no PCA-LDA 2D [7] ? yes Bayesian PCA 2D [35] ? no ASM-AAM 2D [24] ? no ASM-AAM 2D [24] ? yes Face Bunch Graph 2D [51] ? no Face Bunch Graph 2D [51] ? yes Infra-Red images 2D [97] ? no Gaussian images 3D [71] ? no Point Signatures 3D [75] ? no Extended Gaussian Images 3D [77] ? no Profiles 3D [80] 26 1? 2.5 no Morphable model 3D [82] ? no Morphable model 3D [82] ? yes 3D eigenfaces 3D [91] ? no 3D eigenfaces 3D [92] yes Canonical forms 3D [93] 157 1?? yes Table 1: A summary on most important presented 2D and 3D face recognition methods. The variation in images column shows if images in de dataset were taken under different conditions, like facial expression, illumination, head pose et cetera. 18

19 3 Data and Materials 3.1 Data Description For this thesis we used the whole body scans of the Dutch and Italian part of the CAESAR-survey [98]. The main goal of the CAESAR-survey was to acquire 3D whole body scans of subjects. It took place from December 1997 until December 2001 in America, Italy and the Netherlands. Each subject was scanned in three different poses; standing, relaxed seating and seating in such a way the whole body was visible for the camera. Also, demographic information was collected of each subject, like age, data collection location, date, education level, ethnic group, family income in the past year, gender, present occupation, marital status, etc. Also, 73 landmarks were manually placed on each subject before scanning. Besides the 3D whole body scan a separate file was available with the 3D-coordinates of these landmarks. The landmarks which were used in the CAESAR-survey, are in detail described in Appendix A For the data analysis and the 3D face comparison problem we extracted the facial region from each whole body scan using bounding boxes as described in [99]. In this facial region 8 landmarks were defined by the CAESAR-survey: sellion (se) Point of greatest indentation of the nasal root depression right infraorbitale (r. io) Lowest point on the inferior margin of the right orbit, marked directly inferior to the pupil left infraorbitale (l. io) Lowest point on the inferior margin of the left orbit, marked directly inferior to the pupil supramenton (sp) Point of greatest indentation of the mandibular symphysus, marked in the midsagittal plane. right tragion (r. tr) Notch just above the right tragus (the small cartilaginous flap in front of the ear hole) left tragion (l. tr) Notch just above the left tragus (the small cartilaginous flap in front of the ear hole) right gonion (r. go) Inferior posterior right tip of the gonial angle (the posterior point of the angle of the mandible, or jawbone. This point is difficult to find when covered with a lot of tissue. left gonion (l. go) Inferior posterior left tip of the gonial angle (the posterior point of the angle of the mandible, or jawbone. This point is difficult to find when covered with a lot of tissue. The locations of these landmarks are shown in figure Procedures of Acquiring 3D Whole Body Scans In this section the acquisition of the 3D whole body scans and the corresponding landmark coordinates for the CAESAR-survey are described. Since the data collection was done in America, the Netherlands and Italy, differences in scanning proceduder occurred during the survey [100]. Some of these differences had an impact on the quality of the 3D whole body scans. These differences and their impact are also described here. 19

20 Figure 1: The positions of the facial landmarks of the CAESAR-survey. These landmarks are manually located (This scan is provided by the Netherlands Forensic Institute and not a part of the CAESAR-survey.) Materials Scanning Devices Two different scanning devices were used to scan the bodies. In Italy and America the 3D whole body scans were acquired using a Cyberware 3D body scanner [101]. In the Netherlands a Vitronic 3D body scanner was used [102]. Both scanning devices have a square platform with a camera block on each corner. A camera block exist of four depth cameras and one texture camera. So, the scanning process for 1 subject resulted in 16 point clouds and 4 texture files that must be merged to one 3D whole body scan, as described in In figure 2 the four camera blocks around the platform are shown. Also, the angles between the camera blocks are presented. For the Vitronic scanning device it is known that α = γ and β = δ, but that α = γ > β = δ. For the Cyberware scanner these ratios weren t known. The Cyberware scanning device was extensively tested and calibrated before the survey started. However, the Vitronic scanning device wasn t tested so well as the Cyberware scanner. As a consequence, it wasn t quite clear what the differences were between both scanning devices when the survey started [100]. These differences were revealed and dealt with during the survey. The Vitronic scanning device had a higher scanning resolutions than the Cyberware scanning device, but the scanning volume of the Vitronic was smaller. Because of the small scanning volume of the Vitronic scanner, some body parts weren t visible for the cameras. To solve this problem the orientation of some scanning poses were changed for this scanning device as described in section Another problem was found in the software used for the landmark picking. This software was developed for the Cyberware scanner and later on adjusted for the Vitronic scanner. The software could be applied on scans acquired by the Vitronic scanner, but only the scans that were aligned by hardware calibration. The landmark picking software couldn t handle the software alignment of the scans, since the there was too much loss of resolution. For the Cyberware scanner the software alignment could be performed without the loss of resolution 20

21 in the landmark picking software. This means that the landmarks were more accurately picked for the scans acquired by the Cyberware scanner than for the scans acquired by the Vitronic scanner. Landmarks To indicate the landmarks on the body white stickers and bumpers were used. The stickers were flat, round and had a diameter of 1 cm. The bumpers were little cubes with a width, height and depth of 0.5 cm and were only used at places difficult to see in the texture file, for example the landmarks on the shoulder. In this thesis to both stickers and bumpers are referred to as markers. In the Netherlands 6 observers were hired to place the markers. These observers were given all the information needed for placing the markers at the right spot on the body. The landmark picking was done semi-automatically. A program identified the locations of the landmark based on the white color of the landmark. These locations were presented to the observer to pick the accurate location of the landmark file. It could occur that a landmark was visible on two texture files. In that case the program selected one texture file to detect the landmark location. This landmark location was presented to the observer. After the observer had picked the landmarks from the texture file, the location and landmark name were automatically written in a landmark file. To limit the amount of work the observers picked the landmarks for pose a and pose b only Scanning Process Each subject to be scanned was dressed in tight shorts. The women wore a tight top, too. By using this outfit most of the body is shown and visible for the scanner. One of the observers placed all the 73 landmarks on the body before the actual scanning. For the scanning itself each subject had to stand in three different poses. First, one was asked to stand still with the arms stretched besides the body (pose a). For the second pose (pose b) each subject was asked to sit down with his arms on his knees. For the last pose (pose c) each subject, still seating, was asked to lift his arms above his head. The three poses of the CAESAR-survey are shown in figure 4. The resolution problems as described in section lead to a change in the orientation of the scanning poses of the Dutch part of the Survey. As described before, the scanning volume of the Vitronic was small and to make most of the body parts visible for the depth cameras, pose b had to be rotated with 90 degrees. Pose a and c of the Dutch survey kept the same orientation. In fact, pose a and c had the same orientation as all three poses of the American and Italian survey. The orientations of the three poses of the Dutch survey are given in figure Post processing of the Point Clouds After the scanning and landmark picking, the 16 point clouds of the depth cameras and the 4 2D-textures of the color cameras had to be aligned and merged to one object in one coordinate system with a resolution of 2 mm. For more detailed information of the aligning and merging process the reader is referred to the master thesis of Suikerbuik [103]. The landmark coordinates were also transformed from 2D coordinates on the texture to 3D-coordinates in 21

22 Figure 2: The angles between the four camera blocks of the 3D scanning devices measured from the middle of the platform. (a) (b) (c) Figure 3: The orientations of the three scanning positions of the Vitronic scanner were (a) shows pose a, (b) shows pose b and (c) shows pose c. The poses a and c are both directed to the angle β as shown in figure 2 and pose b is oriented in the direction of the angle γ the same coordinate system as the whole body scan. After the transformation each landmark was snapped to the closest point in the point cloud of the whole body scan. 3.3 Sanity Check In earlier research was found that in some cases the location of the landmarks wasn t accurate [103, 104]. Since this dataset will be used as the ground truth during this study, one needs to know the frequency and the causes of these incorrect landmark locations Previous Study Verbaan did a study on measurement errors introduced by the observers [105]. He found an average variation of 5 mm for placing the markers on the body for all observers. In his work Verbaan also described other errors that could occur during the scanning process. He mentioned as possible errors the incorrect placement of the markers on the body, the variation in the posture of the subjects and the movement of the subject during scanning (as one can see in figure 4(c), where the subject had moved his left arm). Verbaan further concluded that the landmark picking in one texture could only cause a minimal error, due to an excellent zoom function. 22

23 Suikerbuik rejected this theory of Verbaan on the latter issue and concluded that observers couldn t pick the exact centre of the landmarks with a diameter of 10 mm from a texture file correctly [103]. Even if the observer had picked a landmark correctly, the landmark locating software snapped the landmark coordinate to the closest point of the body scan, introducing a maximal error of = 1.73 mm for each landmark. Also, Suikerbuik added the height of the bumpers to the possible causes of errors Errors in the Dataset In this section is described how we tried to find other errors than described in the previous section. We tried to identify the causes of the newly found errors in the dataset. Since we were only using the facial area of a whole body scan, we restricted ourselves to validate the facial landmarks. For our studies we were using a dataset of 20 subjects. One observer placed the landmarks on each subject before the scanning process. Next, each subject was scanned in the three poses of the CAESAR-survey (pose a, pose b and pose c) as shown in figure 4. For each pose the subjects were scanned three times under the same circumstances (same landmarks, same pose,etc). We referred to these three scans as scan 1, scan 2 and scan 3. The scanning of all subjects was done on the same day by the Vitronic scanner used in the Netherlands and by the Cyberware scanner used in America and Italy. Summarizing, the dataset existed of 20 subject with 18 whole body scans for each subject; 9 scans from the Cyberware scanner and 9 scans from the Vitronic scanner. Empirical Approach As described in section 3.3.1, Verbaan had mostly focused his research on the variation of landmarks caused by the observers. In this section we were studying the influence of scanning procedure, like described in section Since we were especially interested in the distances between landmarks, we had set up an experiment to compare the distances between the landmarks for the different poses of the subjects. We had split up the dataset for the scans made by the Cyberware and Vitronic scanning devices. Table 2 shows the results of this experiment. For each pose we had calculated the differences between the distances between the landmarks. For example, the table entry (a-b,se - sp) for the column Vitronic denotes that the average distance sellion - supramenton for all 20 subjects scanned by the Vitronic is 0.06 mm larger in pose a than in pose b. One would expect that the average difference between two poses for a certain distance is 0.00 mm, since the subjects were scanned with the same landmarks on the body in each pose. For some table entries, however, the average lies over 4.0 mm (see table entry (a-c,sp - r go). This results suggest that there could be a systematically error in the dataset. Suikerbuik noted in his report that the landmarks were snapped to points of the point cloud, introducing an error of at most 1.73 mm [103]. We took this error into account and stated that the average difference above 2.0 mm couldn t be explained by previous study. These distances could be influences by systematical errors (bold figures in the table). One can see in table 2 that the distances which have more than one average difference above the 2.0 mm are in six cases distances on the left side of the face, in four cases distances on the right side of the face and in three cases distances at the frontal part of the face. It is interesting that most affected distances are 23

24 Table 2: Absolute differences in mm between two poses for the separate variables. The poses are split up for scanning device. Vitronic Cyberware variable a - b a - c b - c a - b a - c b - c se - sp Se - l tr Se - r tr Se - l io Se - r io Se - l go Se - r go r io - l io r io r tr r io - sp r io - r go l io - l tr l io - l go l io - sp sp - l tr sp - r tr sp - l go sp - r go r tr - r go l tr - r tr l tr - l go r go - l go from and to the left and right gonion and found in the whole body scans of the Vitronic Scanning device. Statistical Approach We showed in the previous section that there could be some systematical errors in the dataset. We would like have tested all factors for systematical errors, but in the dataset we used only one observer placed and picked the landmarks. So, we couldn t test the landmark placing and landmark picking for systematical errors. However, Verbaan had performed an analysis on the variance by observers and couldn t find any systematical errors introduced by the observers. Therefore we assume that if there is a systematical error in the dataset, it must have been introduced by the other scanning factors. Those other factors, the scanning pose and scanning device, are tested here. In the perfect case, the factors have no significant influence to any distance and the measured distances are the same for each scanning device or scanning pose. As statistical test, we used the multivariate analysis of variance test, the so called MANOVA test [106]. The MAVONA test is explained in more detail in Appendix B. In this case we used a two level nested MANOVA with the scanning devices {Vitronic, Cyberware} and poses {a,b,c} as the factors. For each scan we measured the same distances in the face as used for the empircal approach. One can find these distances in table 2. These distances were the 24

25 variables of the MANOVA test. We performed the nested MANOVA test separately on the 20 subjects. For each subject we tested on the first level the null-hypothesis (H 0 ) that there is no significant influence between the means of the cyberware and the vitronic scanner. In the next level (H 1 ) we tested if there is a significant influence of the scanning poses for the different scanning devices. Both hypotheses are summarized below: H 0 : µ Cyberware = µ vitronic : There is no significant influence of the scanning device on the measurements. H a : µ Cyberware µ vitronic : There is a significant influence of the scanning device on the measurements. H 1 : µ a,cyberware = µ b,cyberware = µ c,cyberware and µ a,vitronic = µ b,vitronic = µ c,vitronic : There is no significant influence of the scanning poses on the measurements for each scanning device. H a : There is a significant influence of the scanning poses on the measurements for each scanning device. Using the Wilks lambda as MANOVA statistic and a significance level of 0.05, we looked at the number of subjects for which a significant influence was found. Under the H 0 and H 1, the number of significant influences is binomially distributed with the parameters 20 (number of subjects) and 0.05 (significance level). We can calculate the critical value using the t-distribution (t (20), ). Now, we can say with 95% probability that in case of no significant influence, there are less or equal than = 3 rejections of our hypothesizes. H 0 H 1 all distances 1 6 Table 3: The number of subjects for with the H 0 and/or the H 1 were rejected The results are shown in table 3. One can see that there isn t a significant influence for the scanning devices, since the null-hypothesis is only rejected for one subject. On the other hand, there is a significant influence found for the scanning poses, since the H 1 is rejected for more than 3 subjects. For a second test, we partitioned all distances in three groups. The first group consisted of distances on the left part of the face, the second group consisted of distances at the right part of the face and the last group consisted of distances at the front part of the face (se - sp, l io - r io, l tr - r tr and l go - r go). We performed this test based on our observations in the empirical approach that there appeared to be a difference in distances at the left, right and frontal parts of the face. This effect can be found in table 2. For each group we performed the same tests as described above. The results of these tests can be found in table 4. One can see in table 4 that there is a significant influence for the scanning devices and the scanning poses for the left, right and the frontal distances. In contrast with the observation in the empirical approach where it seems that 25

26 H 0 H 1 left right frontal 8 9 Table 4: The number of subjects for with the H 0 and/or the H 1 were rejected only the right distances were affected with errors, there is a significant influence found for the left as well as the right distances. Another remarkable fact is that the scanning device has a significant effect for the three separate groups (left,right and frontal distances). This is in contrast with our first test, where all distances are placed in one group and no significant influence for the scanning device is found. If one compares the number of rejections from the first experiment with the second experiment, one can see that the number of rejections for the second experiment is much larger. 3.4 Discussion and Conclusion In this section we have discussed the dataset that will be used for analysis and later on for the automated landmark detection. We have described the scanning procedure and we discussed the differences between the Cyberware scanning device and the Vitronic scanning device and the differences of scanning procedures caused by the differences in scanning. We performed an empirical test as well as a statistical test to prove that the measurements on the landmarks are significant influenced by the scanning devices and the scanning poses. The empirical test showed us there could be a systematical error of 5 mm for some distances occurring under different scanning poses and device. It also showed that a little more distances on the left part of the face could be affected with this error. With the statistical MANOVA test we showed that the scanning pose had a significant influence on the measurements for some distances. However, for the scanning device was no significant influence found. Since the empirical test showed a difference between the left and right part of the face, we had split up the distances and performed the MANOVA test again on the separate groups. It was now shown that the distances at the left part of the face as well as the right part were both significantly influenced by the scanning pose and also the scanning device. For both left and right distances significant influences were found, we noticed that the number of rejections for the separate distances was much larger compared to the number of rejections for all distances together. It could be possible that the distances at one part of the face neutralize the systematical errors at the other part of the face. More research on this area is needed on this topic. So, we have shown that the dataset of the CAESAR-survey contains systematical errors for some distances in the face. These systematical errors result in a displacement of a landmark of at most 5 mm. In our following research we handle these systematical errors as if they are a part of the intra-variance of a measurement. More details on this topic is described in the next section. In that section we use the dataset from the CAESAR-survey for our landmark analysis. 26

27 We have proved in this section that there were significant influences caused by some scanning factors. But, we have not yet discussed what the causes of these differences were. In this research we found two problems that could be the cause of the systematical errors. First, the errors could be caused by the scanning position. As one can see in image 3(b) and can read in section 3.2.2, the head of the subject isn t placed in the middle of the scanning area due to the non-optimal scanning volume of the Vitronic scanner. Since the texture cameras couldn t get a clear view of the side (depth) of the face, no optimal texture file could be produced for the head of each subject. So the observers that picked the landmarks from the texture files weren t capable of accurately picking the landmarks that were placed in the depth of the face, like the tragion and the gonion. This could introduce an error and it could explain why the left and right gonion appeared to be most influenced by the scanning factors. This could also be the case for the Cyberware scanner, since it wasn t exactly known how the subjects were placed on the scanning platform. We only know the orientation of the subjects for each pose. Second, we know that the landmark picking software could only be used for the Vitronic scanner before the software alignment was done. This is described in more detail in section So, only a hardware calibration was performed on the 3D model before the landmark picking. After the landmark picking, the software alignment for the 3D model is performed as described in [103]. When the software alignment is finished the texture file is wrapped over the 3D model and the landmarks are snapped to some model points. But, the transformation of the model caused by the software alignment wasnt performed on the landmark points, so these landmark points could be snapped at the wrong spot in the model. Especially, the second error cause could explain why the Vitronic scanning device shows more errors than the Cyberware scanning device in our empirical approach. The first error cause could have occurred for both scanning devices and could explain why both scanning devices show systematical errors. More research on this topic should point out what the exact cause of the systematical errors was. 27

28 (a) (b) (c) Figure 4: The three scanning poses. position a: pose (a), pose b: relaxed seating (b) and pose c: best scan coverage seating (c) 28

29 4 Landmark Analysis 4.1 Information of a Landmark Set When using landmarks for face recognition, a lot of information of the face isn t used. So, the locations of the landmarks should be chosen carefully to use as much facial information as possible. Besides the location of the landmarks the number of landmarks must be carefully picked as well. More landmarks will give more information about a face so the face comparison method will perform better. But when using too many landmarks, there is a risk of creating a landmark set that uses noise in the classification process. In this case, there will be a high false reject rate. This chapter describes different approaches for selecting a set of distances between landmarks that will give us as much information as possible, but uses no information that can be considered as noise. To measure the amount of information of a set we propose to make use of formulas for each analysis method. Each set S consist of distances between landmark points, from here on referred to as landmark set. The number of elements in the set is described by N. Each landmark set gives a certain amount of information of a face. The information of a landmark set is therefore described by I(S). In this section two different measures are used to find a landmark set. The first method is a measure based on the variance and correlation and is described in section The second method is an existing method based on the Fisher discriminant analysis and is described in section The user must decide which set is selected as the final landmark set. This can be done by using a threshold T. This threshold can be a maximal acceptable number of elements T N in the final landmark set or a minimal acceptable increase in information T I(S). In this thesis the final landmark set is found using a bottom-up approach. We start with an empty set and elements are added to the set which will give the highest I(S), this will continue until the manual set threshold T is reached. The algorithm used is described in the pseudo code below. 29

30 Algorithm for selecting a set of landmarks LIST candidatelandmarks <- all distances between all landmarks LIST landmarkset <- empty while( I(landmarkSet) < T Length(LandmarkSet) < T ) do { bestelement <- candidatelandmarks.first(); bestinformation <- I(landmarkSet + bestelement); for (element e in candidatelandmarks) do { if( I(landmarkSet + e) > bestinformation ) { bestinformation <- I(landmarkSet + e) bestelement <- e; } } candidatelandmarks = candidatelandmarks - bestelement; landmarkset = landmarkset + bestelement; if(candidatelandmarks.empty() = TRUE) break(); } return landmarkset; 4.2 Information Measures Many different methods can be used as information measures. For our analysis, we have chosen for two information measures. First, we propose an information measure based on variance and correlation between the landmarks and second, we will use the Fisher discriminant analysis as reference method for our proposed approach. Both information measures are described shortly in this section Variance and Correlation based Information Measure In our proposed information measure, the distances are selected based on their variance and correlation. First, the variance analysis is used to add the distances between landmarks to the set. After our variance analysis we will test the resulting set on correlation between the elements. If two elements are highly correlated, the distances will add very little information to the set. So, these elements should be excluded from the set. We chose to test first on variance since the correlation approach will first add noise to the set, since noise is uncorrelated. Now, the variance analysis is used to filter the noise from the landmark set. 30

31 Variance Analysis When one attempts to measure a ground truth, there is always some variance in the measurements. One can divide this variance into inter-variance and intra-variance. The inter-variance of distances between landmarks is the variation in measurements on different subjects. The discriminating value of landmark distances grows when the inter-variance grows. The intravariance is the variation in different measurements on the same subject. It is sometimes called the measurement error or the uncertainty margin of a measurement. If the intra-variance of a distance is relative large to the inter-variance, the discriminating value of that landmark distance will be low. The low discriminating value is due to the large amount of overlap of the intra-variance with the inter-variance, so it is difficult to tell whether two measurements belong to the same subject or two different subjects. For the selection of distances between landmarks we are looking for distances having a small intra-variance and a large inter-variance. In other words we are looking for landmarks on places where human faces differ a lot and which are easy to find in a face. To deal with these kind of information, we define information of a landmark set I(S) of with N elements as the ratio of the interand intra-variance as described in (1). I(S) = N d=1 I inter (S d ) I intra (S d ) For each element S d in S the I intra or the intra-variance and I inter or the intervariance is measured. If the I intra is large and the I inter is small, the I(S) will be low. On the other hand if the inter-variance is large and the intra-variance is small, the I(S) will be high. We propose to calculate the information of a landmark set by using two different measures, the sample mean deviation and the coefficient of variance. Both variance-measures are described in detail in [106]. sample mean deviation The sample mean deviation gives information about the concentration of data samples around the sample mean. As for the intra-variance of a distance d we are looking at the concentration of different measurements for d at one subject around the subject mean for distance d. The I intra for the sample mean deviation of the element S d is given by (2) with m as the number of different subjects (classes), n j as the number of elements in class j, x ij as element i in class j and x j as the mean of class j. I intra (S d ) = m j=1 Pn j i=1 xij xj n j m As for the inter-variance we are interested in the concentration of all classes around the total sample mean x for each distance S d. As a representative value for each class j we use the class mean x j. (1) (2) I inter (S d ) = m j=1 x j x m (3) coefficient of variation The great advantage of the coefficient of variation above the sample mean deviation is the normalization of each distribution 31

32 of element S d. A large distance has a larger variance than a small distance; this may influence the results when comparing a large distance with smaller distances. The inter-variance I intra (S d ) is described in (4) and inter-variance I inter (S d ) is described in (6), with s j as the standard deviation of the class j, x j as the mean of class j, x ij as sample i in class j, x as the total mean, s the standard deviation of the total sample, n j as the number of samples in class j and m as the number of different classes. with and with I intra (S d ) = s j = s = nj m j=1 m s j x j i=1 x2 ij x2 j n j 1 I inter (S d ) = s x m j=1 x2 j x2 m 1 (4) (5) (6) (7) Correlation Analysis If a distance is highly correlated to at least one elements in the set, it will only add a little information of the face. Therefore, we cannot include elements which have a high correlation to at least one other element in the set. To exclude these elements from the set, we start again with an empty set S. As candidate items we use the resulting set of the variance analysis S. When an element d is added to the set, it with two demands: 1. From all candidate elements, d has the lowest average correlation with all elements in the set. 2. There is no other element already in S where to d is more correlated than a certain threshold ρ t The first restriction is used to prevent that two highly correlated distances were both selected to the set, because their average correlation with all other elements in the set is very low. The second restriction is used to first add the elements to the set which are very low correlated with the other elements. The average correlation of a set R(S) is described in formula (8). R(S) = 1 N(N 1) N N i=0 j=i+1 ρ(s i, S j ) (8) With N as the number of elements in the set S and ρ(s i, S j ) as the correlation between element S i and S j. We use the absolute correlation since we are only interested in the amount of correlation and not if two distances are positively of negatively correlated. 32

33 4.2.2 Fisher Discriminant Analysis As reference for our proposed measure we use the selection measure based on the Fisher discriminant analysis. The Fisher discriminant analysis finds principal components that best separate the different classes (subjects). Each principal component can be seen as a function of all variables in a set (the distances between landmarks). One can use stepwise discriminant analysis to find the variables that are significant for the composition of the principal components. To test the significance of the variables we use Wilks lambda. Wilks lambda or Λ is a quantity to measure the amount of variance that is not explained by the model. If the model explains all variance its value is 0.0 and if the model doesn t explain any variance its value is 1.0. Most computer programs calculate the significance based on an F-statistic or tolerance factor. The F-statistic is a one-tailed transformation of the Wilks lambda. It indicates for each variable the proportion of the intra-variance of the model that is explained by this variable and not already explained by other variables in the model. The F-statistic is 1.0 if there is no correlation with the variables entered before. The F-statistic is 0.0 if the variable doesn t explain any intra-variance of the model that is not explained by other variables in the model. For each step in the stepwise approach, the significance is calculated for each variable that could be added to the model. The variable with the largest F-statistic is added to the model. The algorithm ends when there is no variable left that increases the explanation of the variance more than tolerance α. The tolerance is based on the F-statistic and indicates the minimal proportion of intra-variance explained by the variable and not by the other variables in the model. The discriminant analysis makes some assumptions on the dataset and its variables. These assumptions can also be made for our experiment and dataset. So therefore we can perform the discriminant analysis on the CAESAR data. These assumptions are: 1. The measurements on the subjects are normally distributed In this case we assume that the distances between two landmarks for multiple subjects are normally distributed. This can be assumed in almost every case. 2. The variance of the variables is equal for all different subjects In this case we assume that the intra-variance of the distances between two landmarks is equal for all subjects. This assumption doesn t hold exactly, for example the landmarks of someone with little fat in the face can be more accurately located than for someone with a lot of fat in the face. Since the intra-variance is dependent on more than the variability of the face (goodness of the observer, scanning device, etc. For more detail read section 3.3), one can assume that variation of the intra-variance between different subject is small. 3. The covariance between the variables is equal for the different subjects This means that for all subjects the distances in the face must have the same covariance. This assumption doesn t hold, since faces can be different in shape, e.g. a long small face or a tiny wide face. Also, as described in section 3.3, systematical errors occurred in measuring the face which disturbs the normal covariance of the measurements. Although these three assumptions don t hold exactly for this dataset, the Fisher Discriminant analysis has proven to be quite robust against violations 33

34 of the normality assumption. This means that even if the measurements aren t exactly normally distributed or have an equal (co)variance, the discriminant analysis still gives reasonable results. However, one has to be carefully by interpreting the results. (a) (b) Figure 5: The 22 defined distances between landmarks that were used for complete analysis (a) and the set of 13 distances between landmarks without the gonion included (b). The positions of the landmarks in the face are manually located (This scan is provided by the Netherlands Forensic Institute and not a part of the CAESAR-survey.). 4.3 Experiments In this section we describe how we used the information measures to obtain a final set of landmarks that can be used for face comparison. We defined two candidate sets and we analysed each candidate set with both information measures. The first candidate set is shown in figure 5(a) and consists of 22 distances in the face. Although we found in section that the landmark set contains systematic errors, especially for distances from and to the left and right gonion, we still include all distances in our starting set. We can assume that every set of landmarks have some errors in the measurements. These errors will, of course, not be the same and can be placed in the uncertainty margin of a measurement. This uncertainty margin can be seen as a part of the intra-variance. If there are errors in the dataset for some distances, their intra-variance will be higher than others and they will not be added to the landmark set. The second candidate set is shown in figure 5(b) and consists of 13 distances in the face. It is actually a subset of the first candidate set without the distances from and to the gonion. The gonion is located based on the underlying bone structure. This information is not available in a 3D model and can therefore not be found by automated landmark detection. Our main goal is to find a landmark set that can discriminate between two subjects and for which the landmarks can be found by automated landmark detection. So, the second candidate set can give us a realistic image of the discriminating value that can be reached where the first candidate set can give 34

35 us an upper bound for the discriminating value of a landmark set for these 8 landmarks Variance and Correlation based Information Measure Variance Analysis For this analysis we used the same dataset as described in section That dataset consisted of 20 subjects with 18 scans for each subjects. Since we were interested in the total intra-variance, we used all 18 samples taken for each subject. We performed our analysis twice on each defined candidate set. The first time we used the variance analysis implemented by the sample mean deviation and the second time the variance is analysed by the coefficient of variation. Both methods are described in section For the candidate set with 22 distances is the increase in information displayed in figure 6(a). The absolute increase for each added element in the landmark set is shown in 6(b). The results for the candidate set without the distances from and to the gonion are shown in figure 7(a) and (b). In Appendix C.1 is described in what order the distances between landmarks were selected for the landmark set. As discussed in section 4.1, we need to set a threshold for the selection of landmarks in the set. First, we state that each element added to the set must have a larger inter-variance than it s intra-variance. The information measure for the variance is a ratio of the inter-variance and the intra-variance, so each element must have a ratio of at least 1.0. For a more exact definition of the threshold for both candidate sets we looked at the absolute increase of information shown in figure 6(b) and 7(b). For the candidate set with all distances we could see that for the coefficient of variation there is a large decrease of information when the 19 th element is added to the set. The elements that were added after this 19 th are the same for the coefficient of variation as the sample mean deviation. We therefore decided to place the threshold at that point and exclude the last three elements from the set, which is shown in figure 6(b) as the dotted line. The resulting landmark set is visualized in figure 8(a). A similar approach is used for the candidate set without the gonion. Here a large decrease in information is shown when the 10 th element is added to the landmark set. Therefore we place the threshold here, as shown in figure 7(b) by the dotted line. The resulting landmark set without distances from and to the gonion is visualized in figure 8(b). One can see in figure 8 that both resulting landmark sets exclude the same three landmark distances, which were the distances between the right infraorbitale and left infraorbitale, the left infraorbitale and left tragion and the distance between the sellion and the left infraorbitale. Correlation Analysis For this analysis we used the landmark files of the scanning pose a (figure 4(a)) of the CAESAR-survey from the Dutch population (1127 subjects) and Italian (973 subjects) population to calculate the correlation between the landmark distances. We couldn t use the same landmark files as in section 4.3.1, since 20 subjects isn t enough to calculate an accurate correlation. As candidate sets we used the landmark sets that resulted from the variance analysis. We could do so, since the population used for the previous analysis was a subset of the population used here. Otherwise we couldn t assume that 35

36 (a) (b) Figure 6: The increase in information (a) and the absolute increase in information (b) when adding 1 element to the set. This is shown for the sample mean deviation and the coefficient of variation using the dataset with all distances. (a) (b) Figure 7: The increase in information (a) and the absolute increase in information (b) when adding 1 element to the set. This is shown for the sample mean deviation and the coefficient of variation using the dataset without the distances from and to the gonion. 36

37 (a) (b) Figure 8: The landmark sets resulting from the variance analysis when started with the complete candidate set (a) and with the candidate set without the distances from and to the gonion (b) (This scan is provided by the Netherlands Forensic Institute and not a part of the CAESAR-survey.). the resulting landmark sets were the same for this population without testing it. The analysis is performed according to the information measure described in section The threshold ρ t for excluding a distance when the correlation with another distance in the set is to high, was set at The resulting landmark sets from the correlation analysis is shown in figure 9. Also if one looks at the distances that appear in both landmark sets, one can see in figure 9 that the resulting distances are the same for both sets. Also the order of selection is almost the same for both landmark sets if one can see in Appendix C.2. (a) (b) Figure 9: The landmark sets resulting from the correlation analysis for the candidate sets with (a) and without (b) the distances from and to the gonion (This scan is provided by the Netherlands Forensic Institute and not a part of the CAESAR-survey.). 37

38 4.3.2 Fisher Discriminant Analysis As described in section 4.2.2, the Fisher discriminant analysis is used as a reference method for the variance and correlation based information measure. For this analysis we used the same dataset as for the variance analysis in section 4.3.1, which exists of 20 subject with 18 scans for each subject. As a tolerance factor we use α = 0.025, which means that a variable can only be added to the model if it explains at least 2.5% of the intra-variance that is not already explained by other variables in the model. Just as with the variance and correlation based analysis we use two sets of landmarks; one with the gonion and one without the gonion (see figure 5). The results of the analysis are displayed in figure 10(a) for the landmark set with the gonion and in figure 10(b) for the landmark set without the gonion. The order of the selection is shown in Appendix C.3 for both landmark sets. In the resulting landmark sets with the gonion, four distances were found not significant enough and were excluded from the model. In the resulting landmark set without distances from and to the gonion, all distances were included. Even the distances that we found not significant enough in the landmark set with the gonion. This could mean that the landmark set without the gonion hasn t enough elements in the set to build an accurate model to explain all variance and it added all elements to the set, even the least significant ones. (a) (b) Figure 10: The landmark sets resulting from the Fisher discriminant analysis when started with the complete candidate set (a) and with the candidate set without the distances from and to the gonion (b) (This scan is provided by the Netherlands Forensic Institute and not a part of the CAESAR-survey.). 4.4 Discrimination Value of a Landmark Set We have stated in section 4.1 that more elements in a set hold more information of a face, but only if the elements are well chosen. To test which landmark set holds more information to discriminate between different faces, we will calculate the discriminating value of both landmark sets. Helmer presented a study of the discriminating value of a set of measurements on human skulls [107]. He defined a probability p(s) that another skull 38

39 Table 5: The results of the individuality calculation for the four resulting datasets (shown in figure 9 and 10) with a measurement error of 5 mm. The individuality is calculated for modelled measurements (mean (µ), µ σ(standard deviation), 95% interval, 25% interval) based on the distribution of the measurements on 1920 subjects. measure variance and correlation Fisher discriminant including gonion Yes No Yes No number of distances (n) resulting ( 5 2 n) % % µ - σ, µ µ + σ % % can be found with the same measurements. He also defined an individuality of a skull I(s) = 1 p(s). This individuality is interpreted as the number of skulls that are needed to find a skull with the exact same measurements, like a likelihood ratio. In other words, the chance of finding a skull with the exact same measurements is a chance of one in I(s). The p(s) is calculated by using formula (9): p(s) = Φ( u i(s µ) + ) Φ( u i(s µ) ) (9) λi λi The eigenvalues λ and the eigenvectors u are calculated from the covariance matrix for the different measurements. Helmer stated that individuality of a skull decreases if it lies close to the mean measurements, because the chance of another skull with the same measurements is larger if the measurements lie around the sample mean. Therefore, he projected the difference between each skull s with the sample mean µ on the subspace. To deal with the measurement errors or intra-variance of each skull he defined a tolerance factor. All measurements that lie in subspace less than from a skull s are assumed to be same as s, otherwise the skulls are interpreted as different skulls. The resulting probability was transformed to a number between 0 and 1 by applying a standard normal cumulative distribution function Φ to it. We used formula (9) to calculate the discriminating value for our four resulting landmark sets (see figure 9 and 10). The measurement error x was set to 5 mm for each distance and we assumed that all distances were independent (upper bound). The tolerance factor for a set with n distances can be calculated by x 2 n. As dataset to calculate the distributions for each distance measurement, we used the 1920 landmarks files of the scanning pose a (figure 4(a)) of the CAESAR-survey from the Dutch population (1127 subjects) and Italian (793 subjects) population. From the calculated distributions, we extracted measurements at particular places of these distributions (mean, 95% interval, 25% interval) to calculate the individuality, these measurements are 39

40 referred to as modelled measurements. The discriminating values for our landmark sets are shown in table 5. So, when a subject has a typical face (95% of the distribution) and one uses the resulting landmark set of the variance and correlation measure with the gonion, there is a probability of 1 in 19 that the measurements are found on another subject. If one observes the results, one can see that subjects with a typical face (95% of the distribution) have a smaller probability than subjects with measurements around the mean (probability of 1 in 2). Another observation is that the individualities of the resulting landmark sets lie close to each other, especially for average faces. The individualities of the resulting landmark sets only differ for typical faces. Helmer measured 8 landmarks on 52 skulls with a tolerance factor of 2.0 mm. Using formula (9) he found that the probability of finding two skulls with the exact same measurements is for a common (mean) skull a chance of one in 60,000 and for a typical skull a chance of one in 1.5 billion (1.5 x 10 9 ). These results have a much lower probability than we found for our landmark sets. We performed three equal experiments where we varied the number of subjects and the tolerance factor to measure the influence of these two parameters on the individuality. For this experiment we used the landmark set without the gonion resulting from the variance-correlation based measure. This landmark set has 7 elements and lies therefore closest to the landmark set of 8 elements that Helmer have used. The results of these experiments are displayed in table 6. Unlike the size of the landmark set (as shown in table 5), the size of the population does influence the calculated probabilities. This is due to the fact that the distribution cannot be estimated accurately when the population is small, introducing an uncertainty. The tolerance factor influences the calculated individuality mostly, especially when the tolerance factor is set lower than the observed measurement error. In this case the probability of the same measurements on another subject will become very small. If one compares the table entries (N=52, =2.0) with the results from Helmer, one can see that our results approach the results from Helmer. 4.5 Error Analysis In this section we describe some of the errors or uncertainties that could have occurred during our analysis. First, for the variance calculation we used a dataset of 20 subjects. The landmark placing and picking for this dataset was performed by one observer, so we couldn t measure the observer error. Verbaan showed that the intraand inter-variance of the landmark placing and picking was around 5 mm [105]. It is very likely that our observed error for the scanning device and scanning pose includes the intra- and inter-observer error but we couldn t measure it. Therefore, it could be in worst case that the real variance lies around 5 mm higher than the measured variance. Second, we used two different datasets for our different information measures. Since the dataset of 20 subjects is a subset of the large dataset of the CAESAR-survey, we can assume that it represents the same population. But the distributions and variance cannot be accurately calculated on such a small population. Third, in section we showed that the CAESAR-dataset contains one 40

41 Table 6: Three equal experiments where the number of subjects and the tolerance factor were varied to measure their influence on the individuality calculation. The individuality is calculated for a common face (mean) and a typical face (95% of the distribution). is calculated by based x 2 n, where x is the measurement error and n the number of elements in the landmark set. In this experiment n = 7. Experiment 1 subjects N=1920 N=500 N=52 in mm mean 95% mean 95% mean 95% 2 47,676 7,312,890 40,819 4,328,880 34,485 8,487, , , Experiment 2 subjects N=1920 N=500 N=52 in mm mean 95% mean 95% mean 95% 2 47,676 7,312,890 47,643 3,509,760 40,541 17,257, , , Experiment 3 subjects N=1920 N=500 N=52 in mm mean 95% mean 95% mean 95% 2 47,676 7,312,890 45,301 4,622,560 34,643 9,892, , , ,

42 or more systematical errors introduced by the scanning device and the scanning pose. In this section we handled the systematical error as part of the intravariance. Still it is best to use a dataset without such errors to obtain more trustworthy results. If a dataset without systematical errors could be used, it is possible that it will result in different landmark sets than the one we found in this section. Also, the intra-variance of the distances could be lower resulting in a lower tolerance factor for the individuality calculation (section 4.4). A lower tolerance factor will result in a higher discrimination value for the landmark sets. 4.6 Discussion and Conclusion In this section an analysis is performed to find a set of distances between landmarks that contains as much facial information as possible, but uses no information that can be considered as noise. Two different information measures were proposed, to measure the amount of information in the face. The first information measure is based on the variance and correlation of the distance measurements in a face and the second information measure is based on the Fisher discriminant analysis. For both information measures we used two landmark sets, one with distances to and from the gonion and one without distances to and from the gonion. These landmark sets are used since the gonion cannot be accurately found in a 3D model. This resulted in four landmark sets, two for the variance and correlation measure and two for the Fisher discriminant analysis. The analysis started with two landmark sets, one with all distances and one without the distances from and to the gonion. The resulting landmark sets for the variance and correlation based measure are shown in figure 9. This figure shows that the same distances are found significant and the only difference are the distances from and to the gonion. This is not the case for the landmark sets resulting from the Fisher discriminant analysis. These landmark sets are visualised in figure 10. This is probably caused since the landmark set without the gonion hasn t enough distances to explain all variance in a population. One has to be careful when interpreting these results, since the assumptions of the Fisher discriminant are not entirely met. Although the Fisher discriminant is quite robust against violations to its assumptions, there results could contain some uncertainties. When one looks at the order of selection for both information measures, one can see that the most informative distances are the distances from and to the gonion (they were added first to the landmark set). The least informative distances are the distances from and to the infraorbitales, which are added last to the dataset, also in the dataset without the gonion. Besides the gonion, the distances from and to the sellion and the supramenton can be considered as most informative. To decide which information measure selects the best landmark set, we performed experiments to test discriminating value of each landmark set. The discriminating value was represented as the probability that the measurements on a subject were found on another subject in a chance of 1 in n. We found that the probabilities are influenced by the size of the population and the intravariance of the measurements. and not by the size of the landmark set. One would expect that the size of the landmark set increases the amount of infor- 42

43 mation of the face and has therefore a higher discriminating value. But when there are more dimensions (like in a larger landmark set), the accuracy of the set decreases. This effect is called the curse of dimensionality and is compensated by the tolerance factor. The results in table 5 show that a subject with a common face has a probability of 1 in 2 that the measurements are found on another subject and a subject with a typical face has a probability of 1 in 19 that the measurements are found on another subject. When interpreting the results, one must realise that we assumed that all measurements in the face are independent of each other to find an upper bound. Since all measurements on the face are all correlated to each other, the real probability is probably worse. From our experiments as shown in table 6 we can conclude that it is important to perform research in decreasing the intra-variance or the measurement error. Since, the discriminating value will increase very much if the intra-variance is a little decreased. Based on these results we propose to use the landmark set without the gonion that results from the variance-covariance measure. All landmark sets have the same discriminating value, but this landmark set contains less elements. Although we have performed a theoretical performance test for the landmark sets based on the individuality, only a practical test can confirm our conclusions. The final decision of which landmark set discriminates best between faces must depend on the performance of the landmark sets in 3D face comparison. 43

44 5 Landmark Detection Different surface descriptors for 3D human body scans are proposed during the years. For example, Douros and Buxton [108] proposed the Gaussian Curvature to define quadratic patches for extracting significant areas of the body. Another local shape descriptor for human body scans was the Paquet Shape Descriptor introduced by Robinette [109]. Some shape descriptors were used for the automated detection of landmarks like Suikerbuik, who tested two descriptors; Gaussian Curvatures and function fitting [99] and achieved an maximal error of respectively 5 and 8 mm. In previous work [104] we automatically detected the pronasale, sellion and the subnasale with a detection performance of 99.5%, 80.6% and 81.5%, respectively. Chua et al. [75] proposed Point Signatures as a local descriptor to find landmarks in the face, this is described in section Moccozet et al. [110] used the multi-scale bubbles introduced by Mortara et al. [111] to find landmarks as part of their research of animatable human body model construction. Chua et al. and Moccozet et al. don t give the accuracy of the landmark detection performance. In this section we present a new automated landmark detection algorithm. For this algorithm we use the 3D whole body scans and the manually located landmarks of the CAESAR-survey, as described in the previous sections. We extracted the facial area by using the boundingbox presented by Suikerbuik [99]. These extracted facial scans are aligned in a coordinate system with the midpoint of the scan at the origin and the nose is pointing in the direction of the x-axis. This algorithm uses the local surface of the facial landmarks combined with the geometrical information. For the local surface we use the local surface descriptor of Mortara et al.. The landmark detection algorithm is described in section 5.1. The performance is presented in section 5.2 and discussed in section Automated Landmark Detection The presented detection algorithm is used to detect 11 landmarks: sellion 1 (se), supramenton 1 (sp), left infraorbitale 1 (l. io), right infraorbitale 1 (r. io), left tragion 1 (l. tr), right tragion 1 (r. tr), left gonion 1 (l. go), right gonion 1 (r. go), pronasale 2 (prn), left alar curvature 3 (l. ac) and the right alar curvature 3 (r. ac). These landmarks are shown in figure 11. The first eight landmarks are defined by the CAESAR-survey and the last three are defined by Farkas [112]. In section is described how the curvature is characterized. This characterization is analyzed to find which curvature is typical for a certain landmark. The analysis of the curvature is described in section The detection of the different landmarks is based on the information from the analysis and combined with geometrical information, which is described in section Definition can be found in section The most anterior point of the nose (nose tip) 3 The most lateral point in the curved base line of the nose, indicating the facial insertion of the nasal wingbase 44

45 Figure 11: The position of the automated detected landmarks in the face, the landmarks on this scan are manually located. (This scan is provided by the Netherlands Forensic Institute and not a part of the CAESAR-survey) Surface Characterization Mortara et al. use spheres, the so called bubbles, to describe the local surface of model M at a vertex v. A sphere S(v, R) with radius R and its center on v defines an intersection γ with the surface: γ = M S(v, R). The length of this intersection is divided by the radius to obtain a radius invariant surface description: L γ,r = Length(γ) Length(M S(v, R)) = (10) R R For the characterization of the surface, Mortara et al. use thresholds to divide the surface into three categories: 0 L γ,r 2π: v is sharp (see figure 12(a)). 2π L γ,r 2π: v is round (see figure 12(b)). 2π L γ,r : v is blend (see figure 12(c)). (a) (b) (c) Figure 12: Three different surfaces (sharp, round and blend) characterized by the intersection of a sphere and the model surface. 45

46 Besides the surface characterization, Mortara et al. calculate the status (concave or convex) of each vertex v. A vertex is considered convex if v lies above γ. This means that the angle θ between the vector v N and v b is larger than 90 degrees, where b is the center of mass of γ and N is the average normal vector of v, or: N (b v) < 0 (11) For each vertex of the model the intersection is calculated for multiple spheres with different radii {R 1,..., R n }, creating a sequence of n characterizations for each vertex v. The spheres with small radii characterize the details of the curvature at v, while the spheres with large radii characterize the global characteristics for the curvature at v. For our surface characterization, we did not use the discrete characterization (sharp, round, blend). Instead, we decided to use the length of γ, since that holds more information of the surface. For each vertex we had a sequence of n numerical values indicating the local curvature and a sequence of n discrete values indicating the status of the surface (convex or concave). Another difference in approach is that Mortara states that if γ exists of two or more elements, the surface has a tubular shape. In our approach, the models could have holes or lie close to the border of the boundingbox which cause γ to be smaller. Also, these holes could be the cause that γ has multiple elements. In both cases the local surface characterization fails: In the first case, it would indicate that a point is sharp since it has a small γ, but it is actually a round local surface that lie too close to a border or a hole. In the latter case, the surface is indicated as a tubular shape. Since there are no tubular shapes in the human face, we excluded the tubular shape characterizations from our algorithm. Also, we have to pay extra care to the fact that holes influence the curvature characterization. To avoid these kind of errors, we manually checked every landmark of our training and test set to exclude landmarks that lie too close to a hole or the border of the boundingbox Landmark Detection Rules Based on the surface characterization as described in the previous section, we analyzed which curvature is typical for a certain landmark. Therefore we took a subset of the Dutch part of the CAESAR-survey existing of 303 scans and characterized the local curvature of the 8 facial landmarks in these scans, excluding the landmarks that were incorrectly located or located close to a hole. For this analysis we used for each vertex spheres with 11 different radii. The smallest radius was equal to 1.5 times the smallest edge in the scan and the largest radius was equal to 3.5 times the average edge in the scan. Compared with the radius settings of Mortara et al. we use relative small radii. We did so because we were mainly interested in the local curvature of a landmark. In table 7 is an example shown of the characterization of facial landmarks for a single facial scan. This data is used to analyze the local curvature for the different facial landmarks. For the analysis we made use of bump-hunting. Bump hunting will create rules that can distinguish between the facial landmarks based on the local surface. With these rules, one can determine for each point in a new facial scan a probability of being a certain landmark. The details of the bump-hunting algorithm are explained in appendix D. Since bump hunting 46

47 curvature lnd.id r.1 r.2 r.3 r.4 r.5 r.6 r.7 r.8 r.9 r.10 r lnd.id convex (1) or concave (0) Table 7: The result of the surface characterization based on multi-scale bubbles for the facial landmarks of facial scan nl-6038a. 11 radii were used for this characterization (r.1 until r.11). The landmarks (lnd.id) 1,2,3,4,6 and 8 are respectively the sellion, right infraorbitale, left infraorbitale, supramenton, right gonion and left gonion. The left and right tragion are missing for this scan, since they were located too close to the border of the bounding box. uses the differences between the local surfaces of different landmarks for the creation of rules, we had to assign the landmarks placed left and right of the face to the same class. For example the left and right tragion were both marked as tragion. Otherwise, bump hunting creates rules to distinguish between a left and right landmark by using the exceptional cases (asymmetry in the face or the not accurately located landmarks) to build rules for distinguishing and not the common characteristics for this landmark. In other words, bump hunting will focus on rules to distinguish between the left and right landmark instead of distinguishing between the landmark and others. An example is shown in figure 13, where the probabilities for the infraorbitales together have a more accurate distribution than the probabilities for the left infraorbitale only. The latter marks most points in the face with a high probability to be a left infraorbitale and there is still no clear division between left and right. Bump hunting has created rules for the distinguishing between 5 classes or landmarks: The sellion, the supramenton, the infraorbitale, the tragion and the gonion. The classification in left and right landmark is done in a later stage in the algorithm based on geometrical information of the face. The resulting detection rules with their accompanying probabilities can be found in Appendix E Landmark Detection Algorithm To find landmarks in a new facial scan, we calculated for each point in the scan the probability that the point is a certain landmark. We combined these found probabilities with geometrical information of the face to find the locations of the landmarks. In this section we describe for each landmark how it is detected. 47

48 (a) (b) Figure 13: The distribution of the probabilities for the infraorbitale. (a) is based on the detection rules where the left and right are placed in the same class and (b) is based on the detection rules for the left infraorbitale only. pronasale We started to detect the pronasale based on its properties that it is the point that sticks most out of the face. The detection of the other landmarks is based on the location of the pronasale. Therefore, the pronasale must be found correctly. When the pronasale is found, two segmentation planes were defined to divide the face in four areas. The first segmentation plane is a horizontal plane that divides the face in above or under the pronasale and is from now on called the nose-plane. The second segmentation plane divides the face in a left and right part and is called the center-plane. This division is shown in figure 14. sellion For the sellion, we have found detection rules with bump hunting. We used these rules to calculate the probabilities of being a sellion for each point in the scan. An example of the probability distribution for a scan is shown in figure 15. As one can see, there are a few points with a high probability to be a sellion (red), these points are found in the region where the sellion is located. The points with a low probability to be a sellion (pink) are distributed over the whole face. The sellion is located within 10 mm from the center-plane and above the nose-plane. From the points that fulfil to the geometrical information, the point denoted as the sellion is the point with the highest probability. When the sellion is found, a new horizontal segmentation plane called sellion-plane is created. This plane is shown in figure 16. supramenton The supramenton is, just like the sellion, also found based on the calculated probabilities, the center-plane and the nose-plane. The point that is denoted as the supramenton is the point that lies under the nose-plane, within 10 mm from the center-plane and has the highest probability of all points that satisfy the geometrical conditions. tragion The tragion is found based on the fact that the tragion lies under the sellion-plane and above the nose-plane. To find a left and right tragion, we use the center-plane. We know that the tragion lies at the side of the face, 48

49 Figure 14: The segmentation planes based on the pronasale, the vertical plane (center-plane) divides the face in left and right and the horizontal plane (noseplane) divides the face in upper and lower face. therefore we introduce two other segmentation planes, the left-plane and rightplane. These planes lie exactly between the center-plane and the most lateral point of the scan. This is shown in figure 17. The segmentation planes altogether define two regions; one region in the left which includes all points under the sellion-plane and above the center-plane and left from the left-plane, the other region includes all points under the sellion-plane and above the center-plane and right from the right-plane. For both regions the point with the highest probability to be a tragion is selected as tragion. infraorbitale Just like the tragion, the left and right infraorbitale are found in two regions enclosed by the sellion-plane, the nose-plane, the center-plane and the left or right-plane. From all points in the regions, the point with the highest probability to be an infraorbitale is chosen. gonion For the gonion we made use of the nose-plane, the left- and rightplane. From all points that are located under the nose plane and left from the left-plane, the point with the highest probability was selected to be the left gonion. The right gonion is selected from the points that lie right from the right-plane and under the nose-plane. As mentioned in section 4, the location of the gonion is based on the underling bone-structure and it therefore cannot be determined on surface only. This conclusion is confirmed is one looks at figure 18, which shows the calculated probabilities for the gonion. Almost the entire face has a high probability of being the gonion, so it would be likely that the algorithm is easily tricked in locating an incorrect point. alar curvature The alar curvature isn t defined in the CAESAR-survey, so it has to be detected by geometrical information only. Therefore we used the intersection of the nose-plane with the scan. An example of such an intersection is shown in figure 19. For the detection of the alar curvature we start at the 49

50 Figure 15: The probabilities of being a sellion for all points in a scan. Red points have a high probability and white points have a zero probability. Figure 16: The horizontal segmentation plane based on the sellion (sellion-plane). The detected pronasale and sellion are both marked in blue. Figure 17: The segmentation planes left-plane, center-plane and rightplane for the detection of the tragion, the infraorbitale and the gonion. Figure 18: The probabilities of being a gonion. Magenta points have a high probability and white points have a zero probability pronasale and walk across the nose to left and right until the value of the x-axis is relative small compared to the previous point. 5.2 Performance and Results To test the performance of the landmark detection, we used a test set of 50 (not already used) whole body scans from the CAESAR-survey. For each scan we automatically detected the 11 landmarks, as described in the previous section. In section 3.3 we described some systematical errors in the manual placement of the landmarks in the CAESAR-survey, so we couldn t compare these landmarks with the detected landmarks to test the performance of the detection algorithm. Therefore, we manually examined each scan and labelled every detected landmark as correct or incorrect. 50

51 Figure 19: An intersection of the nose-plane with a scan which is used to detect the alar curvature (ac). Also the pronasale (prn) is shown. The result of the experiment is a k x n matrix M with k = 11 as the number of detected landmarks and n = 50 as the number of scans in the test set. An entry M(i,j) gives the result for landmark k i of scan n j : 1 (correct) or 0 (incorrect). Based on the test set, we can estimate the detection performance ˆp ki for each landmark k i by calculating the average for all n scans: ˆp ki = 1 n n M(i, j) (12) j=1 The accuracy of ˆp ki is dependent on the size of the test set. If the test set is large enough, ˆp ki will approach the real detection performance p ki. Since we have used a relatively small test set of 50 scans, the observed performance ˆp ki is likely to be different from the real performance p ki. Therefore we have made an estimation of the real performance by detecting the lower and upper bound of real performance by calculation the 90% confidence interval: 51

52 The 90% confidence interval for the detection performance p ki For a certain landmark k i, is n j=1 M(i, j) binomially distributed with n samples and a probability p ki to be correctly detected, this means that: ˆp ki p ki t(n 1) (13) 1 n S2 i where S i is the variance of the all n observations for landmark k i. With equation (13) we can calculate the 90% confidence interval for p ki : ˆp ki p ki t (n 1),0.1 ˆp ki (1 ˆp ki ) 1 (14) n where t (n 1),0.1 is the critical value for a one sided t-test with a confidence interval of 90% for a sample size n. This means that the lower and upper bound for p ki is calculated by: ˆp ki t (n 1),0.1 ˆp ki (1 ˆp ki ) 1 n p k i ˆp ki + t (n 1),0.1 ˆp ki (1 ˆp ki ) 1 n (15) One can see formula (14) how the sample size affects the confidence interval. If the sample size increases, the 90% confidence interval will become smaller, so the upper and lower bound are closer to ˆp ki. This means that the estimated detection performance ˆp ki approaches the underlying detection performance p ki. But since the square root is taken over the sample size, ˆp ki converges slowly to p ki. In table 8 we displayed the results of the automated landmark detection for 50 scans. For each landmark we displayed in how many cases the landmark was correctly detected, the observed detection performance ˆp ki and the lower and upper bound in which the underlying detection performance can be found with a probability of 90%. 5.3 Discussion and Conclusion In this section we have proposed an automated landmark detection algorithm. The detection algorithm is based on the local curvature of the surface, which is calculated by multi-scale bubbles. With the results from the local curvature characterization, we have created detection rules to calculate a probability that a point in a new scan is a certain landmark. These probabilities are combined with geometrical information for the automated detection of 11 landmarks, as shown in figure 11. We have used a test set of 50 scans to measure the performance of the algorithm. Since this test set is small, it will only give an indication for the detection performance. For a more accurate estimation of the detection performance, we calculated with 90% confidence a lower bound of the real detection performance. The results of the automated landmark detection are shown in table 8. One can see that the pronasale is correctly detected in all cases. This is im- 52

53 Table 8: The performance results for 11 landmarks based on a set of 50 scans. The column ˆp ki displays the estimated detection performance. As indication of the real detection performance p ki, the column lower bound shows with a confidence of 90% the smallest value what the performance p ki could be. name correctly performance lower detected ˆp ki bound pronasale sellion right infraorbitale left infraorbitale supramenton right tragion left tragion right gonion left gonion right alar curvature left alar curvature portant, since the detection of the other landmarks is dependent on the location of the pronasale. The sellion detection is based on the local curvature and geometrical information and is found correctly in 78% of the cases in the test set. The real detection performance is, with a confidence of 90%, at least 66%. This is in line with our previous work on the pronasale and sellion detection, based on the Gaussian curvature and geometrical information, where the pronasale was correctly found in 99.5% and the sellion in 80.6% of 350 scans[104]. If one takes a look at the detection performance of the gonion, one can see that the gonion is correctly detected in 3(right) and 4(left) times of the 50 scans. This was as expected, since the gonion is located based on the underlying bone structure and not on the local surface. Therefore, the bump hunting algorithm had difficulties of creating accurate detection rules for distinguishing the local surface of the gonion from the local surface of other landmarks. The infraorbitales has a higher detection performance than the gonion, even though it is also localized based on the underlying bone structure. The reason for this is that there is less fat under the eyes which makes that the skin follows the bone structure which results in a more pronounced local surface. However, the results show that also the local surface of the infraorbitale isn t typical enough to be distinguished from the local surface of other landmarks. The left and right tragion have a detection performance of at least 20% (with a confidence of 90% ), which is low considering the fact that the tragion is located at a typical surface. The reason for the tragion to be poorly detected is the fact that most facial scans of the CAESAR-survey have holes in or around the ears. The curvature characterization fails in this case, as mentioned in section 5.1.1, so the detection for the tragion will fail. The detection performance for the tragion will probably increase when scans with no holes are used for the classification and detection of the tragion. The alar curvature is found only on the geometrical characteristics, which results in a similar detection performance as the supramenton although the 53

54 supramenton has less typical geometrical characteristics. Since the supramenton is found on geometrical characteristics in combination with the detection rules, we can conclude that the detection rules improve the detection performance. This increase in detection performance is highest for landmarks with a typical (not-flat) local surface This is confirmed by the distribution of the probabilities of the sellion compared with the probabilities of the infraorbitale, as shown in figure 20. Since the sellion has a more distinctive surface, there are a few points with a high probability to be a sellion and the points with a high probability are found in the region where the sellion is located. The infraorbitale has a less typical local curvature and as one can see, there are a lot of points with a high probability which are distributed over the whole face. (a) (b) Figure 20: The distribution of the probabilities for the sellion (a) compared to the probabilities for the infraorbitale (b). White points have a zero probability, pink(sellion) and light-blue(supramenton) points have a low probability to be a landmark and red(sellion) or blue(infraorbitale) points indicate a high probability to be the landmark. The overall detection performance isn t good enough for further usage, such as face comparison. However, the detection performance can be improved by using different landmarks. The landmarks we have used in this thesis were manually picked for the CAESAR-survey. Most of these landmarks (gonion, supramenton and infraorbitale) were hard to detect since their local curvatures occur at multiple places in the face, like forehead, cheeks and so on. If we had a test set with landmarks with more typical local surfaces (like the sellion or alar curvature), the detection rules would have a higher distinguishing value. For the creation of detection rules, we have used the bump hunting algorithm since it produces simple and intuitive rules for classification including probabilities. The disadvantage of bump hunting is that it cannot handle outliers very well. Bump hunting will try to create rules to include them in the classification. This can only be prevented by tuning the bump hunting parameters manually. The detection performance of the algorithm could be improved by using another classification method for the detection rules. For example logistic regression which can handle outliers by making use of statistical information. 54

55 6 Conclusion 6.1 Conclusion We started this thesis with an examination of landmarks that are best suited for face comparison. We used two measures; a newly proposed measure based on the variance and correlation of the landmarks and a reference measure based on the Fisher discriminant analysis. We used two datasets for our analysis: The first consisted of the absolute distances between the eight facial landmarks of the CAESAR-survey (the sellion, the left and right infraorbitale, the supramenton, the left and right tragion and the left and right gonion). The second dataset is a subset of the first excluding the distances from and to the gonion. We found as most informative distances the absolute distances from and to the gonion. The least informative distances are the distances from and to the infraorbitales. Next to those, the distances from and to the sellion and the supramenton can be considered as most informative. We statistically determined the discriminating values in shape of likelihoods for the datasets resulting from the analysis. Based on these likelihoods we found that all resulting landmark sets have an equal discriminating value and can all be used for facial comparison. The results show that a subject with a common face has a probability of 1 in 2 that the measurements are found on another subject and a subject with a typical face has at most a probability of 1 in 19 that the measurements are found on another subject. Until a practical test has proven otherwise, we propose to use the landmark set without the gonion resulting from the variance and correlation based measure since it contains the least elements of all other resulting landmark sets. Also, we presented a landmark detection algorithm for 11 facial landmarks. Eight of these landmarks were manually located landmarks from the CAESARsurvey, the other landmarks are the pronasale and the left and right alar curvature. For the eight landmarks we used bump hunting to analyze the local curvature and for the creation of detection rules. The other three are located based on geometrical information only. Based on a test set of 50 subjects we estimated the detection performance and its lower bound with 90% confidence. The pronasale and the sellion were detected best with a detection performance of 100% and 78%, where the lower bound of the detection performance of the sellion is 66%. The landmarks located on the underlying bone, like the gonion and the infraorbitale, are detected worst with a lower bound of respectively 0% and 6%. In the section about the landmark analysis (section 4), we started with two landmark sets: one with all landmarks included and one without the gonion, since the location of the gonion is based on the underlying bone it cannot be automatically detected. This assumption is confirmed by the landmark detection where the algorithm which fails to locate the gonion. So, when using automated landmark detection for facial comparison, the landmark set without the gonion must be used. The results from the landmark detection (section 5) also show that the left and right infraorbitale cannot be detected accurately by this algorithm, since it is also located on a less typical surface. This means that for facial comparison with automatically detected landmarks, the infraorbitale cannot be used. In the section landmark analysis (section 4), the absolute distances from and to 55

56 the infraorbitale were found least informative. In further research two things could be done to handle the bad detection performance of the infraorbitale. One could improve the detection performance by using more geometrical information, exclude the infraorbitale from the landmark set or use another landmark located around the eyes with a more typical surface, like the eye corner. The detection performance can be increased by decreasing the intra-variance of different landmarks. This can be done in the ground truth we used for the local curvature analysis (the manually located landmarks of the CAESAR-survey): The facial landmarks that are used have an intra-variance or error around the 5 mm, as discussed in section 3. This introduces a variation in local surface of a landmark, which causes an uncertainty in the classification rules created by the bump hunting algorithm. If one can decrease the intra-variance of the manually located landmarks by using different scans or better located landmarks, bump hunting can create more accurate rules for the detection which increases the detection performance. Besides the intra-variance of the manually located landmarks, also detection performance of the automatically detected landmarks can be decreased: The increase of the detection performance can be done by using more landmarks with a typical surface, especially in combination with prior knowledge, based on local surface analysis. The presented detection algorithm performs best on landmarks with a typical surface, like the point between the lips, the alar curvature or the eye corner. If more of those landmarks can be included in the detection algorithm, the overall detection performance of the algorithm increases. More accurately locating of the landmarks, automated or not, will decrease their intra-variance, which causes a large increase of the discriminating value of the landmark set, as described in the section Further Research For the creation of detection rules based on the training set of local curvatures, we have used the bump hunting algorithm since it produces simple and intuitive rules for classification including probabilities. The disadvantage of bump hunting is that it cannot handle outliers well. The detection performance of the algorithm could be improved by using another classification method for the detection rules. As stated in the section of the landmark analysis, we have performed a theoretical test to determine the discriminating value of the landmark sets. Based on that test we have selected a landmark set that would be preferred to use for face comparison. It would be interesting to perform a practical test where all resulting landmark sets are used for facial comparison to compare the performance and calculate its real discriminating value. As mentioned in the literature survey (section 2), most 3D face recognition algorithms treat 3D facial scans as rigid objects. This will also be the case, when landmarks are used for the 3D face comparison. To avoid this, one has to pick the location landmarks carefully by, for example, excluding landmarks around the mouth. To investigate which landmarks are affected by different facial expressions and should be excluded, one could perform a study for tracking landmarks in time and during different facial expressions. 56

57 For face comparison based on landmarks only the automatically detected landmarks will be used, while the presented automated landmark detection is based on information of the local surface. This means that currently information of the local surface is not used for face comparison, although the surface information is the great advantage of 3D scans over 2D images. Perhaps the local surface information around the landmarks can be used in the face comparison process besides distances between the landmarks, since the local surface of the landmark holds more information than the landmark itself. 57

58 References [1] M. Yoshido, H. Matsuda, S. Kubota, and K. Imaizumi. Computer-assisted facial image identification system. Forensic Science Communications, 3(1), [2] A4 3D Facial Camera Hardware. The A4 vision homepage. URL: last checked: February [3] W.W. Bledsoe. The model method in facial recognition. Technical report, PRI-15, Panoramic Recsearch, Inc, Palo Alto, Califormia, [4] W.W. Bledsoe. Man-machine facial recognition: Report on a large-scale experiment. Technical report, PRI-22, Panoramic Recsearch, Inc, Palo Alto, Califormia, [5] A. Samal and P.A. Iyengar. Automatic recognition and analysis of human faces and facial expressions. Pattern Recognition, 25(1):65 77, [6] R. Chellappa, C.L. Wilson, and S. Sirohey. Human and machine recognition of faces: A survey. Proceedings of the IEEE, 83(5): , [7] W.Y. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld. Face recognition: A literature survey. ACM Computing Surveys, 35(4): , [8] P.J. Phillips, R. Grother, R.J. Micheals, D.M. Blackburn, E. Tabassi, and M. Bone. Face recognition vendor test URL: last checked: February [9] P. Grother, R.J. Micheals, and P.J. Phillips. Face recognition vendor test In Proceedings of the Audio- and Video-Based Biometric Person Authentication, pages , [10] M. Kirby and L. Sirovich. Application of the Karhunen-Loeve procedure for the characterization of human faces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(1): , [11] M. Turk and A. Pentland. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1):72 86, [12] R.O. Duda, P.E. Hart, and D.G. Stork. Pattern Classification. John Wiley and Sons, second edition, [13] P. Navarrete and J. Ruiz del Solar. Comparative study between different eigenspace-based approaches for face recognition. In AFSS International Conference on Fuzzy Systems, pages , [14] I. Pima and M. Aladjem. Regularized discriminant analysis for face recognition. Pattern Recognition, 37(9): , [15] R.A. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7: , [16] P.N. Belhumeur, J.P. Hespana, and D.J. Kriegman. Eigenfaces vs Fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):45 58,

59 [17] W.Y. Zhao, A. Krishnaswamy, and R. Chellappa. Discriminant analysis of principal components for face recognition. In Proceedings of the International Conference on Automatic Face and Gesture Recognition, pages , [18] A.M. Martinez and A.C. Kak. PCA versus LDA. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2): , [19] S.C. Chen, J. Liu, and Z.H. Zhou. Making FLDA applicable to face recognition with one sample per person. Pattern Recognition, 37(7): , [20] S.C. Chen, J. Liu, and Z.H. Zhou. Enhanced (P C) 2 A for face recognition with one training image per person. Pattern Recognition Letters, 25(10): , [21] J. Wu and Z.H. Zhou. Face recognition with one training image per person. Pattern Recognition Letters, 23(14): , [22] R. Brunelli and T. Poggio. Face recognition: Features versus templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(10): , [23] T.F. Cootes and C.J. Taylor. Training models of shape from sets of examples. In Proceedings of the British Machine Vision Conference, pages 9 18, [24] A. Lanitis, C.J. Taylor, and T.F. Cootes. An automatic face identification system using flexible appearance models. Image and Vision Computing, 13(5): , [25] T.F. Cootes and C.J. Taylor. Statistical models of appearance for computer vision. Technical report, Image Science and Biomedical Engeneering, University of Manchester, [26] M.J. Jones and T. Poggio. Multidimensional morphable models. In Proceedings of the International Conference on Computer Vision, pages , [27] X. Xu, C. Zhang, and T.S. Huang. Active morphable model: An efficient method for face analysis. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, pages , [28] V. Blanz and T. Vetter. Face recognition based on fitting a 3D morphable model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9), [29] Z. Xue, S.Z. Li, and E.K. Teoh. Bayesian shape model for facial feature extraction and recognition. Pattern Recognition, 36(12): , [30] B. Zhang, W. Gao, S. Shan, and W. Wang. Constraint shape model using edge constraint and Gabor wavelet based search. In Proceedings of the Audio- and Video-Based Biometric Person Authentication, pages 52 61,

60 [31] F. Jiao, S. Li, H.Y. Shum, and D. Schuurmans. Face alignment using statistical models and wavelet features. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 1, pages , [32] T.K. Kim, H. Kim, W. Hwang, and J. Kittler. Independent component analysis in a local facial residue space for face recognition. Pattern Recognition, 37(9): , [33] L. Wang and T.K. Tan. Experimental results of face recognition using the 2nd-order eigenface method. ISO/MPEG, M6001, Geneva, May [34] B.W. Hwang, H. Byun, M.C. Roh, and S.W. Lee. Perform evaluation of face recognition algorithms on the asian face database KFDB. In Proceedings of the Audio- and Video-based-Based Biometric Person Authentication, pages , [35] B. Moghaddam, T. Jebera, and A.P. Pentland. Bayesian face recognition. Pattern Recognition, 33(11): , [36] P.J. Phillips, H. Moon, S.A. Rizvi, and P.J. Rauss. The FERET evaluation methodology for face-recognition algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(10): , [37] X. Wang and X. Tang. A unified framework for subspace face recognition. IEEE Transactions on pattern analysis and machine intelligence, 26(9): , [38] M.H. Yang. Face recognition using kernel methods. In Proceedings of the Advances in Neural Information Processing Systems, pages , [39] S.K. Zhou, R. Chellappa, and B. Moghaddam. Intra-personal kernel space for face recognition. In Proceedings of the IEEE International Automatic Face and Gesture Recognition, pages , [40] H.C. Kim, D. Kim, and S.Y. Bang. Face recognition using the mixture-ofeigenfaces method. Pattern Recognition Letters, 23(13): , [41] H.C. Kim, S.Y. Bang, and S.Y. Lee. Face recognition using the secondorder mixture-of-eigenfaces method. Pattern Recognition, 37(2): , [42] W.Y. Zhao and R. Chellappa. SFS based view synthesis for robust face recognition. In Proceedings of the IEEE International Automatic Face and Gesture Recognition, pages , [43] Y. Hu, D. Jiang, S. Yan, L. Zhang, and H. Zhang. Automatic 3D reconstruction for face recognition. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, pages , [44] C.H. Lee, S.W. Park, W. Chang, and J.W. Park. Improving the performance of multi-class SVMs in face recognition with nearest neighbour rule. In Proceedings of the IEEE International Conference on Tools with Artificial Intelligence, pages ,

61 [45] B.K. Gunturk, A.U. Batur, Y. Altunbasak, M.H. Hayes, and T.M. Mersereau. Eigenface-domain super-resolution for face recognition. IEEE Transactions on Image Processing, 12(5): , [46] T.E. Boult, M.C. Chiang, and R.J Micheals. Super-resolution via image warping. In E.S. Chaudhuri, editor, Super-Resolution imaging, chapter 6, pages The Kluwer international series in engineering and computer science 632, [47] C. Liu, H.Y. Shum, and C.S. Zhang. A two-step approach to hallucinating faces: Global parametric model and local nonparametric model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, volume 1, pages , [48] X. Wang and X. Tang. Face hallucinatio and recognition. In Proceedings of the Audio- and Video-Based Biometric Person Authentication, pages , [49] Z. Jin, J.Y. Yang, Z.S. Hu, and Z. Lou. Face recognition based on the uncorrelated discriminant transformation. Pattern Recognition, 34(7): , [50] L. Wiskott. Phantom faces for face analysis. In Proceedings of the Joint Symposium on Neural Computation, pages 46 52, [51] L. Wiskott, J.M. Fellous, N. kruger, and C. van der Malsburg. Face recognition by elastic bunch graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7): , [52] C.J.C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2): , [53] N. Christianini and J. Shawe-Taylor. Support Vector Machines. Cambridge University Press, [54] K. Jonsson, J. Kittler, and J.Matas. Support vector machines for face authentication. Image and Vision Computing, 20(5-6): , [55] M. Sadeghi, J. Kittler, A. Kostin, and K. Messer. A comparative study of automatic face verification algorithms on the banca database. In Proceedings of the Audio- and Video-Based Biometric Person Authentication, pages 35 43, [56] N. P. Costen and M. Brown. Exploratory sparse models for face recognition. In Proceedings of the British Machine Vision Conference, [57] B. Heisele, P. Ho, J. Wu, and T. Poggio. Face recognition: componentbased versus global approaches. Computer Vision and Image Understanding, 91(1-2):6 21, July [58] T. Koshizen B. Heisele. Components for face recognition. In Proceedings of the Audio- and Video-Based Biometric Person Authentication, pages ,

62 [59] S. Marcel. A symmetric transformation for LDA-based face verification. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, pages , [60] P. Yang, S. Shan, W. Gao, S.Z. Li, and D. Zhang. Face recognition using ada-boosted Gabor features. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, pages , [61] Y. Wang, C.S. Chua, and Y.K. Ho. Facial feature detection and face recognition from 2D and 3D images. Pattern Recognition Letters, 23(10): , [62] A. Nefian and M.H. Hayes III. A hidden Markov model for face recognition. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pages , [63] A. Nefian and M.H. Hayes III. Face recognition using an embedded HMM. In Proceedings of the IEEE Conference on Audio and Video-Based Biometric Person Authentication, pages 19 24, [64] B. takács and H. Wechsler. Face recognition using binary image metrics. In Proceedings of the IEEE International Conference in Automatic Face and Gesture Recognition, pages , [65] L. Chen, L. Zhang, H. Zhang, and M. Abdel-Mottaleb. 3D shape constraint for facial feature localization using probalilistic-like output. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, pages , [66] J. Wilder, P.J. Phillips, J. Cunhong, and S. Wiener. Comparison of visible and infra-red imagery for face recognition. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, pages , [67] A. gyaourova, G. Bebis, and I. Pavlidis. Fusion of infrared and visible images for face recognition. In Proceedings of the European Conference on Computer Vision, pages Vol IV: , [68] D.A. Socolinsky and A. Selinger. A comparative analysis of face recognition performance with visible and thermal infrared imagery. In Proceedings of the International Conference on Pattern Recognition, pages IV: , [69] D.A. Socolinsky, L.B. Wolff, J.D. Neuheisel, and C.K. Eveland. Illumination invariant face recognition using thermal infrared imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages I: , [70] K.W. Bowyer, K. Chang, and P.J. Flynn. A survey of approaches to three-dimensional face recognition. In International Conference of Pattern Recognition, volume I, pages ,

63 [71] G.G. Gordon. Face recognition based on depth maps and surface curvature. In Proceedings of the SPIE, Geometric Methods in Computer Vision, volume 1570, pages , [72] A.B. Moreno, A. Sánchez, J.F. Vélez, and F.J. Díaz. Face recognition using 3D surface-extracted descriptors. In Proceedings of the Irish Machine Vision and Image Processing Conference, [73] C. Xu, Y. Wang, T. Tan, and L. Quan. Automatic 3D face recognition combining global geometric features with local shape variation information. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, pages , [74] C.S. Chua and R. Jarvis. Point signatures: A new representation for 3D object recognition. International Journal on Cumputer Vision, 25(1):63 85, [75] C.S. Chua, F. Han, and Y.K. Ho. 3D human face recognition using point signature. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, pages , [76] M.O. İrfanoglu, B. Gökberk, and L. Akarun. 3d shape-based face recognition using automatically registered facial surfaces. In Proceedings of the International Conference on Pattern Recognition, pages , [77] H.S. Wong, K.K.T. Chueng, and H.H.S. Ip. 3D head model classification by evolutionary optimization of the extended Gaussian image representation. Pattern Recognition, 37: , [78] T. Papatheodorou and D. Rueckert. Evaluation of automatic 4D face recognition using surface and texture registration. In Proceedings of the IEEE International Conference in Automatic Face and Gesture Recognition, pages , [79] C. Beumier and M. Acheroy. Automatic 3D face authentication. Image and Vision Computing, 18(4): , [80] C. Beumier and M. Acheroy. Face verification from 3d and grey level clues. Pattern Recognition Letters, 22(12): , [81] Y. Wu, G. Pan, and Z. Wu. Face authentication based on multiple profiles extracted from range data. In Proceedings of the Audio- and Video-Based Biometric Person Authentication, pages , [82] V. Blanz, S. Romdhani, and T. Vetter. Face identification across different poses and illuminations with a 3D morphable model. In Proceedings of the IEEE International Automatic Face and Gesture Recognition, [83] S. Romdhani and T. Vetter. Efficient, robust and accurate fitting of a 3D morphable model. In Proceedings of the European Conference on Computer Vision, [84] J. Huang, B. Heisele, and V. Blanz. Component-based face recognition with 3d morphable models. In Proceedings of the Audio- and Video-Based Biometric Person Authentication, pages 27 34,

64 [85] A.J. Naftal, Z. Mao, and M.J. Trenouth. Stereo-assisted landmark detection for the analysis of 3D facial shape changes. Technical report, TRS , Deprtment of Computation UMIST, Manchester, [86] A. Ansari and M. Abdel-Mottaleb. 3D face modeling using two views and a generic face model with application to 3D face recognition. In Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, pages 37 44, [87] J. Ahlberg. CANDIDE-3 - an updated parameterized face. Technical report, LiTH-ISY-R-2326, Dept. of Electrical Engineering, Linkping University, Sweden, [88] D. Terzopoulos and K. Waters. Analysis and synthesis of facial image sequences using physical and anatomical models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(6): , [89] X. Lu, R. Hsu, A. Jain, B. Kamgar-Parsi, and B. Kamgar-Parsi. Face recognition with 3D model-based synthesis. In Proceedings of the International Conference on Biometric Authentication, pages , [90] N. Mavridis, F. Tsalakanidou, D. Pantazis, S. Malasiotis, and M. Strintzis. The HISCORE face recognition application: Affordable desktop face recognition based on a novel 3D camera. In Proceedings of the International Conference on Augmented Virtual Environments and 3D Images, URL: [91] K.I. Chang, K.W. Bowyer, and P.J. Flynn. Multi-modal 2D and 3D biometrics for face recognition. In Proceedings of the IEEE International Workshop on Analysis and Modeling Journal on Computer Vision, [92] C. Xu, Y. Wang, T. Tan, and L. Quan. A new attempt to face recognition using eigenfaces. In Proceedings of the Asian Conference on Computer Vision, volume 2, pages , [93] A.M. Bronstein, M.M. Bronstein, and R. Kimmel. Expression-invariant 3D face recognition. In Audio- and Video-Based Biometric Person Authentification, pages 62 70, [94] F. Tsalakanidou, D. Tzovaras, and M. G. Strintzis. Use of depth and colour eigenfaces for face recognition. Pattern Recognition Letters, 24(9-10): , [95] F. Tsalakanidou, S. Malassiotis, and M. G. Strintzis. Integration of 2D and 3D images for enhanced face authentication. In Proceedings of the IEEE International Conference in Automatic Face and Gesture Recognition, pages , [96] R. Beveridge and B. Draper. Evaluation of face recognition algorithms. URL: last checked: February [97] Z. Pan, G. Healey, M. Prasad, and B. Tromberg. Face recognition in hyperspectral images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(12): ,

65 [98] CAESAR-survey. The civilian american and european surface anthropometry resource. URL: last checked: February [99] C.A.M. Suikerbuik, J.W.H. Tangelder, H.A.M. Daanen, and A.J.K. Oudenhuijzen. Automatic feature detection in 3D human body scans. In Proceedings of the Digital Human Modelling Conference, [100] K.M. Robinette and H.A.M. Daanen. Lessons learned from CAESAR: A 3- D anthropometric survey. In Proceedings of the International Ergonomics Association Conference, [101] Whole Body Color 3D Scanner. The cyberware homepage. URL: last checked: February [102] The Vitus 3D Body Scanner. The vitronic homepage. URL: last checked: February [103] C.A.M. Suikerbuik. Automatic feature detection in 3D human body scans. Master Thesis, INF/SCR/ Informatics and Computing Sciences Institute, University of Utrecht, July [104] A.E.H. Scheenstra. Automatic landmark detection and analysis on 3D facial models. Experimentation Project at the Informatics and Computing Sciences Institute, University of Utrecht, April [105] J.W. Verbaan. Reproducibility in both traditional anthropometric measurements and a 3-dimensional Vitronic body scanner. Second Probation of the fourth year of Bewegingstechnologie, Haagse Hogeschool in The Hague, February [106] J.H. Zar. Biostatistical Analysis. Prentice-Hall, Inc, fourth edition, [107] R.P. Helmer, J.N. Schimmler, and J. Riegler. On the conclusiveness of skull information via the video superimposition technique. Canadian Society of Forensic Science Journal, 22(2): , [108] I. Douros and B.F. Buxton. Three-dimensional surface curvature estimation using quadric surface patches. In Proceedings of the Scanning 2002 Conference, [109] K.M. Robinette. An alternative 3D descriptor for database mining. In Proceedings of the Digital Human Modelling Conference, [110] L. Moccozet, F. Dellas, N. Magnenat-Thalmann, S. Biasotti, M. Mortara, B. Falcidieno, P. Min, and R. Veltkamp. Animatable human body model reconstruction from 3D scan data using templates. In Proceedings of the CapTech Workshop on Modelling and Motion Capture Techniques for Virtual Environments, [111] M. Mortara, G. Patanè, M. Spagnuolo, B. Falcidieno, and J. Rossignac. Blowing bubbles for the multi-scale analysis and decomposition of trianglemeshes. Algorithmica, 38(1): ,

66 [112] L.G. Farkas. Anthropometry of the Head and Face. Raven Press, second edition, [113] J.H. Friedman and N.I. Fisher. Bump hunting in high-dimensional data. Statistics and Computing, 9(2): , [114] Ad Feelders. Rule induction by bump hunting. URL: last checked: February

67 A Landmarks of the CAESAR-survey A summary of all landmarks and their description as used in the CAESAR survey [98]. The landmarks are automaticly extracted from the 3D scan. The most of the description of the landmarks is literally copied from the CAESAR survey to avoid errors in the definitions. The abbrivations are only used for the annotation on the image 21 to show the positions of the landmarks on the body. These are not the abbrivations as used in the CAESAR survey. head (9 landmarks) Sellion (se) Point of greatest indentation of the nasal root depression Right Infraorbitale (r io) Lowest point on the inferior margin of the right orbit, marked directly inferior to the pupil Left Infraorbitale (l io) Lowest point on the inferior margin of the left orbit, marked directly inferior to the pupil Supramenton (sp) Point of greatest indentation of the mandibular symphysus, marked in the midsagittal plane. Right Tragion (r t) Notch just above the right tragus (the small cartilaginous flap in front of the ear hole) Left Tragion (l t) Notch just above the left tragus (the small cartilaginous flap in front of the ear hole) Right Gonion (r go) Inferior posterior right tip of the gonial angle (the posterior point of the angle of the mandible, or jawbone). This point is difficult to find when covered with a lot of tissue. Left Gonion (l go) Inferior posterior left tip of the gonial angle (the posterior point of the angle of the mandible, or jawbone). This point is difficult to find when covered with a lot of tissue. Nuchale (n) Lowest point of the occiput that can be palpated among the nuchal muscles. this point is often obscured by hair. The marker is placed in the midsagittal plane. torso (19 landmarks) Right Clavicale (r cl) Most prominent point of the superior aspect of the medial end of the right clavical at the sternoclavicular junction. Left Clavicale (l cl) Most prominent point of the superior aspect of the medial end of the left clavical at the sternoclavicular junction. Suprasternale (spn) Highest palpable point on the sternum (breastbone) 67

68 Figure 21: The landmarks used by the CAESAR survey. original form is depended from the work of suikerbuik [99]. The image in it s 68

69 Substernale (sbn) Lowest palpable point on the sternum (breastbone) Right Thelion (r th) Also called right Bustpoint. Most anterior protrusion of the right bra cup on woman. Center of the right nipple on men. Left Thelion (l th) Also called left Bustpoint. Most anterior protrusion of the left bra cup on woman. Center of the left nipple on men. Right Tenth Rib (r tr) Lowest palpable point on the inferior border of the right 10th Rib at the bottom of the rib cage Left Tenth Rib (l tr) Lowest palpable point on the inferior border of the left 10th Rib at the bottom of the rib cage Tenth Rib Midspine (m tr) Level of the landmark of the right 10th rib marked on the midspine. Right Anterior Superior Iliac Spine (r asis) Prominent, anterior point on the anterior rim of the ilia (hip joint) on the right side. Left Anterior Superior Iliac Spine (l asis) Prominent, anterior point on the anterior rim of the ilia (hip joint) on the left side. Right Posterior Superior Iliac Spine (r psis) Prominent point on the posterior superior spine of the ilium (hip joint) on the right side. A dimple often overlies this point. Left Posterior Superior Iliac Spine (l psis) Prominent point on the posterior superior spine of the ilium (hip joint) on the left side. A dimple often overlies this point. Right Iliocristale (r ic) Highest palpable point of the superior rim of the ilium (hip joint) in the mid-lateral line on the right side. Left Iliocristale (l ic) Highest palpable point of the superior rim of the ilium (hip joint) in the mid-lateral line on the left side. Right Trochanterion (r tro) Top of the bony lateral protrusion of the proximal end of the femur (around the hip joint) of the right leg. Left Trochanterion (l tro) Top of the bony lateral protrusion of the proximal end of the femur (around the hip joint) of the left leg. Cervicale (c) Most prominent point of the spinous process of the seventh cervical vertebra (neck). Waist Preferred Posterior (wpp) Level of the waist as marked on the subject right side in the midposterior line. 69

70 arms (24 landmarks) Right Acromion (r acr) Most lateral point of the lateral edge of the acromial process of the scapula (right shoulder) Left Acromion (l acr) Most lateral point of the lateral edge of the acromial process of the scapula (left shoulder) Right Anterior Axilla Point (r aax) Lowest point on the anterior axillary fold (right armpit). For scans is the mark placed on the right arm on the level of the lowest point on the axillary Left Anterior Axilla Point (l aax) Lowest point on the anterior axillary fold (left armpit). For scans is the mark placed on the left arm on the level of the lowest point on the axillary Right Posterior Axilla Point (r pax) Lowest point on the posterior axillary fold (right armpit). For scans is the mark placed on the right arm on the level of the lowest point on the axillary Left Posterior Axilla Point (l pax) Lowest point on the posterior axillary fold (left armpit). For scans is the mark placed on the left arm on the level of the lowest point on the axillary Right Radial Styloid (r rst) Distal tip of the radius of the right wrist Left Radial Styloid (l rst) Distal tip of the radius of the left wrist Right Olecranon (r ol) Prominent point on the olecranon process of the unlna, marked with the right elbow bent 90 degrees Left Olecranon (l ol) Prominent point on the olecranon process of the unlna, marked with the left elbow bent 90 degrees Right Lateral Humeral Epicondyle (r lhe) Lateral point on the lateral epicondyle of the humerus (around the elbow) of the right arm, when the palm is facing the side of the body. Left Lateral Humeral Epicondyle (l lhe) Lateral point on the lateral epicondyle of the humerus (around the elbow) of the left arm, when the palm is facing the side of the body. Right Medial Humeral Epicondyle (r mhe) Lateral point on the medial epicondyle of the humerus (around the elbow) of the right arm, when the palm is facing the side of the body. Left Medial Humeral Epicondyle (l mhe) Lateral point on the medial epicondyle of the humerus (around the elbow) of the left arm, when the palm is facing the side of the body. Right Radiale (r rad) Proximal point on the head of the radius, near the midpoint of the elbow on the lateral aspect of the right arm 70

71 Left Radiale (l rad) Proximal point on the head of the radius, near the midpoint of the elbow on the lateral aspect of the left arm Right Dactylion (r dact) Tip of the middle finger of the right hand. In scans is the marker placed on the fingernail with the center of the dot corresponding to the tip of the finger. Left Dactylion (l dact) Tip of the middle finger of the left hand. In scans is the marker placed on the fingernail with the center of the dot corresponding to the tip of the finger. Right Ulnar Styloid (r us) Distal point of the ulna (right wrist) Left Ulnar Styloid (l us) Distal point of the ulna (left wrist) Right Metacarpal-Phalangeal V (r mcpv) Prominent point on the lateral surface of the fifth metacarpal-phalangeal joint on the right hand Left Metacarpal-Phalangeal V (l mcpv) Prominent point on the lateral surface of the fifth metacarpal-phalangeal joint on the left hand Right Metacarpal-Phalangeal II (r mcpii) Prominent point on the lateral surface of the second metacarpal-phalangeal joint on the right hand Left Metacarpal-Phalangeal II (l mcpii) Prominent point on the lateral surface of the second metacarpal-phalangeal joint on the left hand legs (20 landmarks) Right Knee Crease (r kc) Midpoint of the crease that runs medial to lateral on the posterior side of the right knee. Left Knee Crease (l kc) Midpoint of the crease that runs medial to lateral on the posterior side of the left knee. Right Lateral Femoral Epicondyle (r lfe) Lateral point on the lateral epicondyle of the femur (around the knee) of the right leg. Left Lateral Femoral Epicondyle (l lfe) Lateral point on the lateral epicondyle of the femur (around the knee) of the left leg. Right Medial Femoral Epicondyle (r mfe) Lateral point on the medial epicondyle of the femur (around the knee) of the right leg. Left Medial Femoral Epicondyle (l mfe) Lateral point on the medial epicondyle of the femur (around the knee) of the left leg. Right Metatarsal-Phalangeal V (r mtpv) Maximum protrusion of the outside of the right foot at the head of the Metatarsus V. 71

72 Left Metatarsal-Phalangeal V (l mtpv) Maximum protrusion of the outside of the left foot at the head of the Metatarsus V. Right Metatarsal-Phalangeal I (r mtpi) Maximum protrusion of the inside of the right foot at the head of the Metatarsus I. Lef Metatarsal-Phalangeal I (l mtpi) Maximum protrusion of the inside of the left foot at the head of the Metatarsus I. Right Lateral Malleolus (r lma) Lateral point on the distal fibular protrusion of the right ankle Left Lateral Malleolus (l lma) Lateral point on the distal fibular protrusion of the left ankle Right Medial Malleolus (r mma) Medial point on the distal tibial protrusion of the right ankle Left Medial Malleolus (l mma) Medial point on the distal tibial protrusion of the left ankle Right Sphyrion (r sph) Distal point on the medial side of the tibia of the right ankle. Left Sphyrion (l sph) Distal point on the medial side of the tibia of the left ankle. Right Posterior Calcaneus (r pca) Most prominent posterior point of the right heel, this can be tissue instead of the calcaneus bone. Left Posterior Calcaneus (l pca) Most prominent posterior point of the left heel, this can be tissue instead of the calcaneus bone. Right Digit II (r diii) Tip of the second toe of the right foot For scans the marker is placed on the tip of the toe, not on the toenail. Left Digit II (l diii) Tip of the second toe of the left foot For scans the marker is placed on the tip of the toe, not on the toenail. 72

73 B Multiple Analysis of Variance B.1 MANOVA The binominal distribution is meant to test whether it is likely that two factors 1 and 2 come from the same class or have the same influence on the measurements in a dataset. This is measured by stating the null-hypothesis H 0 that their means are equal: µ 1 = µ 2. If the H 0 is rejected and the alternative hypothesis H a : µ 1 µ 2 is accepted, that is these stochastic variables come from different classes. One can reject the H 0 if its probability has passed the critical value at the tails of the probability distribution of t. The critical value is calculated based on a chosen significance level α, which is usually set to 5% or When one wants to test whether it is likely that multiple variables or factors X 1,..., X k belong to the same class. In other words one want to find out if the multiple factors have the same influence on the measurements of the data. The null-hypothesis is now H 0 = (µ 1 = µ 2 =... = µ k ) and the H a will be that at least one of the means belongs to a different class. I.e. at least one of the factors influences the measurements significantly different as the other factors. If a a normal t-test was used to test the H 0, the H 0 must be split up in H 0 = (µ 1 = µ 2,..., µ 1 = µ k, µ 2 = µ k,..., µ k 1 = µ k ). For every test an 1 α probability that the H 0 is incorrectly rejected. In a t-test is n times repeated, the probability of incorrectly rejection will be (1 α) n. This probability increases if the n increases. For t-test with k factors n will be ( k 2), which will result in a large probability of incorrect rejection. To avoid the large chances of incorrectly rejection, an analysis of variance (ANOVA)-test can be performed. This test is called analysis of variance since one main assumption of the ANOVA-test is that σ1 2 = σ2 2 =... = σk 2. Where ANOVA is used to test on k different factors with one variable, the multiple ANOVA (MANOVA) is used to test on k different factors with j variables. The reason to use the MANOVA-test is that MANOVA can include the correlation between different variables in the test, something that the ANOVA-test cannot. The null-hypothesis of the MANOVA test will be: µ 11 = µ 12 =... = µ 1k, µ 21 = µ 22 =... = µ 2k and µ j1 = µ j2 =... = µ jk and the alternative-hypothesis will be that at least one of the means is different from the rest and belongs to a different class. I.e. at least one of the factors or variables influences the measurements significant different as the other factors. For the comparision of means, different MANOVA statistics can be used, for example Wilks lambda, Pillai s trace and Hotelling-Lawley Trace. Each of these statistics is a function of a eigenvalues of the data-matrices and uses different characteristics of the differences in the means and therefore none of them can be called best. Wilks lambda is the most commonly used statistic and measures the amount of variance that is not explained by the experimental factor(s). If the factor(s) explains all variance it s value is 0.0 and if the factor(s) doesn t explain any variance it s value is 1.0. B.2 nested MANOVA Nested MANOVA is used when there are several levels in the dataset that can be used for testing. For example, there are two main factors A and B and for each factor there are variables x and y measured. This leaves four combinations 73

74 of classes: Ax, Ay, Bx and By. To apply the nested MANOVA on this set, a two level test can be performed, the different levels are visualised in figure 22 First level. The null-hypothesis is used to test if all means belong to the same class, i.e. is there a difference between the different factors? H 0 : µ A = µ B H a : µ A µ B : There is a significant effect difference between A and B. Second level. A split is made between the different factors. This is used to test wheter there is a significant difference for the measurements when the factor A and B are separated for the variables. H 1 : µ Ax = µ Ay and µ Bx = µ By H a : There is for at least one factor (A and/or B) a significant difference between the different variables x and y. Figure 22: The two levels of nested MANOVA as given in the example in section B.2. 74

75 C Selection Order of the Landmark Set In this appendix we ues the following abbrevations for the facial landmarks, the definitions of the landmarks can be found in Appendix A: se: sp: l io: r io: l tr: r tr: l go: r go: sellion supramenton left infraorbitale right infraorbitale left tragion right tragion left gonion right gonion C.1 Variance Analysis In this section we give an overview of the order in which the landmark distances were added to the final set of landmarks. In table 9, 10, 11 and 12 is the order of the selection of distances is given for the variance analysis. In these tables the bold distances were excluded from the landmark set. Table 9 and 10 display the results for the landmark set with the Gonion and table 11 and 12 display the results for the landmark set without the Gonion. 75

76 order distance information difference 1 r go - l go se - r go se - l go r tr - r go se - sp l tr - r tr r io - r go sp - l go sp - l tr sp - r tr sp - r go l tr - l go se - r tr l io - l go se - l tr r io - sp r io - r tr l io - sp se - r io r io - l io l io - l tr se - l io Table 9: The order of the selection for the variance analysis using the sample mean deviation for the landmark set with the Gonion 76

77 order distance information difference 1 r go - l go l tr - r tr se - r go se - sp r tr - r go r io - r go se - l go sp - r go sp - r tr l tr - l go r io - sp l io - sp sp - l go l io - l go se - r tr sp - l tr se - r io r io - r tr se - l tr r io - l io l io - l tr se - l io Table 10: The order of the selection for the variance analysis using the coefficient of variation for the landmark set with the Gonion 77

78 order distance information difference 1 se - sp l tr - r tr sp - l tr sp - r tr se - r tr se - l tr r io - sp r io - r tr l io - sp se - r io r io - l io l io - l tr se - l io Table 11: The order of the selection for the variance analysis using the sample mean deviation for the landmark set without the Gonion order distance information difference 1 l tr - r tr se - sp sp - r tr r io - sp l io - sp se - r tr sp - l tr se - r io r io - r tr se - l tr r io - l io l io - l tr se - l io Table 12: The order of the selection for the variance analysis using the coefficient of variation for the landmark set without the Gonion 78

79 C.2 Correlation Analysis The correlation analysis is performed on the landmark sets resulting from the variance analysis. The order of the selection of the landmarks are shown in table 13 for the landmark set with the gonion and in table 15 for the landmark set without the gonion. All distances displayed in these table were selected for the final landmark sets. The bold distances in this table have a high correlation with another distances which was excluded from the set. The distances with a high correlation are displayed in table 14 for the landmark set with the gonion and in table 16 for the landmark set without the gonion. order distance mean correlation 1 sp - r go - 2 r tr - r go se - r io l io - sp r io - r tr l tr - l go r go - l go se - sp sp - l go se - l tr r io - r go l io - l go r tr - l tr sp - r tr Table 13: The order of the selection for the correlation analysis for the landmark set with the Gonion distance 1 distance 2 correlation r io - r go se - r go l io - l go se - l go l io - sp r io - sp se - l go se - r go r io - r tr se - r tr sp - r tr sp - l tr Table 14: The distances with a high correlation to another distances for the landmark set with the Gonion 79

80 order distance mean correlation 1 se - r Inf - 2 l Inf - sp r Inf - r tr se - l tr se - sp r tr - l tr sp - r tr Table 15: The order of the selection for the correlation analysis for the landmark set without the Gonion distance 1 distance 2 correlation l Inf - sp r Inf - sp r Inf - r tr se - r tr sp - r tr sp - l tr Table 16: The distances with a high correlation to another distances for the landmark set without the Gonion 80

81 C.3 Fisher Discriminant Analysis The discriminant analysis is performed on two landmark sets; one with distances from and to the Gonion (table 17) and one without distances from and to the Gonion (table 18). The column minimal tolerance shows the percentage of variance that is explained by this variable and not yet by the model. The Wilks lamba is the overall explanation of the variance of the model. order distance minimal tolerance Wilks Lambda 1 se - r go se - sp r go - l go x r tr - r go x l tr - r tr x r io - r go x se - l go x se - r tr x sp - r go x l tr - l go x r io - r tr x sp - l tr x se - r io x sp - l go x se - l io x r io - l io x l io - sp x l io - l tr x 10 8 not selected - se - l tr x r io - sp x l io - gogo x sp - r tr x 10 7 Table 17: The results of the stepwise discriminant analysis for the landmark set with the Gonion 81

82 order distance minimal tolerance Wilks Lambda 1 se - sp l tr - r tr sp - r tr se - r tr x r io - r tr x se - r io x sp - l tr x r io - sp x se - l io x r io - l io x l io - sp x se - l tr x l io - l tr x 10 6 Table 18: The results of the stepwise discriminant analysis for the landmark set without the Gonion 82

83 D Bump Hunting Bump Hunting is a classification method. It will create a list of detection rules for each seperate class. Based on these rules, a classification for a new point can be made. The most common used bump-hunting algorithm is PRIM (Patient Rule Induction Method)[113]. In this section we describe the main principles of bump hunting. For more detail is the reader referred to [114]. D.1 Creation of Detection Rules Bump-Hunting will search in n-dimensional data and places boxes around the points that belongs to the same class. These boxes are constructed by creating detection rules. By the construction of these boxes, two parameters are used to influence the process, the support of a box S b and the mean of a box M b. The mean of a box is used to test if a box holds sufficient information for the classification. The support of a box is used to test whether the box is large enough to be added included in the detection-rules: M b = S b = number of points in box b total number of points in the data number of correct classified points in box b total number of points in the data (16) (17) (a) (b) (c) (d) Figure 23: An 2-dimensional example of the creation of boundingboxes. In (a) the test set is shown, in (b),(c) and (d) is respectively the created boundingboxes B 1, B 2 and B 3 shown. 83

84 We will explain the PRIM-algorithm based on a 2-class example with 2 dimensions. This example is shown in figure 23(a). We try to find detection rules to classify points as the blue squares. As a stopping rule, we set the minimal support of a box at The first box B 1 is found with the rule 0.5 < x < 0.8 and 0.1 < y < 0.5 and 6 is shown in figure 23(b). It has a support of 21 and a mean of 6 6 = 1. Before the second box can be created, the points in B 1 has to be removed. The second box B 2 is found with the rule 0.2 < x < 0.5 and 0.9 < y < 1.1 and is shown in figure 23(c). It has a support of 3 21 = 1 7 and a mean of 2 6. Again the points that fall into B 2 are removed before one can continue. The third box B 3 is found with the rule 0.7 < x < 0.8 and 1.1 < y < and is shown in figure 23(d). It has a support of 21 and a mean of 1 1 = 1. Although this box has a high mean, the support is below the minimal support of This rule isn t accepted and the classification of the blue squares is based on 2 detection rules. Figure 24: The test data that is used to test the boundingboxes created in D.1. D.2 Using the Detection Rules Before the detection rules can be used for classification, the detection rules must first be tested. The test data ought to be different than the training data for which the boxes are created. A new test set for the example is shown in figure 24. The box mean for box B 1 and B 2 is calculated again, where M B1 = 5 7 and M B 2 = 3 5. These values can be used as probabilities for classification. If no rules are satisfied, the probability for a blue square is 2 12 = 1 6. Using these rules, we can calculate for a new point p the probability P(p=blue square) using the following pseudo code: 84

85 Algorithm for blue square classification For each new point p do { if(0.5 < p.x < 0.8 && 0.1 < p.y < 0.5) return P(p = blue square) = 5/7 if(0.2 < p.x < 0.5 && 0.9 < p.y < 1.1) return P(p = blue square) = 3/5 } return P(p = blue square) = 1/6 When using bump hunting one must be aware that the collection of rules is an ordered set of rules which are not independend. Since the target mean of the box B k is calculated with the datapoints in box B 1,..., B k 1 removed. If there is no overlap for box B k and B k is used as independend rule, the box mean is probably the same. But if there is overlap between B k and one of the previous boxes (what is mostly the case in practice), the box mean would probably be much lower. 85

86 E Detection Rules of Facial Landmarks In this section are the detection rules shown, as found by bump hunting. 11 radii, min-radius, max-radius. etc E.1 sellion (se) 1. p(se) = IF r1 > AND r6 > AND r8 = convex AND r11 = convex 2. p(se) = IF r1 < AND r4 < AND r7 > AND r11 > AND r11 = convex 3. p(se) = IF r3 > AND r8 < AND r9 < AND r10 > AND r10 = convex AND r11 > p(se) = IF r6 > AND r7 > AND r8 > AND r9 > AND r11 > r11 = convex 5. p(se) = IF r1 < AND r1 = convex AND r2 > AND r3 > AND r4 < AND r6 > AND r8 < AND r10 > AND r11 > AND r11 = convex 6. p(se) = IF r2 > AND r2 < AND r3 > AND r3 < AND r4 > AND r5 > AND r6 > AND r8 < AND r9 > AND r9 < AND r10 > AND r11 > AND r4 = convex 7. p(se) = IF r4 > AND r5 > AND r7 > AND r9 > AND r10 > AND r11 > AND r11 < p(se) = IF r1 > AND r2 > AND r2 < AND r3 < AND r4 > AND r6 > AND r6 < AND r8 > AND r9 > AND r9 < AND r10 > AND r11 > AND r2 = convex AND r4 = convex AND r11 = convex 9. p(se) = IF r2 > AND r3 > AND r3 < AND r5 > AND r6 > AND r9 < AND r10 > AND r11 > AND r4 = convex AND r10 = convex 10. p(se) = IF r2 > AND r2 < AND r3 > AND r3 < AND r5 < AND r7 < AND r8 < AND r9 > AND r9 < AND r10 < AND r1 = convex AND r11 = convex 11. p(se) = IF r3 < AND r4 > AND r6 > AND r7 > AND r7 < AND r8 > AND r9 > AND r10 > p(se) = IF r1 > AND r2 > AND r3 > AND r4 > AND r5 > AND r6 > AND r7 > AND r8 > AND r9 > AND r10 > AND r11 > p(se) = IF r4 = convex AND r5 = convex AND r6 = convex AND r7 = convex AND r8 = convex AND r9 = convex AND r10 = convex AND r11 = convex 86

87 E.2 infraorbitale (io) 1. p(io) = IF r3 < AND r6 < AND r8 > AND r8 < AND r9 > AND r10 > AND r11 < p(io) = IF r3 > AND r4 > AND r5 > AND r5 < AND r6 > AND r7 < AND r8 > AND r9 > AND r10 < AND r11 < AND r11 = convex 3. p(io) = IF r4 > AND r6 > AND r6 < AND r7 < AND r9 > AND r10 > AND r10 < AND r4 = convex AND r5 = convex AND r6 = convex AND r7 = convex AND r8 = convex AND r9 = convex AND r10 = convex AND r11 = convex 4. p(io) = IF r4 > AND r4 < AND r5 > AND r5 < AND r6 < AND r7 > AND r7 < AND r8 < AND r9 > AND r10 > AND r11 < AND r8 = convex 5. p(io) = IF r1 > AND r3 < AND r4 > AND r6 > AND r7 > AND < AND r8 < AND r11 > AND < AND r11 = convex 6. p(io) = IF r3 > AND e3 < AND r4 > AND r6 > AND r6 < AND r7 < AND r8 > AND r9 > AND r10 > AND r3 = convex AND r11 = convex 7. p(io) = IF r2 < AND r3 > AND r4 > AND r5 > AND r5 < AND r6 > AND r6 < AND r7 < AND r8 < AND r9 > AND r10 > AND r11 < AND r6 = convex 8. p(io) = IF r2 > AND r5 > AND r5 < AND r6 > AND r6 < AND r7 < AND r8 > AND r8 < AND r9 > AND r10 > AND r10 < AND r11 < AND r3 = convex AND r5 = convex 9. p(io) = IF r1 > AND r2 > AND r3 < AND r4 < AND r5 < AND r6 > AND r6 < AND r8 > AND r9 > AND r10 > AND r11 > AND r11 < AND r11 = convex 10. p(io) = IF r2 > AND r2 < AND r3 < AND r6 < AND r7 > AND r7 < AND r8 < AND r9 < AND r11 < AND r10 = convex 11. p(io) = IF r1 > AND r1 < AND r2 > AND r4 > AND r5 > AND r5 < AND r7 < AND r8 > AND r8 < AND r9 > AND r10 > AND r11 < p(io) = IF r5 > AND r5 < AND r6 >

88 AND r9 > AND r10 > AND r11 < AND r11 = convex 13. p(io) = IF r1 > AND r4 > AND r6 > AND r7 > AND r7 < AND r8 > AND r9 > AND r11 < p(io) = IF r1 > AND r3 > AND r3 < AND r5 < AND r7 > AND r7 < AND r8 > AND r8 < AND r9 > AND r11 < AND r5 = convex AND r9 = convex 15. p(io) = IF r3 < AND r4 > AND r6 > AND r8 < AND r9 > AND r10 > AND r10 < AND r11 < AND r1 = convex AND r9 = concave 16. p(io) = IF r2 > AND r3 < AND r4 > AND r5 > AND r5 < AND r6 > AND r8 < AND r9 < AND r10 < AND r11 < AND r11 = concave 17. p(io) = IF r1 > AND r3 < AND r4 > AND r5 > AND r6 < AND r7 < AND r8 < AND r10 > AND r8 < AND r11 > p(io) = IF r2 > AND r3 > AND r4 < AND r5 > AND r5 < AND r6 > AND r7 > AND r9 > AND r10 < AND r11 < AND r11 = convex 19. p(io) = IF r2 > AND r2 < AND r3 < AND r4 < AND r6 > AND r7 < AND r8 < AND r11 < AND r3 = convex AND r11 = convex 20. p(io) = IF r1 > AND r2 > AND r4 > AND r5 > AND r6 > AND r7 > AND r8 > AND r11 < AND r6 = convex 21. p(io) = IF r1 > AND r2 < AND r3 > AND r4 > AND r5 < AND r7 < AND r8 < AND r11 > AND < AND r9 = convex 22. p(io) = IF r1 > AND r2 > AND r3 < AND r4 > AND r5 > AND r6 > AND r7 < AND r8 > AND r9 < AND r10 < AND r11 < p(io) = IF r1 > AND r2 > AND r3 > AND r4 < AND r6 > AND r10 < AND r11 < AND r11 = concave 24. p(io) = IF r1 < AND r2 < AND r3 < AND r5 > AND r7 > AND r8 > AND r9 > AND r9 < AND r10 > AND r10 < p(io) = IF r2 > AND r3 < AND r4 > AND < AND r5 > AND < AND r6 > AND < AND r7 <

89 AND r9 < AND r10 > AND r11 < p(io) = IF r3 > AND r4 > AND R4 < AND r5 > AND r6 > AND R6 < AND r7 > AND r8 > AND r9 < AND r10 < AND r11 > p(io) = IF r1 > AND < AND r4 > AND r5 > AND r6 < AND r7 < AND r11 < p(io) = IF r1 > AND r2 < AND r3 > AND r4 > AND r4 < AND r5 > AND r6 > AND r7 > AND r8 > AND r8 < AND r9 > AND < AND r10 < AND r3 = convex 29. p(io) = IF r1 < AND r4 < AND r6 > AND r8 < AND r10 < AND r11 < p(io) = IF r5 > AND r5 < AND r6 > AND r7 > AND r8 > AND r8 < AND r9 > AND r9 < AND r10 > AND r11 > AND r2 = convex AND r8 = convex 32. p(io) = IF r4 > AND r5 > AND r6 > AND r7 > AND r8 > AND r8 < AND r9 > AND r9 < AND r10 > AND r10 < AND r11 > AND r11 < p(io) = IF r5 > AND r7 < AND r9 < AND r10 > AND r10 < AND r11 > AND r11 < E.3 supramenton (sp) 1. p(sp) = IF r1 > AND r1 < AND r3 > AND r3 < AND r5 > AND r5 < AND r7 > AND r8 > AND r10 > AND r10 < AND r11 > AND r11 < AND r11 = convex 2. p(sp) = IF r1 > AND r2 > AND r6 > AND r7 > AND r8 > AND r9 > AND r10 > AND r11 > AND r11 < AND r8 = concave AND r11 = convex 3. p(sp) = IF r1 > AND r2 > AND r3 > AND r3 < AND r5 > AND r7 > AND r8 > AND r9 < AND r10 > AND r10 < AND r11 > AND r11 < p(sp) = IF r1 > AND r2 > AND r3 > AND r5 > AND r7 > AND r8 > AND r9 > AND r10 > AND r10 < AND r11 > AND r11 < AND r11 = convex 5. p(sp) = IF r1 > AND r4 > AND r5 > AND r6 > AND r7 > AND r8 > AND r9 < AND r11 > AND r5 = concave 89

90 6. p(sp) = IF r1 > AND r2 > AND r2 < AND r3 > AND r5 > AND r7 > AND r8 > AND r10 > AND r11 > AND r11 < p(sp) = IF r1 > AND r2 > AND r3 < AND r4 > AND r4 < AND r5 > AND r5 < AND r10 > AND r11 < p(sp) = IF r1 > AND r4 > AND r5 > AND r7 > AND r8 > AND r9 < AND r5 = concave 9. p(sp) = IF r1 > AND r2 > AND r2 < AND r3 > AND r3 < AND r4 > AND r5 > AND r6 >.253 AND r7 > AND r10 > AND r11 > AND r11 < AND r5 = convex AND r11 = convex 10. p(sp) = IF r1 > AND r4 > AND r9 = concave AND r11 = convex 11. p(sp) = IF r1 > AND r2 > AND r2 < AND r3 > AND r3 < AND r4 > AND r5 > AND r6 < AND r7 > AND r7 < AND r9 < AND r10 > AND r10 < AND r11 < p(sp) = IF r1 > AND r1 < AND r2 > AND r2 < AND r3 > AND r4 > AND r4 < AND r5 > AND r6 > AND r7 > AND r8 > AND r9 < AND r10 > AND r10 < AND r11 > AND r11 < p(sp) = IF r1 > AND r1 < AND r2 > AND r2 < AND r3 > AND r4 > AND r4 < AND r5 > AND r5 < AND r6 > AND r8 > AND r10 < AND r11 > AND r11 < AND r11 = convex 14. p(sp) = IF r1 > AND r2 > AND r2 < AND r3 > AND r4 > AND r5 > AND r8 > AND r9 > AND r10 > AND r11 > AND r11 < AND r4 = concave AND r6 = convex AND r10 = convex 15. p(sp) = IF r1 > AND r4 > AND r5 > AND r8 > AND r9 < AND r11 < p(sp) = IF r1 > AND r2 > AND r5 > AND r7 < AND r11 < AND r11 = convex 17. p(sp) = IF r1 > AND r4 > AND r5 < AND r6 < AND r7 > AND r7 < AND r8 > AND r8 < AND r9 > AND r9 < AND r10 > AND r11 < p(sp) = IF r2 > AND r2 < AND r3 < AND r4 > AND r6 > AND r7 > AND r8 > AND r10 < AND r11 > AND r1 = convex AND r11 = convex 90

91 19. p(sp) = IF r1 > AND r1 < AND r2 < AND r3 > AND r4 > AND r5 > AND r6 > AND r6 < AND r7 > AND r9 > AND r9 < AND r10 > AND r10 < AND r11 > AND r11 < AND r11 = convex 20. p(sp) = IF r4 > AND r5 > AND r6 < AND r7 > AND r8 > AND r9 > AND r10 > AND r10 < AND r11 < AND r2 = convex AND r11 = convex 21. p(sp) = IF r2 < AND r3 > AND r4 > AND r5 > AND r6 > AND r7 > AND r8 > AND r9 > E.4 tragion (tr) 1. p(tr) = IF r2 > AND r3 < AND r4 < AND r8 < AND r9 < AND r10 < AND r11 < AND r11 = concave 2. p(tr) = IF r1 < AND r2 > AND r3 > AND r5 > AND r6 < AND r11 = concave 3. p(tr) = IF r1 > AND r2 > AND < AND r3 > AND < AND r4 < AND r7 < AND r11 > AND r10 = concave 4. p(tr) = IF r2 > AND r4 > AND r5 > AND r6 > AND r7 > AND r10 < AND r11 < AND r8 = concave 5. p(tr) = IF r3 > AND r7 > AND r11 < p(tr) = IF r1 > AND < AND r2 > AND < AND r3 > AND < AND r4 > AND r8 < AND r9 < AND r11 > AND r11 = concave 7. p(tr) = IF r4 < AND r5 < AND r6 > AND r10 < AND r9 = concave 8. p(tr) = IF r1 > AND r6 < AND r7 > AND r8 > AND r9 > AND r10 > AND r11 > AND r10 = concave 9. p(tr) = IF r4 > AND r5 > AND r6 < AND r7 < p(tr) = IF r3 < AND r4 < AND r6 > AND r9 < AND r10 < AND r9 = concave 11. p(tr) = IF r2 < AND r3 > AND < AND r5 < AND r1 = convex AND r11 = concave 12. p(tr) = IF r4 < AND r7 > AND r9 > AND r10 < p(tr) = IF r1 > AND r3 > AND r4 > AND r6 > AND r7 < AND r8 > AND r10 > AND r10 = concave 14. p(tr) = IF r4 > AND r6 > AND r7 > AND r9 > AND r10 < AND r11 <

92 15. p(tr) = IF r1 < AND r2 > AND < AND r3 > AND < AND r4 < AND r7 < AND r10 < AND r4 = convex AND r11 = convex 16. p(tr) = IF r1 > AND r2 < AND r3 < AND r4 < AND r5 < AND r7 < AND r8 > AND r9 > AND r11 < AND r1 = concave AND r8 = concave 17. p(tr) = IF r2 < AND r4 < AND r7 < AND r8 < p(tr) = IF r2 > AND r3 > AND r4 < AND r8 > AND r10 > AND r11 < p(tr) = IF r2 > AND < AND r3 > AND r4 < AND r5 > p(tr) = IF r1 > AND r3 > AND < AND r4 > AND r6 < AND r7 > AND r9 > AND r11 > p(tr) = IF r3 > AND r4 < AND r6 > AND r9 < p(tr) = IF r3 > AND r4 < AND r7 < AND r9 > p(tr) = IF r4 > AND r6 > AND r8 < AND r11 = concave 24. p(tr) = IF r11 = concave E.5 gonion (go) 1. p(go) = IF r5 < AND r7 < AND r8 < AND r9 > AND r10 < AND r3 = concave AND r9 = convex 2. p(go) = IF r1 < AND r2 < AND r3 > AND r5 < AND r6 < AND r7 > AND r8 > AND r9 > AND r10 < AND r11 < p(go) = IF r1 < AND r3 < AND r5 < AND r6 < AND r7 < AND r8 > AND r9 > AND r10 > AND r10 < AND r11 < p(go) = IF r1 < AND r3 > AND r7 > AND r7 < AND r10 < p(go) = IF r3 > AND r4 > AND r4 < AND r5 < AND r8 > AND r8 < AND r10 < AND r11 < AND r11 = convex 6. p(go) = IF r1 > AND r3 > AND r5 > AND r8 < AND r9 < AND r10 > AND r11 > AND r11 < AND r3 = concave 7. p(go) = IF r7 < AND r8 > AND r8 < AND r10 < AND r3 = concave AND r9 = convex 8. p(go) = IF r1 < AND r5 > AND r5 < AND r7 < AND r8 < AND r9 >

93 AND r9 < AND r11 < AND r11 = convex 9. p(go) = IF r2 > AND r4 > AND r5 < AND r8 < AND r10 > AND r10 < AND r11 < AND r8 = concave 10. p(go) = IF r1 > AND r1 < AND r3 > AND r4 > AND r5 < AND r8 > AND r9 < AND r10 < AND r11 < p(go) = IF r3 > AND r4 < AND r6 > AND r8 > AND r8 < AND r9 < AND r10 < AND r11 < AND r2 = convex 12. p(go) = IF r1 < AND r3 > AND r3 < AND r5 < AND r6 < AND r7 > AND r10 > AND r11 < p(go) = IF r3 > AND r3 < AND r5 < AND r6 > AND r7 > AND r7 < AND r8 < AND r9 > AND r10 < AND r11 < p(go) = IF r7 < AND r8 > AND r8 < AND r9 > AND r10 > AND r10 < AND r11 < AND r6 = concave 15. p(go) = IF r3 > AND r5 < AND r6 > AND r8 < AND r9 > AND r10 < AND r11 < AND r2 = convex 16. p(go) = IF r1 < AND r3 < AND r6 < AND r8 > AND r11 < p(go) = IF r1 < AND r2 > AND r4 > AND r5 < AND r8 > AND r8 < AND r10 < AND r11 < p(go) = IF r3 > AND r7 < AND r8 < AND r9 > AND r9 < AND r10 > AND r10 < AND r11 < AND r9 = concave 19. p(go) = IF r1 < AND r3 > AND r3 < AND r4 > AND r4 < AND r5 < AND r6 < AND r7 < AND r8 < AND r9 < AND r10 > AND r10 < AND r11 < AND r11 = convex 20. p(go) = IF r1 < AND r2 < AND r4 > AND r6 > AND r6 < AND r9 < AND r10 < AND r11 < AND r11 = convex 21. p(go) = IF r1 < AND r2 < AND r4 < AND r6 < AND r9 > AND r10 > AND r10 < AND r11 > AND < p(go) = IF r1 < AND r2 < AND r3 > AND r6 < AND r10 > AND < AND r11 < p(go) = IF r1 < AND r3 > AND r5 < AND r6 > AND r9 < AND r11 < AND r2 = concave 24. p(go) = IF r6 > AND r8 > AND r10 < AND r11 >

94 25. p(go) = IF r1 < AND r2 < AND r3 < AND r4 < AND r6 > AND r7 > AND r8 < AND r9 > AND r11 > AND r11 < p(go) = IF r7 > AND r8 < AND r9 > AND r10 > AND r10 < AND r11 < p(go) = IF r5 < AND r6 >

95 F Program Manual In this section is the program described, which is written for this thesis. We will now discuss the main functions of the program. F.1 View options For most viewing actions, the user has to open a single scan first. This is done by selecting open scan in the File-menu. If a 3D whole body scan is opened, the user is asked to locate the boundingbox-file to extract the facial region from the scan. The opened scan can be viewed in 3D as point cloud or surface. Both functions can be selected from the View-menu. Also one can see the manually located landmarks from the CAESAR-survey, by selecting landmarks in the View-menu. This option is enabled when the landmarks are read in the memory, by selecting read landmarks in the Landmarks-menu. Also, the user can segment the curvature of the scan by selecting the option segmentation in the View-menu. When segmentation is pressed, a window opens where the user can adjust the settings for the curvature characterization. The segmentation is based on the multi-scale bubbles curvature characterization as described in section This window is shown in figure 25. When the settings are adjusted, the user can segment the scan by pressing select. Now the segmentation or characterization for each separate radius is shown, as in figure 26. The color code for the segmentation is: color description name red convex sharp point tip (T) green concave sharp point pit (P) blue convex round point mount (M) yellow concave round point dip (D) white convex blend point blend (B) When the user presses the next step button, he can perform a query to find points that fulfil a certain surface characterization. In figure 27 a query is shown that shows all points which are classified as blend point (B) for the (fourth) radius of 5.01 pixels. The surface characterizations for the other radii don t matter (*). F.2 Landmark Detection The automated landmark detection is performed on multiple scans. When the user selects detect landmarks from the Landmark-menu, a window pops up where the user has to select the places where the needed files can be found. These are: a folder with the scans, if these scans are whole body scans the boundingboxfile must be selected, a folder with the detection rules and a folder to place the result files in. When the user has confirmed the locations, the actual detection begins with the first scan in the folder. When the local curvature for all points in the scan is analysed, the user can view for each point its probabilities to be a certain landmark. This is shown in figure 28 and 29. With the radio buttons the user can view the highest probability for all points (figure 28), but also select a landmark for which the probability distribution must be shown (figure 29). In 95

96 Figure 25: The parameter settings for the curvature characterization Figure 26: The curvature characterization for each landmark the latter case, the probabilities are shown from no probability (white) until a high probability (landmark color). For the automated landmark detection, the user presses next step. When the landmarks are located, they are shown to the user as in figure 30. The landmarks that are correctly found must be selected right in the window. When the user presses next scan, the results are saved in a file and the next scan is processed. F.3 Local curvature analysis for landmarks In the File-menu, there is an option run batch. This option is used to determine the local surface of the manual located landmarks of the CAESAR-survey. Before running the batch the user has to specify the places where the needed files can be found. These are: a folder with the scans, if these scans are whole body scans the boundingbox-file must be selected, a folder with the landmark files and a folder to place the result files in. When the user has confirmed the 96

97 Figure 27: A query for points with a certain local curvature locations, the actual analysis begins with the first scan in the folder. Before the analysis of the local curvature can be performed, the user must validate the position of the landmarks by selecting the option view result in the File-menu. The user has to specify the places where the needed files can be found. These are: a folder with the scans, if these scans are whole body scans the boundingbox-file must be selected and a folder with the result files. When the user has confirmed the locations, a window opens which shows the first scan in the folder including the manually located landmarks of the CAESAR-survey. The user can delete the landmarks which cannot be used since they are not accurately located or because they lie too close to the border of the scan or a hole in the scan. 97

98 Figure 28: The points labelled in the color of the landmark with the highest probability. When a point has no probability to be a certain landmark that is higher than the threshold (0.75) it is labelled black. Figure 29: The probabilities of being a sellion for all points in a scan. points have a high probability and white points have a zero probability. Red 98

99 Figure 30: The locations of the detected landmarks. The user can select the landmarks that are correctly found. 99