Face Recognition Software using the LDA Algorithm

Transcription

1 Face Recognition Software using the LDA Algorithm Sumit Sampat, Mohammad Murtuza, and Mohammad Anwar Abstract: The essential task in face recognition system is dimensionality reduction. We implement the linear discriminant analysis or Fisherfaces method to reduce the dimensionality and recognize facial images. LDA algorithm addresses the small sample size problem found in the Eigenfaces approach. Advantages of alternative approaches such as D-LDA and DF-LDA are discussed. We use only frontal images in the FERET dataset. We also discuss software we developed that implements LDA. Finally, we analyze the results for a varying set of images. Index Terms: Eigenfaces, Fisherfaces, linear discriminant analysis (LDA), Direct LDA, Direct Fractional LDA, principle component analysis (PCA), FERET dataset, face recognition, biometrics. 1. Introduction A number of biometrics have been proposed for making human recognition an automated task. Face is an important biometric and a large amount of research is being done in face recognition. First let us distinguish between face recognition and face verification. The latter deals with confirming a persons identity and is a one to one problem whereas the former deals in establishing a person s identity and is a one to many problem. [3] A number of algorithms have been proposed for face recognition in the appearance based approach to avoid difficulties with threedimensional (3-D) modeling. Principle component analysis (PCA) or Eigenfaces and linear discriminant analysis (LDA) are powerful tools used for data reduction and feature extraction. [1] The Eigenface method is based on the linearly projecting the image space to a low dimensional feature space. However the Eigenface approach uses principal component analysis (PCA) for dimensionality reduction. This yields projection directions that maximize the total scatter across all classes, i.e., across all images of all faces. Eigenfaces focus on the most expressive features (MEF) which mainly achieves object reconstruction. Thus, in PCA significant discriminatory information is lost. [2] Fisherfaces or the linear discriminant analysis (LDA) focuses on the most discriminant features (MDF) to achieve better classification accuracy. LDA maximizes the between class scatter and minimizes the within class scatter to 1

2 separate classes and have images of the same class closer to each other. [1], [2] Degenerated scatter matrices exist due to the so called small sample size (SSS) problem. The SSS problem widely exists in face recognition tasks where the number of training samples is smaller than the dimensionality of the samples. The traditional solution to the SSS problem requires the incorporation of a PCA step into the LDA framework. In this approach, the PCA is used as a preprocessing step for dimensionality reduction to discard the null space matrix of the training data set LDA is then performed on the lower dimensional PCA subspace. Some amount of information is lost due to the PCA step. To overcome this problem an alternative algorithm D-LDA is proposed which eliminates the PCA step. Thus, significant discriminatory information is preserved. A fractional LDA (F-LDA) step is further applied to reduce the dimension in a few fractional steps allowing for relevant distances to be accurately weighted. The combination of both these algorithms gives rise to a new direct fractional LDA (DF- LDA) algorithm. However, we concentrate on the traditional LDA or Fisherfaces approach and analyze its performance. [1] 2. Fisherfaces Approach A set of L training images x 1, x 2 x L of size 150 x 130 (height x width) are the input to the face recognition task. This matrix is represented as a vector of length N (height x width) obtaining a dimensionality of for the images in the FERET dataset. A set of c classes is arbitrarily chosen, which in our case is 20, 40 or 60 to analyze the performance of the LDA algorithm. L i is the number of images in each class and varies for each class. Thus, there are x i L images which belong to X i c classes. The idea is to reduce the dimensionality using the PCA step from N to N-c and then perform LDA to reduce the dimensionality of the subspace to c-1. Initially, it is necessary to calculate the mean of each class μ i and the overall mean μ of all images. Mean of each class μ i, L 1 i µ i = x j (1) L i j= 1 Mean of overall images μ, 1 c i µ = µ (2) c i= 1 Mean centered images Ф are calculated by subtracting the images from the overall mean μ. Ф = X μ (3) The PCA step is now applied to reduce the dimensionality to N-c Eigenvectors of the mean centered images E are calculated and the Eigenspace is calculated by multiplying the eigenvectors E with the mean centered images Ф. 2

3 LDA maximizes the ratio between the between class scatter matrix S B and the within class scatter matrix S W for better classification. The between class scatter matrix S B is computed as c S = L ( µ µ )( µ µ ) Τ (4) B i i i i= 1 The within class scatter matrix S w is computed as c T w k i k i i= 1 xk Xi (5) S = ( x µ )( x µ ) The size of the scatter matrices is N x N. Since S w is non singular its inverse exists. Thus we, need to calculate the eigenvectors of S w -1 S B. Since the size of this matrix is N x N the eigenvectors U of this matrix based on c-1 largest eigenvalues are computed. The Fisherfaces Ω are calculated by multiplying the eigenvectors U with the mean centered images Ф. The Fisherfaces are projected on the PCA subspace on the reduced Fisherspace. [1], [2], [4] 3. LDA Analysis Tool The FERET Dataset is an image database that consists of over 14,000 images of 1,200 individuals. Pictures of these individuals were taken at various angles and expressions. Our analysis of the Linear Discriminant Analysis (LDA) algorithm focused on the frontal position subset. As a reference, we used Colorado State University s source to implement the standard LDA algorithm [5]. We ported some of the source to the.net framework for rapid development of our LDA analysis tool. Our program features a visual interface for training, testing, image distance graphing, 2d and 3d plotting, and a webcam interface. Each image needs to normalized before they can be trained. Figure 1 is an example of a before and after picture of a normalized image. The reason for normalizing is to get rid of background noise, unnecessary detail, and equalizing the image s histogram. Figure 1. Results of normalization an image. After images are normalized they are ready to be trained and tested on. Our tool allows the user to visually pick which images they want to train and test. The first step is to create a SRT file which is a text file with a list of images. Each line in the training SRT files is a class, and class can consist of multiple images of the same person. Figure 3

4 A1 in the appendix shows the user interface. After training, a training file is saved which stores the eigenvalues, the mean of training images, and basis vectors [5]. The next step is to create a testing SRT file which is the union of test images and trained images from the previous step. After testing, distance files are generated for each image and its Euclidean distance to all other images in the set. To visualize the distance from each image to each class in the set, we created a concentric circle graph where each circle s radius is the distance from an image to other classes. This is done by calculated the average distance of a class and plotting its point as the radius of a circle with the origin being the current image plotted at (0,0). Figure A2 in the appendix display s this feature in detail. In order to view how images are clustered together in subspace you need to plot each image and maintain distances between each point. This is done using principle coordinate analysis. We used distpcoa [6] to read in the distance matrix of a testing session which is the concatenation of all the distance files contents to form a matrix. The output is a 2 and 3 dimensional axis listing for each image. A simple proof of method is shown in figure Figure 2. An example of using a distance matrix of a triangle and using distpcoa to find a set of points in subspace while maintaining the original distances between points. After the two and three dimensional coordinates are generated, we plot them accordingly onto a 2d and 3d plot as shown in figure 3. a b c a b c Distance matrix 2d coordinates generated by distpcoa x y a b c Verifying original distance d(a,b)=sqrt(( )^2+( )^2)=3 d(b,c)=sqrt(( )^2+( )^2)=5 d(a,c)=sqrt(( )^2+( )^2)=4 4

5 Figure 3. 2d plot of a 400 images grouped into 60 classes. There are some clear clusters and some with clusters overlapping. Figure A3 in the appendix shows the user interface for this graph and a 3d graph. We decided to include a webcam interface to allow anyone to test their own face with the LDA algorithm analysis system. Any windows compatible webcam can be used. After capturing a picture of the user, they manually set the eye coordinates on their face and add it to an eye coordinates list. Figure A4 in the appendix shows the webcam user interface. From there they normalize the image to prepare it so it can be accepted for training and testing (Figure A5 in the appendix). 4. Experimental Results We trained and tested 20, 40, and 60 classes from the FERET database. Each image is 130x150 pixels, making N = 19500, which is the size of the subspace and the total number of pixels in the image. Table 1 shows our analysis of the LDA algorithm with the frontal images from the FERET dataset. As the number of classes grow, the better the classifying of a test image in the trained subspace as shown in Figure 4. Correct Test Image Classifying Success Rate 84% 82% 80% 78% 76% 74% 72% 70% Classes Figure 4. Increasing the number of classes for training and testing results in better recognition. 5. Conclusion The feature extraction method utilized here is the well known linear discriminant analysis or Fisherfaces approach. This method is applied to the FERET dataset and is analyzed on 20, 40 and 60 classes. The results are discussed in the experimental section and it shows that the accuracy increases as the number of classes increase. Only frontal images were used. Thus the experiment can be extended to different poses. The presence of glasses and various expressions are considered. And the classes trained images test images correct wrong total % success % Table The 201 number of classes 41 and 32 their corresponding 9 41 number 78% of training/test 60 images 305 and the 90 success rate % 5

6 performance is analyzed on the dataset. The algorithm has limitations wherein significant discriminatory information is lost due to the PCA step. Alternative methods such as DF- LDA can be applied for better classification accuracy to avoid the mentioned problem. 6. References [1] Juwei Lu, Kostantinos N. Plataniotis, and Anastasios N. Venetsanopoulos, Face Recognition Using LDA-Based Algorithms, IEEE TRANSACTIONS ON NUERAL NETWORKS, vol. 14, No. 1, pp , January [2] Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman, Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Vol. 19, No. 7, pp , July [3] Jain, Anil K.; Bolle, Ruud.; Pankanti, Sharath, Biometrics : Personal Identification in Networked Society, [4] Dr. Pavlidis, Cosc 6397 Lectures, University of Houston, Spring [5]. Evaluation of Face Recognition Algorithms. Colorado State University. [6]. Legendre, Pierre. DistCoA Algorithm. o/distpcoa.html 6

7 Screenshot Appendix Figure A1. In the training step, the user can visually select the files they want to train and which files they want to leave out for testing. Above, the highlighted files are the files selected for training, the others are left for testing. 7

8 Figure A2. Image to class distance graph the current image is represented by the letter x on the graph. Each concentric circle is the normalized class distance from the current image. In this example, image 90009ahmed_2 is represented by x and it is correctly closest to class 9. The last class listed in the second list box proves this as the distance is the minimum of all classes listed. The highlighted images (test images) are calculated by taking the symmetric difference of the training SRT file listing and the testing SRT file listing. The training file are user selected files for training, and the testing file are all files in the image set. By taking the symmetric difference, you are left with test images. 8

9 Figure A3. 2d and 3d plots for a 60 class training/testing session. This allows the user to see the clustering of images into classes and visualize how the algorithm is doing when trying to maximize the between class scatter and minimizing the with class scatter. 9

10 Figure A4. The webcam feature allows anyone with a webcam to capture their face and create their own eye-ordinate files which is later used for normalizing. The eye coordinates are selected by clicking on the picture s right and left eye. 10

11 Figure A5. After capturing pictures, it is necessary to normalize them to get rid of the background, noise, and to focus only on the face as shown above. 11