Open World Face Recognition with Credibility and Confidence Measures

Open Worl Face Recognition with Creibility an Confience Measures Fayin Li an Harry Wechsler Department of Computer Science George Mason University Fairfax, VA 22030 {fli, wechsler}@cs.gmu.eu Abstract. his paper escribes a novel framework for the Open Worl face recognition problem, where one has to provie for the Reject option. Base upon algorithmic ranomness an transuction, a particular form of inuction, we escribe the CM-kNN (ransuction Confience Machine knearest Neighbor) algorithm for Open Worl face recognition. he algorithm propose performs much better than PCA an is comparable with Fisherfaces. In aition to recognition an rejection, the algorithm can assign creibility ( likelihoo ) an confience ( lack of ambiguity ) measures with the ientification ecisions taken.. Introuction he choices facing face recognition systems shoul inclue: ACCEP, REJEC ( is not here ), an AMBIGUI ( nee more information ). he inclusion of the REJEC option, which correspons to an open worl of (face recognition) hypotheses, as complexity to the whole process an makes face recognition much harer compare to the more traitional close worl biometric systems available toay. In aition to seeking how similar or close some probe face image is to each subject in the face gallery set, one nees some measure of confience when making any ientification ecision. his paper escribes a novel methoology for hanling an open worl of hypotheses, incluing the REJEC option, an provies the means to associate creibility an confience measures with each of the ecisions mae regaring HumanID. he propose methoology, base upon ranomness concepts an transuctive learning, is formally valiate on challenging (varying illumination) an large overlapping FERE ata sets. 2. Ranomness an p-values Confience measures can be base upon universal tests for ranomness, or their approximation. A Martin-Lof ranomness eficiency (Li an Vitanyi, 997) base on such tests is a universal version of the stanar statistical notion of p-values. Universal tests for

ranomness are not computable an hence one has to approximate the p-values using nonuniversal tests. We use the p-value construction in Proerou et al. (200) to efine the quality of information. he assumption use is that ata items are inepenent an are prouce by the same stochastic mechanism. Given a sequence of proximities (istances) between the given training (gallery) set an an unknown sample (test) probe, one quantifies to what extent the (classification) ecision taken is reliable, i.e., non-ranom. owars that en one efines the strangeness of the unknown sample probe i with putative label y in relation to the rest of the training set exemplars as: D k j k y ¹ j D ij D y ij ¹ he strangeness measure is the ratio of the sum of the k nearest istances D from the same class (y) to the sum of the k nearest istances from all other classes ( y). he strangeness of an exemplar increases when the istance from the exemplars of the same class becomes larger an when the istance from the other classes becomes smaller. A vali ranomness test (Nouretinov et al., 200) efines then the p-value measure of a test exemplar with a possible classification assigne to it as p f ( D ) f ( D 2) ( m f ( ) f ( D D new m) ) f ( Dnew ) where f is some monotonic non-ecreasing function with f(0) = 0, e.g., f(d) = D, m is the number of training examples, an Dnew is the strangeness measure of a new potential test probe exemplar c new. An alternative efinition available for the p-value is p( c ) #{ i : D t D }/( m new i new ). Using the p-value one can now preict the class membership as the one that yiels the largest p-value, which is efine as the creibility of the assignment mae. he associate confience measure, which is one minus the 2n largest p-value, inicates how close the first two assignments are. he confience value inicates how improbable the classifications other than the preicte classification are an the creibility value shows how suitable the training set is for the classification of that testing example. One can compare the top ranke assignments, rather than only the first two assignments, an efine aitional confience criteria. Both the creibility an confience measures allow the face recognition moule to aapt to existing conitions an act accoringly. 3. ransuction Confience Machine (CM)- knn Another form of learning, beyon inuction, is transuction. Given an unlabele valiation test, in aition to the training set, the task now is to estimate the class for each unlabele

pattern in orer to construct the best classifier rule for both the training an valiation sets. CM-kNN Algorithm for i = to m y y Fin an store D i an D i Calculate the alpha strangeness values for all the training exemplars Calculate the similarity ist vector as the istances of the new exemplar from all the training exemplars for j = to c o for every training exemplar t classifie as j o if D > ist(t), i = }k, recalculate the alpha value of exemplar t j ti for every training exemplar t classifie as non-j o if D ti j > ist(t), i = }k, recalculate the alpha value of exemplar t Calculate alpha value for the new exemplar classifie as j Calculate p-value for the new exemplar classifie as j Preict the class with the largest p-value Output as confience one minus the 2n largest p-value Output as creibility the largest p-value he constraints on the (geometric) layout of the learning space an the search for improve classification margins are aresse in this paper using algorithmic ranomness (Vovk et al., 999), universal measures of confience ranomness (Vovk et al., 999), an transuctive confience (learning) machines (CM) (Proerou et al., 200). he experimental ata presente later on that valiates our approach, is base on CM-kNN which is an augmente CM using locality-base evience, e.g., the k-nearest Neighbors (knn) concept. he similarity istances ist (in script) use are shown next. Given two n-imensional n vectors, ƒ, the istance measures use are efine as follows: n 2 L (, ) i i L2 (, ) ( ) ( ) i cos (, ) Dice 2 (, ) 2 2 2

Jaccar Mah L (, ) 2 2 2(, ) ( ) 6 ( ) Mah 6 cos (, ) where 6 is the scatter matrix of the training ata. For PCA, 6 is iagonal an the iagonal elements are the (eigenvalues) variances of the corresponing components. he Mahalanobis + L istance efine only for PCA is n i i Mah L (, ) 4. Data Collection i i Figure. Face Images Our ata set is rawn from the FERE atabase, which has become a e facto stanar for evaluating face recognition technologies (Phillips et al., 998). he ata set consists of 600 FERE frontal face images corresponing to 200 subjects, which were acquire uner variable illumination an facial expressions. Each subject has three images of size 256x384 with 256 gray scale levels. Face image normalization is carrie out as follows: first, the centers of the eyes of an image are manually etecte, then rotation an scaling transformations align the centers of the eyes to preefine locations, an finally, the face image is croppe to the size of 28x28 to extract the facial region. he extracte facial region is further normalize to zero mean an unit variance. Fig. shows some exemplar images use in our experiments that are alreay croppe to the size of 28x28. Each

column in Fig. correspons to one subject. Note that for each subject, two images are ranomly chosen for training, while the remaining image (unseen uring training) is use for testing. he normalize face images are processe to yiel 400 PCA coefficients, accoring to eqs. 7 9 from Liu an Wechsler (2002), an 200 Fisherfaces using FLD (Fisher Linear Discriminant), accoring to eqs. 0 2 from Liu an Wechsler (2002) on a reuce 200 imensional space PCA space. 5. Open Worl Face Recognition Algorithms Open Worl CM-kNN Algorithm Calculate the alpha values for all the training exemplars for i = to c o for every training exemplar t classifie as i o for j = to c an j!= i o Assume t is classifie as j, which shoul be rejecte Recalculate the alpha value for all the training exemplars classifie as non-j Calculate alpha value for the exemplar t classifie as j Calculate p-value for the exemplar t classifie as j Calculate the P max, P mean an P stev (stanar eviation) for the p-value of exemplar t Calculate the PSR value for exemplar t: PSR = (P max P mean )/P stev Calculate the mean, stev (stanar eviation) for all the PSR values Calculate the mean + 3*stev as threshol for rejection Calculate the istances of the probe exemplar from all the training exemplars for i = to c o Calculate alpha value for the probe exemplar classifie as i Calculate p-value for the probe exemplar classifie as i Calculate the largest p-value max for the probe exemplar Calculate the mean an stev for the probe p-value without max Calculate the PSR value for the probe exemplar: PSR = (max mean)/ stev Reject the probe exemplar if its PSR is less than or equal to the threshol. Otherwise preict the class with the largest p-value

Open Worl {PCA, Fisherfaces} Algorithm for i = to m Fin the maximum intra-within-istance an minimum inter-between-istance Calculate the mean an stanar eviation for all maximum intra-istances an minimum inter-istances: mean intra, mean inter, stev intra an stev inter Calculate mean intra + 3* stev intra as the lower boun of the threshol Calculate mean inter - 3* stev inter as the upper boun of the threshol Choose the threshol base on the lower an upper boun Calculate the istances of the probe exemplar from all the training exemplars Fin the minimum istance ist min of the probe exemplar If ist min >= threshol, then reject the probe exemplar Else preict the class with the minimum istance ist min 6. Experimental Results We foun that the best similarity istances for PCA an Fisherfaces are {Mahalanobis + (L, L 2 or cos)} an {cosine, Dice, Jaccar, (Mahalonobis + cos)}, respectively. hose istances are use in our experiments. he experiments reporte were carrie out on the ata escribe in the previous section. Both the gallery an the probe sets consist of 00 subjects, an the overlap portion between the two sets on the average 50 subjects. he recognition rate is the percentage of the subjects whose probe is correctly recognize or rejecte. 0.8 0.75 0.9 0.7 0.8 Recognition Rate 0.65 0.6 0.55 Recognition Rate 0.7 0.6 0.5 0.5 0.45 0.5 0.6 0.7 0.8 0.9..2.3.4 hreshol 0.4-0.9-0.85-0.8-0.75-0.7-0.65-0.6-0.55-0.5-0.45 hreshol Figure 2. he Recognition Rate vs hreshol: PCA (Left) an Fisherfaces (Right) Open Worl (PCA an Fisherfaces) Face Recognition he ata was ranomly chosen, the same experiment was run 00 times, an Fig. 2 shows the mean recognition rates for ifferent threshols. he istance measurements for PCA an Fisherfaces, which yiel the best results, are Mahalanobis + L 2 an cosine, respectively. Fig. 2 shows that the best recognition rate for PCA is 77% if the threshol

can be chosen correctly, while for Fisherfaces is 9% if the threshol is chosen as its upper boun. he stanar eviation for the best recognition rate for PCA an Fisherfaces are 3.7% an 2.4%, respectively. 0.65 0.6 correct rejection correct recognition false recognition 0.08 correct rejection correct recognition 0.55 0.07 wrong recognition 0.5 0.06 p-value 0.45 0.4 p-value 0.05 0.04 0.35 0.03 0.3 0.25 0.02 0.2 0 0 20 30 40 50 60 70 80 90 00 classes 0.0 0 0 20 30 40 50 60 70 80 90 00 classes Figure 3. est p-value istribution of rejection, correct an false recognition using PCA with (Mahalanobis + L2) istance (Left) an Fisherfaces with cosine istance (Right) PCA PSR histogram with Mahalanobis + L istance Fisherface PSR histogram with cos istance 20 90 80 00 70 80 60 50 60 40 40 30 20 20 0 0 2 3 4 5 6 7 8 9 0 0 2 4 6 8 0 2 4 Figure 4. he PSR value histogram: PCA (Left) an Fisherfaces (Right) CM-kNN he ata use are either the (400) PCA or (200) Fisherface components, an k =. he threshol is compute accoring to the algorithm escribe in Sect. 5 base on the training exemplars. he p-value istributions shown in Fig.3 inicate that the test PSR values are useful for rejection an recognition. Recognition is riven by large PSR values. he best recognition rate using PCA components is 87.87% using the Mahalanobis + L istance, an its stanar eviation is 3.0%. he threshol is 6.57 compute from the PSR histogram shown in Fig. 4 (left). he best recognition rate using Fisherface components is 90% using the cosine istance, an its stanar eviation is 2.7%. he threshol is 9.20 compute from the PSR histogram shown in Fig. 4 (right). CM-kNN provies aitional information regaring the creibility an confience in the recognition ecision taken. he corresponing 2D istribution for correct an false

recognition is shown in Fig. 5, where one can see that false recognition, for both the PCA an Fisherfaces components, shows up at low values. correct recognition false recognition 0.35 correct recognition false recognition 0.9 0.3 0.25 0.8 Creibility 0.7 Creibility 0.2 0.5 0.6 0. 0.5 0.05 0.4 0.6 0.65 0.7 0.75 0.8 0.85 Confience 0 0.96 0.965 0.97 0.975 0.98 0.985 0.99 0.995 Confience Figure 5: Distribution of confience an Creibility: PCA (left) an Fisherfaces (right). 7. Conclusions We introuce in this paper a new face recognition algorithm suitable for open worl face recognition. he feasibility an usefulness of the algorithm has been shown on varying illumination an facial expression images rawn from FERE. Furthermore, both creibility an confience measures are provie for both the recognition an rejection ecisions. We plan to use those measures for optimal training of the face recognition system, such that the composition an size of the training set are etermine using active rather than ranom selection 8. References. A. Gammerman, V. Vovk, an V. Vapnik (998), Learning by ransuction. In Uncertainty in Artificial Intelligence, 48 55. 2. M. Li an P. Vitanyi (997), An Introuction to Kolmogorov Complexity an Its Applications, 2e., Springer-Verlag. 3. C. Liu an H. Wechsler (2002), Gabor Feature Base Classification Using the Enhnace Fisher Linera Discriminant Moel for Face Recognition, IEEE rans. on Image Processing, Vol., No. 4, 467 476. 4. I. Nouretinov,. Melluish, an V. Vovk (200), Rige Regression Confience Machine, Proc. 7th Int. Conf. on Machine Learning. 5. P. J. Phillips, H. Wechsler, J. Huang, an P. Rauss (998), he FERE Database an Evaluation Proceure for Face Recognition Algorithms, Image an Vision Computing, Vol.6, No.5, 295-306. 6. K. Proerou, I. Nouretinov, V. Vovk an A. Gammerman (200), ransuctive Confience Machines for Pattern Recognition, R CLRC-R-0-02, Royal Holloway University of Lonon. 7. V. Vovk, A. Gammerman, an C. Sauners (999), Machine Leraning Applications of Algorithmic Ranomness, Proc. 6th Int. Conf. on Machine Learning.