Challenges in Face Recognition Biometrics Sujeewa Alwis Cybula Ltd
Background Techniques and issues Demo Questions
Why use face? Every one has got a fairly unique face Can be captured without user cooperation (passive)
Application Modes Verification Are you the same person you say you are? System captures a new biometric sample and the person submits an ID. Yes/no answer indicates authentication result. Identification Who are you? System captures a new biometric sample. It does a database search and presents the top n similar matches may need a human operator to make the final decision. Watch-list Are we looking for you System captures a new biometric sample. System triggers an alarm only if that person is in the database. Similar to identification - but uses an additional threshold to identify a hit.
Iris Advantages highly unique (five different patterns in even two identical twins) Stable after the first year of birth Disadvantages Need user cooperation Difficulties during enrolment The most successful technique is based on projecting Iris pattern onto a Gabor wavelet (Daugman, 1993). Gabor coefficients represent the biometric template - commercialised by Iridian technologies
Fingerprints Advantages Availability of large fingerprint databases Disadvantages Associated with crime control/investigation Need user cooperation Need to keep the capture surface clean and germ-free not suitable for high-throughput applications Represents minutiae points in a map Cross match technologies is one of the companies that sell fingerprint recognition systems
Gait recognition Palm print recognition Voice recognition
Combinations Face + Iris (Wang, 2003) Face + Ear (Chang, 2003) Face + Gait (Shakhnanorvich, 2002) Face + Palm print + Fingerprint (Ross, 2001) Face + Voice + Lip movement (Frischholz, 2000) Face + Voice (Kittler, 1997)
Face Representation 2D vs. 3D 2D Advantages Availability of large 2D image collections Capture devices are currently cheaper 3D Advantages Can deal with pose variations if the cameras can capture the full face Less sensitive to lighting variations Better accuracy during recognition (Experimental results from Notre Dame University, Chang et al. 2003)
Face Representation 2D vs. 3D (contd.) 2D Disadvantages Cannot handle pose variations Sensitive to lighting variations, shadows etc. 3D Disadvantages Cameras are still expensive Takes time to reconstruct models Unavailability of large collections of 3D data (UofY/ Cybula data set, U of Notre Dame data set)
Techniques Appearance based techniques Feature based Techniques Model based Techniques Eigen faces and Fisher faces Distances between landmark points such as eyes, nose and mouth. Graph matching techniques Active appearance/ shape models, Fitting morphable models
Eigen Analysis One of the most popular methods for face recognition The central argument is faces contain a lot of features some are common to all faces, some are highly discriminatory information. So they have to be mapped to different feature space that consists of discriminatory information a dimensionality reduction method is needed Eigen analysis provides a way to identify dimensions that indicate high variance - so we can use Eigen analysis to extract principal components
A simple example y = P x where y coordinates in the new space x coordinates in the previous space P projection matrix -a face
Eigen Faces projections of a face template along different principle components
Previous Work Using 2D images Sirovich and Kirby (1987), Turk and Pentland (1991) Using 3D images Heseltine, Pears and Austin (2003), Chang, Bowyer and Flynn (2003)
Linear Discriminant Analysis Subject A Subject B The aim is to minimise the within class separation and maximise between class separation. In other words, maximise the ratio between between class variance and within class variance Subject C Maximise (S B S -1 w ) Where S B between class scatter matrix S w within class scatter matrix
Previous Work Using 2D images Belhumeur, Hespanha and Kriegman (1997), Etemad and Chellappa (1996), Liu and Wiechsler (1998), Kittler (1999) Using 3D images Heseltine, Pears and Austin (2004)
Is LDA always better than PCA? PCA LDA D LDA D PCA Martinez and Kak (IEEE PAMI, 2001) Present experimental data to show that LDA does not always outperform PCA particularly when the number of samples in a class is small
Feature based matching techniques One of the earliest techniques is to use distance between landmarks such as eye, nose and mouth This technique may not be robust due to pose variations and it may be difficult to accurately identifying the required feature points
Cybula approach 3D graph matching A 3D mesh is used to identify a set of significant points we identify high curvature points on face profiles These points and the relationships between points are represented in a graph A graph matching framework called Relaxation by Elimination (RBE) developed at York is used.
Elastic Bunch Graph Matching But we are not the only people who have applied graph matching to faces! Wiskott, Fellous, Kruger and Malsburg (1999) have used graph matching for 2D face recognition. Each landmark point (eyes, mouth et.) is represented by a stack of wavelet responses. They become the nodes of the graph. Distances are represented in edges. Graph for a new image can be fitted by scaling, rotating and translating a standard model graph. Dissimilarity measure is a straight-forward comparison between graphs
Model based recognition Active appearance models (Cootes, Edwards and Taylor, 2001) A statistical appearance model is constructed by combining a shape model and a texture model. Shape model is constructed by identifying the positions of landmark points Texture model represent gray level intensities. Model parameters are identified by applying Eigen analysis. Recognition is an iterative process in which model parameters are adjusted to obtain the best match
3D morphable model (Blanz and Vetter, 2003) A set of laser scanned 3D image models (100 males and 100 females) are used to construct the morphable 3D model. Shape is represented by 3D co-ordinates while texture is represented by colour. Model parameters are calculated by applying Eigen analysis. 3D model is deformed to obtain the best fit between its 2D projection and the new 2D image. New model parameters are used to describe the new image. So this could be seen as 2D to 3D mapping Optimisation process involves finding out optimum values for model parameters as well as scene parameters (pose, focal length of the camera, light intensity, colour and direction)
One remaining issue how to keep the data collections updated? Face is changed when people become older and it could depend on both internal and external factors Lanitis, Taylor and Cootes (2002) have extended their work on active appearance model to predict the age of an unseen subject and then to simulate/ eliminate age effects Using training data, they build up a weighted person specific aging function to predict an age of a person using appearance as well as external factors such as lifestyle Age simulation can be done by changing the model parameters.
Evaluation False acceptance rate (FAR) number of times a wrong person is accepted False rejection rate (FRR) - number of times the correct person is rejected Equal error rates the value that FAR and FRR becomes equal Time to verify Time to capture/ enrol
Benchmark Assessments FRVT has been replaced by the Grand Challenge Experiment led by NIST First round was finished in this month the second round results submission is due next year