An Experimental Comparison of Face Detection Algorithms



Similar documents
Vision based Vehicle Tracking using a high angle camera

Template-based Eye and Mouth Detection for 3D Video Conferencing

Face Model Fitting on Low Resolution Images

Spatio-Temporal Nonparametric Background Modeling and Subtraction

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE.

Tracking and Recognition in Sports Videos

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

Automatic Traffic Estimation Using Image Processing

Building an Advanced Invariant Real-Time Human Tracking System

Real-Time Tracking of Pedestrians and Vehicles

International Journal of Advanced Information in Arts, Science & Management Vol.2, No.2, December 2014

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

Efficient Background Subtraction and Shadow Removal Technique for Multiple Human object Tracking

Edge tracking for motion segmentation and depth ordering

Simultaneous Gamma Correction and Registration in the Frequency Domain

Low-resolution Character Recognition by Video-based Super-resolution

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition

How To Filter Spam Image From A Picture By Color Or Color

Real Time Target Tracking with Pan Tilt Zoom Camera

Neural Network based Vehicle Classification for Intelligent Traffic Control

Analecta Vol. 8, No. 2 ISSN

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

Face Recognition For Remote Database Backup System

A Method of Caption Detection in News Video

A Real Time Hand Tracking System for Interactive Applications

Fall detection in the elderly by head tracking

Tracking People in D. Ismail Haritaoglu, David Harwood and Larry S. Davis. University of Maryland. College Park, MD 20742, USA

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Reconstructing 3D Pose and Motion from a Single Camera View

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER

A Cognitive Approach to Vision for a Mobile Robot

REAL TIME TRAFFIC LIGHT CONTROL USING IMAGE PROCESSING

Image Segmentation and Registration

A ROBUST BACKGROUND REMOVAL ALGORTIHMS

Tracking performance evaluation on PETS 2015 Challenge datasets

HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT

An Active Head Tracking System for Distance Education and Videoconferencing Applications

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Object Tracking System Using Motion Detection

An Iterative Image Registration Technique with an Application to Stereo Vision

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER

Face detection is a process of localizing and extracting the face region from the

Circle Object Recognition Based on Monocular Vision for Home Security Robot

Colour Image Segmentation Technique for Screen Printing

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan


A Learning Based Method for Super-Resolution of Low Resolution Images

Real-time Traffic Congestion Detection Based on Video Analysis

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions

AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION

Subspace Analysis and Optimization for AAM Based Face Alignment

FACE RECOGNITION BASED ATTENDANCE MARKING SYSTEM

Privacy Preserving Automatic Fall Detection for Elderly Using RGBD Cameras

Tracking of Small Unmanned Aerial Vehicles

Cloud tracking with optical flow for short-term solar forecasting

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY

Mean-Shift Tracking with Random Sampling

Tracking And Object Classification For Automated Surveillance

FPGA Implementation of Human Behavior Analysis Using Facial Image

Robert Collins CSE598G. More on Mean-shift. R.Collins, CSE, PSU CSE598G Spring 2006

Feature Tracking and Optical Flow

Video compression: Performance of available codec software

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

Speed Performance Improvement of Vehicle Blob Tracking System

Visibility optimization for data visualization: A Survey of Issues and Techniques

Object tracking & Motion detection in video sequences

2. MATERIALS AND METHODS

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

Interactive person re-identification in TV series

An Algorithm for Classification of Five Types of Defects on Bare Printed Circuit Board

Very Low Frame-Rate Video Streaming For Face-to-Face Teleconference

VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION

Static Environment Recognition Using Omni-camera from a Moving Vehicle

The Scientific Data Mining Process

Detection and Restoration of Vertical Non-linear Scratches in Digitized Film Sequences

Accurate and robust image superresolution by neural processing of local image representations

Design of Multi-camera Based Acts Monitoring System for Effective Remote Monitoring Control

Model-Based 3D Human Motion Capture Using Global-Local Particle Swarm Optimizations

Canny Edge Detection

Video Surveillance System for Security Applications

A NEW SUPER RESOLUTION TECHNIQUE FOR RANGE DATA. Valeria Garro, Pietro Zanuttigh, Guido M. Cortelazzo. University of Padova, Italy

Support Vector Machine-Based Human Behavior Classification in Crowd through Projection and Star Skeletonization

How To Analyze Ball Blur On A Ball Image

Object tracking in video scenes

Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance

False alarm in outdoor environments

Automatic Calibration of an In-vehicle Gaze Tracking System Using Driver s Typical Gaze Behavior

Transcription:

An Experimental Comparison of Face Detection Algorithms Shivesh Bajpai, Amarjot Sgh, and K. V. Karthik Abstract Human face detection and trackg is an important research area havg wide application human mache terface, content-based image retrieval, video codg, gesture recognition, crowd surveillance and face recognition. Human face detection is remely important and simultaneously a difficult problem computer vision, maly due to the dynamics and high degree of variability of the head. A large number of effective algorithms have been proposed for face detection grey scale images rangg from simple edgebased methods to composite high-level approaches usg modern and advanced pattern recognition approaches. The aim of the paper is to compare Gradient vector flow and silhouettes, two of the most widely used algorithms the area of face detection. Both the algorithms were applied on a common database and the results were compared. This is the first paper which evaluates the runtime analysis of Gradient vector field methodology and compares with silhouettes segmentation technique. The paper also explas the factors affectg the performance and error curred by both the algorithms. Fally, results are explaed which proves the superiority of the silhouette segmentation method over Gradient vector flow method. Index Terms Face detection, gradient vector flow (GVF), active contour flow, silhoutte. I. INTRODUCTION The current advancements computer technologies have envisaged a sophisticated vision based mechanical world, amended by artificial telligence. This trend motivated the development mache telligence especially the field of computer vision. Computer vision focuses to replicate human vision. Computer vision systems focus on performg laborious and repetitive jobs such as surveillance, assembly le recapitulation etc. The ma application is driven towards more generalized application of vision related to a broader field of face recognition and video codg. Face detection is a brilliant and remely efficient paradigm for a number of applications like object detection etc. It is itself a fascatg problem to explore. Human face can be easily detected by human begs without much effort but for maches, it is one of the most complicated tasks to perform, due to the complex 3-D shape of the face and the variation appearance under different lightg environments. Many of the current technologies used for face detection work only on frontal faces of similar size. Another major challenge is the generality of detectg faces grayscale image due to the lack of standard methods Manuscript received June 13, 01; revised August 10, 01. Amarjot Sgh and K. V. Karthik are with National Institute of Technology, Warangal, 506004, India (e-mail: amarjotsgh@ieee.org). Shivesh Bajpai is with Indian School of Mes Dhanbad, Jharkhand, India. for determg illumation data or scene structure without performg prelimary ensive operations on the image. This encouraged the development of several successful techniques, capable of detectg faces knavish experimental conditions. Vision systems have to perform a number of operations volvg segmentation, raction, and verification of faces and possibly facial features from an uncontrolled background order to ract the face from an unknown scene. A large number of techniques have been proposed for the purpose. In early 1970s, heuristic and anthropometric techniques [1] were utilized for face detection but due to large assumptions, a small change the image would lead to the tung or redesigng of the system. Despite several obstructions and the slow pace of growth research, video codg and face recognition have started to become a reality. There have been several developments the important aspects of face detection over the past decade. More robust segmentation techniques usg color and motion have been developed. In addition to the segmentation techniques, there have been advancements contour and templates which can track features more accurately. Feature based techniques used for detectg faces can be classified to three broad categories: low level analysis, feature analysis and active shape models. As the most primitive feature, edge representation was applied face detection by Sakai et al. [1]. Besides edges, gray formation can also be exploited to various parts on the face. A number of algorithms search for a local gray mima the racted face region []. Another convenient way of ractg the face from the scene is motion. Motion segmentation usg silhouettes and Active contours can also be effectively used for face detection. Both these algorithms are the focus of this paper and hence, they have been discussed detail below. Motion segmentation is an remely efficient approach capable of recognizg a movg foreground efficiently regardless of the background content. Motion segmentation is commonly achieved directly by frame difference usg silhouettes. Movg silhouettes, cludg face and body parts are racted on the basis of threshold value of accumulated frame difference. The method works on the prciple of silhouettes generated by timed motion history image (tmhi). It is remely smooth to ract face from the scene if video is available with the help silhouettes. Almost any silhouettes generation method cludg stereo disparity also known as stereo depth subtraction [3], frared back-lightg [4], frame differencg [5], color histogram back-projection [6], ture blob segmentation, range imagery foreground segmentation, etc can be used for generatg silhouettes of the object of terest. We choose a simple background subtraction [7] method for the DOI: 10.7763/IJCTE.013.V5.644 47

silhouettes generation. Another approach to spot the features is the active shape model. Unlike the previous models, this model focuses on the actual physical and hence higher-level appearance of features. The algorithm teracts with local image features (edges, brightness), gradually deformg to the shape of the facial feature. The first type uses a generic active contour called snakes troduced by Kass et al. 1987 [8]. Further need for accuracy motivated the development of deformable templates [9] takg to account a priori of facial features. The third version [10] termed as smart snakes is a generic flexible model which provides the best efficient terpretation of the human face, possible by this method. The remag paper can be classified to followg sections. N section explas the algorithms explaed and compared this paper detail with formulation. Section III presents the results obtaed by both the algorithms on various databases with the factors affectg their performance. Section IV describes the applications of both the algorithms to face detection systems. The last section V, concludes the results. given equation 1, Fig.. Face racted by gradient vector field method where E and 1 E E ( X ( s)) ds (1) E respectively. Internal 0 elasticity and stra defed as E are the ternal and ernal energies E is a combation of dx ( s) dx ( s) ds ds () It is used to control the snake tension and rigidity. represents elasticity while stiffness snake s ternal. The ernal energies which force the snake towards the edge are elaborated as Fig. 1. Face racted by silhouette segmentation technique II. ALGORITHMS The algorithm used is described as below. A. Active Contour Model Unlike the previous described face detection techniques, this method aims to depict the actual high level appearance of features. They are commonly used to locate head boundary or edges [11]. The task is achieved by itializg the snake the nearby proximity or region around the head. The snake gives the actual boundaries if released with approximate boundaries. The snake itialized converges onto the edges and subsequently assumes the shape of the head. The algorithm locks onto the features of terest which clude maly les, edges or boundaries. The progress of traditional snake function X s xs, y( s) for s 0, 1 that moves through the spatial doma of an image I( x, y) is obtaed by mimizg the function E ( x, y) I( x, y) (3) E ( x, y) G ( x, y)* I( x, y) (4) where G ( x, y) is a two dimensional Gaussian function with standard deviation σ and as the gradient operator respectively. Mimization of provides the necessary force required for the snake to shrk to the succeedg position. This timates that the ternal is equal to the force by image gradient. It is expressed as iv E X '( s) X ( s) E 0 (5) iv F X '( s) X ( s) is defed as the ternal force that tries to stop the stretchg and bendg while F E is the ernal force that tries to pull the snake towards the desired image edge. After the force balance F F 0 (6) 48

Fig. 3. Error variation with respect to the number of iterations GVF method Fally, the snake takes the shape of the object and the becomes mimum. The itialization of the snake is a difficult task as we have to acquire a suitable set of parameters as it is essential for the itialization process. It is difficult to automatically generate the set of parameters for the objects of terest. Hence these constants are decided by the user. Once the parameters are decided correctly and the snake is released close proximity to the object, the face can be racted successfully. It is an efficient method used a number of applications requirg face detection. B. Segmentation Usg Silhouettes In case of video, motion is the easiest way to ract movg objects from the scenes. Although there is recent work on more sophisticated techniques of background subtraction, we use a fast and simple method for the face raction. In this method, the pixels which are a set number of standard deviations from the mean RGB background are labeled as foreground. In the n step, noise is removed by pixel dilation and region growg methods which fally leads to the raction of the silhouette. One of the simplest background model assumes that the brightness of every background pixel varies dependently accordg to the normal distribution. The background features can be calculated by accumulatg several dozens of frames, as well as their squares. The mean and standard deviation can be calculated as below t( x, y) m( x, y) (7) N tq( x, y) tq( x, y) W ( x, y) N N where N is the total number of frames collected. The pixels positioned certa frame are grouped to the movg object if (8) m( x, y) p( x, y) CW ( x, y) (9) where C is equal to 3 accordg to the famous three sigma rule. This algorithm is remely efficient and racts the pixels which vary or move with respect to a still background. As, here we are concerned only with the face, the remag parts of the body are not of our terest. But, case of human raction from scenes, the connected portions to the movg object can be racted by creatg a mask to segment movg region. III. RESULTS The results obtaed from the simulations enables us to vestigate the capability of both the methodologies applied to synthetic and real time datasets. This section elaborates and compares the results obtaed for face detection and raction usg active contour and silhouettes techniques tested on a pentium core duo 1.83 ghz mache. the algorithms are compared on the basis of runtime analysis and the accuracy of face detected, compared across a common ground truth. this section further emphasizes on the factors affectg the performance of the algorithms, followed by their advantages as well as disadvantages. The results obtaed for face detection usg silhouette segmentation technique are shown Fig. 1. The tests are performed on the face of the author as shown the figure mentioned above. A video sequence was given as put to the system. The method successfully detected and fally racted the face of the author accordg to the algorithm described above. The whole process computes less than 1 second. The algorithm works well for all kd of puts. Silhouettes perform the task of recognition with high accuracy hence we don t compute the error for this methodology. The simulation results for active contour method are shown Fig.. The algorithm performed remely well on A.R. database and ORL database with an accuracy of 97.88 % and 96.50% respectively. Fig. shows the step by step evaluation of raction on the put image. The 49

method effectively racts the author s face 30 iterations. The method takes 31 seconds for the whole process. The section also focuses on the other parameters responsible for the variations the results on the basis of the error curred by the specific parameter. As, the parameter have to adjusted manually, the accuracy maly depends upon the selection of the parameters. Error is defed as the total number of pixels unclassified the silhouette over the total number of pixels the silhouette. Fig. 3 shows the variations on error with respect to normalization iterations. The mimum error is 0.13 for 140 iterations and it creases to 59.06 as the iterations drop to 10. Fig. 4 shows the variation error with respect to the variations alpha which is the elasticity parameter. The error varies from a maximum error of 17.61 for alpha equal to 0.06 to 17.03 for 0.05 alpha value. Another important parameter which affects the performance of snake is the itialization distance between the two pots. The maximum error the result is 0.4 for a maximum distance of 0 between itialization pots specified for the snake itialization while an error of 18.9 for a mimum distance of 1.5. The error with respect to rigidity parameter beta varies from 17.03 to 17.71 for beta value equal to zero and 0. respectively. Similarly the maximum errors and mimum errors curred by varyg viscosity parameter ( ) and ernal force weight (ƙ) are 44.45 and 17.03 for viscosity parameter while 7.9 and 17.03 for ernal force weight respectively. The accuracy also depends on the number of iterations. The error variations with respect to the number of iterations are shown Fig. 3. The error variations with respect to other parameters mentioned above are shown Fig. 4. The Parameters which give the best results are stated below. Alpha value should be set to 1.75, beta to 1.1 and gamma to 1. DMAX should be set to 1.75, DMIN to 1.1 and kappa to 0.6. Fig. 4. a) Error variation with respect to the parameter Dmax b) Error variation with respect to the parameter Dm c) Error variation with respect to the parameter α d) Error variation with respect to the parameter β e) Error variation with respect to the parameter γ f) Error variation with respect to the parameter κ IV. APPLICATION TO FACE DETECTION Face detection has huge potential as it is widely range of applications cludg biometric identification, video conferencg, video codg and dexg of images and video databases. With an creasg availability of images 50

on the world wide web, face has become an important part for content has been used as a part of image search enge and has been used webseer [1] to me the required content from based image retrieval. Neural network face detection system terabytes of video libraries. Face detection systems are also widely used video conferencg systems where it is necessary to focus on the face of the speaker [13]. Human faces are maly racted usg features such as sk, motion, contour geometry, facial analysis etc. V. CONCLUSION The section presents two most popular algorithms used for face detection, together with a brief presentation of some of the application areas. The results obtaed from real time simulation by active contour method and silhouettes segmentation applied on real time data were presented the paper. The followg presents a concise summary with conclusions related to the performance of the algorithms. Silhouettes perform much faster as compared to gradient vector field with higher accuracy. The silhouettes take less than 1 second to compute the result while on the other hand Gradient vector field takes 31 seconds to produce the results. The parameters of gvf has to be controlled manually which is time consumg process and may lead to error the results, while silhouettes is totally automatic leadg to higher accuracy. GVF gives better results when applied to images controlled background. Another disadvantage of gvf is that it yields better results only on frontal image while any kd of image will give equally good results for silhouettes technique. Gradient vector field method fails to work on images which are very close to the image frame which aga proves the superiority of silhouettes technique over gvf. On itialization of the gvf snake, it is released close proximity to the face. As the face is very near to the frame, the snakes touches the frame of the image durg expansion and compression, leadg to an error. Silhouettes are much better terms of accuracy as well as speed. Face detection is an important research area and the field has advanced to more complex and sophisticated areas which are still expensive to be implemented on a real platform, but it is likely to change with the upcomg advancement computer hardware. Face recognition is still a very crucial area when it comes to face detection. Due to expensiveness of the technology, it is commonly used for CBIR systems, web search enges and digital video dexg. In spite of the advancements, accurate detection of facial feature such as corners of the eyes and mouth is still a hard problem to be solved. REFERENCES [1] T. Sakai, M. Nagao, and T. Kanade, Computer analysis and classification of photographs of human faces, Proc. First USA Japan Computer Conference, 197, pp. -7. [] P. J. L. V. Beek, M. J. T. Reders, B. Sankur, and J. C. A. V. D. Lubbe, Semantic segmentation of videophone image sequences, Proc. of SPIE Int. Conf. on Visual Communications and Image Processg. [3] D. Beymer and K. Konolige, Real-time trackg of multiple people usg stereo, IEEE FRAME-RATE Workshop, 1999 [4] J. Davis and A. Bobick, A robust human-silhouette raction technique for teractive virtual environments, Proceedgs Modellg, 1998 [5] G. Bradski. Computer vision face trackg for use a perceptual user terface. Intel Technology Journal. [Onle]. Available: http://www. developer.tel.com/technology/itj/q1998/articles/art.htm [6] C. Bregler, Learng and recognizg human dynamics video Sequences, Proceedgs Computer Vision And Pattern Recognition, June 1997, pp 568 574. [7] A. Elgammal, D. Harwood, and L. Davis, Non-parametric model for background subtraction, IEEE FRAME-RATE Workshop, 1999 [8] M. Kass, A. Witk, and D. Trezopoulos, Snakes: Active Contour Models, International Journal of Computer Vision, vol. 1, no. 4, pp. 31-331,1987. [9] L. Yuille, P. W. Hallan, and D. S. Cohen Features raction from faces usg deformable templates, International Journal of Computer Vision, vol. 8, pp. 99-111, 199. [10] T. F. Cootes and C. J. Taylor, Active shape models-smart snakes, In Proceedgs of British Mache Vision Conference, 199, pp. 66-75. [11] X. Chenyang and L. J. Prce, Snakes, shapes and Gradient vector Flow, IEEE Transaction on Image Processg, vol.7, no. 3 pp. 359-369, March 1998. [1] C. Frankel, M. J. Swa, and V. Athitsos, WebSeer: An Image Search Enge for the World Wide Web, Technical Report, Computer Science Department, Univ. of Chicago, pp. 96-14, 1996. [13] C. Wang, S. M. Griebel, and M. S. Brandste, Robust automatic video - conferencg with multiple cameras and microphones, Proc. IEEE International Conference on Multimedia and Expo, 000. Amarjot Sgh is a Research Engeer with Tropical Mare Science Institute at National University of Sgapore (NUS). He completed his Bachelors Electrical and Electronics Engeerg from National Institute of Technology Warangal. He is the recipient of Gold Medal for Excellence research for Batch 007-011 of Electrical Department from National Institute of Technology Warangal. He has authored and coauthored 48 International Journal and Conference Publications. He holds the record Asia Book of Records (India Book of Record Chapter) for havg "Maximum Number (18) of International Research Publications by an Undergraduate Student". He has been awarded multiple prestigious fellowships over the years cludg the prestigious Gfar Research Scholarship for Excellence Research from Gfar Research Germany and Travel Fellowship from Center for International Corporation Science (CICS), India. He has also been recognized for his research at multiple ternational platforms and has been awarded 3 rd position IEEE Region 10 Paper Contest across Asia-Pacific Region and shortlisted as world falist (Top 15) at IEEE President Change the World Competition. He is the founder and chairman of Illumati, a potential research groups of students at National Institute of Technology Warangal (Well Known across a Number of Countries Europe and Asia). He has worked with number of research organizations cludg INRIA-Sophia Antipolis (France), University of Bonn (Germany), Gfar Research (Germany), Twtbuck (India), Indian Institute of Technology Kanpur (India), Indian Institute of Science Bangalore (India) and Defense Research and Development Organization (DRDO), Hyderabad (India). 51