Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa vhr@yandex.ru, shmaglt_lev@yahoo.com, andrey.shemakov@gmal.com, lebedevdes@gmal.com Abstract The system allowng to extract all the possble nformaton about depcted people from the nput vdeo stream s dscussed. As reported prevously, the proposed system conssts of fve consecutve stages: face detecton, face trackng, gender recognton, age classfcaton and statstcs analyss. The crucal part of the system s gender classfer constructon on the bass of machne learnng methods. We propose a novel algorthm consstng of two stages: adaptve feature extracton and support vector machne classfcaton. Both tranng technque of the proposed algorthm and expermental results acqured on a large mage dataset are presented. More than 90% accuracy of vewer's gender recognton s acheved. I. INTRODUCTION Automatc vdeo data analyss s a very challengng problem. In order to fnd a partcular object n a vdeo stream and automatcally decde f t belongs to a partcular class one should utlze a number of dfferent machne learnng technques and algorthms, solvng object detecton, trackng and recognton tasks [1-3]. A lot of dfferent algorthms, usng such popular technques as prncpal component analyss, hstogram analyss, artfcal neural networks, Bayesan classfcaton, adaptve boostng learnng, dfferent statstcal methods, and many others, have been proposed n the feld of computer vson and object recognton over recent years. Some of these technques are nvarant to the type of analyzed object, others, on the contrary, are utlzng aprorstc knowledge about a partcular object type such as ts shape, typcal color dstrbuton, relatve postonng of parts, etc. [4]. In spte of the fact that n the real world there s a huge number of varous objects, a consderable nterest s beng shown n the development of algorthms of analyss of a partcular object type human faces. The promsng practcal applcatons of face recognton algorthms can be automatc number of vstors calculaton systems, throughput control on the entrance of offce buldngs, arports and subway; automatc systems of accdent preventon, ntellgent human-computer nterfaces, etc. Gender recognton, for example, can be used to collect and estmate demographc ndcators [5-8]. Besdes, t can be an mportant preprocessng step when solvng the problem of person dentfcaton, as gender recognton allows twce to reduce the number of canddates for analyss (n case of dentcal number of men and women n a database), and thus twce to accelerate the dentfcaton process. In order to organze a completely automatc system, classfcaton algorthms are utlzed n the combnaton wth a face detecton algorthm, whch selects canddates for further analyss [9-14]. In paper [15] we proposed a system whch extracts all the possble nformaton about depcted people from the nput vdeo stream, aggregates and analyses t n order to measure dfferent statstcal parameters (Fg. 1). Fg. 1. A block dagram of the proposed applcaton for vdeo analyss The qualty of face detecton step s crtcal to the fnal result of the whole system, as naccuraces at face poston determnaton can lead to wrong decsons at the stage of ISSN 305-754
recognton. To solve the task of face detecton AdaBoost classfer, descrbed n paper [16], s utlzed. Detected fragments are preprocessed to algn ther lumnance characterstcs and to transform them to unform scale. On the next stage detected and preprocessed mage fragments are passed to the nput of gender recognton classfer whch makes a decson on ther belongng to one of two classes («Male», «Female»). Same fragments are also analyzed by the age estmaton algorthm whch dvdes them nto several age groups [17]. Age classfer shares the same learnng technque wth that of gender classfer. As a result the followng metrcs are calculated: Count the number of vewers who ve pad an attenton to a partcular product or have watched the advertsement. Opportunty to See (OTS) the number of potental vewers who were close to the presented product or advertsng meda; Dwell Tme the average tme durng whch potental vewers have been n the vsblty range to the presented product or advertsng meda; Attenton Tme the average tme when the vewer was watchng the object of nterest; Gender vewer gender (man/woman); Age vewer age group (chld / youth / adult / senors). Ths paper focuses on the problem of gender recognton classfer constructon. There are comprehensve surveys wrtten on face detecton [14, 8], face recognton [18] and facal expresson analyss [19]. Gender recognton has been studed less. The comparatve analyss of lately proposed gender classfcaton algorthms has been presented n paper [3]. The researches have proposed algorthms based on artfcal neural networks [4], on a combnaton of Gabor wavelets and prncpal component analyss [0], on ndependent component analyss and lnear dscrmnant analyss (LDA) [1, ]. In paper [3] the authors utlze genetc algorthm for feature selecton and support vector machne (SVM) [4] for classfcaton. In paper [5] an algorthm based on local bnary patterns n combnaton wth AdaBoost classfer was proposed. Experments wth radal bass functon (RBF) networks and nductve decson trees are descrbed n paper [6]. A novel gender recognton algorthm, proposed n ths paper, s based on non-lnear SVM classfer wth RBF kernel. To extract nformaton from mage fragment and to move to a lower dmenson feature space we propose an adaptve feature generaton algorthm whch s traned by means of optmzaton procedure accordng to LDA prncple. The rest of the paper s organzed as follows. The scheme of the proposed algorthm s descrbed n secton. Secton 3 consders tranng methodology and expermental setup. In secton 4, the results of the proposed algorthm RGB Y Y RGB Y Y N N AN N AF AN N. C N N C N N m f ( AF) sgn yk( X, AF) b 1 z1 z k( z1, z) exp X y b Fg.. The scheme of the proposed gender classfcaton algorthm ---------------------------------------------------------------------------- 53 ----------------------------------------------------------------------------
comparson wth state-of-the-art gender classfcaton methods s presented. Secton 5 concludes the paper. II. THE PROPOSED ALGORITHM Classfer s based on Adaptve Features and SVM (AF-SVM). Its operaton ncludes several stages, as shown n Fg.. AF-SVM algorthm conssts of the followng steps: color space transform, mage scalng, adaptve feature set calculaton and SVM classfcaton wth prelmnary kernel RGB transformaton. Input mage Y Y s converted from RGB to color space and s scaled to fxed mage resoluton N N. After that we calculate a set of features AF, where each feature represents the sum of all rows and columns of element-by-element matrx product of an nput mage and a coeffcent matrx C wth resoluton N N, whch s generated by the tranng procedure: AF AN N. C. N N The obtaned feature vector s transformed usng a Gaussan radal bass functon kernel: z z 1 k ( z1, z ) exp. Kernel functon parameters and are defned durng tranng. The resulted feature vector serves as an nput to lnear SVM classfer whch decson rule s: m f ( AF) sgn y k( X, AF) b. 1 The set of support vectors X y,, the sets of coeffcents and the bas b are obtaned at the stage of classfer tranng. III. TRAINING AND TESTING SETUP Both gender recognton algorthm tranng and testng requre huge enough color mage database. The most commonly used mage database for the tasks of human faces recognton s the FERET database [7], but t contans nsuffcent number of faces of dfferent ndvduals, that s why we collected our own mage database, gathered from dfferent sources (Table 1). Faces on the mages from the proposed database were detected automatcally by AdaBoost face detecton algorthm. After that false detectons were manually removed, and the resulted dataset consstng 10 500 mage fragments (5 50 TABLE I. THE PROPOSED TRAINING AND TESTING IMAGE DATABASE PARAMETERS Parameter Value The total number of mages 8 654 The number of male faces 5 50 The number of female faces 5 50 Mnmum mage resoluton 640 480 Color space format RGB Face poston Frontal People s age From 18 to 65 years old Race Caucasan Lghtng condtons, background and facal expresson No restrctons for each class) was obtaned. Ths dataset was splt nto three ndependent mage sets: tranng, valdaton and testng. Tranng set was utlzed for feature generaton and SVM classfer constructon. Valdaton set was requred n order to avod the effect of overtranng durng the selecton of optmal parameters for the kernel functon. Performance evaluaton of the traned classfer was carred out wth the use of the testng set. The tranng procedure of the proposed AF-SVM classfer can be splt nto two ndependent parts: feature generaton, SVM constructon and optmzaton. Let s consder the feature generaton procedure. It conssts of the followng basc steps: RGB color space transform of the tranng mages (all further operatons are carred out for each color component ndependently); scalng tranng mages to fxed mage resoluton N N ; coeffcent matrx C random generaton; feature value AF calculaton for each tranng fragment; the utlty functon F calculaton as a square of a dfference between feature averages, calculated for male and female tranng mage datasets, dvded by the sum of feature varances [8]: F AF AF F AF AF F teratvely n a cycle (untl the number of teratons exceeds some prelmnary fxed maxmum value): random generaton of coeffcent matrx C ~ nsde the fxed neghborhood of matrx C, feature value A F ~ calculaton for each tranng fragment, calculaton of the utlty functon F ~, transton to a new pont ( ~ ~ ~ F F, C C ), f F F ; ; ---------------------------------------------------------------------------- 54 ----------------------------------------------------------------------------
savng the matrx C after exceedngg the maxmum number of teratons; return to begnnng n order to start the generaton of the next (+1) feature. An optmzaton procedure, descrbedd above, allows to extract from an mage only the nformaton whch s necessary for class separaton. Besdes, features wth hgher utlty functonn value have hgher separaton ablty.. The feature generaton procedure have the followng f adjusted parameters: tranng fragments resoluton (N), the number of tranng mages for each class (M), maxmum number of teratons (T). The followng values, as a compromse between reached separaton ablty and the t tranng speed, were emprcally receved: N 65; M 400; 5 T 10. 1000 features have been generated for each color component. At the second stage of tranng these features have been extracted from tranng mages and were then used to learn an SVM classfer. SVM constructonn and optmzaton procedure ncluded the followng steps: calculaton of the feature set, generated on thee frst stage of tranng, for each tranng fragment; feature normalzaton; learnng an SVM classfer wth dfferent parameters of the kernel functon; recognton rate (RR) calculaton usng valdaton mage dataset; determnaton of optmal kernel functon parameters (maxmzng RR); learnng a fnal SVM classferr wth the found optmal kernel functon parameters. The goal of SVM optmzaton procedure s to fnd a soluton wth the best generalzaton ablty, and thus s wth the mnmum classfcaton error. The adjusted parameters are: the number of features n a featuree vector (N), the number of tranng mages for each class (M), the kernel functon parameters and C. Grd search was appled to determne optmal kernel parameters: SVM classfer was constructed varyng C 10 k1 and 10 k, where k1 and k all combnatons of ntegers from the range [-15[ 15]; durng the search recognton rate was measured usng valdaton mage dataset. The results of ths procedure are presented n Fg. 3. Maxmum recognton rate (about( 80%) was 6 8 obtaned for C 10 and 10. Fg. 3. The dependence of RR from kernel functon parameters p C and 10 k 10 k1 Besdes, we nvestgated the dependence of classfer Fg. 4. The dependence of recognton rate from tranng procedure parameters performance fromm the numberr of features extracted from each color component N,, and from the number of tranng mages for each class M (Fg. 4). ---------------------------------------------------------------------------- 55 ----------------------------------------------------------------------------
The analyss shows that each feature has essental separaton ablty, and at N 30 recognton rate reaches 79.5%. At the same tme the growth of RR s observed both wth the growth of N and M due to the accumulaton of nformaton about consdered classes nsde the classfer. Thus, to obtan a compromse between qualty and speed the followng parameters were chosen: N 30 ; M 400. IV. EXPERIMENTAL RESULTS In ths secton we present the results of the proposed AF-SVM algorthm comparson wth state-of-the-art classfers: SVM [4] and KDDA (Kernel Drect Dscrmnant Analyss) [1]. Classfer AF-SVM was traned accordng to a technque, gven above. SVM and KDDA classfers have far less adjustable parameters as they are workng drectly wth mage pxel values nstead of feature vectors. To construct these classfers the same tranng base, as for AF-SVM classfer, was used. The followng condtons also were dentcal for all three consdered classfers: the number of tranng mages for each class, tranng fragments resoluton and mage preprocessng procedure. Optmzaton of SVM and KDDA kernel functon parameters was held usng the same technque and the same valdaton mage dataset as used n case of AF-SVM classfer. Thus, equal condtons for ndependent comparson of consdered classfcaton algorthms, usng testng mage dataset, were provded. For the representaton of classfcaton results we utlzed the Recever Operator Characterstc (ROC-curve) [9]. As TABLE II. COMPARATIVE ANALYSIS OF TESTED ALGORITHMS PERFORMANCE Algorthm SVM KDDA AF-SVM Parameter Recognton rate True False True False True False Classfed as male, % 80 0 75.8 4. 80 0 Classfed as female, % 75.5 4.5 65.5 34.5 79.3 0.7 Total classfcaton 77.7.3 69.7 30.3 79.6 0.4 rate, % Operaton speed, faces/sec 44 45 65 there are two classes, one of them s consdered to be a postve decson and the other a negatve. ROC-curve s created by plottng the fracton of true postves out of the postves (TPR = true postve rate) vs. the fracton of false postves out of the negatves (FPR = false postve rate), at varous dscrmnaton threshold settngs. The advantage of ROC-curve representaton les n ts nvarance to the relaton between the frst and the second error type s costs. The results of AF-SVM, SVM and KDDA testng are presented n Fg. 5 and n table II. The computatons were held on a personal computer wth the followng confguraton: operatng system Mcrosoft Wndows 7; CPU type Intel Core 7 ( GHz) 4 cores; memory sze 6 GB. The analyss of testng results show that AF-SVM s the Fg. 5. ROC-curves of tested gender recognton algorthms ---------------------------------------------------------------------------- 56 ----------------------------------------------------------------------------
most effectve algorthm consderng both recognton rate and operatonal complexty. AF-SVM has the hghest RR among all tested classfers 79.6% and s faster than SVM and KDDA approxmately by 50%. Such advantage s explaned by the fact that AF-SVM algorthm utlzes a small number of adaptve features, each of whch carres a lot of nformaton and s capable to separate classes, whle SVM and KDDA classfers work drectly wth a huge matrx of mage pxel values. Let s consder the possblty of classfer performance mprovement by the ncrease of the total number of tranng mages per class from 400 to 5000. Experments showed that SVM and KDDA recognton rates can t be sgnfcantly mproved n that case. Besdes, ther computatonal complexty ncreases dramatcally wth the growth of the tranng dataset. Ths s explaned by the fact that whle the number of pxels, whch SVM and KDDA classfers utlze to fnd an optmal soluton n a hgh dmensonal space, ncreases t becomes harder and even mpossble to fnd an acceptable soluton for the reasonable tme. In the case of AF-SVM classfer the problem of the decrease of SVM classfer effcency wth the growth of the tranng database can be solved by the use of the small number of adaptve features, holdng nformaton about a lot of tranng mages at once. For ths purpose we suggest that each feature should be traned usng a random subset TABLE III. RECOGNITION RATE OF AF-SVM ALGORITHM TRAINED ON DATASETS OF DIFFERENT SIZE Algorthm AF-SVM (M=5000) AF-SVM (M=400) Parameter Recognton rate True False True False Classfed as male, % 90.6 9.4 80 0 Classfed as female, % 91 9 79.3 0.7 Total classfcaton rate, % 90.8 9. 79.6 0.4 (contanng 400 tranng mages per class) from the whole tranng database (contanng 5000 mages per class). Thus each generated feature wll hold the maxmum possble amount of nformaton, requred to dvde the classes, and a set of features wll nclude the nformaton from each of the 10 000 tranng mages. On the stage of feature generaton 300 features were traned accordng to the technque descrbed above. After that an SVM classfer, utlzng these features, was traned smlarly as before. Besdes, we preserved the number of tranng mages for SVM constructon equal to 400, and thus the operaton speed of the fnal classfer remaned the same as n prevous experments 65 faces processed per second. Fg. 6. ROC-curves for AF-SVM algorthm traned on datasets of dfferent sze ---------------------------------------------------------------------------- 57 ----------------------------------------------------------------------------
Fg. 7. Vsual example of the proposed automatc gender classfer The results of AF-SVM algorthm traned usng expand dataset ( M 5000 ) and the ntal AF-SVM classfer ( M 400 ) comparson are presented n table III and n Fg. 6. The results show that AF-SVM algorthm together wth the proposed tranng setup allow to sgnfcantly mprove the classfer performance n case of ncreasng the tranng database sze to 5000 mages per class. Recognton rate of nearly 91% s acheved. Vsual example of the proposed gender classfcaton system s presented n Fg. 7. Recognzed classes male and female are desgnated by symbols M and F correspondngly (Fg. 7). V. CONCLUSION A new classfer based on adaptve features and support vector machnes, solvng the problem of automatc gender recognton va face area analyss, s proposed. It shows recognton rate of 79.6%, whch s 1.9% and 9.9% hgher than that of SVM and KDDA correspondngly. The possblty of AF-SVM recognton rate mprovement up to 91% n case of ncreasng the tranng database sze to 5000 mages per class s shown. The proposed algorthm allows to process 65 faces per second whch s enough to use t n real tme vdeo sequence analyss applcatons. The adaptve nature of the feature generaton procedure allows usng the proposed AF-SVM classfer for the recognton of any other object on an mage (n addton to faces). For ths purpose a new tranng database of the consdered class should be constructed and a classfer should be retraned accordng to the descrbed n ths paper technque. The proposed gender recognton classfer s used as a part of vdeo data analyss system, whch provdes collecton and processng of nformaton about the audence n real tme. The system s fully automatc and does not requre people to conduct t. No personal nformaton s saved durng the process of operaton. The noted features allow applyng the proposed system n varous spheres of lfe: places of mass stay of people (stadums, move theaters and shoppng centers), transport knots (arports, ralway and auto statons), border passport and vsa control checkponts, etc. It should be noted that ths report covers mostly theoretcal aspects of gender classfer constructon whle the performance of the proposed audence analyss system wll be demonstrated at FRUCT conference demo secton. Ths work was supported by P.G. Demdov Yaroslavl State Unversty research framework (project #1060). REFERENCES [1] E. Alpaydn, Introducton to Machne Learnng. The MIT Press, 010. ---------------------------------------------------------------------------- 58 ----------------------------------------------------------------------------
[] R. Szelsk, Computer Vson: Algorthms and Applcatons. Sprnger, 010. [3] E. Maknen, R. Rasamo, An expermental comparson of gender classfcaton methods, Pattern Recognton Letters, vol. 9, No 10, pp. 1544-1556, 008. [4] S. Tamura, H. Kawa, H. Mtsumoto, Male/female dentfcaton from 8 to 6 very low resoluton face mages by neural network, Pattern Recognton Letters, vol. 9, No, pp. 331-335, 1996. [5] V. Khryashchev, A. Prorov A, L. Shmaglt, M. Golubev, Gender Recognton va Face Area Analyss, Proc. World Congress on Engneerng and Computer Scence, Berkeley, USA. P. 645-649, 01. [6] V. Khryashchev, A. Gann, M. Golubev, L. Shmaglt, Audence analyss system on the bass of face detecton, trackng and classfcaton technques, Lecture Notes n Engneerng and Computer Scence: Proceedngs of The Internatonal MultConference of Engneers and Computer Scentsts 013, 13-15 March, 013, Hong Kong, pp. 446-450. [7] Y. Fu, T.S. Huang, Age Synthess and Estmaton va Faces: A Survey, IEEE Trans. on Pattern Analyss and Machne Intellgence, vol. 3, No 11, pp. 1955-1976, 010. [8] E.C. Sung, J.L. Youn, J.L. Sung, R.P. Kang and K. Jahe, A comparatve study of local feature extracton for age estmaton, IEEE Int. Conf. on Control Automaton Robotcs & Vson (ICARCV), pp. 180-184, 010. [9] G. Guodong, M. Guowang, Human age estmaton: What s the nfluence across race and gender? IEEE Computer Socety Conf. on Computer Vson and Pattern Recognton Workshops (CVPRW), pp. 71-78, 010. [10] L. Zhen, F. Yun, T.S. Huang, A robust framework for multvew age estmaton, IEEE Computer Socety Conf. on Computer Vson and Pattern Recognton Workshops (CVPRW), pp. 9-16, 010. [11] G. Guodong, W. Xaolong, A study on human age estmaton under facal expresson changes, IEEE Conf. on Computer Vson and Pattern Recognton (CVPR), pp. 547-553, 01. [1] L.W. Hee, W. Jan-Gang, Y. We-Yun, L.C. Xng, P.T. Yap, Effects of facal algnment for age estmaton, IEEE Int. Conf. on Control Automaton Robotcs & Vson (ICARCV), pp. 644-647, 010. [13] D. Kregman, M.H. Yang, N Ahuja, Detectng faces n mages: A survey, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 4, No 1, pp. 34-58, 00. [14] E Hjelmas, Face detecton: A Survey, Computer vson and mage understandng, vol. 83, No 3, pp. 36-74, 001. [15] V. Pavlov, V. Khryashchev, E. Pavlov, L. Shmaglt, Applcaton for Vdeo Analyss Based on Machne Learnng and Computer Vson Algorthms, Proceedngs of the 14th conference of open nnovaton assocaton FRUCT, pp. 90 100, 013. [16] P. Vola, M. Jones, Rapd object detecton usng a boosted cascade of smple features, Proc. Int. Conf. on Computer Vson and Pattern Recognton, vol. 1, pp. 511-518, 001. [17] P. Thukral, K. Mtra, R. Chellappa, A herarchcal approach for human age estmaton, IEEE Int. Conf. on Acoustcs, Speech and Sgnal Processng (ICASSP), pp. 159-153, 01. [18] W. Zhao, R. Chellappa, P. Phllps, A Rosenfeld, Face recognton: A lterature survey, ACM Computng Surveys (CSUR), vol. 35, No 4, pp. 399-458, 003. [19] B. Fasel, J. Luettn, Automatc Facal Expresson Analyss: A Survey, Pattern Recognton Letters, vol. 36, No. 1, pp. 59 75, 003. [0] M. Lyons, J. Budynek, A. Plante, S. Akamatsu, Classfyng Facal Attrbutes Usng a -d Gabor Wavelet Representaton and Dscrmnant Analyss, Proc. Int. Conf. on Automatc Face and Gesture Recognton, pp. 0 07, 000. [1] A. Jan, J. Huang, Integratng Independent Components and Lnear Dscrmnant Analyss for Gender Classfcaton, Proc. Int. Conf. on Automatc Face and Gesture Recognton, pp. 159 163, 004. [] Y. Saatc, C. Town, Cascaded Classfcaton of Gender and Facal Expresson Usng Actve Appearance Models, Proc. Int. Conf. on Automatc Face and Gesture Recognton, pp. 393 400, 006. [3] Z. Sun, G. Bebs, X. Yuan, S.J. Lous, Genetc Feature Subset Selecton for Gender Classfcaton: A Comparson Study, Proc. IEEE Workshop on Applcatons of Computer Vson, pp. 165 170, 00. [4] C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, vol., pp. 11 167, 1998. [5] N. Sun et al, Gender Classfcaton Based on Boostng Local Bnary Pattern, Proc. Int. Symposum on Neural Networks, vol., pp. 194 01, 006. [6] S. Gutta, H. Wechsler, P.J. Phllps, Gender and Ethnc Classfcaton of Face Images, Proc. Int. Conf. on Automatc Face and Gesture Recognton, pp. 194 199, 1998. [7] P.J. Phllps et al, The FERET Evaluaton Methodology for Face Recognton Algorthms, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol., no. 10, pp. 1090 1104, 000. [8] H. Gao, J. Davs, Why Drect LDA s not Equvalent to LDA, Pattern Recognton Letters, vol. 39, no. 5, pp. 100 1006, 006. [9] T. Fawcett, ROC Graphs: Notes and Practcal Consderatons for Researchers, Pattern Recognton Letters, vol. 7, no. 8, pp. 88 891, 004. ---------------------------------------------------------------------------- 59 ----------------------------------------------------------------------------