3D Head Tracking Using Motion Adaptive Texture-Mapping

3D Head Trackng Usng Moton Adaptve Texture-Mappng Lsa M Brown IBM T.J. Watson Research Center lsab@us.bm.com Abstract We have developed a fast robust 3D head trackng system based on renderng a texture-mapped cylnder. In order to handle the varable frame-to-frame moton changes, the system uses moton templates, whch adapt to the current sze of the moton ncrement. The relatonshp between measurable pxel energy and trackng error s used to desgn the parameters of the adaptve algorthm. To speedup processng and decouple rotatonal and translatoal moton, 2D postonal nformaton of the necklne s utlze. The confdence at each pont s computed based on the amount of nformaton used n creatng the texture map and re-renderng the face. The system can also handle large out-of-plane rotatons va addtonal templates. If the tracker fals, t can recover usng an ndependent face detecton routne. We compare the results of our approach wth the extensve results of a closely related technque. Keywords: 3D head trackng, real-tme moton estmaton, robust face modelng. 1. Introducton Several applcatons n computer vson wll beneft from an accurate and fast 3D head trackng system. These nclude pose-nvarant face recognton[1-8], facal expresson analyss[9-1], nput routnes for head and facal anmaton[1-13], facal compresson systems for teleconferencng[14], and human computer nterfaces[15-16]. Although recently, several nvestgators have developed methods to perform 3D head trackng[17-19], there s stll a need to mprove the accuracy and relablty of such systems. Ths s the task we have undertaken. In partcular, we are nterested n 3D head trackng as part of a larger, mult-scale human trackng system beng developed at IBM, called PeopleVson[2]. Ths multcamera system s beng desgned to determne a range of human actvtes, from how many people are n a partcular space, to who s there and what are they dong. From the vewpont of ths project, head trackng needs to be desgned so that, on the one hand, t s useful for posenvarant (or dynamc) face recognton, and on the other hand, can explot larger scale nformaton such as the absolute poston of the head or the relatve poston of the head to the shoulders. From the pont of vew of face recognton, commercal systems are, by and large, stll pose dependent. Vsoncs can recognze people for pose less than 2 degrees. Recent research has made sgnfcant strdes towards mproved pose nvarance. Wen Y Zhao and Chellapa [1], Chung et. al. [2], and Senor[3] have mproved the robustness to face recognton to small changes n pose or lghtng. Demr et al.[4] acheve pose nvarance based on matchng the pose of the nput wth a smlar pose recorded n the database. Feng et.al.[5] and Kousan[6] acheve pose nvarance by determnng the pose of the nput mage, normalzng ths mage to a frontal pose and testng aganst the orgnal database of frontal mages. All of these systems are lmted, ether by a small range of pose, an extensve database or n the latter case, n the accuracy of normalzed pose. Two very recent works have pushed the state of the art forward. Georghages et. al.[7] generate synthetc poses and lghtng condtons wth large varatons from a small number of mages of each subject. These are used to buld a representaton of the statstcal space spanned by each subject under all lghtng condtons and poses. Okada et. al. [8] create a PCMAP of each face n the database from a large number of mages of each subject. The pose of the nput test mage s determned by fndng the optmal PCMAP usng a classfcaton method. The work of both Georghages and Okada are appearance or vew-based and extend the current statstcal foundaton of face recognton. Ther emphass s on stand-alone face recognton systems. These approaches rely on statstcal nformaton and gnore the geometrc realty. They are effectve for the current state-of-the art n face recognton.

Our approach to head trackng s an attempt to brdge the gap between the vew-based statstcal technques and the geometrc model and feature-based methods. We would lke to ultmately ntegrate our system nto multscale human trackng and at the same tme acheve dynamc face recognton. 2. Background An excellent survey of face trackng systems can be found n [21]. Systems are categorzed by ther algorthm characterstcs, rangng from: color blob, moton blob, depth blob, edge, feature, template and optc flow. The frst three types refer to systems whch track color, moton or depth clusters, respectvely. Systems are also dfferentated accordng to ther recovery algorthms, whch are smlarly classfed. Toyama makes two nterestng conclusons. Frst, certan cues, lke color and moton are useful for postonal trackng and recovery snce they can be computed quckly, but lack the precson and robustness necessary for full 3D trackng. Second, 3D trackng s predomnantly based on ether feature or template algorthms. Furthermore, template technques tend to be more robust whle feature-based methods can be more effcently mplemented. The method we have developed prmarly uses templates. As a template approach, the dense comparson or correspondng pxels between mage and template makes the method robust aganst ndvdual varaton and small non-rgd moton. It can also be made to be robust aganst llumnaton varatons by the use of subjectndependent llumnaton templates. On the other hand, we are able to track effcently by explotng wdely avalable texture mappng hardware and a straghtforward least squares computaton. In addton, we use an ndependent estmate of 2D poston to speedup the processng. Our three-dmensonal head tracker s an extenson of the method desgned and thoroughly tested by La Casca et. al.[22]. Ths technque s based on mappng the face onto a cylndrcal model and estmatng the change n pose wth respect to the ncremental dfference n the rerendered texture maps. We have studed the cases n whch the accuracy of the system degenerates and have desgned our system to handle these problems. The prmary sources of error n the orgnal system are due to (1) the nablty to dstngush rotatons around the horzontal axs and vertcal translatons or smlarly rotatons around the vertcal axs and horzontal translatons, (2) very large rotatons around the vertcal axs, and (3) large frame to frame pose changes. Error sources of the frst and thrd knd are reduced va adaptve templates. Errors of the frst knd are also reduced by usng an ndependent measure of two-dmensonal pose based on feature trackng along the necklne. The nformaton to dstngush small rotatons from ther counterpart translatons s not suffcent usng frame-toframe dfferences but can be roughly estmated usng necklne features and then refned usng the full 3D tracker. When the source of error s due to large rotatons around the vertcal axs the orgnal system fals because the face s not truly cylndrcal. Ths s corrected by updatng the moton templates when rotaton around the vertcal axs s suffcently large. We show several examples where the orgnal method fals but s robust usng adaptve/addtonal templates and 2D postonal nformaton. The data used was created by La Casca et. al. and s publcly avalable at [23]. 3. Moton Adaptve Templates Our system s based on the creaton of a set of moton templates. Intally, a frontal mage of the person to be tracked s acqured usng a face detecton algorthm. Ths mage s used to create a texture map for a cylndrcal model of the head. Ths texture map s then used to render the face at small perturbatons n pose, n partcular along each of the sx degrees of freedom of the moton: three translatons and three rotatons. Each moton template s the dfference mage between the orgnal frontal mage and the re-rendered texture map of the head at a slghtly dfferent pose, based on a small change n one of the moton parameters. Each subsequent frame of the vdeo sequence or camera nput s used to create a texture map for a cylndrcal model of the head, based on the prevous estmate of the pose. The dfference between the frontal vew of ths texture map and the frontal vew of the orgnal (frontal) texture map s used as a measure of the moton that occurred between the prevous frame and current one. The system determnes the best lnear combnaton of templates that accounts for ths moton. Ths lnear combnaton s an accurate estmate of the moton as long as the moton s small, the motons are not too tghtly coupled, the head moves rgdly and the head and face behave approxmately lke a cylnder We compute the best lnear combnaton usng least squares based on the normal equatons computed by a straghtforward Gauss-Jordan elmnaton. Orgnally, we performed the ft usng SVD whch although, n general, s more robust, was n ths case unnecessary and sgnfcantly slower. The majorty of the tme requred to compute the moton parameters was the tme requred by the least squares ft calculaton. We found our system to be hghly senstve to the sze of the perturbatons n the moton templates. Followng the notaton used by La Casca et.al., we create 4 moton templates for each of the sx moton parameters, changng

1 X Translaton Y Translaton 45 Z Translaton 5 4 1 35 15 3 1 5 1 15 2 Ptch Rotaton about X 2 5 1 15 2 Yaw Rotaton about Y 25 5 1 15 2 Roll Rotaton about Z 2 1 1 2 5 1 15 2 3 2 1 1 2 3 5 1 15 2 3 2 1 1 5 1 15 2 Fgure 1. Results of 3D head tracker usng two sets of moton template perturbaton deltas. Lght sold lne s the ground truth. Dark sold lne represents the results wth the smaller deltas whch lose the track by frame 15. Dashed lne shows the results wth larger set whch successfully tracks the head. Trackng Error Sold Lne 1 2 3 4 5 6 7 8 9 1 15 1 5 1 2 3 4 5 6 7 8 9 1 Fgure 2. Trackng Error s compared wth Pxel Dfference Energy over 1 frames of head trackng. the k th parameter by ±δ k and ±2δ k. In La Casca et.al., the δ s were set so that correspondng dfference mages for dfferent moton parameters would have the same energy. Ths was based on the work on vew-based actve appearance models performed by Cootes et.al.[]. It was also verfed n our own analyss. However, even f ths relatve perturbaton sze s mantaned, the absolute value of these δ s can alter whch vdeo sequences can be tracked. In Fgure 1, the vdeo sequence vam2.av s.25.2.15.1.5 Pxel Energy Dotted Lne tracked usng two sets of deltas. For one set (the larger) the head s tracked successfully. For the smaller set, the track s lost by frame 15. For a dfferent sequence, the smaller set of deltas s requred to successfully track the head whle the larger set causes t to fal. Based on the relatonshp between template energy and the relatve perturbaton sze, we decded to nvestgate the relatonshp between trackng error and pxel dfference energy (between the current frame mapped frontally and the orgnal frontal frame.) We defne the trackng error as the sum of the rotatonal and translaton errors. These are each defned as a functon of the Mahalanobs dstance between the estmated and measured poston and orentaton respectvely[22]. For example, the error n translaton s gven by: e where ~ T 1 = [ x ] [ ~ t, xt, xt, xt, ] 2 t, x t, and x t, ~ are the estmate of the 3D poston and the ground truth, and s the covarance computed over the entre set of ground truth data. In Fgure 2. ths relatonshp s plotted over 1 frames for the orgnal head trackng system. The sold lne represents the trackng error whose unts are shown on the left hand sde. The dotted lne represents the pxel energy; unts are on the rght. The two values appear lnearly related, n partcular for the smaller trackng error where the system s successful. We conclude from ths

relatonshp, that the perturbaton sze needs to adapt to the trackng error so that the system can correct for moton changes of dfference szes. If the moton s too large, or the system has faled to track correctly, templates created wth larger perturbatons are necessary to brng the system back to the smaller trackng error sze. 4. Confdence Maps Each frame s used to create a texture map and then projected frontally. The frame s nversely projected onto the texture map based on the current estmate of the pose, gven by the 6 moton parameters. If the pose s frontal, then the creaton of the texture map s smply the reverse of the projecton and the nformaton gven n the mage has unform confdence. However, as the pose n the current frame vares, the confdence map s defned as the product of the two processes, the nverse projecton onto the texture map, and the forward projecton onto the mage. In both processes, we defne the confdence to be the rato of the area of an element n the nput over the sze of the area of an element of the output. In other words, our confdence s greatest f the nput used to compute the output has the greatest relatve area. For forward projecton, mage ponts derved from texture map ponts, whch are vewed oblquely, have the hghest confdence because a large area of the texture map s evaluated. For nverse projecton, texture map ponts derved from mage ponts, whch represent ponts vewed frontally, have the hghest confdence. For the moment, let us consder ths rato for the frontal (forward) perspectve projecton. As seen n Fgure 3, we defne the mage plane by, s the lne that runs from the vew pont to the mage pont, ntersectng the cylnder. We fnd the pont of ntersecton, t = v + U t. by solvng for t such that t les on the cylnder, t = [ r cosθ, h, r snθ ] where r s the radus of the cylnder, h s n the range of (, h) where h s the heght of the cylnder. The cylnder pont, t, s at a dstance, t = t from v the vewpont. Smlarly, the mage pont s located at a dstance =. The rato of the area-element on v the texture-mapped cylnder to the projecton of that element on the mage plane s then gven by the followng formula: At A t = 2 U P U R. The area grows wth the square of the dstance and nversely wth the projecton of the surface normal to the vew drecton. P = d where P s the unt normal to the plane, s an arbtrary pont on the plane, d s a constant. Let t be a pont on the texture-mapped cylnder. Let R be the unt radus vector from the axs of the cylnder to t. R s therefore normal to the surface of the cylnder. Construct the unt vector, ( v) U = v that ponts from the vewpont, v, to the mage pont,, and consequently to a texture-mapped cylnder pont, t. The lne, L( k) = v + Uk Fgure 3. The result of ths computaton s shown n Fgure 4b. The ntensty s scaled to mprove the vsualzaton. The largest confdence values occur where the face s vewed most oblquely. We also mask ths mage wth an ellpse wth major axs equal to half the heght of the cylnder and mnor axs equal to the radus. In ths way, we lmt our measurements to facal pxels.

In addton to the confdence values due to the frontal projecton, we need to compute the confdence due to the nverse projecton. Ths s smply the nverse of Equaton 1, wth two addtonal detals. Frst, the cylnder s rgdly transformed snce the nverse projecton depends on the current estmate of the pose. Ths can be mplemented by performng the transformaton wth respect to the vewpont and the mage plane, thereby allowng us to use the same equatons as before. The second detal s that the confdence measures need to be computed wth respect to each pont n the fnal frontal mage so the cumulatve confdence can be computed as the product. In partcular, for each pont n the frontal mage, we determne the pont on the cylnder to whch t corresponds (when vewed frontally.) We then compute the dstances between the vewpont and the mage pont, the vewpont and the cylnder, and the vewng drectons wth respect to the current pose. An example of ths computaton s gven n Fgure 4a. In ths case, the current pose has rotated 3 around the horzontal axs. Notce how the greatest confdence values occur near the top where the face s vewed most perpendcularly to the camera. Fgure 4a-c. The fnal confdence map s shown n Fgure 4c. Ths map s used as the weghtng system n the least squares computaton. 5. Addtonal Templates One of the prmary sources of error n the orgnal system were due to very large rotatons around the vertcal axs. These errors cause a problem for the system because the face s not truly cylndrcal. In partcular, when a person turns ther head sharply left or rght, ther nose and ts three dmensonal shape can be seen more clearly and at the same tme t obscures other parts of ther face. The moton templates created based on the orgnal frontal mage do not provde ths nformaton. The dfference mage cannot be accounted for usng the orgnal templates. In addton, the amount of useful nformaton wth hgh confdence from both mages s low snce what s frontal n one s along the sde of the other. We have found that we can reduce the error n trackng due to large rotaton around the vertcal axs by creatng new moton templates, ether on the fly when the rotaton grows too large or beforehand usng addtonal footage processed offlne. For the sequence jm1.av, the orgnal tracker faled by frame 6 as shown n Fgure 5 by the dark dotted lne. The ground truth based on a magnetc tracker s shown as a sold nosy lne. Usng addtonal moton templates when the rotaton exceeded 1 degrees allowed the system, depcted as a dashed lne, to contnue to track. We also tested the utlty of addng addtonal templates when the rotaton exceeded 5 degrees shown as the dotted lne. As can be seen n the results, the dfference s neglgble. 6. 2D Postonal Informaton The remanng source of error s due to the nablty of the orgnal system to dfferentate small rotatons from ther vsually smlar translatons. However, snce we envson our system as part of a larger human trackng project, we have avalable nformaton regardng the 2D postonal locaton of the necklne. Ths s computed usng a standard Lucas-Kanade feature tracker. Ths nformaton s used n conjuncton wth the 3D head tracker estmates to derve a more robust 3D postonal estmate. The x and y estmates computed by the necklne feature detector are averaged wth the current estmates computed by the tracker pror to the least squares ft. In Fgure 6, we show the results of the head tracker algorthm on the challengng vam4.av sequence, a sequence whch ncludes rotaton along two axes, coupled rotaton about the vertcal and translaton along x, and large fast rotatonal moton. Our system s able to cope wth coupled rotaton/translaton seen n frames 3-1. It also handles smultaneous rotatons around y and z n frames 5-1. But t fals to track when the rotatonal moton around the y-axs s too fast and s coupled wth horzontal translaton and rotaton around the z-axs. 7. Recovery A face detecton scheme based on a template search across an mage pyramd [24] was ntegrated nto the system n two ways. It s ntally used to detect the face n the sequence, n order to ntalze the sze and locaton of the cylnder. It s also used to recover the pose when the tracker fals. An mage pyramd s constructed representng the mage at dfferent resolutons, wth neghborng scales dfferng

1 X Translaton Y Translaton 45 Z Translaton 5 4 1 35 15 3 1 5 1 15 2 Ptch Rotaton about X 2 5 1 15 2 Yaw Rotaton about Y 25 5 1 15 2 Roll Rotaton about Z 2 1 1 2 3 2 1 1 2 3 5 1 15 2 5 1 15 2 5 1 15 2 Fgure 5. Results of 3D head trackng system augmented wth addtonal templates. Dark dots, represent orgnal system whch lost the track by frame 6. The sold nosy lne s the ground truth. The dashed lne represents the system wth addtonal templates every 1 degrees of yaw. The nearly dentcal dotted lne represents the system wth addtonal templates every 5 degrees of yaw. 2 1 1 2 by a factor of 1.2. The sze of the pyramd can be lmted by doman constrants on the szes of faces to be expected. The face detector s appled at each locaton n the mage pyramd to determne f the surroundng square represents a face. The face detector appled at each such canddate regon conssts of a herarchy of bnary classfers, each seekng to flter out a proporton of the non-faces but retan the true faces. The frst detector s a skn tone flter that determnes f the pxels of the canddate regon have colorng consstent wth skn. Each pxel s ndependently classfed as skntone or not, and the canddate regon s rejected f t contans an nsuffcent proporton of skn-tone pxels (typcally 6-8%). Subsequently a lnear dscrmnant traned on face and non-face exemplars s appled to the gray-level pxels. The lnear dscrmnant s fast and removes a sgnfcant proporton of the non-face mages. Next a Dstance From Face Space [25] measure s used to further flter the exemplars and fnally a combnaton of the two scores s used to rank overlappng canddates that have been retaned, wth only the hghest scorng canddate beng retaned. The pose parameters of the face detectons are refned by a local search to maxmze the combned score over the space of small perturbatons n scale, locaton and rotaton about the z-axs. Some rotaton about the x and y-axes s accommodated but not estmated by extendng the search to small perturbatons n aspect rato. Trackng s carred out by a smlar local search, after predctng the face's new locaton wth a constant-velocty model and the current frame rate. 8. Conclusons We have demonstrated the effcacy of mprovng 3D head trackng based on texture mappng onto a cylnder

1 X Translaton Y Translaton 45 Z Translaton 5 4 1 35 15 3 1 5 1 15 2 Ptch Rotaton about X 2 5 1 15 2 Yaw Rotaton about Y 25 5 1 15 2 Roll Rotaton about Z 2 1 1 2 3 2 1 1 2 3 5 1 15 2 5 1 15 2 5 1 15 2 Fgure 6. Results of 3D Head Tracker ntegrated wth 2D pose nformaton from a face detecton algorthm. Lght sold lne represents ground truth; dark dotted lne represents the results of the tracker. 2 1 1 2 usng confdence maps derved from the geometry, adaptve moton templates based on the pxel energy, addtonal templates for large rotatons and external 2D postonal nformaton Confdence maps derved from the geometry of the mappng enable the system to measure only the overlappng and related portons of two frames whch may have a very large pose dfference. Adaptve moton templates based on the pxel dfference energy mprove the system s ablty to tolerate large moton and small trackng errors. Addtonal templates for large vertcal rotatons compensate for the 3D structural propertes of the head. Two dmensonal poston nformaton enable the system to de-couple small rotatons from ther vsually smlar translatons and smultaneously speed up the computaton. We would lke to more extensvely evaluate the benefts of ths methodology through a more thorough analyss of a large data set. Ultmately we would lke to ntegrate ths system nto our mult-scale people trackng project usng nformaton about a person as they approach the camera and determnng the dentty of the person as they travel through the perceptual envronment. 9. Bblography [1] Wen Y Zhao, Chellapa, R. SFS Based Vew Synthess for Robust Face Recognton, Proc. 4 th IEEE Conf. On Automatc Face and Gesture Recognton, Grenoble, France, March 2, p277-284. [2] Chung, K-C, Kee, S.C., Km, S.R., Face Recognton Usng Prncpal Component Analyss of Gabor Flter Responses, Proc. of the Int l Workshop on Recognton, Analyss and Trackng of Faces and Gestures n Real-tme Systems, Corfu, Greece, Sept. 1999, p53-57. [3] Senor, A.W., Recognzng Faces n Broadcast Vdeo, Proc. of the Int l Workshop on Recognton, Analyss and Trackng of Faces and Gestures n Real-tme Systems, Corfu, Greece, Sept. 1999, p15-11. [4] Demr, E., Akarun, L., Alpaydn, E., Two Stage Approach for Pose Invarant Face Recognton, 2 IEEE Int l Conf. On Acoustcs, Speech and Sgnal Processng, Vol 4, Instanbul, Turkey 5-9, June 2, p2343-4. [5] Feng, G.C., Yuen, P.C., Recognton of Head and Shoulder Face Image Usng Vrtual Frontal Vew Image, IEEE Trans. Syst. Man Cybern. & Syst. Humans, Vol 3, No. 6, Nov 2, p871-82.

[6] Kouzan, A.Z., He, F., Sammut, K., Towards Invarant Face Recognton, Informaton Scences, Vol. 123, (2), p75-11. [7] Georghages, A.S., Behumeur, P.N., and Kregman, D.J. From Few to Many: Generatve Models for Recognton Under Varable Pose and Illumnaton, Proc. 4 th IEEE Conf. on Automatc Face and Gesture Recognton, Grenoble, France, March 2, p277-284. [8] Okada, K., Akamatsu, S., von der Malsburg, C., Analyss and Synthess of Pose Varatons of Human Faces by a Lnear PCMAP Model and ts Applcaton for Pose Invarant Face Recognton System, Proc. 4 th IEEE Conf. on Automatc Face and Gesture Recognton, Grenoble, France, March 2, p142-149. [9] Tan, Y-L, Facal Expresson Analyss, PAMI [1] Chen, T., Audovsual Speech Processng, IEEE Sgnal Processng, Vol. 18, No. 3, May 21, p9-21. [11] Vetter, T., Synthess of Novel Vews from a Sngle Face Image, Int l J. Computer Vson, Vol. 28, No.2, 1998, p13-116. [12] Takacs, B. Fromherz, T., Tce, S., Metaxas, D., Dgtal Clones and Vrtual Celebrtes: Facal Trackng, Gesture Recognton and Anmaton for the Move Industry, Proc. of the Int l Workshop on Recognton, Analyss and Trackng of Faces and Gestures n Real-tme Systems, Corfu, Greece, Sept. 1999, p169-176. [13] Goto, T., Kshrsagar, Magnenat-Thalmann, N., Automatc Face Clonng and Anmaton, IEEE Sgnal Processng, Vol. 18, No. 3, May 21, p17-25. [14] Shn, M.C., Dmtry G. Km, Carlos, Estmaton of the MPEG-4 FAPs Usng Pont and Curve Features, IEEE Workshop on Human Modelng Analyss and Synthess, Hlton Head Island, SC, June 2, p59-64. [15] Darrell, T., Gordon, G., Woodfll, J., Harvlle, M., A Vrtual Mrror Interface usng Real-tme Robust Face Trackng, [16] Sherrah, J., and Gong, S., Fuson of 2D Face Algnment and 3D Head Pose Estmaton for Robust and Real-tme Performance, Proc. of the Int l Workshop on Recognton, Analyss and Trackng of Faces and Gestures n Real-tme Systems, Corfu, Greece, Sept. 1999, p24-3. [17] Jebara, T.S., and Pentland, A., Parameterzed Structure from Moton for 3D Adaptve Feedback Trackng of Faces, Proc. Conf. Computer Vson and Pattern Recognton, 1997. [18] Schodel, A., Haro, A. and Essa, I., Head Trackng Usng a Textured Polygonal Model, Proc. 1998 Workshop Perceptual User Interfaces, 1998. [19] Dellaert, F., Thorpe, C., and Thrun, S., Jacoban Images of Super-Resolved Texture Maps for Model-Based Moton Estmaton and Trackng, Proc. IEEE Workshop Applcatons of Computer Vson, 1998. [2] Hampapur, A., Senor, A., Brown, L., Tan, Y-L, Pankant, S. People Vson: A Multscale Human Percepton System, http://arunh.userv.bm.com/arpresentaton.doc, 21. [21] Toyama, K., Prolegomena for Robust Face Trackng, Mcrosoft Research Techncal Report MSR-TR-98-65, Nov. 13, 1998. [22] La Casca, M., Sclaroff, S., and Athtsos, V., Fast, Relable Head Trackng under Varyng Illumnaton: An Approach Based on Regstraton of Texture-Mapped 3D Models, IEEE PAMI, Vol 22, No. 4., Aprl 2. [23] http://www.cs.bu.edu/groups/vc/headtrackng [24] Cootes, T.F., Walker, K., Taylor, C.J., Vew-Based Actve Appearance Models, Proc. 5 th European Conf. On Computer Vson, 1998. [25] Senor, A W., Recognzng Faces n Broadcast Vdeo, Proc. Int l Workshop on Recognton, Analyss and Trackng of Faces and Gestures n Real-Tme Systems 1999. [26] Turk, M. A. and Pentland, A. P., Face Recognton Usng Egenfaces, Proc. IEEE Conf. on Computer Vson and Pattern Recognton, Hawa, June, 1992.