Face Alignment through Subspace Constrained MeanShifts


 Bathsheba Lester
 1 years ago
 Views:
Transcription
1 Face Algnment through Subspace Constraned MeanShfts Jason M. Saragh, Smon Lucey, Jeffrey F. Cohn The Robotcs Insttute, Carnege Mellon Unversty Pttsburgh, PA 15213, USA Abstract Deformable model fttng has been actvely pursued n the computer vson communty for over a decade. As a result, numerous approaches have been proposed wth varyng degrees of success. A class of approaches that has shown substantal promse s one that makes ndependent predctons regardng locatons of the model s landmarks, whch are combned by enforcng a pror over ther jont moton. A common theme n nnovatons to ths approach s the replacement of the dstrbuton of probable landmark locatons, obtaned from each local detector, wth smpler parametrc forms. Ths smplfcaton substtutes the true objectve wth a smoothed verson of tself, reducng senstvty to local mnma and outlyng detectons. In ths work, a prncpled optmzaton strategy s proposed where a nonparametrc representaton of the landmark dstrbutons s maxmzed wthn a herarchy of smoothed estmates. The resultng update equatons are remnscent of meanshft but wth a subspace constrant placed on the shape s varablty. Ths approach s shown to outperform other exstng methods on the task of generc face fttng. 1. Introducton Deformable model fttng s the problem of regsterng a parametrzed shape model to an mage such that ts landmarks correspond to consstent locatons on the object of nterest. It s a dffcult problem as t nvolves an optmzaton n hgh dmensons, where appearance can vary greatly between nstances of the object due to lghtng condtons, mage nose, resoluton and ntrnsc sources of varablty. Many approaches have been proposed for ths wth varyng degrees of success. Of these, one of the most promsng s one that uses a patchbased representaton and assumes mage observatons made for each landmark are condtonally ndependent [2, 3, 4, 5, 16]. Ths leads to better generalzaton wth lmted data compared to holstc representatons [10, 11, 14, 15], snce t needs only account for local correlatons between pxel values. However, t suffers from detecton ambgutes as a drect result of ts local representaton. As such, care should be taken n combnng detecton results from the varous local detectors n order to steer optmzaton towards the desred soluton. Our key contrbuton n ths paper les n the realzaton that a number of popular optmzaton strateges are all, n some way, smplfyng the dstrbuton of landmark locatons obtaned from each local detector usng a parametrc representaton. The motvaton of ths smplfcaton s to ensure that the approxmate objectve functon: () exhbts propertes that make optmzaton effcent and numercally stable, and () stll approxmately preserve the true certanty/uncertanty assocated wth each local detector. The queston then remans: how should one smplfy these local dstrbutons n order to satsfy () and ()? We address ths by usng a nonparametrc representaton that leads to an optmzaton n the form of subspace constraned meanshfts. 2. Background 2.1. Constraned Local Models Most fttng methods employ a lnear approxmaton to how the shape of a nonrgd object deforms, coned the pont dstrbuton model (PDM) [2]. It models nonrgd shape varatons lnearly and composes t wth a global rgd transformaton, placng the shape n the mage frame: x = sr( x + Φ q)+t, (1) where x denotes the 2Dlocaton of the PDM s th landmark and p = {s, R, t, q} denotes the parameters of the PDM, whch consst of a global scalng s, a rotaton R, a translaton t and a set of nonrgd parameters q. In recent years, an approach to that utlzes an ensemble of local detectors (see [2, 3, 4, 5, 16]) has attracted some nterest as t crcumvents many of the drawbacks of holstc approaches, such as modelng complexty and senstvty to lghtng changes. In ths work, we wll refer to these methods collectvely as constraned local models (CLM) 1. 1 Ths term should not be confused wth the work n [5] whch s a partcular nstance of CLM n our nomenclature.
2 optmzaton proceeds by maxmzng: n p({l = algned} n p) = p(l = algned x ) (4) Fgure 1. Illustraton of CLM fttng and ts two components: () an exhaustve local search for feature locatons to get the response maps {p(l = algned I,x)} n, and () an optmzaton strategy to maxmze the responses of the PDM constraned landmarks. All nstantatons of CLMs can be consdered to be pursung the same two goals: () perform an exhaustve local search for each PDM landmark around ther current estmate usng some knd of feature detector, and () optmze the PDM parameters such that the detecton responses over all of ts landmarks are jontly maxmzed. Fgure 1 llustrates the components of CLM fttng. Exhaustve Local Search: In the frst step of CLM fttng, a lkelhood map s generated for each landmark poston by applyng local detectors to constraned regons around the current estmate. A number of feature detectors have been proposed for ths purpose. One of the smplest, proposed n [16], s the lnear logstc regressor whch gves the followng response map for the th landmark 2 : p(l = algned I,x) = exp{αc (I; x)+β}, (2) where l s a dscrete random varable denotng whether the th landmark s correctly algned or not, I s the mage, x s a 2D locaton n the mage, and C s a lnear classfer: C (I; x) =w T [ I(y1 );... ; I(y m ) ] + b, (3) wth {y } m Ω x (.e. an mage patch). An advantage of usng ths classfer s that the map can be computed usng effcent convoluton operatons. Other feature detectors have also been used to great effect, such as the Gaussan lkelhood [2] and the Haarbased boosted classfer [3]. Optmzaton: Once the response maps for each landmark have been found, by assumng condtonal ndependence, 2 Not all CLM nstances requre a probablstc output from the local detectors. Some, for example [2, 5], only requre a smlarty measure or a match score. However, these matchng scores can be nterpreted as the result of applyng a monotonc functon to the generatng probablty. For example, the Mahalanobs dstance used n [2] s the negatve log of the Gaussan lkelhood. In the nterest of clarty and succnctness, dscussons n ths work assume that responses are probabltes. wth respect to the PDM parameters p, where x s parameterzed as n Equaton (1) and dependence on the mage I s dropped for succnctness. It should be noted that some forms of CLMs pose Equaton (4) as mnmzng the summaton of local energy responses (see 2.2). The man dffculty n ths optmzaton s how to avod local optma whlst affordng an effcent evaluaton. Treatng Equaton (4) as a generc optmzaton problem, one may be tempted to utlze general purpose optmzaton strateges here. However, as the responses are typcally nosy, these optmzaton strateges have a tendency to be unstable. The smplex based method used n [4] has been shown to perform reasonably for ths task snce t s a gradentfree based generc optmzer, whch renders t somewhat nsenstve to measurement nose. However, convergence may be slow when usng ths method, especally for a complex PDM wth a large number of parameters Optmzaton Strateges In ths secton, a revew of current methods for CLM optmzaton s presented. These methods ental replacng the true response maps, {p(l x)} n, wth smpler parametrc forms and performng optmzaton over these nstead of the orgnal response maps. As these parametrc densty estmates are a knd of smoothed verson of the orgnal responses, senstvty to local mnma s generally reduced. Actve Shape Models: The smplest optmzaton strategy for CLM fttng s that used n the Actve Shape Model (ASM) [2]. The method entals frst fndng the locaton wthn each response map for whch the maxmum was attaned: µ = [ µ 1 ;...; µ n ]. The objectve of the optmzaton procedure s then to mnmze the weghted least squares dfference between the PDM and the coordnates of the peak responses: Q(p) = n w x µ 2, (5) where the weghts {w } n reflect the confdence over peak response coordnates and are typcally set to some functon of the responses at {µ } n, makng t more resstant towards such thngs as partal occluson, where occluded landmarks wll be more weakly weghted. Equaton (5) s teratvely mnmzed by takng a frst order Taylor expanson of the PDM s landmarks: x x c + J p, (6)
3 and solvng for the parameter update: ( n ) 1 p = w J T J n w J T (µ x c ), (7) whch s then appled addtvely to the current parameters: p p + p. Here, J =[J 1 ;...; J n ] s the Jacoban and x c = [ x c 1 ;...; xc n] s the current shape estmate. From the probablstc perspectve ntroduced n 2.1, the ASM s optmzaton procedure s equvalent to approxmatng the response maps wth an sotropc Gaussan estmator: p(l = algned x) N (x; µ,σ 2 I), (8) where w = σ 2. Wth ths approxmaton, takng the negatve log of the lkelhood n Equaton (4) results n the objectve n Equaton (5). Convex Quadratc Fttng: Although the approxmaton descrbed above s smple and effcent, n some cases t may be a poor estmate of the true response map. Frstly, the landmark detectors, such as the lnear classfer descrbed n 2.1, are usually mperfect n the sense that the maxmum of the response may not always concde wth the correct landmark locaton. Secondly, as the features used n detecton consst of small mage patches they often contan lmted structure, leadng to detecton ambgutes. The smplest example of ths s the aperture problem, where detecton confdence across the edge s better than along t (see example response maps for the nose brdge and chn n Fgure 2). To account for these problems, a method coned convex quadratc fttng (CQF) has been proposed recently [16]. The method fts a convex quadratc functon to the negatve log of the response map. Ths s equvalent to approxmatng the response map wth a full covarance Gaussan: p(l = algned x) N (x; µ, Σ ). (9) The mean and covarance are maxmum lkelhood estmates gven the response map: Σ = α x (x µ )(x µ ) T ; µ = α x x, x Ψ x c x Ψ x c (10) where Ψ x c s a 2Drectangular grd centered at the current landmark estmate x c (.e. the search wndow), and: α x = p(l = algned x) y Ψ xc p(l = algned y). (11) Wth ths approxmaton, the objectve can be wrtten as the mnmzaton of: n Q( p) = x c + J p µ 2, (12) Σ 1 Fgure 2. Response maps, p(l = algned x), and ther approxmatons used n varous methods, for the outer left eye corner, the nose brdge and chn. Red crosses on the response maps denote the true landmark locatons. The GMM approxmaton has fve cluster centers. The KDE approxmatons are shown for σ 2 {20, 5, 1}. the soluton of whch s gven by: ( n ) 1 n p = J T Σ 1 J J T Σ 1 (µ x c ). (13) A Gaussan Mxture Model Estmate: Although the response map approxmaton n CQF may overcome some of the drawbacks of ASM, ts process of estmaton can be poor n some cases. In partcular, when the response map s strongly multmodal, such an approxmaton smoothes over the varous modes (see the example response map for the eye corner n Fgure 2). To account for ths, n [8] a Gaussan mxture model (GMM) was used to approxmate the response maps: p(l = algned x) K π k N (x; µ k, Σ k ), (14) k=1 where K denotes the number of modes and {π k } K k=1 are the mxng coeffcents for the GMM of the th PDM landmark. Treatng the mode membershp for each landmark, {z } n, as hdden varables, the maxmum lkelhood soluton can be found usng the expectatonmaxmzaton (EM) algorthm, whch maxmzes: p({l } n p) = n k=1 K p (z = k, l x ). (15) The Estep of the EM algorthm nvolves computng the posteror dstrbuton over the latent varables {z } n : p(z = k l, x )= where p(z = k) =π k and: p(z = k) p(l z = k, x ) K j=1 p(z = j) p(l z = j, x ), (16) p(l = algned z = k, x )=N (x ; µ k, Σ k ). (17)
4 In the Mstep, the expectaton of the negatve log of the complete data s mnmzed: { n }] Q(p) =E q(z) [ log p(l = algned,z x ), (18) where q(z) = n p (z l = algned, x ). Lnearzng the shape model as n Equaton (6), ths Qfuncton takes the form: n K Q( p) w k J p y k 2 + const, (19) Σ 1 k k=1 where w k = p (z = k l = algned, x ) and y k = µ k x c, the soluton of whch s gven by: ( n ) 1 K p = w k J T Σ 1 k J n K w k J T Σ 1 k y k. k=1 k=1 (20) Although the GMM s a better approxmaton of the response map compared to the Gaussan approxmaton n CQF, t exhbts two major drawbacks. Frstly, the process of estmatng the GMM parameters from the response maps s a nonlnear optmzaton n tself. It s only locally convergent and requres the number of modes to be chosen apror. As GMM fttng s requred for each PDM landmark, t consttutes a large computaton overhead. Although some approxmatons can be made, they are generally suboptmal. For example, n [8], the modes are chosen as the Klargest responses n the map. The covarances are parametrzed sotropcally, wth ther varance heurstcally set as the scaled dstance to the closest mode n the prevous teraton of the CLM fttng algorthm. Such an approxmaton allows an effcent estmate of the GMM parameters wthout the need for a costly EM procedure at the cost of a poorer approxmaton of the true response map. The second drawback of the GMM response map approxmaton s that the approxmated objectve n Equaton (15) s multmodal. As such, CLM fttng wth the GMM smplfcaton s prone to termnatng n local optma. Although good results were reported n [8], n that work the PDM was parameterzed usng a mxture model as opposed to the more typcal Gaussan parameterzaton, whch places a stronger pror on the way the shape can vary. 3. Subspace Constraned MeanShfts Rather than approxmatng the response maps for each PDM landmark usng parametrc models, we consder here the use of a nonparametrc representaton. In partcular, we propose the use of a homoscedastc kernel densty estmate (KDE) wth an sotropc Gaussan kernel: p(l = algned x) α µ N (x; µ,σ 2 I), (21) µ Ψ x c where α µ s the normalzed true detector response defned n Equaton (11). Wth ths representaton the kernel centers are fxed as defned through Ψ x c (.e. the grd nodes of the search wndow). The mxng weghts, α µ, can be obtaned drectly from the true response map. Snce the response s an estmate of the probablty that a partcular locaton n the mage s the algned landmark locaton, such a choce for the mxng coeffcents s reasonable. Compared to parametrc representatons, KDE has the advantage that no nonlnear optmzaton s requred to learn the parameters of ts representaton. The only remanng free parameter s the varance of the Gaussan kernel, σ 2, whch regulates the smoothness of the approxmaton. Snce one of the man problems wth a GMM based representaton s the computatonal complexty and suboptmal nature of fttng a mxture model to the response maps, f σ 2 s set apror, then optmzng over the KDE can be expected to be more stable and effcent. Maxmzng the objectve n Equaton (4) wth a KDE representatons s nontrval as the objectve s nonlnear and typcally multmodal. However, n the case where no shape pror s placed on the way the PDM s landmarks can vary, the problem reverts to ndependent maxmzatons of the KDE for each landmark locaton separately. Ths s because the landmark detectons are assumed to be ndependent, condtoned on the PDM s parameterzaton. A common approach for maxmzaton over a KDE s to use the well known meanshft algorthm [1]. It conssts of fxed pont teratons of the form: ( ) (τ +1) x α µ N x (τ ) ; µ,σ 2 I ( )µ, µ Ψ x c y Ψ x α c y N x (τ ) ; y,σ 2 I (22) where τ denotes the tmestep n the teratve process. Ths fxed pont teraton scheme fnds a mode of the KDE, where an mprovement s guaranteed at each step by vrtue of ts nterpretaton as a lower bound maxmzaton [6]. Compared to other optmzaton strateges, meanshft s an attractve choce as t does not use a step sze parameter or a lne search. Equaton (22) s smply appled teratvely untl some convergence crteron s met. To ncorporate the shape model constrant nto the optmzaton procedure, one mght consder a two step strategy: () compute the meanshft update for each landmark, and () constran the meanshfted landmarks to adhere to the PDM s parameterzaton usng a leastsquares ft: Q(p) = n x x (τ +1) 2. (23) Ths s remnscent of the ASM optmzaton strategy, where the locaton of the response map s peak s replaced wth the meanshfted estmate. Although such a strategy s attractve n ts smplcty, t s unclear how t relates to the global
5 Algorthm 1 Subspace Constraned MeanShfts Requre: I and p. 1: whle not converged(p) do 2: Compute responses {Eqn. (2)} 3: Lnearze shape model {Eqn. (6)} 4: Precompute pseudonverse of Jacoban (J ) 5: Intalze parameter updates: p 0 6: whle not converged( p) do 7: Compute meanshfted landmarks {Eqn. (22)} 8: Apply subspace constrant {Eqn. (24)} 9: end whle 10: Update parameters: p p + p 11: end whle 12: return p Fgure 3. Illustraton of a the use of a precomputed grd for effcent meanshft. Kernel evaluatons are precomputed between c and all other nodes n the grd. To approxmate the true kernel evaluaton, x s assumed to concde wth c and the lkelhood of any response map grd locaton can be attaned by a table lookup. objectve n Equaton (4). Gven the form of the KDE representaton n Equaton (21), one can treat t smply as a GMM. As such, the dscussons n 2.2 on GMMs are drectly applcable here, replacng the number of canddates K wth the number of grd nodes n the search wndow Ψ x c, the mxture weghts π k wth α µ, and the covarances Σ k wth the scaled dentty σ 2 I. When usng the lnearzed shape model n Equaton (6) and maxmzng the global objectve n Equaton (4) usng the EM algorthm, the soluton for the so called Q functon of the Mstep takes the form: [ p = J (τ +1) x 1 x c 1 ;... ; x (τ +1) n ] x c n, (24) where J (τ +1) denotes the pseudonverse of J, and x s the mean shfted update for the th landmark gven n Equaton (22). Ths s smply the Gauss Newton update for the least squares PDM constrant n Equaton (23). As such, under a lnearzed shape model, the two step strategy for maxmzng the objectve n Equaton (4) wth a KDE representaton shares the propertes of a general EM optmzaton, namely: provably mprovng and convergent. The complete fttng procedure, whch we wll refer to as subspace constraned meanshfts (SCMS), s outlned n Algorthm 1. In the followng, two further nnovatons are proposed, whch address dffcultes regardng local optma and the computatonal expense of kernel evaluatons. Kernel Wdth Relaxaton: The response map approxmatons dscussed n 2.2 can be though of as a form of smoothng. Ths explans the relatve performance of the varous methods. The Gaussan approxmatons smooth the most but approxmate the true response map the poorest, whereas smoothng effected by the GMM s not as aggressve but exhbts of a degree of senstvty towards local optma. One mght consder usng the Gaussan and GMM approxmatons n tandem, where the Gaussan approxma ton s used to get wthn the convergence basn of the GMM approxmaton. However, such an approach s nelegant and affords no guarantee that the mode of the Gaussan approxmaton les wthn the convergence basn of the GMM s. Wth the KDE approxmaton n SCMS a more elegant approach can be devsed, whereby the complexty of the response map estmate s drectly controlled by the varance of the Gaussan kernel (see Fgure 2). The gudng prncple here s smlar to that of optmzng on a Gaussan pyramd. It can be shown that when usng Gaussan kernels, there exsts a σ 2 < such that the KDE s unmodal, regardless of the dstrbuton of samples [13]. As σ 2 s reduced, modes dvde and smoothness of the objectve s terran decreases. However, t s lkely that the optmum of the objectve at a larger σ 2 s closest to the desred mode of the objectve wth a smaller σ 2, promotng ts convergence to the correct mode. As such, the polcy under whch σ 2 s reduced acts to gude optmzaton towards the global optmum of the true objectve. Drawng parallels wth exstng methods, as σ 2 the SCMS update approaches the soluton of a homoscedastc Gaussan approxmated objectve functon. As σ 2 s reduced, the KDE approxmaton resembles a GMM approxmaton, where the approxmaton for smaller σ 2 settngs s smlar to a GMM approxmaton wth more modes. Precomputed Grd: In the KDE representaton of the response maps, the kernel centers are placed at the grd nodes defned by the search wndow. From the perspectve of GMM fttng, these kernels represent canddates for the true landmark locatons. Although no optmzaton s requred for determnng the number of modes, ther centers and mxng coeffcents, the number of canddates used here s much larger than what would typcally be used n a general GMM estmate (.e. GMM based representatons typcally use K<10, whereas the search wndow sze typcally has > 100 nodes). As such, the computaton of the posteror n Equaton (16) wll be more costly. However, f the var
6 ProportonofImages ASM(88ms) CQF(98ms) GMM(2410ms) KDE(121ms) MultPeFttngCurve ProportonofImages XM2VTSFttngCurve ASM(84ms) CQF(93ms) GMM(2313ms) KDE(117ms) ShapeRMSError ShapeRMSError (a) (b) Fgure 4. Fttng Curves for the ASM, CQF, GMM and KDE optmzaton strateges on the MultPe and XM2VTS databases. ance σ 2 s known apror, then some approxmatons can be made to sgnfcantly reduce computatonal complexty. The man overhead when computng the meanshft update s n evaluatng the Gaussan kernel between the current landmark estmate and every grd node n the response map. Snce the grd locatons are fxed and σ 2 s assumed to be known, one mght choose to precompute the kernel for varous settngs of x. In partcular, a smple choce would be to precompute these values along a grd sampled at or above the resoluton of the response map grd Ψ x c. Durng fttng one smply fnds the locaton n ths grd closest to the current estmate of a PDM landmark and estmate the kernel evaluatons by assumng the landmark s actually placed at that node (see Fgure 3). Ths only nvolves a table lookup and can be performed effcently. The hgher the granularty of the grd the better the approxmaton wll be, at the cost of greater storage requrements but wthout a sgnfcant ncrease n computatonal complexty. Although such an approxmaton runs the strctly mprovng propertes of EM, we emprcally show n 4 that accurate fttng can stll be acheved wth ths approxmaton. In our mplementaton, we found that such an approxmaton reduced the average fttng tme by one half. 4. Experments Database Specfc Experments: We compared the varous CLM optmzatons strateges dscussed above on the problem of generc frontal face fttng on two databases: () the CMU Pose, Illumnaton and Expresson Database (Mult Pe) [7], and () the XM2VTS database [12]. The MultPe database s annotated wth a 68pont markup used as ground truth landmarks. We used 762 frontal face mages of 339 subjects. The XM2VTS database conssts of 2360 frontal face mages of 295 subjects for whch ground truth annotatons are publcly avalable but dfferent from the 68pont markup we have for MultPe. XM2VTS contans neutral expresson only whereas MultPe contans sgnfcant expresson varatons. A 4fold cross valdaton was performed on both MultPe and XM2VTS, separately, where the mages were parttoned nto three sets of nonoverlappng subject denttes. In each tral, three parttons were used for tranng and the remander for testng. On these databases we compared four types of optmzaton strateges: () ASM [2], () CQF [16], () GMM [8], and (v) the KDE method proposed n 3. For GMM, we emprcally set K =5and used the EM algorthm to estmate the parameters of the mxture model. For KDE, we used a varance relaxaton polcy of σ 2 = {20, 10, 5, 1} and a grd spacng of 0.1pxels n ts effcent approxmaton. In all cases the lnear logstc regressor descrbed n 2.1 was used. The local experts were (11 11)pxels n sze and the exhaustve local search was performed over a (15 15) pxel wndow. As such, the only dfference between the varous methods compared here s ther optmzaton strategy. In all cases, the scale and locaton of the model was ntalzed by an offtheshelf face detector, the rotaton and nonrgd parameters n Equaton (1) set to zero (.e. the mean shape), and the model ft untl the optmzaton converged. Results of these experments can be found n Fgure 4, where the graphs (fttng curves) show the proporton of mages at whch varous levels of maxmum perturbaton was exhbted, measured as the rootmeansquared (RMS) error between the ground truth landmarks and the resultng ft. The average fttng tmes for the varous methods on a 2.5GHz Intel Core 2 Duo processor are shown n the legend. The results show a consstent trend n the relatve performance of the four methods. Frstly, CQF has the capac
7 ShapeRMSError ASM GMM CQF KDE Frame Fgure 5. Top row: Trackng results on the FGNet Talkng Face database for frames {0, 1230, 4200}. Clockwse from top left are fttng results for ASM, CQF, KDE and GMM. Bottom: Plot of shape RMS error from ground truth annotatons throughout the sequence ty to sgnfcantly outperform ASM. As dscussed n 2.2 ths s due to CQF s ablty to account for drectonal uncertanty n the response maps as well as beng more robust towards outlyng responses. However, CQF has a tendency to oversmooth the response maps, leadng to lmted convergence accuracy. GMM shows an mprovement n accuracy over CQF as shown by the larger number of samples that converged to smaller shape RMS errors. However, t has the tendency to termnate n local optma due to ts multmodal objectve. Ths can be seen by ts poorer performance than CQF for reconstructons errors above 4.2 pxels RMS n MultPe and 5pxels RMS n XM2VTS. In contrast, KDE s capable of attanng even better accuraces than GMM but stll retans a degree of robustness towards local optma, where ts performance over grossly msplaced ntalzatons s comparable to CQF. Fnally, despte the sgnfcant mprovement n performance, KDE exhbts only a modest ncrease n computatonal complexty compared to ASM and CQF. Ths s n contrast to GMM that requres much longer fttng tmes, manly due to the complexty of fttng a mxture model to the response maps. OutofDatabase Experments: Testng the performance of fttng algorthms on mages outsde of a partcular database s more meanngful as t gves a better ndcaton on how well the method generalzes. However, ths s rarely conducted as t requres the tedous process of annotatng new mages wth the PDM confguraton of the tranng set. Here, we utlze the freely avalable FGNet talkng face sequence 3. Quanttatve analyss on ths sequence s possble snce ground truth annotatons are avalable n the same format as that n XM2VTS. We ntalze the model usng a face detector n the frst frame and ft consecutve frames usng the PDM s confguraton n the prevous frame as an ntal estmate. The same model used n the databasespecfc experments was used here, except that t was traned on all mages n XM2VTS. In Fgure 5, the shape RMS error for each frame s plotted for the four optmzaton strateges beng compared. The relatve performance of the varous strateges s smlar to that n the databasespecfc experments, wth KDE yeldng the best performance. ASM and GMM are partcularly unstable on ths sequence, wth GMM loosng track at around frame 4200, and fals to recover untl the end of the sequence. Fnally, we performed a qualtatve analyss of KDE s performance on the Faces n the Wld database [9]. It contans mages taken under varyng lghtng, resoluton, mage nose and partal occluson. As before, the model was ntalzed usng a face detector and ft usng the XM2VTS TalkngFace/talkng_face.html
8 Fgure 6. Example fttng results on the Faces n the Wld database usng a model traned usng the XM2VTS database. Top row: Male subjects. Mddle row: female subjects. Bottom row: partally occluded faces. traned model. Some fttng results are shown n Fgure 6. Results suggest that KDE exhbts a degree of robustness towards varatons typcally encountered n real mages. 5. Concluson The optmzaton strategy for deformable model fttng was nvestgated n ths work. Varous exstng methods were posed wthn a consstent probablstc framework where they were shown to make dfferent parametrc approxmatons to the true lkelhood maps of landmark locatons. A new approxmaton was then proposed that uses a nonparametrc representaton. Two further nnovatons were proposed n order to reduce computatonal complexty and avod local optma. The proposed method was shown to outperform three other optmzaton strateges on the task of generc face fttng. Future work wll nvolve nvestgatons nto the effects of dfferent local detectors types and shape prors on the optmzaton strateges. References [1] Y. Cheng. Mean Shft, Mode Seekng, and Clusterng. PAMI, 17(8): , [2] T. F. Cootes and C. J. Taylor. Actve Shape Models  Smart Snakes. In BMVC, pages , [3] D. Crstnacce and T. Cootes. Boosted Actve Shape Models. In BMVC, volume 2, pages , [4] D. Crstnacce and T. F. Cootes. A Comparson of Shape Constraned Facal Feature Detectors. In FG, pages , [5] D. Crstnacce and T. F. Cootes. Feature Detecton and Trackng wth Constraned Local Models. In EMCV, pages , [6] M. Fashng and C. Tomas. Mean Shft as a Bound Optmzaton. PAMI, 27(3), [7] R. Gross, I. Matthews, S. Baker, and T. Kanade. The CMU Multple Pose, Illumnaton and Expresson (MultPIE) Database. Techncal report, Robotcs Insttute, Carnege Mellon Unversty, [8] L. Gu and T. Kanade. A Generatve Shape Regularzaton Model for Robust Face Algnment. In ECCV 08, [9] G. B. Huang, M. Ramesh, T. Berg, and E. LearnedMller. Labeled Faces n the Wld: A Database for Studyng Face Recognton n Unconstraned Envronments. Techncal Report 0749, Unversty of Massachusetts, Amherst, [10] X. Lu. Generc Face Algnment usng Boosted Appearance Model. In CVPR, pages 1 8, [11] I. Matthews and S. Baker. Actve Appearance Models Revsted. IJCV, 60: , [12] K. Messer, J. Matas, J. Kttler, J. Lüttn, and G. Matre. XM2VTSDB: The Extended M2VTS Database. In AVBPA, pages 72 77, [13] M. A. C.P. nán and C. K. I. Wllams. On the Number of Modes of a Gaussan Mxture. Lecture Notes n Computer Scence, 2695: , [14] M. H. Nguyen and F. De la Torre Frade. Local Mnma Free Parameterzed Appearance Models. In CVPR, [15] J. Saragh and R. Goecke. A Nonlnear Dscrmnatve Approach to AAM Fttng. In ICCV, [16] Y. Wang, S. Lucey, and J. Cohn. Enforcng Convexty for Improved Algnment wth Constraned Local Models. In CVPR, 2008.
Boosting as a Regularized Path to a Maximum Margin Classifier
Journal of Machne Learnng Research 5 (2004) 941 973 Submtted 5/03; Revsed 10/03; Publshed 8/04 Boostng as a Regularzed Path to a Maxmum Margn Classfer Saharon Rosset Data Analytcs Research Group IBM T.J.
More informationWho are you with and Where are you going?
Who are you wth and Where are you gong? Kota Yamaguch Alexander C. Berg Lus E. Ortz Tamara L. Berg Stony Brook Unversty Stony Brook Unversty, NY 11794, USA {kyamagu, aberg, leortz, tlberg}@cs.stonybrook.edu
More informationAsRigidAsPossible Image Registration for Handdrawn Cartoon Animations
AsRgdAsPossble Image Regstraton for Handdrawn Cartoon Anmatons Danel Sýkora Trnty College Dubln John Dnglana Trnty College Dubln Steven Collns Trnty College Dubln source target our approach [Papenberg
More information(Almost) No Label No Cry
(Almost) No Label No Cry Gorgo Patrn,, Rchard Nock,, Paul Rvera,, Tbero Caetano,3,4 Australan Natonal Unversty, NICTA, Unversty of New South Wales 3, Ambata 4 Sydney, NSW, Australa {namesurname}@anueduau
More informationAsRigidAsPossible Shape Manipulation
AsRgdAsPossble Shape Manpulaton akeo Igarash 1, 3 omer Moscovch John F. Hughes 1 he Unversty of okyo Brown Unversty 3 PRESO, JS Abstract We present an nteractve system that lets a user move and deform
More informationDo Firms Maximize? Evidence from Professional Football
Do Frms Maxmze? Evdence from Professonal Football Davd Romer Unversty of Calforna, Berkeley and Natonal Bureau of Economc Research Ths paper examnes a sngle, narrow decson the choce on fourth down n the
More informationMANY of the problems that arise in early vision can be
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004 147 What Energy Functons Can Be Mnmzed va Graph Cuts? Vladmr Kolmogorov, Member, IEEE, and Ramn Zabh, Member,
More informationAssessing health efficiency across countries with a twostep and bootstrap analysis *
Assessng health effcency across countres wth a twostep and bootstrap analyss * Antóno Afonso # $ and Mguel St. Aubyn # February 2007 Abstract We estmate a semparametrc model of health producton process
More informationTurbulence Models and Their Application to Complex Flows R. H. Nichols University of Alabama at Birmingham
Turbulence Models and Ther Applcaton to Complex Flows R. H. Nchols Unversty of Alabama at Brmngham Revson 4.01 CONTENTS Page 1.0 Introducton 1.1 An Introducton to Turbulent Flow 11 1. Transton to Turbulent
More informationTrueSkill Through Time: Revisiting the History of Chess
TrueSkll Through Tme: Revstng the Hstory of Chess Perre Dangauther INRIA Rhone Alpes Grenoble, France perre.dangauther@mag.fr Ralf Herbrch Mcrosoft Research Ltd. Cambrdge, UK rherb@mcrosoft.com Tom Mnka
More informationAlpha if Deleted and Loss in Criterion Validity 1. Appeared in British Journal of Mathematical and Statistical Psychology, 2008, 61, 275285
Alpha f Deleted and Loss n Crteron Valdty Appeared n Brtsh Journal of Mathematcal and Statstcal Psychology, 2008, 6, 275285 Alpha f Item Deleted: A Note on Crteron Valdty Loss n Scale Revson f Maxmsng
More informationComplete Fairness in Secure TwoParty Computation
Complete Farness n Secure TwoParty Computaton S. Dov Gordon Carmt Hazay Jonathan Katz Yehuda Lndell Abstract In the settng of secure twoparty computaton, two mutually dstrustng partes wsh to compute
More informationEnsembling Neural Networks: Many Could Be Better Than All
Artfcal Intellgence, 22, vol.37, no.2, pp.239263. @Elsever Ensemblng eural etworks: Many Could Be Better Than All ZhHua Zhou*, Janxn Wu, We Tang atonal Laboratory for ovel Software Technology, anng
More informationWhy Don t We See Poverty Convergence?
Why Don t We See Poverty Convergence? Martn Ravallon 1 Development Research Group, World Bank 1818 H Street NW, Washngton DC, 20433, USA Abstract: We see sgns of convergence n average lvng standards amongst
More informationFrom Computing with Numbers to Computing with Words From Manipulation of Measurements to Manipulation of Perceptions
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: FUNDAMENTAL THEORY AND APPLICATIONS, VOL. 45, NO. 1, JANUARY 1999 105 From Computng wth Numbers to Computng wth Words From Manpulaton of Measurements to Manpulaton
More informationThe Relationship between Exchange Rates and Stock Prices: Studied in a Multivariate Model Desislava Dimitrova, The College of Wooster
Issues n Poltcal Economy, Vol. 4, August 005 The Relatonshp between Exchange Rates and Stock Prces: Studed n a Multvarate Model Desslava Dmtrova, The College of Wooster In the perod November 00 to February
More informationThe Global Macroeconomic Costs of Raising Bank Capital Adequacy Requirements
W/1/44 The Global Macroeconomc Costs of Rasng Bank Captal Adequacy Requrements Scott Roger and Francs Vtek 01 Internatonal Monetary Fund W/1/44 IMF Workng aper IMF Offces n Europe Monetary and Captal Markets
More informationThe Developing World Is Poorer Than We Thought, But No Less Successful in the Fight against Poverty
Publc Dsclosure Authorzed Pol c y Re s e a rc h Wo r k n g Pa p e r 4703 WPS4703 Publc Dsclosure Authorzed Publc Dsclosure Authorzed The Developng World Is Poorer Than We Thought, But No Less Successful
More informationDISCUSSION PAPER. Is There a Rationale for OutputBased Rebating of Environmental Levies? Alain L. Bernard, Carolyn Fischer, and Alan Fox
DISCUSSION PAPER October 00; revsed October 006 RFF DP 03 REV Is There a Ratonale for OutputBased Rebatng of Envronmental Leves? Alan L. Bernard, Carolyn Fscher, and Alan Fox 66 P St. NW Washngton, DC
More informationDISCUSSION PAPER. Should Urban Transit Subsidies Be Reduced? Ian W.H. Parry and Kenneth A. Small
DISCUSSION PAPER JULY 2007 RFF DP 0738 Should Urban Transt Subsdes Be Reduced? Ian W.H. Parry and Kenneth A. Small 1616 P St. NW Washngton, DC 20036 2023285000 www.rff.org Should Urban Transt Subsdes
More informationcan basic entrepreneurship transform the economic lives of the poor?
can basc entrepreneurshp transform the economc lves of the poor? Orana Bandera, Robn Burgess, Narayan Das, Selm Gulesc, Imran Rasul, Munsh Sulaman Aprl 2013 Abstract The world s poorest people lack captal
More informationShould marginal abatement costs differ across sectors? The effect of lowcarbon capital accumulation
Should margnal abatement costs dffer across sectors? The effect of lowcarbon captal accumulaton Adren VogtSchlb 1,, Guy Meuner 2, Stéphane Hallegatte 3 1 CIRED, NogentsurMarne, France. 2 INRA UR133
More information4.3.3 Some Studies in Machine Learning Using the Game of Checkers
4.3.3 Some Studes n Machne Learnng Usng the Game of Checkers 535 Some Studes n Machne Learnng Usng the Game of Checkers Arthur L. Samuel Abstract: Two machnelearnng procedures have been nvestgated n some
More informationCiphers with Arbitrary Finite Domains
Cphers wth Arbtrary Fnte Domans John Black 1 and Phllp Rogaway 2 1 Dept. of Computer Scence, Unversty of Nevada, Reno NV 89557, USA, jrb@cs.unr.edu, WWW home page: http://www.cs.unr.edu/~jrb 2 Dept. of
More informationEVERY GOOD REGULATOR OF A SYSTEM MUST BE A MODEL OF THAT SYSTEM 1
Int. J. Systems Sc., 1970, vol. 1, No. 2, 8997 EVERY GOOD REGULATOR OF A SYSTEM MUST BE A MODEL OF THAT SYSTEM 1 Roger C. Conant Department of Informaton Engneerng, Unversty of Illnos, Box 4348, Chcago,
More informationWhat to Maximize if You Must
What to Maxmze f You Must Avad Hefetz Chrs Shannon Yoss Spegel Ths verson: July 2004 Abstract The assumpton that decson makers choose actons to maxmze ther preferences s a central tenet n economcs. Ths
More informationIncome per natural: Measuring development as if people mattered more than places
Income per natural: Measurng development as f people mattered more than places Mchael A. Clemens Center for Global Development Lant Prtchett Kennedy School of Government Harvard Unversty, and Center for
More informationFinance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C.
Fnance and Economcs Dscusson Seres Dvsons of Research & Statstcs and Monetary Affars Federal Reserve Board, Washngton, D.C. Banks as Patent Fxed Income Investors Samuel G. Hanson, Andre Shlefer, Jeremy
More informationMULTIPLE VALUED FUNCTIONS AND INTEGRAL CURRENTS
ULTIPLE VALUED FUNCTIONS AND INTEGRAL CURRENTS CAILLO DE LELLIS AND EANUELE SPADARO Abstract. We prove several results on Almgren s multple valued functons and ther lnks to ntegral currents. In partcular,
More informationUPGRADE YOUR PHYSICS
Correctons March 7 UPGRADE YOUR PHYSICS NOTES FOR BRITISH SIXTH FORM STUDENTS WHO ARE PREPARING FOR THE INTERNATIONAL PHYSICS OLYMPIAD, OR WISH TO TAKE THEIR KNOWLEDGE OF PHYSICS BEYOND THE ALEVEL SYLLABI.
More information