, pp.273-282 hp://dx.do.org/10.14257/jsp.2015.8.11.25 Onlne ean Kernel Learnng for Objec Trackng Le L 1, 2, Rung Zhang 3, Jangng Kan 1 and Wenbn L 1 1 School of Technology, Bejng Foresry Unversy, 100083, Bejng, Chna 2 Insue of Aospherc Physcs, Chnese Acadey of Scences, 100029, Bejng, Chna 3 Canvard College, Bejng Technology and Busness Unversy, 101118, Bejng, Chna Correspondng Auhor: kanj@bjfu.edu.cn Absrac Feaures for represenng he arge are he fundaenal ngreden when consrucng he appearance odel n he rackng proble. Only one ype of feaures s ulzed o represen he arge n os curren algorhs. However, he led represenaon of a sngle feaure gh no ress all appearance changes of he arge durng he rackng process. To cope wh hs proble, we propose a novel rackng algorh - ean Kernel Tracker (KT) - o robusly locae he objec. The KT cobnes hree copleenary feaures - Color, HOG (Hsogra of Orened Graden) and LBP (Local Bnary Paern) - o represen he arge. And Exensve experens on publc benchark sequences show KT perfors favorably agans several sae-of-he-ar algorhs. Keywords: Sparse represenaon, objec rackng, onlne ean kernel learnng 1. Inroducon Vsual objec rackng s an poran proble n copuer vson and has any applcaons ncludng raffc onorng, vehcle navgaon and huan copuer nerface (HCI), jus o nae a few. Alhough has been nvesgaed n he pas decades [1, 2, 9-18], desgnng a robus racker o cope wh dfferen objecs n varous scenaros s sll a grea challenge. A very coon dffculy s o ress vsual appearance changes frae by frae due o pose, sudden llunaon changes and paral occluson [3]. Such changes ay ake a racker drf away fro he arge objec. To deal wh appearance changes of he arge, Collns e. al., [4] proposed an onlne echans o pck ou he op-ranked color feaures fro a se of seed feaures based on he objec/background dscrnaon. Hare e. al., [19] appled 6 dfferen ypes of Haarlke feaure arranged on a grd a 2 scales and srucured SV o consruc he objec appearance. Kalal e. al., [20] proposed a robus deecor/racker, TLD (rackng, learnng, deecon), whch cobned he rando fores [8] based deecor and KLT racker [5-21]. The deecor adoped a knd of feaure naed 2-b Bnary Paerns o learn he appearance of he arge. any of he aforeenoned approaches could no deal wh all varous appearance changes sulaneously due o he led represenaon of he sngle feaure used n he syses. One sraegy s o desgn a srong feaure whch s robus o any appearance change of he arge. Unforunaely, s no an easy ask [22-23], especally for suaons where no pror knowledge of he arge s avalable excep for s nal locaon. Anoher sraegy s o adapvely cobne a few copleenary age represenaons (e.g., age feaures based on color, edge and exure nforaon), of whch each feaure descrpor aps he arge no one feaure space and encodes one channel of nforaon of he arge. The copleenary feaures can ress ohers changes due o he ISSN: 2005-4254 IJSIP Copyrgh c 2015 SERSC
appearance changes, e.g., llunaon changes, whch conrbue o a robus syse. Du and Paer [24] negraed ulple feaures, edge, and color n a probablsc fraework. The arge s racked n each feaure by a parcle fler and dfferen parcle flers nerac each oher va a essage passng schee. ore recenly, Kwon and Lee [25] proposed a rackng fraework Vsual Trackng Decoposon (VTD). VTD consruced he appearance odel usng he nal frae and he laes four fraes by eans of sparse prncpal coponen analyss (SPCA) of a se of copleenary feaure eplaes whch deal wh change of pose, lghng. Neverheless, due o he generave represenaon schee, VTD gh no dsngush arge and background and ge wrong updang of appearance odel. In hs paper, o represen he arge well enough, we use hree copleenary age represenaons: Color, hsogra of orened gradens (HOG) [26] and local bnary paern (LBP) [6]. ulple kernel learnng (KL) [27, 7] has been wdely used for feaure cobnaon n he objec deecon [28] and classfcaon [23]. However, due o he sparsy n KL, only one feaure gh be seleced or have a ajor wegh, whch would no aan he goal of feaure cobnaons. Therefore, o cobne feaures effecvely, We ulzed a specal edon of KL, he ean Kernel Learnng, n whch he weghs of dfferen feaures was se anually raher han learned. And hs splfcaon also akes he opzaon solved effcenly. Exensve experens on any benchark vdeos have shown ha our ean Kernel Tracker can ouperfor saeof-he-ar rackng algorhs. The res of hs paper s organzed as follows. In Secon 2, every par of he novel ean Kernel Tracker (KT) s descrbed n deal. Then he novel rackng algorh and pleenaon deals s gven n Secon 3. Experenal resuls and coparson wh oher sae-of-he-ar approaches are presened n Secon 4. Secon 5 suarzes our work. 2. ean Kernel Learnng In hs secon, we gve a deal of how o cobng ulple feaures va he ean Kernel Learnng fraework. The ulple kernel learnng s o learn he weghed cobnaon of dfferen kernels. In hs work, one ype of feaure only corresponds o a kernel, and he kernel and feaure are he sae eanng f no oher explanaon. Gven an age pach x, one can exrac ore han one ype of feaure. And dfferen feaures are cobned as he followng forulaon: k ( x, y ) d k ( f ( x ), f ( x )) 1 j 1 d 1, d 0, 1 ; where f ( x ) s he h feaure of age pach x, he wegh d sands for he h porance of he kernel k ( f ( x ), f ( x )) and s he oal nuber of feaures. j Aong any exsng KL learnng algorhs, Sple KL [7] enables o learn he wegh d very effcenly. And d s learn va opzng he followng opzaon proble: (1) 1 1 n 2 d 1 1 1 T w 2 C s.. y ( w ( f ( x )) b ) 1, 1 N ; N (2) 1 d 1, d 0, 1 ; 274 Copyrgh c 2015 SERSC
where y s he label of x, ( f ( x )) s he plc appng of f ( x ), C s he penaly paraeer, s he slack facor, b s he bas and N s he oal nuber of ranng exaples. How o solve Sple KL effcenly can be found n deal n [7]. Due o he sparse penaly on d, Sple KL ofen assgns a ajor wegh o one feaure. Ths would no aan he goal of feaure cobnaons o ress he appearance changes of he arge. Therefore, we adoped a sple edon of Sple KL, ean Kernel Learnng o learn he appearance odel. In ean Kernel Learnng, he weghs of dfferen kernels are anually se fxed d 1 raher han learned auoacally. Therefore, Proble (2) can be splfed as follows: 1 1 n 2 1 1 1 T w 2 C s.. y ( w ( f ( x )) b ) 1, 1 N ; N By nroducng Lagrange ulplers, we ge he dual of ean Kernel SV: 1 T a n a Y K Y a a a 2 N 1 1 s..( a y 0, 0 a C, 1 N ; N (3) (4) where 1 2 Y d a g ( y, y y N ), and K 1 K ( f ( x ), f ( x )) j 1 j. Ths opzaon proble can be solved easly usng he ool LIBSV [29]. The fnal decson funcon of ean Kernel Learnng can be expressed as: N 1 F ( x ) a y K ( f ( x ), f ( x )) b j 1 1 In Proble (4), he opzaon proble s he sae as he classcal SV. However, he dfference s ha he affny arx K n ean Kernel SV s an average of dfferen kernel funcon. Ths cobnes dfferen feaures sealessly and resss he changes of one feaure channel and ake he syse ore robus. (5) 3. Deals of he Ipleenaon In hs secon, we gve he pleenaon deals of our rackng algorh. And hen we suarze he rackng algorh. 3.1. Preparaon of Tranng Ses Le l denoe he cener poson of he arge n he h frae. And x s an age pach and l ( x ) s he colun vecor of s cener poson n frae. Abou n exaples are randoly sapled fro Z x : l ( x ) l l l o for he negave ranng se n P In our rackng syse, we se he posve ranng X x : l ( x ) l a l X., whch eans all age paches around he arge n he curren frae are sapled as posve ones. Afer consrucng he ranng se, all he paches are noralzed o he sae sze (32 32 n our experens) for effcency. Copyrgh c 2015 SERSC 275
3.2. The Feaures for Trackng Color Feaure conans rch vsual nforaon, bu s easly affeced by large llunaon changes. The age pach s down-sapled no 16 16, and he pxel value s concaenaed and fored as he correspondng feaure vecor. Edge Feaure s ore robus han color feaure o llunaon changes bu ay no be adapve o background cluer scene. In our syse, HOG feaure [26] s exraced on he noralzed paches and we use he pleenaon of HOG n Open CV. Texure Feaure can capure sall deal of he arge. LBP feaure [6] s used for s effcency and ressance o lghng effecs. The noralzed age pach s dvded no 2 2 blocks. The LBP feaure of each block s exraced and concaenaed no a longer feaure vecor. 3.3. Paraeer Sengs In our rackng syse, all he paraeer sengs are fxed n our experens. a 3, 16, 40, he penaly paraeer C 0.5 and abou n 65 negave exaples are l l sapled. The Guassan Kernel s used for easurng he slary of age paches n he feaure space. And he bandwdh s 0.05, 0.1 and 0.2 for Color, HOG and LBP feaures respecvely. The enre procedure of he proposed rackng algorh, ean Kernel Tracker (KT), s suarzed n Algorh 1. Algorh 1 ean Kernel Tracker Inpu: Fraes 1, 2, 3,..., and he nal locaon, l, of he objec n frae 1; 1 Ou Pu: objec locaons, l, l..., n he subsequen fraes. 2 3 1: 1 ; P n 2: Consruc he ranng ses, X and X (Secon 3-A); 3: Exrac he Color, HOG and LBP feaures (Secon 3-B); 4: Learn he appearance odel F ( x ) (Proble(4) and Eq.(5)); 5: Randoly saple 500 canddaes wh scale change around 6: Exrac feaures and fnd he pach x wh he axu score F ( x ) usng he classfer funcon F ( x ) (Eq.(5)); 7: 1, go o 2. 4. Experens We pleen our ean Kernel Tracker (KT) n C++ language and evaluae s perforance on publc challengng age sequences. KT s also quanavely copared wh soe sae-of-he-ar rackng algorhs, FragTrack (Frag) [9], ILTracker (IL) [15], VTD [25], VTS [17] and Sruck [19]. Ther codes or execuable prograes are publcly avalable and he paraeers are urned fnely. All algorhs are copared n ers of he sae nal posons n he frs frae. And he rackng resuls gh slghly dfferen as saed n her papers, for he nal posons n soe sequences were se as n ILTracker [15]. We also pleen he Sple KL Tracker (SKT). The weghs of dfferen feaures n SKT are learn auoacally, whch s he only dfference o KT. To evaluae he perforance aong he above algorhs and ours, he cener errors beween he rackng resuls and he ground ruh are calculaed and repored on age sequences. Due o he randoness n our rackng algorh, we run our racker fve es on every sequence and average he resuls. l ; 276 Copyrgh c 2015 SERSC
Throughou he experens, we evaluaed all approaches on he followng sequences: Sylveser, Davd, Grl,FaceOcc, FaceOcc2, Tger1, Tger2, CokeCan and Shakng. The frs egh sequences are fro [15] and he las one fro [25]. Illunaon Change s he coon change n he rackng proble, whch gh conrbue o he arge n he new frae draacally dsslar as before. Ths dsslary can effec he feaures based on color nensy (e.g., Haar-lke feaure) uch ore han he feaures based on edges. Therefore, he cobnaon of copleenary feaures ay ress he llunaon change beer han sngle feaures. In coparson wh ILTracker and VTS and Sruc Tracker, our racker s ore robus agans llunaon change. In Sequence Davd (Fgure 1(a)), ILTracker nroduced rackng errors snce frae 46 when he arge oved fro he darker place o a brgher one. Ths phenoenon s uch ore obvous a frae 394 due o he sudden change of llunaon. Bu our racker can rack he arge very well. Ths can also be found n Sequence CokeCan (Fgure 1(e)) snce frae 117 and Sequence Shakng (Fgure 1(f)) snce frae 117. ILTracker and Sruc Tracker perfored unwell whle our racker sll held he arge ghly. VTS worked beer on Sequence Shakng han our racker bu oally los he arge on Sequence CokeCan. Occluson s a very challengng proble for robus rackng. Sruc Tracker worked slghly beer han ours on Sequence FaceOcc and FaceOcc2 (Fgure 1(c)) because he arge shows lle appearance change excep for occluson. However, f here exsed changes of arge appearances, our racker worked uch beer on Sequence Tger1 (Fgure 1(d)) and Tger2. When occluson happened, he racker gh faled f we only used recen a few frae o updae he appearance odel. Ths could be found n VTS on Sequence Tger1 (Fgure 1(d)) and CokeCan (Fgure 1(e)). Bu our rackng algorh exploed he racked resuls, hs could keep he os valuable nforaon we have goen. Even f he occluson happened, he ranng se sll had he arge nforaon for odelng he appearance. When he arge reappeared, he racker sll could cach up he arge. Ths could be found On Sequance Grl (Fgure 1(b)) and CokeCan (Fgure 1(e)). Table 1. Average Cener Errors beween he Trackng Resuls and her Ground Truh. The bold represens he bes resul, and he underlned he second Bes Sequence Frag IL VTD VTS Sruck SKT KT Sylveser 11 11 20 18 17 7 6 Davd 46 23 14 13 54 7 5 Grl 27 32 13 12 10 23 12 FaceOcc 6 27 11 11 9 14 10 FaceOcc2 45 20 7 9 7 13 9 Tger1 40 15 30 12 20 5 4 Tger2 38 17 20 20 9 9 5 CokeCan 63 21 46 48 9 8 7 Shakng 84 38 8 7 52 55 17 fps 0.5-1 6-8 0.3-0.5 0.3-1 2-4 1-2 1-2 Background Cluer gh hjacked he racker, for he coplex envronen gh conan a blob ha gh be slar o he arge n one sngle feaure space. Ths could be found n Sruc Tracker n Sequence Davd (Fgure 1(a)). The racker slghly go wrong updae a frae 206 and oally los he arge a frae 295. Sruc Tracker also los he arge n Sequence Tger1 (Fgure 1(d)) a frae 245. ILTracker and VTS also perfored poor perforance on Tger1 (Fgure 1(d)) and CokeCan (Fgure 1(e)). On hese background cluer sequences, our racker worked beer han he ohers, for copleenary feaures represened he arge well enough. Copyrgh c 2015 SERSC 277
The whole quanave coparsons are shown n Table 1 and Fgure 2. I can be concluded ha our rack was alos always ore robus han ohers. (a) Davd (b) Grl (c) FaceOcc2 278 Copyrgh c 2015 SERSC
(d) Tger1 (e) CokeCan (f) Shakng Fgure 1. Represenave Fraes of 6 Sequences: cyan, blue, green and red objec boundng boxes generaed by ILTracker [15], VTS [17], Sruc Tracker [19] and our KT, respecvely. Ths fgure s bes vewed n color Copyrgh c 2015 SERSC 279
280 Copyrgh c 2015 SERSC
Fgure 2. The Pxel Errors for Every Sequence Tesed n he Experens. Ths Fgure s Bes Vewed n Color 5. Concluson In hs paper, a novel rackng algorh - he ean Kernel Tracker (KT) - has been proposed. KT consrucs he appearance odel by cobnng hree copleenary feaures - Color, HOG and LBP - o ress dfferen appearance changes. Through experens on publc benchark vdeos, our rackng algorh can rack objecs very well under llunaon change, heavy occluson, background cluer and even backup fro rackng falures, and he KT can perfor favorably agans several sae-of-hear algorhs. Acknowledgen Ths work was suppored by Bejng Hgher Educaon Young Ele Teacher Projec (NO. YETP1949) and Naonal Naural Naural Scence Foundaon of Chna (Gran No. 30901164). References [1]. Yang, Z. Fan, J. Fan, and Y. Wu, Trackng nonsaonary vsual appearances by daa-drven adapaon, IEEE TIP, vol. 18, (2009), pp. 1633 1644. [2]. Tang and X. Peng, Robus rackng wh dscrnave rankng lss, IEEE TIP, vol. 21, no. 7, (2012), pp. 3273 3281. [3] A. Ylaz, O. Javed, and. Shah, Objec rackng: A survey, AC Copu. Surv, vol. 38, (2006). [4] R. Collns, Y. Lu, and. Leordeanu, Onlne selecon of dscrnave rackng feaures, IEEE PAI, vol. 27, no. 10, (2005), pp. 1631 1643. [5] S. Baker and I. ahews, Lucas-kanade 20 years on: A unfyng fraework, IJCV, vol. 56, (2004), pp. 221 255. [6] T. Ojala,. Pekanen, and T. aenpaa, ulresoluon gray-scale and roaon nvaran exure classfcaon wh local bnary paerns, IEEE PAI, vol. 24, no. 7, (2002), pp. 971 987. [7] A. Rakooaonjy, F. Bach, S. Canu, and Y. Grandvale, Splekl, JLR, vol. 9, (2008), pp. 2491 2521. [8] L. Brean, Rando foress, achne learnng, vol. 45, no. 1, (2001), p. 5 32. [9] A. Ada, E. Rvln, and I. Shshon, Robus fragens-based rackng usng he negral hsogra, n CVPR, (2006). [10] H. Grabner and H. Bschof, On-lne boosng and vson, n CVPR, (2006). [11] Y. Wu and T. Huang, A co-nference approach o robus vsual rackng, n ICCV, (2001). [12] H. Grabner, C. Lesner, and H. Bschof, Se-supervsed on-lne boosng for robus rackng, n ECCV, (2008). [13] D. Chen, J. Zhang, and. Tang, Rando pach based vdeo rackng va boosng he relave spaces, n ICASSP, (2009). [14] J. Zhang, D. Chen, and. Tang, Cobnng dscrnave and descrpve odels for rackng, n ACCV, (2009). [15] B. Babenko,.-H. Yang, and S. Belonge, Vsual rackng wh onlne ulple nsance learnng, n CVPR, (2009). Copyrgh c 2015 SERSC 281
[16] Y. Ba and. Tang, Robus vsual rackng va rankng sv, n ICIP, (2011). [17] J. Kwon and K.. Lee, Trackng by saplng rackers, n ICCV, (2011). [18] Y. Ba and. Tang, Robus rackng va weakly supervsed rankng sv, n CVPR, (2012). [19] P. H. S. T. Sa Hare and A.Saffar, Sruck: Srucured oupu rackng wh kernels, n ICCV, (2011). [20] Z. Kalal, J. aas, and K. kolajczyk, P-n learnng: Boosrappng bnary classfers by srucural consrans, n CVPR, (2010). [21] J. Sh and C. Toas, Good feaures o rack, n CVPR, (1994). [22]. Vara and D. Ray, Learnng he dscrnave power-nvarance rade-off, n ICCV, (2007). [23] P. Gehler and S. Nowozn, On feaure cobnaon for ulclass objec classfcaon, n ICCV, (2009). [24] W. Du and J. Paer, A probablsc approach o negrang ulple cues n vsual rackng, n ECCV, (2008). [25] J. Kwon and K.. Lee, Vsual rackng decoposon, n CVPR, (2010). [26] N. Dalal and B. Trggs, Hsogras of orened gradens for huan deecon, n CVPR, (2005). [27] F. R. Bach, G. R. G. Lanckre, and. I. Jordan, ulple kernel learnng, conc dualy, and he so algorh, n ICL, (2004). [28] A. Vedald, V. Gulshan,. Vara, and A. Zsseran, ulple kernels for objec deecon, n ICCV, (2009). [29] C.-C. Chang and C.-J. Ln, LIBSV: A lbrary for suppor vecor achnes, AC IST, (2011). Auhors Le L was born n JLn Provnce, Chna, n 1980. He sudes n Technology Bejng Foresry Unversy, and separaely ganed a bachelor's degree n 2005 and a aser's degree n 2008. Now, he s a docorae canddae n he sae unversy. Hs research neress anly nclude nforaon secury, age processng age copresson. Rung Zhang was born n HeBe Provnce, Chna, n 1981. He sudes on Appled aheacs n Chna Agrculure Unversy, and ganed a aser's degree n 2006. Then he works n Canvard college, Bejng Technology and Busness Unversy. Hs research neress anly nclude achne Learnng, Suppor vecor achne. Jangng Kan, Correspondng auhor of hs paper, receved n PhD degree n foresry engneerng fro Bejng Foresry Unversy, P.R.Chna n 2009. Currenly, he s an assocae professor n Bejng Foresry Unversy. Hs research neress nclude copuer vson and nellgen conrol. Wenbn L receved.s. and Ph.D. degrees n Shzuoka Unversy and Ehe Unversy, Japan, n 1987, and 1990, respecvely. Sarng 1992, he was a faculy wh he School of Technology, Bejng Foresry Unversy and was prooed o be a professor n 1996, Ph.D. supervsor. Hs curren research neress nclude fores achnery auoaon and nellgen. 282 Copyrgh c 2015 SERSC