Online Multi-Class LPBoost

Online Muli-Class LPBoos Amir Saffari Marin Godec Thomas Pock Chrisian Leisner Hors Bischof Insiue for Compuer Graphics and Vision, Graz Universiy of Technology, Ausria {saffari,godec,pock,leisner,bischof}@icg.ugraz.a Absrac Online boosing is one of he mos successful online learning algorihms in compuer vision. While many challenging online learning problems are inherenly muli-class, online boosing and is varians are only able o solve binary asks. In his paper, we presen Online Muli-Class LPBoos (OMCLP) which is direcly applicable o muli-class problems. From a heoreical poin of view, our algorihm ries o maximize he muli-class sof-margin of he samples. In order o solve he LP problem in online seings, we perform an efficien varian of online convex programming, which is based on primal-dual gradien descen-ascen updae sraegies. We conduc an exensive se of experimens over machine learning benchmark daases, as well as, on Calech101 caegory recogniion daase. We show ha our mehod is able o ouperform oher online muliclass mehods. We also apply our mehod o racking where, we presen an inuiive way o conver he binary racking by deecion problem o a muli-class problem where background paerns which are similar o he arge class, become virual classes. Applying our novel model, we ouperform or achieve he sae-of-he-ar resuls on benchmark racking videos. 1. Inroducion Online learning is an area of machine learning concerned wih esimaion problems wih limied access o he enire daa domain. I is a sequenial decision making ask where he objecives for he learner This work has been suppored by he Ausrian FFG projec EVis (813399) and Oulier (820923) under he FIT-IT program and he Ausrian Science Fund (FWF) under he docoral program Confluence of Vision and Graphics W1209. are revealed over ime. Classical online learning problems can be formulaed as a game beween he learner and an adversary environmen (or eacher). In his repeaed game, a any ime he following seps happen: 1) The environmen chooses a new sample x R d. 2) The learner responds wih is predicion ŷ. 3) The environmen reveals he label of he sample y. 4) The learner suffers he loss l and updaes is model. The goal of he learner is o achieve a low cumulaive loss over ime by updaing is inernal represenaion of he problem. Online learning is an essenial ool for learning from dynamic environmens, from very large scale daases, or from sreaming daa sources. I has been sudied exensively in he machine learning communiy (for a comprehensive overview we refer o [5, 25] and references herein). In compuer vision, online learning has been used in applicaions such as objec recogniion [14, 4], objec deecion [22, 30] and racking [8, 16, 13]. Hisorically, Oza and Russell [21] were he firs o exend he AdaBoos [11] o operae in an online learning scenario. Their formulaion and many varians have been used in various compuer vision applicaions [16, 13, 22, 30, 18]. Almos all recen work on online boosing algorihms focus on binary decision problems, while many ineresing problems are inherenly muli-class. These algorihms ackle he muli-class problems by using a se of decomposed binary asks, usually obained by ypical approaches like 1-vs.-all, 1-vs.-1, and error correcing oupu codes [2]. However, such approaches have major drawbacks. Firs, by considering only he binary sub-problems, he algorihms ofen fail o compleely capure he rue srucures and relaions beween he classes in he feaure space. In online learning asks, his problem is even more severe because he learner only has access o a limied amoun of daa. Second, one has o rain a leas

a number of classifiers equivalen o he number of classes. For problems wih a large number of classes, such approaches have compuaional disadvanages. For an online learning scenario, because of hese consrains, such approaches migh no be applicable. Third, he commonly used 1-vs.-all approach inroduces addiional problems, such as producing unbalanced daases or uncalibraed classifiers. Boosing wih convex loss funcions is proven o be sensiive o ouliers and label noise [19]. This inheren problem of boosing is even more imporan in online learning problems where he label given by he environmen migh be quie noisy. Hence, raining such a sensiive algorihm in noisy environmens usually leads o inferior classifiers. Recenly, here has been a grea effor o remedy his weakness, e.g. by inroducing more robus loss funcions [19, 27, 18]. There exiss heoreical evidences ha many boosing algorihms are only able o maximize he hard-margin [23] or average margin [26] of daa samples. Such problems are addressed in oher learning mehods, specially in suppor vecor machines, by inroducing he sof-margins. Forunaely, for offline boosing, here exis a few mehods which are able o use he sof-margins, noably, he Linear Programming Boosing (LPBoos) [9] and is varians [28, 29, 12]. Therefore, by formulaing he LPBoos for online muli-class learning problems, we are able o direcly address hese inheren weaknesses of online boosing mehods. Experimenally, we show ha our algorihm in fac holds o hese promises and is able o ouperform or a leas achieve sae-of-he-ar resuls compared o oher online muli-class learning mehods on various paern recogniion and compuer vision asks. We conduc a se of experimens o compare our mehod wih oher online and offline muliclass learning mehods on sandard machine learning benchmarks. Addiionally, we apply our mehod o he objec caegory classificaion problem on Calech101 daase. As a side effec of having an online muli-class classifier, we are able o perform muliarge racking efficienly and robusly. For single objec racking wih a complex background, we propose o formulae he problem as a muli-arge racking by assigning separae virual classes for non-arge objecs wih high similariies o he arge, and hence, improve he racking resuls o achieve sae-of-hear over benchmark videos. 2. Online Muli-Class LPBoos In his secion, we formulae he online muliclass boosing as a linear programming opimizaion problem. We firs sae he problem and hen presen our learning mehod which is based on a primal-dual gradien descen-ascen sraegy. In online learning scenarios, he daa samples are presened sequenially. Following he repeaed game analogy of online learning, he goal of he learner is o achieve a low cumulaive loss over ime by updaing is model incremenally. Le he loss a ieraion be l, which measures how bad was he predicion of he learner ŷ wih respec o he rue class label of he newes sample y. In our formulaion, we assume ha he number of classes is no known in advance, and he learner should be able o incorporae new classes on-he-fly. Since he classes are presened over ime, we do no penalize he learner when a new class is inroduced. Le C C be he se of known classes up o ime and C be he oal label se (unknown o he learner). Also le K = C be he number of known classes a ime. In our formulaion, he learner mainains a model f : R d R K which is a mapping from he inpu feaure space o he muli-class hypohesis domain. We represen he confidence of he learner for he k-h class f,k (x ) as he k-h elemen of he oupu vecor f (x ) = [f,1 (x ),..., f,k (x )] T. The following decision rule is applied in order o obain he classificaion ŷ = arg max f,k (x ). (1) k C We define he hard margin of a sample x as m y (x ) = f,y (x ) max k C k y f,k (x ), (2) which measures he difference beween he classificaion confidence of he rue class and he closes non-arge class y = arg max f,k (x ). k C k y Noe ha based on he decision rule of Eq (1), m y (x ) < 0 means a wrong predicion, while a posiive margin means a correc classificaion. In his work, we use he hinge loss funcion l (m y ) = I(y C ) max(0, 1 m y (x )), (3) where I( ) is an indicaor funcion, which is inroduced so ha we do no penalize he model if here

is a novel class inroduced. Noe ha hinge loss is an upper bound on he miss-classificaion error T I(y ŷ ) =1 T l (m y ), (4) =1 and hence, is minimizaion resuls in minimizing he miss-classificaion error rae. 2.1. Muli-Class LPBoos Model Our learner is a boosing model, i.e. a linear combinaion of some weak learners (or bases) f (x ) = M w,m g,m (x ), (5) m=1 where g,m : R d R K is he m-h weak learner, M represens he number of weak learners, and w,m is he weigh of m-h base. I is convenien o wrie his formulaion in a more compac form as f (x ) = G (x )w, (6) where w = [w,1,..., w,m ] T R M is he weigh vecor of all bases and G (x ) = [g,1 (x ),..., g,m (x )] R K R M (7) is he response marix of all weak learners for all he known classes. We denoe G (y, ) o be he y-h row of his marix, and G (y, m) o be he elemen in he y-h row and he m-h column. Offline boosing sequenially adds base learners o he whole model. However, in our online boosing formulaion, he model uilizes a fixed se of online base learners and updaes hem sequenially by adjusing he weigh of a sample. Le B T be a cache wih size T. A cache of size T = 1 will correspond o he case ha learner discards he sample afer updaing on i. Considering our boosing model and he loss funcion presened in Eq (3), we propose he following regularized muliclass LPBoos problem o be opimized online min C w T,ξ B T s.. m : w T,m 0, k y : ξ,k 0 k y ξ,k + w T 1 (8), k y : ( G (y, ) G (k, ) ) w T + ξ,k 1 where C is he capaciy parameer, and slack variables ξ,k are added o creae sof margins for boosing. Noe ha his formulaion is a direc generalizaion of he original formulaion of LPBoos [9] o he muli-class case. Furhermore, if we would share he slack beween all he classes, hen i would be closely relaed o he muli-class varian of ν-lpboos proposed in [12]. In an offline scenario, such problems can be easily solved by sandard opimizaion echniques. However, in he online seing, usually i is infeasible o solve his problem from scrach for every new sample added o he sysem. Therefore, an incremenal soluion is desired. Forunaely, due o convexiy of he problem, one can benefi from previously proposed online convex programming approaches [33]. 2.2. Online Learning Our online learning mehod performs primal-dual gradien descen-ascen ieraively. In deail, we firs conver he problem o is augmened Lagrangian form [20]. By each new sample, we firs perform a dual ascen, which is equivalen of finding sample weighs for each ieraion of raining weak learners. Afer finishing ha sep, we do a primal descen over he weighs of weak learners. The Lagrange dual funcion of opimizaion problem Eq (8) is D(α, β,d) = B T k y d,k + (9) ( M + inf (1 α m )w T,m + w T,ξ m=1 + (C d,k β,k )ξ,k k y B T d,k B T k y ( G (y, ) G (k, ) ) w T ), where α, β, d are he Lagrange mulipliers of he consrains. Due o lineariy of he inner problem of Eq (9), for a se of finie soluions he following condiions mus hold, k y : C d,k β,k = 0 m : 1 α m B T k y d,k G,k (m) = 0, where G,k (m) = G (y, m) G (k, m). Using he posiiviy condiions on Lagrange mulipliers, we

can derive he dual formulaion of Eq (8) as max d,k (10) d B T k y s.. m : d,k G,k (m) 1 k y B T, k y : 0 d,k C. The vecor d, which corresponds o sample weighs, is he dual variable of weighs on weak learners. The firs se of consrains are equivalen of edge consrains of binary LPBoos. As i can be seen, here are K 1 weighs per each sample. However, he weak learners usually accep only one weigh per insance. Therefore, we only consider he mos violaing edge consrain for each example. This corresponds o finding he non-arge class for which he margin is he smalles: y 1. Therefore, we will concenrae on he following problem max d,y d (11) B T s.. m : d,y G,y (m) 1 B T : 0 d,y C. Opimizing he problem in Eq. (11) wih an online convex programming echnique [33], requires a projecion sep for finding soluions which are consisen wih he consrains. However, in our case such a sep is expensive o compue; herefore, we formulae is augmened Lagrangian [20] as max d,y w T,d + (12) B T M + w T,m (1 d,y G,y (m) ζ m ) m=1 B T 1 M (1 d,y 2θ G,y (m) ζ m ) 2 m=1 B T s.. m : ζ m 0, w T,m 0 : 0 d,y C, by inroducing a new se of slack variables ζ m and using θ > 0 as a consan. Noe ha he value of slacks can be easily found by compuing he derivaives of he objecive wih respec o hem and seing 1 Noe ha his is a limiaion imposed by he weak learners. The following derivaions can be easily generalized for all he weighs. i o zero. This leads o ζ m = max(0, q m ), where q m = 1 B T d,y G,y (m) θw T,m. Now we follow his procedure over ime: when a new sample arrives, we se is weigh o C and updae he cache by removing he oldes sample and insering he newes. Then, for raining he m-h weak learner, we compue he sample weighs by dual gradien ascen updae ( : e =d,y + ν d 1 + 1 m 1 θ j=1 q j<0 ) q j G,y (j) d,y max(0, min(c, e )), (13) where ν d is he dual learning rae. Afer updaing he sample weigh and raining he m-h weak learner according o hem, we compue an updae for he weigh of his weak learner by a primal gradien descen updae ( m : z m =w T,m ν p 1 ) d,y G,y (m) B T w T,m max(0, z m ), (14) where ν p is he learning rae for he primal. This alernaing primal-dual descen-ascen is coninued for all he weak learners. Discussion Alhough he updae rules presened in Eq (13) and Eq (14) look complicaed, in fac, hey presen inuiive learning sraegies which are closely relaed o he boosing way of learning from daa. In Eq (13), he inner sum shows he oal confidence of he weak learners rained so far wih respec o he classificaion margin of he curren sample. Noe ha since q j < 0, for a sample which many of he weak learners obain a posiive margin, his sum will be a large negaive value. Hence, for such a sample wih large posiive margin, he weigh will decrease for he raining of he nex weak learner. Similarly, for he updae in Eq (14), if a weak learner has a high weighed average margin over all he samples in he cache, he inner sum will be high, which will lead o an increase in is weigh. Therefore, he weigh of successful weak learners will increase. 3. Experimens We evaluae he proposed Online Muli-Class LP- Boos (OMCLP) algorihm by comparing is performance o oher online learning algorihms. In he firs

wo ses of experimens, we mainly compare wih oher muli-class online and offline algorihms, while in las secion we will conduc racking experimens. 3.1. Machine Learning Benchmark Since here is no oher online muli-class boosing algorihm available in lieraure for comparison, we conver he recenly proposed offline muli-class boosing algorihm of Zou e al. [34] o online formulaion. Based on heir formulaion, we define a margin vecor based on he curren classifier as x : K f,i (x ) = 0. (15) i=1 We hen use a Fisher consisen convex loss funcion [34], which guaranees ha by raining over a large number of samples he boosing model is able o recover he unknown Bayes decision rule. For his work, we experimen wih wo differen loss funcions: he exponenial loss e f,y (x) and he Logi loss log(1+e f,y (x) ). For updaing he m-h weak learner, we perform a funcional gradien descen as g,m (x) = arg max g l(f m 1,y (x ))g y (x ), (16) where l(f,y m 1 (x )) is he gradien of he loss funcion a he m-h sage of boosing. As i will be shown laer, in principle, his is also a novel and successful online muli-class boosing algorihm. We call his algorihm Online Muli-Class Gradien Boos (OM- CGB). Addiionally, we compare wih Online Random Foress (ORF) [24] and he highly successful online muli-class suppor vecor machine algorihm of Bordes e al. [6] named Linear LaRank. Noe ha boh of hese algorihms are inherenly muli-class, so hey provide a fair comparison. We also performed experimens wih he online AdaBoos formulaion of Oza e al. [21] by raining 1-vs-all classifiers. However, is performance was no comparable o hese baseline mehods; herefore, due o lack of space we omi reporing hem. We also compare our mehod wih he following offline rained muli-class classifiers: Random Foress [7], hree muli-class formulaions of AdaBoos namely SAMME [32], AdaBoos.ECC [15], and he recen algorihm of AdaBoos.SIP [31] 2. We also compare wih he muliclass suppor vecor machine algorihms of Keerhi e 2 For hese algorihms we repor he resuls presened in [31]. al. [17] wih a linear kernel and he muli-class SVM from LibSVM wih RBF kernel. The OMCLP and OMCGB use small ORFs wih 10 rees as heir weak learners and we se M = 10. ORF when used as a single model uses 100 rees rained online. For our algorihm, we fix he cache size o 1 and se ν d = θ = 2, ν p = 1e 6, and C = 5. Noe ha hese se of parameers will be used for all he daases in his secion and he nex secion. For offline mehods, we use a 5-fold cross-validaion o obain heir hyperparameers. Table 1 shows he classificaion error on 4 benchmark daases chosen from he UCI reposiory. All he experimens are run for 5 independen runs and he resuls are he average classificaion error on he held ou es se. In order o simulae large scale daases, we conduc he experimens in differen number of epochs: each epoch corresponds o seeing all he daa poins once in random order. As i can be seen, our algorihm ouperforms oher online learning mehods in 6 ou of 8 cases and comes very close o he performance of he bes offline mehods. Anoher ineresing observaion is he fac ha he performance of he OMCLP a he firs epoch is very close o he performance of he ORF afer 10 epochs. This shows ha given a slow converging algorihm like ORF, we are able o speed up is convergence rae as well. Our C++ implemenaion of OMCLP and OMCGB is freely available from he following link 3. 3.2. Objec Caegory Recogniion Online muli-class learning is essenial when dealing wih large-scale image daabases. For example, ypical image or video search engines ofen need o updae heir inernal model when a se of new daa is available. However, rebuilding he enire model is infeasible in pracice. Considering he fac ha he problem of objec caegory recogniion is inherenly muli-class, herefore such sysems can benefi from an online muli-class learner. We evaluae on Calech101 objec caegory recogniion ask, which is a challenging ask for an online learner, since he number of classes is large and he number of raining samples per class is small. For hese experimens, we use he Linear LaRank as he weak learners of he online boosing mehods, due o he fac ha ORFs were performing poorly on his ask. We conver he SVM scores of LaRank o prob- 3 hp://www.ymer.org/amir/sofware/online-muliclasslpboos/

Mehods - Daase DNA Leer Pendigi USPS # Epochs 1 10 1 10 1 10 1 10 OMCLP 0.0983 0.0565 0.1202 0.0362 0.0747 0.0241 0.1185 0.0809 OMCGB-Log 0.2648 0.0777 0.3033 0.1202 0.1666 0.0599 0.2418 0.1241 OMCGB-Exp 0.1395 0.0616 0.2484 0.0853 0.1282 0.0501 0.1926 0.1103 ORF 0.2243 0.0786 0.2696 0.0871 0.1343 0.0464 0.2066 0.1085 LaRank 0.0944 0.0818 0.5656 0.5128 0.1712 0.2109 0.0964 0.1004 RF 0.0683 0.0468 0.0387 0.0610 MCSVM-Lin 0.0727 0.2575 0.1266 0.0863 MCSVM-RBF 0.0559 0.0298 0.0360 0.0424 SAMME [31] 0.1071 0.4938 0.3391 N/A AdaBoos.ECC [31] 0.0506 0.2367 0.1029 N/A AdaBoos.SIP [31] 0.0548 0.1945 0.0602 N/A Table 1. Classificaion error on machine learning benchmark daases for 1 and 10 epochs. The bold-face shows he bes performing online mehod, while he ialic-fon shows he bes offline mehod. # Train 15 30 # Epochs 1 10 1 10 OMCLP 0.7437 0.6093 0.6672 0.5406 OMCGB 0.7520 0.6226 0.6860 0.5693 ORF 0.8969 0.8265 0.8880 0.8142 LaRank 0.7856 0.6353 0.7205 0.5803 Table 2. Classificaion error on Calech101 daases for 1 and 10 epochs. The bold-face shows he bes performing online mehod. abiliies via a mulinomial logisic regression. All oher seings are he same as experimens presened in Secion 3.1. We presen he resuls using he sandard Calech101 seings of raining on 15 and 30, and esing on 50 images per class. For feaure exracion, we use he precompued 360 degree Level2-PHOG feaures from Gehler and Nowozin [12]. Table 2 shows he resuls obained by various online mehods, averaged over 5 independen runs on 5 differen rain and es splis (oal 25 runs per algorihm) 4. As i can be seen, our algorihm performs he bes compared o oher mehods on his difficul ask. Figure 1(a) shows how he performance varies by he number of raining images per caegory. As expeced, more raining samples help algorihms o improve over ime. However, i is noable ha our mehod consisenly obains lower errors compared o oher algorihms over differen amoun of labeled daa. Figure 1(b) shows he dynamics of he learners when he same raining daa is reshuffled and presened as new samples. We can see ha alhough all mehods benefi from revisiing he samples, our algorihms makes he mos ou of he epochs, and as 4 We only repor he performance of OMCGB-Exp as wih Logi loss he resuls were similar. Tes Error Tes Error 0.95 0.90 0.85 0.80 0.75 0.70 0.65 0.60 0.55 0.50 0.90 0.85 0.80 0.75 0.70 0.65 0.60 0.55 OMCLP OMCGB ORF LaRank Solid: 1 epoch, Dashed: 10 epochs 10 15 20 25 30 Num Train (a) OMCLP OMCGB ORF LaRank 0.50 0 1 2 3 4 5 6 7 8 9 Num Epochs (b) Figure 1. (a) Classificaion error on Calech101 for 1 epoch (solid) and 10 epochs runs when he number of raining images per class is varying. (b) Classificaion error over differen number of epochs when using 30 images per caegory for raining. can be seen owards he end of he 10-h epoch, i has he highes gap o he second bes algorihm, OM- CGB.

(a) Addiion of a virual class dae ual class (b) No negaive up- (c) Updaing a vir- Figure 2. Tracking wih virual background classes. 3.3. Tracking wih Virual Classes Objec Tracking is a common applicaion for online learning algorihms in compuer vision. Wihin his secion, we will show he performance of our algorihm in a racking by deecion scenario. When raining a discriminaive objec deecor, he problem is usually formulaed as binary classificaion. In racking, we have usually fas changing, cluered, complex background, which has o be described wih a single background model. However, our approach is o break his binary ask ino a muli-class problem and uilize our robus online learner o discriminae beween hese classes. From he classificaion margin poin of view, he background migh have samples which are very close o he decision boundaries of he arge objec. These samples are usually poenial false posiive deecions during he racking, specially when here are fas appearance changes or occlusions. Since, we know ha our classifier can maximize he sof-margin of daa insances, we sample densely from he decision boundaries of he arge class in he feaure space for poenial false posiive background regions. Then, each of hese background regions is assigned o a separae virual class. Hence, our online muli-class classifier will maximize is margin wih respec o all hese classes, while also in image domain i will keep racking hem as hey were indeed arge objecs. Since our learner is able o accommodae new classes on-he-fly, we can keep rack of any new objec enering he scene which migh cause possible confusions. Figure 2 shows his procedure in acion: Figure 2(a) depics he addiion of a virual background class, and Figure 2(c) indicaes he updae of an exising virual class 5. We conduc an exensive evaluaion of our racking mehod. Since we wan o show ha he performance gain comes from our robus muli-class classifier and from he addiion of novel virual background classes, we only use simple Haar-like Fea- 5 Please refer o supplemenary maerial for videos describing our racking mehod in deails and is resuls. ures wihou any pos-processing. For he evaluaion of our racker we use he deecion-crierion of he VOC Challenge [10], which is defined as R T R GT / R T R GT, where R T is he racking recangle and R GT he ground-ruh. The advanage of his score is ha i ruly shows how accurae is he deecion of he model, raher han compuing he raw disance measure beween he arge and background. We measure he accuracy of a racker by compuing he average deecion score for he enire video. We run each racker 5 imes and repor he median average score. Table 3 liss he resuls for several publicly available benchmark sequences in comparison o oher sae-of-he-ar racking mehods: MILTracker [3], FragTracker [1], and AdaBoosTracker [13]. In 5 ou of 8 videos we ouperform oher mehods, while for he remaining 3 videos we are he second bes mehod. Our unopimized C++ implemenaion of OMCLP algorihm reaches near real-ime performance (around 10 o 15 frames/second on average). Sequence OMCLP MIL [3] Frag [1] OAB [13] Sylveser 0.67 0.60 0.62 0.520 Face 1 0.80 0.60 0.88 0.48 Face 2 0.78 0.68 0.44 0.68 Girl 0.64 0.53 0.60 0.40 Tiger 1 0.53 0.52 0.19 0.23 Tiger 2 0.44 0.53 0.15 0.28 David 0.61 0.57 0.43 0.26 Coke 0.24 0.33 0.08 0.17 Table 3. Average deecion score: bold-face shows he bes mehod, while ialic-fon indicaes he second bes. 4. Conclusions In his paper, we presened he Online Muli- Class LPBoos algorihm, which is able o build in an online seing a robus muli-class boosing model wih maximizing he sof-margin of he daa samples. We solved he opimizaion problem by performing a varian of online convex programming echnique, based on primal-dual gradien descen-ascen sraegy. Based on an exensive se of experimens, we showed ha our mehod ouperforms he sae-of-hear on a wide range of applicaions, such as paern recogniion asks, objec caegory recogniion asks, and objec racking. Our C++ implemenaion is freely available online. Our opimizaion echnique was buil on he wellknown online convex programming echnique which has efficien regre bounds. Therefore, we expec ha similar bounds hold for our mehod as well. We will

presen hese resuls in an exended version of his paper. References [1] A. Adam, E. Rivlin, and I. Shimshoni. Robus fragmens-based racking using he inegral hisogram. In CVPR, 2006. 7 [2] E. L. Allwein, R. E. Schapire, and Y. Singer. Reducing muliclass o binary: A unifying approach for margin classifiers. Journal of Machine Learning Research, 1:113 141, December 2000. 1 [3] B. Babenko, M. H. Yang, and S. Belongie. Visual racking wih online muliple insance learning. In CVPR, 2009. 7 [4] H. Bekel, I. Bax, G. Heidemann, and H. Rier. Adapive compuer vision: Online learning for objec recogniion. In DAGM, pages 447 454, 2004. 1 [5] A. Blum. On-line algorihms in machine learning. In Online Algorihms, pages 306 325, 1996. 1 [6] A. Bordes, L. Boou, P. Gallinari, and J. Weson. Solving muliclass suppor vecor machines wih larank. In ICML, 2007. 5 [7] L. Breiman. Random foress. Machine Learning, 45(1):5 32, Ocober 2001. 5 [8] R. T. Collins, Y. Liu, and M. Leordeanu. Online selecion of discriminaive racking feaures. PAMI, 27:1631 1643, 2005. 1 [9] A. Demiriz, K. P. Benne, and J. S. Taylor. Linear programming boosing via column generaion. Machine Learning, 46(1-3):225 254, 2002. 2, 3 [10] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual objec classes challenge 2007 (voc2007) resuls. 7 [11] Y. Freund and R. Schapire. Experimens wih a new boosing algorihm. In Proceedings of he Thireenh Inernaional Conference on Machine Learning (ICML), pages 148 156, 1996. 1 [12] P. Gehler and S. Nowozin. On feaure combinaion for muliclass objec classicaion. In ICCV, 2009. 2, 3, 6 [13] H. Grabner and H. Bischof. On-line boosing and vision. In CVPR, pages 260 267, 2006. 1, 7 [14] E. Granger, Y. Savaria, and P. Lavoie. A paern reordering approach based on ambiguiy deecion for online caegory learning. PAMI, 25:525 529, 2003. 1 [15] V. Guruswami and Sahai. Muliclass learning, boosing, and error-correcing codes. In COLT, 1999. 5 [16] O. Javed, S. Ali, and M. Shah. Online deecion and classificaion of moving objecs using progressively improving deecors. In CVPR, pages 695 700, 2005. 1 [17] S. S. Keerhi, S. Sundararajan, K. W. Chang, C. J. Hsieh, and C. J. Lin. A sequenial dual coordinae mehod for large-scale muli-class linear svms. In KDD, 2008. 5 [18] C. Leisner, A. Saffari, P. M. Roh, and H. Bischof. On robusness of on-line boosing - a compeiive sudy. In 3rd IEEE ICCV Workshop on On-line Compuer Vision, 2009. 1, 2 [19] P. M. Long and R. A. Servedio. Random classificaion noise defeas all convex poenial boosers. In ICML, volume 307, pages 608 615, 2008. 2 [20] J. Nocedal and S. Wrigh. Numerical Opimizaion. Springer, April 2000. 3, 4 [21] N. Oza and S. Russell. Online bagging and boosing. In AISTAT, pages 105 112, 2001. 1, 5 [22] M. T. Pham and T. J. Cham. Online asymeric boosed clasifiers for objec deecion. In CVPR, 2007. 1 [23] G. Räsch, T. Onoda, and K. R. Müller. Sof margins for adaboos. Machine Learning, 42(3):287 320, March 2001. 2 [24] A. Saffari, C. Leisner, J. Sanner, M. Godec, and H. Bischof. On-line random foress. In 3rd IEEE ICCV Workshop on On-line Compuer Vision, 2009. 5 [25] S. Shalev-Shwarz. Online Learning: Theory, Algorihms, and Applicaions. PhD hesis, The Hebrew Universiy of Jerusalem, July 2007. 1 [26] C. Shen and H. Li. On he dual formulaion of boosing algorihms. IEEE Transacions on Paern Analysis and Machine Inelligence, Dec 2010. 2 [27] H. M. Shirazi and N. Vasconcelos. On he design of loss funcions for classificaion: heory, robusness o ouliers, and savageboos. In NIPS, pages 1049 1056, 2008. 2 [28] M. K. Warmuh, K. Glocer, and G. Räsch. Boosing algorihms for maximizing he sof margin. In NIPS, 2007. 2 [29] M. K. Warmuh, K. A. Glocer, and S. V. Vishwanahan. Enropy regularized lpboos. In ALT, pages 256 271, Berlin, Heidelberg, 2008. Springer- Verlag. 2 [30] B. Wu and R. Nevaia. Improving par based objec deecion by unsupervised, online boosing. In CVPR, 2007. 1 [31] B. Zhang, G. Ye, Y. Wang, J. Xu, and G. Herman. Finding shareable informaive paerns and opimal coding marix for muliclass boosing. In ICCV, 2009. 5, 6 [32] J. Zhu, S. Rosse, H. Zou, and T. Hasie. Muli-class adaboos. Technical repor, 2006. 5 [33] M. Zinkevich. Online convex programming and generalized infiniesimal gradien ascen. In ICML, 2003. 3, 4 [34] H. Zou, J. Zhu, and T. Hasie. New muli-caegory boosing algorihms based on muli-caegory fisherconsisen losses. Annals of Applied Saisics, 2008. 5