1 Real-ime Paricle Filers Cody Kwok Dieer Fox Marina Meilă Dep. of Compuer Science & Engineering, Dep. of Saisics Universiy of Washingon Seale, WA 9895 Absrac Paricle filers esimae he sae of dynamical sysems from sensor informaion. In many real ime applicaions of paricle filers, however, sensor informaion arrives a a significanly higher rae han he updae rae of he filer. The prevalen approach o dealing wih such siuaions is o updae he paricle filer as ofen as possible and o discard sensor informaion ha canno be processed in ime. In his paper we presen real-ime paricle filers, which make use of all sensor informaion even when he filer updae rae is below he updae rae of he sensors. This is achieved by represening poseriors as mixures of sample ses, where each mixure componen inegraes one observaion arriving during a filer updae. The weighs of he mixure componens are se so as o minimize he approximaion error inroduced by he mixure represenaion. Thereby, our approach focuses compuaional resources (samples) on valuable sensor informaion. Experimens using daa colleced wih a mobile robo show ha our approach yields srong improvemens over oher approaches. Inroducion Due o heir sample-based represenaion, paricle filers are well suied o esimae he sae of non-linear dynamic sysems. Over he las years, paricle filers have been applied wih grea success o a variey of sae esimaion problems including visual racking, speech recogniion, and mobile roboics . The increased represenaional power of paricle filers, however, comes a he cos of higher compuaional complexiy. The applicaion of paricle filers o online, real-ime esimaion raises new research quesions. The key quesion in his conex is: How can we deal wih siuaions in which he rae of incoming sensor daa is higher han he updae rae of he paricle filer? To he bes of our knowledge, his problem has no been addressed in he lieraure so far. The prevalen approach in real ime applicaions is o updae he filer as ofen as possible and o discard sensor informaion ha arrives during he updae process. Obviously, his approach is prone o losing valuable sensor informaion. A firs sigh, he sample based represenaion of paricle filers suggess an alernaive approach similar o an any-ime implemenaion: Whenever a new observaion arrives, sampling is inerruped and he nex observaion is processed. Unforunaely, such an approach can resul in oo small sample ses, causing he filer o diverge [, 2]. In his paper we inroduce real-ime paricle filers (RTPF) o deal wih consrains imposed by limied compuaional resources. Insead of discarding sensor readings, we disribue he
2 (a) (b) z z z z z + z + 2 z (c) z 2 z z z z 2 + S u u S + S S S u u S 2 u S S Figure : Differen sraegies for dealing wih limied compuaional power. All approaches process he same number of samples per esimaion inerval (window sizeq hree). (a) Skip observaions, i.e. inegrae only every hird observaion. (b) Aggregae observaions wihin a window and inegrae hem in one sep. (c) Reduce sample se size so ha each observaion can be considered. samples among he differen observaions arriving during a filer updae. Hence RTPF represens densiies over he sae space by mixures of sample ses, hereby avoiding he problem of filer divergence due o an insufficien number of independen samples. The weighs of he mixure componens are compued so as o minimize he approximaion error inroduced by he mixure represenaion. The resuling approach naurally focuses compuaional resources (samples) on valuable sensor informaion. The remainder of his paper is organized as follows: In he nex secion we ouline he basics of paricle filers in he conex of real-ime consrains. Then, in Secion, we inroduce our novel echnique o real-ime paricle filers. Finally, we presen experimenal resuls followed by a discussion of he properies of RTPF. 2 Paricle filers Paricle filers are a sample-based varian of Bayes filers, which recursively esimae poserior densiies, or beliefs, over he sae of a dynamical sysem (see [, ] for deails):!#" $ () Here % is a sensor measuremen and is conrol informaion measuring he dynamics of he sysem. Paricle filers represen beliefs by ses & of weighed samples ' )(++, - 0/ (+.,. Each (.+, is a sae, and he - (+., are non-negaive numerical facors called imporance weighs, which sum up o one. The basic form of he paricle filer realizes he recursive Bayes filer according o a sampling procedure, ofen referred o as sequenial imporance sampling wih resampling (SISR):. Resampling: Draw wih replacemen a random sae from he se & according o he (discree) disribuion defined hrough he imporance weighs -(.+,. 2. Sampling: Use and he conrol informaion o sample 2 according o he disribuion 2!, which describes he dynamics of he sysem.. Imporance sampling: Weigh he sample 2 by he observaion likelihood Each ieraion of hese hree seps generaes a sample ' 2-2 / represening he poserior. Afer 4 ieraions, he imporance weighs of he samples are normalized so ha hey sum up o one. Paricle filers can be shown o converge o he rue poserior even in non-gaussian, non-linear dynamic sysems . A ypical assumpion underlying paricle filers is ha all samples can be updaed whenever new sensor informaion arrives. Under realime condiions, however, i is possible ha he updae canno be compleed before he nex sensor measuremen arrives. This can be he case for compuaionally complex sensor models or whenever he underlying poserior requires large sample ses . The majoriy of filering approaches deals wih his problem by skipping sensor informaion ha arrives during he updae of he filer. While his approach works reasonably well in many siuaions, i is prone o miss valuable sensor informaion.
3 z z z z z 2 S S S S 2 + S S +2 + α α 2 α α α 2 α Esmaion window z + Esimaion window + Figure 2: Real ime paricle filers. The samples are disribued among he observaions wihin one esimaion inerval (window size hree in his example). The belief is a mixure of he individual sample ses. Each arrow addiionally represens he sysem dynamics. Before we discuss ways of dealing wih such siuaions, le us inroduce some noaion. We assume ha observaions arrive a ime inervals, which we will call observaion inervals. Le 4 be he number of samples required by he paricle filer. Assume ha he resuling updae cycle of he paricle filer akes and is called he esimaion inerval or esimaion window. Accordingly, observaions arrive during one esimaion inerval. We call his number he window size of he filer, i.e. he number of observaions obained during a filer updae. The -h observaion and sae wihin window are denoed and, respecively. Fig. illusraes differen approaches o dealing wih window sizes larger han one. The simples and mos common aproach is shown in Fig. (a). Here, observaions arriving during he updae of he sample se are discarded, which has he obvious disadvanage ha valuable sensor informaion migh ge los. The approach in Fig. (b) overcomes his problem by aggregaing muliple observaions ino one. While his echnique avoids he loss of informaion, i is no applicable o arbirary dynamical sysems. For example, i assumes ha observaions can be aggregaed opimally, and ha he inegraion of an aggregaed observaion can be performed as efficienly as he inegraion of individual observaions, which is ofen no he case. The hird approach, shown in Fig. (c), simply sops generaing new samples whenever an observaion is made (hence each sample se conains only 4 samples). While his approach akes advanage of he any-ime capabiliies of paricle filers, i is suscepible o filer divergence due o an insufficen number of samples [2, ]. Real ime paricle filers In his paper we propose real ime paricle filers (RTPFs), a novel approach o dealing wih limied compuaional resources. The key idea of RTPFs is o consider all sensor measuremens by disribuing he samples among he observaions wihin an updae window. Addiionally, by weighing he differen sample ses wihin a window, our approach focuses he compuaional resources (samples) on he mos valuable observaions. Fig. 2 illusraes he approach. As can be seen, insead of one sample se a ime, we mainain smaller sample ses a $$ $!. We rea such a virual sample se, or belief, as a mixure of he disribuions represened in i. The mixure componens represen he sae of he sysem a differen poins in ime. If needed, however, he complee belief can be generaed by considering he dynamics beween he individual mixure componens. Compared o he firs approach discussed in he previous secion, his mehod has he advanage of no skipping any observaions. In conras o he approach shown in Fig. (b), RTPFs do no make any assumpions abou he naure of he sensor daa, i.e. wheher i can be aggregaed or no. The difference o he hird approach (Fig. (c)) is more suble. In boh approaches, each of he sample ses can only conain 4 samples. The belief sae ha is propagaed by RTPF o he nex esimaion inerval is a mixure disribuion where each mixure componen is represened by one of he sample ses, all generaed independenly from he previous window. Thus, he belief sae propagaion is simulaed by "$# sample rajecories, ha for compuaional convenience are represened a he poins in ime where he observaions are inegraed. In he approach (c) however, he belief propagaion is simulaed wih only 4 independen samples.
4 We will now show how RTPF deermines he weighs of he mixure belief. The key idea is o choose he weighs ha minimize he KL-divergence beween he mixure belief and he opimal belief. The opimal belief is he belief we would ge if here was enough ime o compue he full poserior wihin he updae window.. Mixure represenaion Le us resric our aenion o one esimaion inerval consising of observaions. The opimal belief%! a he end of an esimaion window resuls from ieraive applicaion of he Bayes filer updae on each obseraion : %! $$ $ " #$ $$ " $ (2) Here denoes he belief generaed in he previous esimaion window. In essence, (2) compues he belief by inegraing over all rajecories hrough he esimaion inerval, where he sar posiion of he rajecories is drawn from he previous belief. The probabiliy of each rajecory is deermined using he conrol informaion $$ $, and he likelihoods of he observaions %$ $$ along he rajecory. Now le denoe he belief resuling from inegraing only he observaion wihin he esimaion window. RTPF compues a mixure of such beliefs, one for each observaion. The mixure, denoed, is he weighed sum of he mixure componens, where denoes he mixure weighs: where and % $$ $ % %! " $.$.$ " $ (). Here, oo, we inegrae over all rajecories. In conras o (2), however, each rajecory selecively inegraes only one of he observaions wihin he esimaion inerval..2 Opimizing he mixure weighs We will now urn o he problem of finding he weighs of he mixure. These weighs reflec he imporance of he respecive observaions for describing he opimal belief. The idea is o se hem so as o minimize he approximaion error inroduced by he mixure disribuion. More formally, we deermine he mixing weighs "! by minimizing he KL-divergence  beween and #.! $&%(')+#, -/.0 2 #!" 4.%! (4) $&%(')+#, -/ # 4! " $ (5) In he above 8 :9 # ; <. Opimizing he weighs of mixure approximaions can be done using EM  or (consrained) =?> gradien descen . Here, we perform a small number of gradien descen seps o find he mixure weighs. Denoe by Noe ha ypically he individual predicions $ can be concaenaed so ha only wo predicions for each rajecory have o be performed, one before and one afer he corresponding observaion.
5 he crierion o be minimized in (5). The gradien of is given by ' # '! ' # " $ $$ $ (6) The sar poin for he gradien descen is chosen o be he cener of he weigh domain8, ha is $ $$.. Mone Carlo gradien esimaion The exac compuaion of he gradiens in (6) requires he compuaion of he differen beliefs, each in urn requiring several paricle filer updaes (see (2), ()), and inegreaion over all saes. This is clearly no feasible in our case. We solve his problem by Mone Carlo approximaion. The approach is based on he observaion ha he beliefs in (6) share he same rajecories hrough space and differ only in he observaions hey inegrae. Therefore, we firs generae sample rajecories hrough he esimaion window wihou considering he observaions, and hen use imporance sampling o generae he beliefs needed for he gradien esimaion. Trajecory generaion is done as follows: we draw a sample from a sample se of he previous mixure belief, where he probabiliy of chosing a se &) is given by he mixure weighs. This sample is hen moved forward in ime by consecuively drawing samples from he disribuions! a each ime sep $$ $. The resuling rajecories are drawn from he following proposal disribuion : $$ $ #" #$$ $ " (7) Using imporance sampling, we obain sample-based esimaes of weighing each rajecory wih or and! by simply %, respecively (compare (2) and ()). is generaed wih minimal compuaional overhead by averaging he weighs compued for # he individual disribuions. The use of he same rajecories for all disribuions has he advanage ha i is highly efficien and ha i reduces he variance of he gradien esimae. This variance reducion is due o using he same random bis in evaluaing he diverse scenarios of incorporaing one or anoher of he observaions . Furher variance reducion is achieved by using sraified sampling on rajecories. The rajecories are grouped by deermining conneced regions in a grid over he sae space (a ime ). Neighboring cells are considered conneced if boh conain samples. To compue he gradiens by formula (6), we hen perform summaion and normalizaion over he grouped rajecories. Empirical evaluaions showed ha his grouping grealy reduces he number of rajecories needed o ge smooh gradien esimaes. An addiional, very imporan benefi of grouping is he reducion of he bias due o differen dynamics applied o he differen sample ses in he esimaion window. In our experimens he number of rajecories is less han of he oal number of samples, resuling in a compuaional overhead of abou % of he oal esimaion ime. To summarize, he RTPF algorihm works as follows. The number 4 of independen samples needed o represen he belief, he updae rae of incoming sensor daa, and he available processing power deermine he size of he esimaion window and hence he number of mixure componens. RTPF compues he opimal weighs of he mixure disribuion a he end of each esimaion window. This is done by gradien descen using he Mone Carlo esimaes of he gradiens. The resuling weighs are used o generae samples for he individual sample ses of he nex esimaion window. To do so, we keep rack of he conrol informaion (dynamics) beween he differen sample ses of wo consecuive windows.
6 8m 54m Fig. : Map of he environmen used for he experimen. The robo was moved around he symmeric loop on he lef. The ask of he robo was o deermine is posiion using daa colleced by wo disance measuring devices, one poining o is lef, he oher poining o is righ. 4 Experimens In his secion we evaluae he effeciveness of RTPF agains he alernaives, using daa colleced from a mobile robo in a real-world environmen. Figure shows he seup of he experimen: The robo was placed in he office floor and moved around he loop on he lef. The ask of he robo was o deermine is posiion wihin he map, using daa colleced by wo laser-beams, one poining o is lef, he oher poining o is righ. The wo laser beams were exraced from a planar laser range-finder, allowing he robo only o deermine he disance o he walls on is lef and righ. Beween each observaion he robo moved approximaely 50cm (see  for deails on robo localizaion and sensor models). Noe ha he loop in he environmen is symmeric excep for a few landmarks along he walls of he corridor. Localizaion performance was measured by he average disance beween he samples and he reference robo posiions, which were compued offline. In he experimens, our real-ime algorihm, RTPF, is compared o paricle filers wih skipping observaions, called Skip daa (Figure a), and paricle filers wih insufficien samples, called Naive (Figure c). Furhermore, o gauge he efficiency of our mixure weighing, we also obained resuls for our real-ime algorihm wihou weighing, i.e. we used mixure disribuions and fixed he weighs o. We denoe his varian Uniform. Finally, we also include as reference he Baseline approach, which is allowed o generae 4 samples for each observaion, hereby no considering real-ime consrains. The experimen is se up as follows. Firs, we fix he sample se size 4 which is sufficien for he robo o localize iself. In our experimen 4 is se empirically o 20,000 (he paricle filers may fail a lower 4, see also ). We hen vary he compuaional resources, resuling in differen window sizes. Larger window size means lower compuaional power, and he number of samples ha can be generaed for each observaion decreases o (4 ). Figure 4 shows he evoluions of average localizaion errors over ime, using differen window sizes. Each graph is obained by averaging over 0 runs wih differen random seeds and sar posiions. The error bars indicae 95% confidence inervals. As he figures show, Naive gives he wors resuls, which is due o insufficien numbers of samples, resuling in divergence of he filer. While Uniform performs slighly beer han Skip daa, RTPF is he mos effecive of all algorihms, localizing he robo in he leas amoun of ime. Furhermore, RTPF shows he leas degradaion wih limied compuaional power (larger window sizes). The key advanage of RTPF over Uniform lies in he mixure weighing, which allows our approach o focus compuaional resources on valuable sensor informaion, for example when he robo passes an informaive feaure in one of he hallways. For shor window sizes (Fig. 4(a)), his advanage is no very srong since in his environmen, mos feaures can be deeced in several consecuive sensor measuremens. Noe ha because he Baseline approach was allowed o inegrae all observaions wih all of he 20,000 samples, i converges o a lower error level han all he oher approaches.
7 Average Localizaion error [cm] Baseline Skip daa RTPF Naive Uniform Average Localizaion error [cm] Baseline Skip daa RTPF Naive Uniform Time [sec] (a) Time [sec] (b) Average Localizaion error [cm] Baseline Skip daa RTPF Naive Uniform Time [sec] (c) Localizaion speedup Window size Fig. 4(a)-(c): Performance of he differen algorihms for window sizes of 4, 8, and 2 respecively. The -axis represens ime elapsed since he beginning of he localizaion experimen. The -axis plos he localizaion error measured in average disance from he reference posiion. Each figure includes he performance achieved wih unlimied compuaional power as he Baseline graph. Each poin is averaged over 0 runs, and error bars indicae 95% confidence inervals. Fig. 4(d) represens he localizaion speedup of RTPF over Skip daa for various window sizes. The advanage of RTPF increases wih he difficuly of he ask, i.e. wih increasing window size. Beween window size 6 and 2, RTPF localizes a leas wice as fas as Skip daa. Wihou mixure weighing of RTPF, we did no expec Uniform o ouperform Skip daa significanly. To see his, consider one esimaion window of lengh. Suppose only one of he observaions deecs a landmark, or very informaive feaure in he hallway. In such a siuaion, Uniform considers his landmark every ime he robo passes i. However, i only assigns 4 samples o his landmark deecion. Skip daa on he oher hand, deecs he landmark only every -h ime, bu assigns all 4 samples o i. Therefore, averaged over many differen runs, he mean performance of Uniform and Skip daa is very similar. However, he variance of he error is significanly lower for Uniform since i considers he deecion in every run. In conras o boh approaches, RTPF deecs all landmarks and generaes more samples for he landmark deecions, hereby gaining he bes of boh worlds, and Figures 4(a) (c) show his is indeed he case. In Figure 4(d) we summarize he performance gain of RTPF over Skip daa for differen window sizes in erms of localizaion ime. We considered he robo o be localized if he average localizaion error remains below 200 cm over a period of 0 seconds. If he run never reaches his level, he localizaion ime is se o he lengh of he enire run, which is 574 seconds. The -axis represens he window size and he -axis he localizaion speedup. For each window size speedups were deermined using -ess on he localizaion imes for he 0 pairs of daa runs. All resuls are significan a he 95% level. The graph shows ha wih increasing window size (i.e. decreasing processing power), he localizaion speedup increases. A small window sizes he speedup is 20-50%, bu i goes up o 2.7 imes for larger windows, demonsraing he benefis of he RTPF approach over radiional paricle filers. Ulimaely, for very large window sizes, he speedup decreases again, which is due o he fac ha none of he approaches is able o reduce he error below 200cm wihin he run ime of an experimen. (d)
8 5 Conclusions In his paper we ackled he problem of paricle filering under he consrain of limied compuing resources. Our approach makes near-opimal use of sensor informaion by dividing sample ses beween all available observaions and hen represening he sae as a mixure of sample ses. Nex we opimize he mixing weighs in order o be as close o he rue poserior disribuion as possible. Opimizaion is performed efficienly by gradien descen using a Mone Carlo approximaion of he gradiens. We showed ha RTPF produces significan performance improvemens in a robo localizaion ask. The resuls indicae ha our approach ouperforms all alernaive mehods for dealing wih limied compuaion. Furhermore, RTPF localized he robo more han 2.7 imes faser han he original paricle filer approach, which skips sensor daa. Based on hese resuls, we expec our mehod o be highly valuable in a wide range of real-ime applicaions of paricle filers. RTPF yields maximal performance gain for daa sreams conaining highly valuable sensor daa occuring a unpredicable ime poins. The idea of approximaing belief saes by mixures has also been used in he conex of dynamic Bayesian neworks . However, Boyen and Koller use mixures o represen belief saes a a specific poin in ime, no over muliple ime seps. Our work is moivaed by real-ime consrains ha are no presen in . So far RTPF uses fixed sample sizes and fixed window sizes. The nex naural sep is o adap hese wo srucural parameers o furher speed up he compuaion. For example, by he mehod of  we can change he sample size on-he-fly, which in urn allows us o change he window size. Ongoing experimens sugges ha his combinaion yields furher performance improvemens: When he sae uncerainy is high, many samples are used and hese samples are spread ou over muliple observaions. On he oher hand, when he uncerainy is low, he number of samples is very small and RTPF becomes idenical o he vanilla paricle filer wih one updae (sample se) per observaion. 6 Acknowledgemens This research is sponsored in par by he Naional Science Foundaion (CAREER gran number ) and by DARPA (MICA program). References  A. Douce, N. de Freias, and N. Gordon, ediors. Sequenial Mone Carlo in Pracice. Springer- Verlag, New York, 200.  D. Fox. KLD-sampling: Adapive paricle filers and mobile robo localizaion. In Advances in Neural Informaion Processing Sysems (NIPS), 200.  D. Fox, S. Thrun, F. Dellaer, and W. Burgard. Paricle filers for mobile robo localizaion. In Douce e al. .  P. Del Moral and L. Miclo. Branching and ineracing paricle sysems approximaions of feynamkac formulae wih applicaions o non linear filering. In Seminaire de Probabilies XXXIV, number 729 in Lecure Noes in Mahemaics. Springer-Verlag,  T. M. Cover and J. A. Thomas. Elemens of Informaion Theory. Wiley Series in Telecommunicaions. Wiley, New York, 99.  W. Poland and R. Shacher. Mixures of Gaussians and minimum relaive enropy echniques for modeling coninuous uncerainies. In Proc. of he Conference on Uncerainy in Arificial Inelligence (UAI), 99.  T. Jaakkola and M. Jordan. Improving he mean field approximaion via he use of mixure disribuions. In Learning in Graphical Models. Kluwer, 997.  P. R. Cohen. Empirical mehods for arificial inelligence. MIT Press, 995.  X. Boyen and D. Koller. Tracable inference for complex sochasic processes. In Proc. of he Conference on Uncerainy in Arificial Inelligence (UAI), 998.