Vehicle Detection, Classification and Position Estimation based on Monocular Video Data during Night-time

Vehcle Detecton, Classfcaton and Poston Estmaton based on Monocular Vdeo Data durng Nght-tme Jonas Frl, Marko H. Hoerter, Martn Lauer and Chrstoph Stller Keywords: Automotve Lghtng, Lght-based Drver Assstance, Detecton, rackng, Classfcaton, Lght Sources. 1 Abstract hs study descrbes an effectve method for detectng, trackng and classfyng vehcles durng nght-tme n order to support automotve adaptve llumnaton applcatons. he hereby descrbed software framework, whch computes the relatve poston, velocty and estmated class of all detected vehcles, ntegrates multple processng stages. Frstly, an mage segmentaton usng a threshold method to detect all lght sources n the mage. Secondly, possble pars of headand tallght are clustered usng geometrcal nformaton. hrdly, all detected obects are tracked usng a Kalman-Flter to ncrease resoluton and robustness of the algorthm. Lastly, a method for computng dstance and velocty for all classfed obects (e.g. cars, trucks, bkes ) s presented. he system s tested to run n real-tme and some results and conclusons are offered at the end. 2 Introducton Drvng a vehcle durng nght-tme s relatvely more hazardous than at day-tme, whch was recently revewed n detal by the German Federal Offce of Statstcs (DESAIS, [1]). echncally nadequate llumnaton of the present road scenery as well as wrongly utlzed lght functon (e.g. pure usage/ alternaton between low- und hgh-beam) are ust some examples on that. In order to run lghtng based drvng assstance system actng upon the maxm best llumnaton wthout dazzlng other traffc attendance, t s mostly mportant to estmate the spatal poston, velocty as well as the classfed category of the detected lght source. By havng such data relably acqured from the sensor sde, dfferent lghtng based drvng assstance system can be realzed, such as Adaptve Cut-Off-Lne, Predctve Illumnaton Dstance Control or Masked Hgh Beam dervates [2].

2.1 Obectve of research proect he ntenton of the hereby descrbed research proect s to detect, track and fnally classfy lght-sources based on real-world monocular mage sequences. o mplement a most sutable and effcent detecton algorthm, dfferent detecton methods have been evaluated to each other wth respect to ther performance. For trackng purposes a lnear and dscretzed Kalman-Flter has been utlzed, whch represents an teratve and stochastc state estmator by mnmzng the mddle error square. he appled classfer calculates the probablty of error concernng to the found lght source based on real-world mage sequences. 2.2 Utlzed Hard- and Software he above mentoned mage sequences were acqured by utlzng the dedcated testng vehcle at the Department of Measurement and Control (see Fgure 1 and Fgure 2). he hereby used vsual sensor (Pont Grey, type Frefly MV ) has been mounted on a camera platform close to the rear-vew mrror and was drven at 30 frames per second. o guarantee tmely synchronous data (e.g. vehcle mmanent plus vdeo data), a real-tme data base (RDB) framework, ntroduced n [4], has been exploted. Fgure 1 - AUDI Q7 as a testng vehcle at the Department of Measurement and Control. [3] Fgure 2 - Dedcated camera platform mounted behnd front wndow. [3] 3 Lght source detecton thn ths secton the used lght sources detecton algorthm wll be descrbed n detal. As a requrement, the detecton algorthm should be capable not only to detect vehcle front lghts (e.g. cars, trucks, motorcycles), but also other lght

sources wth suffcent lght ntensty, lke street lamps. In [3] dfferent detecton methods, lke SURF, HOUGH-transformaton or dedcated (mult-level) threshold operaton wth respect to performance and applcablty have been evaluated. As a result of ths evaluaton fnally a pcture-row-based threshold operaton, descrbed n [5] has been utlzed n the further proceedng (see Fgure 3). Fgure 3 - Found parameter curve of the pcture-row-based threshold method; here appled on a 640x480 pxel pcture. [3] Fgure 4 - Outcome of the pcture-row based threshold flterng method. [3] 3.1 Lght source detecton wth threshold flter As descrbed above, a pcture-row-based threshold operaton wll be utlzed to set the threshold value correspondng to the vertcal pxel poston. Lke ponted out n [5], one of the most mportant benefts of ths method s to detect even lowlevel lght emttng sources (e.g. tal lghts) n far dstances n comparson to methods wth a constant threshold. Based on the threshold fltered pctures connected blobs, whch are bascally two or more detected lght sources, can be extracted. herefore, the so called connected component analyss, more specfcally detaled n [6], has been used n ths work. Hereby one calculated blob ntegrates values as follows: mdpont, standard devaton and number of pxel. Due to ther occurrence, detected lght sources above the horzon are completely descrbed by these values (see Fgure 4), whereas the blobs underneath the horzon stll need further processng to extract sngle lght sources out of t.

3.2 Detecton of reflectons If sngle lght sources do have suffcent dstance between each other, the mentoned detecton wll be done by the threshold flter operaton tself. If not, two or more lght sources appear as a sngle lght source. hs happens most lkely wthn overexposed pcture regons, for nstance caused by reflecton or dense traffc stuatons. he challenge on that s to evaluate the detected blobs of the gven the pcture, whether they consst of one or multple lght sources (e.g. two head lghts plus two correspondng reflectons, see Fgure 5). Fgure 5 - Consoldated blobs resultng from threshold flterng. [3] Fgure 7 - Separated blobs (red) plus correspondng reflectons (green). [3] Fgure 6 - Par of head lghts wth correspondng reflectons. [3] Fgure 8 - Pars of lght sources grouped together by clusterng algorthm. [3] thn ths work, a proper separaton between sngle blobs and ther reflectons have been accomplshed by summng up the pxel values column by column as well as row by row. A blob detected at ts poston wth-

n a pcture results wth ts correspondng row sum vector : as well as ts column sum vector As next step those stated vectors can be nspected for local maxma to get the number of separate lght sources out of t. By llustratng Fgure 6, we can detect two local maxma for each pxel drecton, whereas at ther ntersecton ponts the four correspondng lght sources can be found. hs makes t easy to dstngush between lght sources and reflectons by smple heurstc knowledge (reflecton are most usual underneath ts correspondng lght sources) by consderng a maxmum gap between those two groups (see Fgure 7). 3.3 o cluster pars of front- and tal lghts As a result of the detecton stage, a lst of detected lght sources per pcture can be extracted. o be able to track obstacles wthn the traffc context, a clusterng method has to pool potental pars of front- and tal lghts together. In [7] a method s presented, whch clusters lght sources wthn a gven pcture based on geometrcal propertes. In our approach, ths method has been adapted to the effect that the nput data of two lght sources and wth geometrc dmenson s gven as follows : he maxmum and mnmum pxel poston, the correlated heght and wdth, as well as the number of pxel correspondng to the lght source. he two gven lght sources wll be pooled together as soon as the followng four propertes are postvely evaluated: (1) Vertcal proectons have adequate ntersecton areas: v v < c max( H, H ), mn mn vp v v < c max( H, H ). max max vp

(2) Heght approxmately at the same range: H H < c max( H, H ). (3) Number of pxel approxmately at the same range: mn( A, A ) > ca. max( A, A ) (4) Horzontal dstance as follows: ( u u ) ( ) max u mn u max mn D h = < cd max( H, H,, ). 2 he parameters have been ustfed manually through evaluatng a huge amount of test data. As two maor dfferences n comparson to [7], the calculaton of the horzontal dstance of a dedcated lght source has been adapted n two ways: Frstly, by not only takng the maxmum heght under consderaton, but also the maxmum wdth. hs has been done due to the fact that lght sources appear n realty not always n deal round shape. Secondly, the threshold has been ncreased to a hgher level to be able to detect also very tny appearng tal lghts. As a result on that, a lst of potental pars of front- and tal lghts can be stated as well as appled to the gven sequence of pctures (see Fgure 8). 120 100 dstance [meters] 80 60 40 20 0 real dstance measured dstance Fgure 9 - Camera and world coordnate system. [3] Fgure 10 Results of the poston estmaton of an obect n dfferent dstances. [3] 4 Poston- and velocty estmaton he am of ths secton s to present a method to compute 3D postons n a fxed world coordnate system by gven 2D mage coordnates of detected lght sources. hese data are essental to get parameters lke dstance or velocty of

detected vehcles n the road scenery n front of the own car. At the same tme they wll be the nput data of the trackng algorthm descrbed n secton 5. In most cases the problem of the under-determned transformaton system s solved by usng a stereo vson approach. Due to the hardware setup of ths proect - usage of only one camera n order to keep costs and packagng down - and to general problems usng stereo vson systems n nght tme scenaros a restrcton of the world has to be done: we assume a flat (2D) world. Hence the horzon can be assumed to be a constant horzontally lne n the mage. th the world coordnate system gven n Fgure 9, all obects have to be coplanar parallel to the X, Z ) -plane, and therefore have a constant heght d. hs ( plane can be descrbed by usng e.g. the hessan normal form ( X, Y, Z ),(0,1,0) d = 0. th the known parameters from the camera calbraton (.e.: rotaton matrx R, translaton vector, camera matrx A ) we can descrbe the transformaton from a gven mage pont ( u, v) : wth: ( X, Y, Z ) λ = R d + A 1 = λ R A ( u, v,1) R, 1 R,(0,1,0) ( u, v,1),(0,1,0) In Fgure 10 the results of the transformaton are shown (dstance d = X + Z ). e can easly see the quadratc error ncrease due to small pxel errors, whch normally occur wth ncreasng obect dstances (see also [8]). After the postons are estmated, the next step s to compute the velocty relatv to the own car. th avable CAN-data (velocty and steerng angle) t s easy to compute the vectorell velocty usng gven postons p = ( X, Z ) t t 2 at tme t, e.g. 2 usng the formula presented n [9]: v t n m+ n p p k = t k 1 k = m+ 1 t k =. mn 5 Clusterng and Kalman-trackng In ths secton an algorthm to track possble vehcle-lke obects n a sequence of mages wll be presented. Pror to the trackng t s requred to extract possble vehcles lke cars or bkes out of the blob lst and generate a lst of all obects n

an mage. herefore we have to recognse, that vehcles can appear n the mage as obects consstng of one (e.g.: bkes, far away cars) or of two blobs. Another mportant fact s that, wth the flat world assumpton of secton 4 and the therefore constant skylne, vehcles can only appear under the constant mage row of the horzon. One-blob obects can be detected usng nformaton about there sze and locaton n the mage. wo-blob obects have to be clustered usng the technque descrbed n secton 3.3. After creatng a complete lst of obects n the mage, a trackng algorthm can be appled, whch wll recover all vehcles over a sequence of mages as well as compute ther relatve physcal values. herefore obects of the last mage have to be assocated wth the current mage data, takng account of the nformaton ganed durng the past recoverng process. One of the dffcultes of ths process s the fact that one obect doesn t have to consst of the same number of blobs over a set of frames. After ths data assocaton a lnear and dscretzed Kalman-Flter s used to ncrease robustness of the results and to mprove spatal resoluton of the dstance estmaton. he state vector x and the measure vector y are defned as follows, consderng the flat world assumpton: wth the veloctes n x ) = (, Z, v x, v Z and X - and Z y ) = ( Z, X, -drecton v X and v Z. Assumng the model of approxmately constant velocty (between two frames), the system matrx s gven by: 1 0 0 0 1 0 A =, 0 0 1 0 0 0 0 1 wth the camera frequency 1 /. he error, whch occurs usng ths modell, s handled wth the error matrx Q, whch can be computed by usng the modell of pecewse constant accelaraton, smplfed resultng n:

4 3 2 2 az 0 az 0 4 2 4 3 2 2 0 ax 0 ax Q = 4 2 3. 2 2 2 a 0 0 Z az 2 3 2 2 2 0 ax 0 ax 2 Here, a Z denotes the maxmal velocty n Z -drecton, a X respectvely. In Fgure 11 the results of the Kalman-Flter are shown. he estmated and Kalman-fltered dstances are plotted over some frames from a scenaro of one departng vehcle to pont out the advantages of ths trackng algorthm. dstance [meter] 120 100 80 60 40 20 0 1 21 41 61 81 101 frame number Fgure 11 - Results of the Kalman-Flter. [3] estmated dstance Kalmanfltered dstance 6 Obect classfcaton he fnal stage of ths proect s to execute a classfcaton process to all detected and tracked obects. e sort the detected obects nto the followng classes: Bke, Car, ruck and Unknown. Frstly a feature-vector m r s defned for every obect O O, whch has to be computed befor startng the classfcaton: r m = ( n, lt, d, sq, v, a) wth: n blobs : Number of blobs the obect conssts of [-] lt : Lfetme of the obect (number of tracked mages) [-] d : Dstance of the obect [meters] sq : Squareness of the obect (wdth[pxel] / heght[pxel]) blobs,

v : Velocty of the obect [meters/sec²] a : Area of the obect [pxel²] After nterpretng a huge amount of tranng data, t s possble to formulate a probablty functon p ( c m) for every class c and for every element m of the feature-vector m r. he resultng probabltes are weghted and summed up to get r the complete probablty p c m ) for every class and allow usng a smple max- ( mum a posteror classfer. Because of the dffculty n dstngushng between lghts of a truck and those of a passenger car, the process shown n Fgure 12 has delvered good classfcaton results, whch executes a detecton of clearance lghts for all possble trucks as a dstnctve means. As a result we obtan the correspondng class for each obect (class Unknown s an ndcator for an uncertan classfcaton or any knd of nose) ncludng ts probablty. compute probablty for classes: - passenger car / truck P car / truck -bke P bke yes P car / truck > P bke no yes P car / truck > c no P bke > c yes dstngush between classes car and truck by detectng clearence lghts class: car class: truck class: unknown class: bke Fgure 12 Classfcaton process, the constant threshold c s set expermentally. [3] 7 Summary and Outlook th ths paper we have presented a novel software framework to detect, track and classfy lght sources wthn a traffc context. By evaluatng the computaton tme of the dfferent software modules, t can be seen, that the over-all algorthm

s capable to run under real-tme condtons due to ts hghly effcent mplementaton [3]. In the future, lght-based drvng assstance systems wll be hghly supported by advanced sensor systems, such as vdeo, radar, laser, or ldar unts to estmated physcal values of the drver`s envronment to guarantee an optmum of llumnaton whle drvng at nght-tme. he presented approach n ths paper shows frst promsng results, nevertheless further enhancement n the feld of detecton range and relablty have to be done n the future. 8 References [1] Statstsches Bundesamt Deutschland, Entwcklung der Zahl der m Straßenverkehr Getöteten 1953 bs 2008, Secton Verkehr/ Verkehrsunfälle, Germany: esbaden, 2009 [2] Marko H. Hoerter et. al, A Hardware and Software Framework for Automotve Intellgent Lghtng, Proceedngs of IEEE Intellgent Vehcles Symposum 2009, Chna: X`an, 2009 [3] Jonas Frl, Lchtquellendetekton, -klassfkaton und Postonsbestmmung n nächtlchen monokularen Bldsequenzen, Unversty of Karlsruhe(H), Department of Measurement and Control, Germany: Karlsruhe, 2009 [4] Matthas Goebl et al., A Real-me-capable Hard- and Software Archtecture for Jont Image and Knowledge Processng n Cogntve Automobles, In Proc. IEEE Intellgent Vehcles Symposum, pages 734-740, urkey: Istanbul, 2007 [5] M.Y. Chern et. al, he lane recognton and vehcle detecton at nght for a camera-asssted car on hghway, In Proc. of IEEE Internatonal Conference on Robotcs and Automaton 2003, volume 2, 2003. [6] F. Chang et al., A lnear-tme component-labelng algorthm usng contour tracng technque, Computer Vson and Image Understandng, 93(2):206 220, 2004. [7] Yen-Ln Chen et al., Nghttme vehcle detecton for drver assstance and autonomous vehcles, Proceedngs of the 18th Internatonal Conference on Pattern Recognton, pages 687 690, USA: ashngton, DC, 2006.

[8] Gdeon P. Sten et. al, Vson-based ACC wth a Sngle Camera: Bounds on Range and Range Rate Accuracy, Proceedngs of the IEEE Intellgent Vehcles Symposum 2003, June 2003 [9] Kaq Huang et. al, A real-tme obect detectng and trackng system for outdoor nght survellance, Pattern Recognton, 41(1):432 444, 2008 [10] Yen-Ln Chen et. al, Nght-tme Vehcle Detecton for Drver Assstance and Autonomous Vehcles, In ICPR 06: Proceedngs of the 18th Internatonal Conference on Pattern Recognton, pages 687 690, ashngton, DC, USA, 2006. IEEE Computer Socety.