Video Surveillane of High Seuriy Failiies S. Kang*, A. Koshan, B. Abid and M. Abidi Imaging, Robois, and Inelligen Sysems Laboraory, The Universiy of Tennessee, Knoxville, TN, USA *sangkyu@uk.edu Absra An inegraed sysem for auomai deeion, handover and raking of an inruder using one fixed amera and one Pan/Til/Zoom(PTZ) amera is presened. This sysem inludes auomai deeion using an overhead sai amera of an inruder who walks ino a rowded seure area, inruder handover from he sai o he neares PTZ amera, and raking using he PTZ amera. The inruder is deeed by applying opial flow algorihms o video from he overhead amera. A PTZ amera exras he inruder deeed by he overhead amera and builds a olor hisogram model of he arge. A novel region segmenaion based on he olor hisogram and he algorihm for raking are presened wih real experimenal resuls. I. INTRODUCTION The desire for safey in boh publi loaions where many people gaher and high seuriy failiies, suh as nulear plans and researh laboraories, has been inreasing sine Sepember 11. For seuriy purposes, suh loaions uilize biomeri-based reogniion ools for aess onrol as well as video amera-based surveillane sysems for monioring aiviies. Finger prining, voie, and reina idenifiaions are examples of biomeri-based aess onrols, and hese mehods usually show good performane for idenifiaion. These mehods, however, require a person o perform erain aions, suh as ouhing a sensor for finger prining, speaking for voie reogniion, or geing lose o a amera for reina idenifiaion, herefore hese mehods are no adequae for monioring unusual aiviies, suh as seuriy breahes and oher ypes of violaions. Video-based surveillane an be used o monior unusual aiviies. These sysems usually onsis of a number of ameras, moniors, and reorders whih funion wih he involvemen of human operaors. Human operaors usually monior aiviies, and he reorded video is examined afer inrusions or inidens have ourred if he human operaor did no noie inidens a he ime of ourrene. In his onex he sysem does no provide a realime warning, whih is very imporan o preven aidens. For deeing an inruder or raking suspes in a rowded area using urren sysems, he video surveillane sysem mus be moniored by human operaors, whih is no a simple endeavor even for skillful operaors, given he edious ask of wahing video for several hours a a ime. For safey reasons, i is also neessary o dee irregular moion paerns of workers or moving robos in hazardous environmens. In addiion, when monioring a large hazardous area, many operaors are needed a he same ime. Differen raking algorihms an be used for auomai arge aquisiion and raking. These inlude adapive bakground generaion and subraion [1][2], raking using shape informaion [3][4][5], olor based raking [6][7], and muliple feaure-based raking using a probabilisi daa assoiaion mehod [8]. Adapive bakground generaion is used o build sai bakgrounds from video onaining moving objes. The generaed bakground image is hen used o exra moving objes. Color-based bakground generaion an also be used o remove shadows [2]. An evaluaion on differen shadow exraion mehods an be found in [9]. Adapive bakground subraion, however, requires video from saionary ameras o build sai pixels. To over a large area, eiher a muliple number of fixed ameras or a PTZ amera are required. L. Lee e al. [10] proposed a mehod o align he ground plane aross muliple views o build a ommon oordinae for muliple ameras. F. Dellaer e al. [11] proposed fas image mahing o reuse a bakground in he daabase apured before raking. An omni-direional amera was also used o exend he field of view o 360 [12], bu deeed objes, suh as a moving person or inruder, are seen a very low-resoluion sine a single amera is used o grab a very large area. Traking using shape informaion is very helpful when used o rak known objes or objes in a daa base. Conour-based mehods [3][4] and aive shape models [5] have been proposed. These mehods usually require iniial manual plaemen of an original shape, whih needs o be plaed as near as possible o he arge,
o yield aurae raking resuls. These mehods are usually no adequae for auomai raking. Color-based raking [6][7] or obje reogniion [13] also show good raking resuls, bu hese mehods also require iniializaion seps on arges o build olor disribuion models or emplaes. The laes researh for auomai arge aquisiion using muliple ameras wih olor informaion an be found in [14], bu his onep will no be eonomial for large areas sine i requires a large number of ameras o exra 3D informaion. In his paper, we presen an inegraed video surveillane sysem using one fixed amera and one Pan/Til/Zoom (PTZ) amera. This sysem inludes auomai deeion (using an overhead sai amera) of an inruder who walks ino a resried area, inruder handover from he sai o he neares PTZ amera, and raking he inruder inside he seure area using he PTZ amera. Seion 2 presens arge deeion using he one fixed amera and moion-based segmenaion. Seion 3 desribes arge raking inluding amera handover, arge aquisiion, and olor-based raking for a PTZ amera. Experimenal resuls are presened in seion 4. We esed our sysem in a loal airpor where es subjes were easily aessible. Bu he same onep an be easily applied o oher high seuriy failiies, suh as nulear plans, miliary failiies, and researh laboraories. Conlusions are presened in seion 5. II. TARGET DETECTION Enering resried areas, for any reason, usually requires permission and seuriy proedures, suh as walking hrough a meal deeor. One serious seuriy violaion is walking in a resried direion, suh as enering hrough an exi lane in an airpor o gain aess o he passenger onourse and bypassing he hek poin. To dee an inruder who walks in he wrong direion hrough a rowded exi lane, an overhead amera was used o avoid olusions. The video from he overhead amera is used o dee inrusions by ompuing moion direion for every pixel using an opial-flow based approah [15][16]. When he sysem dees inruders, he sysem sounds an alarm o aler a nearby human operaor and sends his informaion o anoher sysem onneed o a PTZ amera for arge handover and raking, whih will be disussed in seion 3. The opial flow mehod is a moion ompuaion mehod based on he assumpion of a onsan inensiy of objes along a moion rajeory during a very shor ime. If s y, is he oninuous spae-ime inensiy disribuion, he following equaion an be used ds y, = 0. ( 1) d Equaion (1) an be formulaed wih he hain rule of differeniaion as s s v + v x y s + 1 2 = 0. ( 2) The opial flow equaion an hen be expressed as s ε of ( v( x, ) =< S, v( x, > +. ( 3) The displaemen x is he quaniy found by minimizing Equaion (3). This equaion, however, uses he derivaive of neighboring pixels, and is herefore sensiive o noise. Two differen approahes widely used o redue noise are he Horn and Shunk's mehod [15] and he Luas and Kanade's approah [16]. Luas and Kanade proposed a blok moion model, where he assumpion is ha he moion veor remains unhanged over a pariular blok of pixels. The error an hen be defined as E = ( ε ), ( 4) of x Neighbor and he soluion an be ompued by minimizing Equaion (4). One he moion veors are ompued, he regions are segmened and labeled o depi he wrong way moion. The overall sysem for breah deeion is shown in Figure 1. Figure 1: The overall sysem for breah deeion. Sine Luas and Kanade's mehod does no require any ieraions o ompue he opial flow, his mehod is more adequae for real-ime operaions. We use his algorihm o ompue moion veors for eah wo onseuive images, and perform hresholding in four differen direions. A morphologial operaion is used o remove holes from small regions, whih are less han k pixels. Finally labeling is performed. Afer hese seps, he image is denoised. We found ha homogeneous regions wih a 2
low spaial image gradien an be deeed as falseposiives, due o noise. To deermine wheher a deeed region was aused by noise, he following rierion was used: C i = i h ( I I segmen i h 1 1 segmen ) < n. ( 5) Equaion (5) represens he normalized sum of he absolue differenes beween wo onseuive frames, I and I 1. Homogeneous regions wih no moion usually have small values for C i. Hisogram inerseion and raking for eah segmen is hen performed o find orresponding blobs beween frames. This will be disussed in he following seion 3. Informaion abou inrusions will be sen o a lien onneed o a PTZ amera. Typially = 0. 6 is used. n III. TARGET TRACKING USING A PTZ CAMERA The overhead amera has a fixed view and he sysem will loose he breah even afer he person who aused he breah walks ouside he area of urren view. PTZ ameras will hen aquire and rak he arge using a olor-based raking algorihm afer amera handover. III.A. Targe handover from sai o PTZ amera When a breah ourrene is deeed, he fixed amera in harge of monioring he direion of moion riggers an alarm and provides he posiion of he arge in he world oordinae sysem o he PTZ amera. The PTZ amera hen uses ha posiion informaion o deermine is pan and il angles and lok on he arge for subsequen raking. The geomery of our sysem is shown in Figure 2, and he angles θ and δ an be ompued by: 2 2 1 x x + y θ = os, δ =, ( 6) 2 2 2 2 2 x + y x + y + ( h h ) where x and y are given by x = x x i i d y = y + y d = ( W i) a x d, = ( H b + y d, ( 7) where W is he widh, H he heigh of images, ( i, he loaion of he arge in he image oordinae sysem, a and b are he lenghs per pixel for horizonal and verial direions respeively, and he res of he parameers are shown in Figure 2. These values an be ompued by measuring wo spos in he sene and finding he orresponding poins in he image. Figure 2: The geomery of he dual-amera sysem. The op view and he side view of amera onfiguraions are shown wih parameers. III.B. Targe aquisiion during handover Our arge deeion sysem sends informaion abou a arge whih violaes he orre direion of flow o he lien; he lien manipulaes he PTZ amera using Equaion (6). The lien hen exras he op-down moion o lok on he arge approahing he PTZ amera. Two assumpions are made: 1) he PTZ amera is loaed inside of he seure area and views he exi lane where he violaion may our, 2) he heigh of he PTZ amera is muh higher han he poenial arge heigh. For insane, if he arge in Figure 2 walks from lef o righ, he arge appears o be moving from he op of he PTZ amera's view oward he boom of he view. In a real siuaion, his mehod exras no only he arge o rak, bu also op-down moion aused by people walking ou of he exi lane or shadows. Sine we need o build a olor model of he arge, i is imporan o deermine he orre arge region. If B ( is he resul of hresholding on ompued opial flow, hen B ( = 1 for moving regions and B ( = 0 for sai pixels. We also define a mask M as an ellipse wih semiminor axis, x r, and semimajor axis, y r. These values are deermined from experimens. This mask is normalized, so he sum of pixel values inside he mask is 1. We define R as he onvoluion of B and M. We have R = B * M. ( 8) If a segmen in B has a shape similar o he mask M, hen R ( will be lose o 1. We an hen deermine wheher segmens in B are loser o he shape of he mask by using:
1 R( > = e ( 9) 0 oherwise. An example illusraing his onep is shown in Figure 3. One inpu image obained from opial flow ompuaion is shown in Figure 3(a) and an exraed segmen using op-down moion is shown in Figure 3(b) wih noise presen in he segmen. R is shown in Figure 3(), where he sale is 0 o 255 insead of 0 o 1 for display purposes. We an find he bes loaion of he ellipse by alulaing and imposing ha he ener of he deeed segmen belongs o P, as shown in Figure 3(d). An example of he malfunion of opial flow ompuaion is shown in Figure 4. Two inpu images are shown in Figures 4(a) and (b), and he frame differene is shown in Figure 4(). The sum of frame differenes is smaller han he hreshold, bu he deeed moving area, whih is shown in Figure 4(d), inludes areas from he bakground. I is possible o use a small hreshold value, bu moving objes wih omplex exures may be onsidered as global amera moions when frame differening wih a small hreshold. We hek he size of deeed moving blobs and reje a blob when he size of he blob is larger han our rierion. We perform his proess before deeing he ellipsoidal region. (a) (b) (a) (b) () (d) Figure 3: Finding a arge from moving pixels. (a) One inpu image, (b) he exraed segmen using op-down moion, () he resul of applying Equaion (8), and (d) he deeed ellipse. If he passage where he overhead amera is insalled is narrow, amera handover by Equaion (6) is no neessary if he field of view of he PTZ amera overs he enire passage. If he passage is wide and long, panning and iling by Equaion (6) will be required more han one ime. This means ha he PTZ amera needs o hange is angle several imes, and opial flow ompuaion may no work orrely due o large movemens during amera handover. Two soluions exis o dee wheher he amera hanges angles: 1) querying he urren angles if he PTZ amera suppors his funion, and 2) ompuing he frame differene beween onseuive frames during panning and iling. The firs soluion is good when he amera an provide urren angles quikly, bu i usually akes ime wih a regular PTZ amera. The seond mehod an be used wih any ype of PTZ amera. When he amera hanges angles, wo onseuive frames will show global moion, and he sum of frame differenes will have a high value. We ompare his sum wih a predefined hreshold o deermine if he amera is sai. We, however, found ha he opial flow mehod will no work well even hough wo images have a very small displaemen. () (d) Figure 4: An example of he malfunion of opial flow ompuaion. (a) and (b) represen wo inpu images, and () he frame differene beween (a) and (b). (d) shows deeed ellipses rejeed by he size onsrain. III.C. Color-based raking When an ellipse is superimposed on he deeed segmen, he orresponding olor inpu segmen is divided ino ahromai and hromai pixels using he HSI olor spae. K-means segmenaion is hen applied o divide he hromai pixels ino muliple olor pixels using he hue omponen. Afer olor segmenaion, he similariy beween he divided segmens and he surrounding pixels is ompued by S bg min( Hi ( k), H ( k) = k i k Hbg ( k) ), H ( k) for k = 01,, L,L 1 and i = 01,, L,N 1, k bg ( 10) where H i represens he i-h segmen from he deeed ellipse, H bg is he hisogram of he surrounding pixels, L he number of bins for eah hisogram model, and N he number of divided segmens. This equaion is a normalized version of he hisogram inerseion desribed in [17].
For ahromai pixels, sauraion and value hannels are used for ompuing he similariy using Equaion (10), and hue and sauraion hannels are used for hromai pixels. An example is shown in Figure 5. A deeed ellipse is shown in Figure 5(a) and is segmenaion resul is shown in Figure 5(b). Afer applying Equaion (10) o eah segmen in Figure 5(b) and he bakground reangle defined in Figure 5(), we an ge a major olor ha is differen from he bakground. In his ase, wo segmens were deeed: one is skin olor whih is similar o he bakground olor, and ahromai pixels whih are blak. The similariy ompued by Equaion (10) is 0.51 for he skin olor and 0.07 for ahromai pixels. Finally, he ahromai value was hosen o represen he arge s unique olor for raking. (a) (b) () Figure 5: An example of ompuing similariy beween eah segmen and surrounding pixels. (a) Deeed ellipse on an inpu frame, (b) lassifiaion resul o divide up ahromai and hromai pixels. Whie pixels represen ahromai pixels, () definiion of surrounding pixels ouside he ellipse region. Afer we sele he arge s olor, a 1D smoohing operaion on eah olor hisogram is performed o deal wih small illuminaion hanges. This hisogram is hen normalized by he maximum value of he hisograms for eah hannel. For example, a hisogram H i an be normalized by Hi ( k) H in =. ( 11) max( Hi ) Afer normalizaion, he similariy P beween he model hisogram and inpu image I an be formulaed by P = H ( I( ) H ( I( )), ( 12) ( 1n 2n j where H 1 n and H 2 n are he normalized hisograms for hue and sauraion for hromai segmens, and sauraion and inensiy for ahromai segmens. The mean shif algorihm [18] is used o find he ener of a disribuion in a searh window. This mehod is ieraive and an be formulaed as i ( + 1) = i S j S i S j S i, j ( + 1) = i S j S i S j S j ( 13) where S is a searh window plaed a ( i (, j ( ) and is he number of ieraions. A eah ieraion, he searh window will move o he ener; he ener is usually he loaion of he arge. The PTZ amera is hen onrolled o keep he searh window in he ener of he frame. We also hange he size of he searh window by ± 2 pixels a eah ieraion by alulaing he mean of P in he searh window, M s, and omparing o he mean of he 4 boundaries o deal wih size variaions of he arge. If he mean of a boundary is smaller han 0.2M s, we shrink he searh box. If he mean is larger han 0.8M s, hen we enlarge he searh window in he direion of he boundary. We also hange he zooming raio during raking based on il angles and a predefined zooming raio a disane k. If he obje approahes he amera, he il angle will be inreased, and we an approximae he disane beween he arge and he PTZ amera under he assumpion ha he raker suessfully dees he arge. The overall sysem for arge raking is shown in Figure 6., Figure 6: The overall sysem for arge raking using a PTZ amera.
IV. EXPERIMENT RESULTS The firs experimen performed was o es if he sysem an dee breahes in a rowded area. We insalled our sysem a he exi lane of a loal airpor using a Sony SSC-DC393 amera. The heigh of his overhead amera was abou 15 f. A Marox Meeor2 frame grabber was used wih a Penium4 1.4 GHz PC. Resuls obained are shown in Figure 7. The sysem suessfully deeed in real-ime a person who walked in he wrong direion hrough he exi lane. and il up/down. Traking resuls are shown in Figure 9. The sysem shows reasonably good performane on raking he arge. Images of 320 240 pixel size were apured for eah experimen, and resized o 160 120 pixels for he opial flow ompuaion. Figure 8: Examples of deeing moving objes and ellipses for olor modeling using he PTZ amera. (a) 3046 (b) 3059 (a) 200 (b) 220 () 240 () 3071 (d) 3085 (d) 260 (e) 280 (f) 300 (g) 260 (h) 280 (i) 300 (e) 3098 (f) 3111 Figure 7: Deeing inrusions. Bounding boxes are used o indiae an inruder, and he frame numbers in he live sequene of he experimen are shown wih eah figure. The seond experimen illusraed raking using a single Pelo Spera 3 SE PTZ amera. A Marox Meeor2 frame grabber was used wih a Penium4 2.4GHz, and he wo ompuers were onneed via a 802.11b wireless nework. We assumed ha he iniial view of he PTZ amera poined o he enrane an inruder ould walk hrough. The heigh of he PTZ amera was approximaely 13 f. from he ground and he amera plaed 74 f. from he enrane. When a breah ours, he PTZ amera hanges is angles o obain a fron view of he arge. The deeed obje moving oward he PTZ amera and ellipses are shown in Figure 8. Afer he sysem suessfully dees a major olor in he deeed ellipsoidal region, raking sars. The PTZ amera hanges is angles o plae he arge inside is view by responding o ommands, suh as pan lef/righ ( 260 (k) 280 (l) 300 (m) 260 (n) 280 (o) 300 (p) 260 (q) 280 (r) 300 Figure 9: Traking resuls using he PTZ amera. Bounding boxes are used o indiae a arge, and he frame numbers in he live sequene are shown wih he figures.
V. CONCLUSIONS We have presened a sysem uilizing auomai deeion, handover, and raking of inruders using one sai amera and a PTZ amera for high seuriy failiies. Experimenal resuls show ha he performane for inrusion deeion using an overhead amera is very good, and auomai arge aquisiion and raking using a PTZ amera under onrolled ondiions gives aepable performane. We have insalled his raking sysem a a loal airpor for esing, bu his sysem an be insalled a any loaion wih similar appliaions. Fuure works will address: 1) ombining olor wih shape informaion, 2) raking under parial olusion, 3) robus raking under varying illuminaion hanges, and 4) exending he algorihm for muliple PTZ amera-based raking. Shape informaion will remove ambiguiy when a arge s olor is no unique in a sene or olor hanges under differen illuminaion. This will also help he raking sysem o deermine wheher he arge is oluded. Camera handover beween PTZ ameras is anoher opi o be addressed o implemen he raking sysem in large areas. ACKNOWLEDGMENTS This work was suppored by he DOE Universiy Researh Program in Robois under gran DOE-DE- FG02-86NE37968, by he DOD/TACOM/NAC/ARC Program, R01-1344-18, and by FAA/NSSA Program, R01-1344-48/49. REFERENCES 1. I. HARITAOGLU, D. HARWOOD, and L. S. DAVIS, W4: Real-Time Surveillane of People and Their Aiviies, IEEE Trans. On Paern Analysis and Mahine Inelligene, Vol.22, No. 8, pp. 809-830, Augus 2000. 2. T. HORPRASERT, D. HARWOOD, and L.S. DAVIS, A Robus Bakground Subraion and Shadow Deeion, Pro. ACCV'2000, Taipe Taiwan, January 2000. 3. A. BLAKE and M. ISARD, Aive Conours, Springer, London, England, 1998. 4. H. JIANG and M. S. DREW, A prediive onour ineria snake model for general video raking, Pro. IEEE ICIP, vol 3, pp. 413-416, Sepember 2002. 5. A. KOSCHAN, S. K. KANG, J. K. PAIK, B. R. ABIDI, and M. A. ABIDI, Video obje raking based on exended aive shape models wih olor informaion, Pro. 1s European Conf. on Color in Graphis, Imaging, Vision, pp. 126-131, Universiy of Poiiers, Frane, April, 2002. 6. S. MCKENNA, Y. RAJA, and S. GONG. Traking olour objes using adapive mixure models, Image and Vision Compuing, Vol. 17, pp. 225-231, 1999. 7. D. COMANICIU, V. RAMESH, Comaniiu, AND P. MEER, Kernel-based obje raking, IEEE Trans. On Paern Analysis and Mahine Inelligene, Vol. 25 No. 5, pp. 564-577, May 2003 8. C. RASMUSSEN and G. D. HAGER, Probabilisi daa assoiaion mehods for raking omplex visual objes, IEEE Trans. On Paern Analysis and Mahine Inelligene, Vol. 23, No. 6, pp. 560-576, June 2001. 9. A. PRATI, I. MIKIC, M.M. TRIVEDI, and R. CUCCHIARA, Deeing moving shadows: algorihms and evaluaion, IEEE Trans. On Paern Analysis and Mahine Inelligene, Vol. 25 No. 7, pp. 918-923, July 2003. 10. L. LEE, R. ROMANO, and G. STEIN, Monioring aiviies from muliple video sreams: esablishing a ommon oordinae frame, IEEE Trans. On Paern Analysis and Mahine Inelligene, Vol. 22, No. 8, pp. 758-767, Augus 2000. 11. F. DELLAERT and R. COLLINS, Fas Image- Based Traking by Seleive Pixel Inegraion, ICCV 99 Workshop on Frame-Rae Vision, Sepember, 1999. 12. M. NICOLESCU, G. MEDIONI, and M. LEE, Segmenaion, raking and inerpreaion using panorami video, IEEE Workshop on Omnidireional Vision, pp. 169-174, 2000. 13. P. CHANG and J. KRUMM, Obje Reogniion wih Color Coourrene Hisograms, IEEE Conf. on Compuer Vision and Paern Reogniion, June, 1999. 14. A. MITTAL and L. S. DAVIS, M 2 raker: A muli-view approah o segmening and raking people in a luered sene,'' Inernaional Journal of Compuer Vision, Vol 51. No. 3, pp. 189-203, 2003. 15. B.K.P. HORN and B.G. SCHUNCK, Deermining opial flow, Arifiial Inelligene, Vol. 16, pp. 185-203, Augus 1981. 16. B. D. LUCAS and T. KANADE, An ieraive image regisraion ehnique wih an appliaion o sereo vision, Pro. DARPA Image Undersanding Workshop, pp. 121-130, 1981. 17. M. J. SWAIN and D. H. BALLARD., Color indexing,'' Inernaional Journal of Compuer Vision, vol. 7, No. 1, pp. 11-32, November 1991. 18. Y. CHENG, Mean shif, model seeking, and lusering, IEEE Trans. On Paern Analysis and Mahine Inelligene, Vol. 17, No. 8, Augus 1995.