L a b o r a t o i r e I n f o r m a t i q u e F o n d a m e n t a l e d e L i l l e Human behavior analysis from videos using optical flow Yassine Benabbas Directeur de thèse : Chabane Djeraba Multitel Workshop 2011 UNIVERSITE DES SCIENCES ET TECHNOLGIES DE LILLE LIFL UMR 8022 Bât. M3 59655 Villeneuve d Ascq cedex Tél. : (33) 3 28 77 85 41 Fax : (33) 3 28 77 85 39 e-mail : @lifl.fr 1
Introduction State of the Art Global approach Plan Recognition of human Actions Crowd Event Detection Motion Pattern Extraction Conclusion 2
Introduction Automatic behavior analysis is a very active field in research and industry It consists in extracting information from videos using computer vision algorithms The extracted information is used to: Assist surveillance operators Provide statistics for marketing agents Perform video retrieval Allow more natural and immersive human machine interactions etc 3
State of the art Many approaches have been proposed for behavior analysis Human activity recognition [Le et al. cvpr2011 ] Crowd event detection [Adam et al. TPAMI 2008] Motion pattern extraction [Rodriguez et al, iccv2009] However, they were focusing on a single aspect of behavior analysis or were very complex Example : Dynamic textures [Ma and Cisar, cvpr2009] Privacy issues are not addressed Intelligent cameras that contain embedded software require fast and reusable algorithms 4
Our approach We propose a generic approach for behavior analysis It is based on three levels of features Easier understanding Each level can be designed separately More control Each level can be reused for other purposes Save more processing power The lower level relies on motion information Preserves privacy out of the box 5
General Approach Applications Human action recognition Crowd event detection Motion pattern extraction High level information Mid-level descriptors Low level features Video stream 6
General approach LOW LEVEL FEATURES 7
Interest point detection Identification of good points that can be efficiently and easily tracked. We used the «good features to track» algorithm Fast and efficient OpenCV implementation Jianbo Shi; Tomasi, C.;, "Good features to track," Computer Vision and Pattern Recognition, 1994. Proceedings CVPR '94., 1994 IEEE Computer Society Conference on, vol., no., pp.593-600, 21-23 Jun 1994 doi: 10.1109/CVPR.1994.323794 8
Optical flow computation Estimate the motion of interest points Implementation of Bouguet + = Frame t and its interest points Frame t+1 Optical flow vectors 9
General Approach Applications Human action recognition Crowd event detection Motion pattern extraction High level information Mid-level descriptors Low level features Video stream 10
General approach MID-LEVEL FEATURES : DIRECTION MODEL AND MAGNITUDE MODEL 11
Vector allocation to blocks Each vector is allocated to a block depending on its origin Eliminate vectors with a very small or a very big magnitude Optical flow vectors allocated to a matrix of 8x4 blocs 12
Direction model The orientations of optical flow vectors are clustered in each bloc The circular data is clustered using von Mises distributions 13
Direction model The orientations of optical flow vectors are clustered in each bloc The circular data is clustered using von Mises distributions 14
Direction model (2) The direction model is updated at each new frame for all the duration of the video clip t=0 Optical flow Direction model 15
Direction model (2) The direction model is updated at each new frame for all the duration of the video clip t=40 Optical flow Direction model Bloc size: 20x20 16
Direction model (2) The direction model is updated at each new frame for all the duration of the video clip T=115 Optical flow Direction model Bloc size: 20x20 17
Direction model (2) The direction model is updated at each new frame for all the duration of the video clip T=160 Optical flow Direction model Bloc size: 20x20 18
Magnitude model The magnitude model is estimated following the same steps as the direction model We estimate a Gaussian mixture for each bloc 19
General approach APPLICATIONS 20
General Approach Applications Human action recognition Crowd event detection Motion pattern extraction High level information Mid-level descriptors Low level features Video stream 21
Human Action Recognition Different terminologies (action, activity, event) In this presentation: action recogntion consists in the identification of simple daylife actions(ex : walk, run...) Our input is a video (query video) captured from a monocular camera Answer to the phone Boxing 22
Model associated to a video sequence Model of a video = (direction model, magnitude model) 23
Distance metric walking running jogging Query model handwaving handclapping boxing Template models 24
Distance metric walking running jogging Query model handwaving handclapping boxing Template models 25
Distance metric Detected event walking running jogging Query model handwaving handclapping boxing Template models 26
Distance metric Distance between two direction models Distance between two magnitude models 27
Result comparison KTH dataset ADL dataset [BALD11] Yassine Benabbas, Samir Amir, Adel Lablack, and Chabane Djeraba. Human action recognition using direction and magnitude models of motion. In International Conference on Computer Vision and Applications (VISAPP), 2011 28
General Approach Applications Human action recognition Crowd event detection Motion pattern extraction High level information Mid-level descriptors Low level features Video stream 29
Crowd Event Detection Objective: Detection of interesting events or situation that occur in a crowd scene The targeted events are: Running Splitting Local Dispersion Evacuation Merging These events are defined in the PETS 2009 workshop. 30
Compute the instantaneous direction model Compute the direction model for the current frame Keep only the main orientation for each block of the direction model 31
Group Clustering and Tracking Cluster the neighboring blocks that have a similar direction into a group. 32
Group Clustering and Tracking Cluster the neighboring blocks that have a similar direction into a group. 33
Group Clustering and Tracking Cluster the neighboring blocks that have a similar direction into a group. 34
Group Clustering and Tracking Cluster the neighboring blocks that have a similar direction into a group. Define an orientation and a centroid for each group. Each group is tracked over the next frames 35
We use two classifiers: Event detection One for running and walking events using the mean motion speed as a feature One for local dispersion, split, merge and evacuation events using as features: Number of groups Mean orientation The circular variance Mean motion speed The mean distance between groups Using two classifiers allows to detect 36
Comparison [BID11] - Yassine Benabbas, Nacim Ihaddadene, and Chabane Djeraba. Motion pattern extraction and event detection for automatic visual surveillance. EURASIP Journal on Image and Video Processing, 2011:15, 2011 37
General Approach Applications Human action recognition Crowd event detection Motion pattern extraction High level information Mid-level descriptors Low level features Video stream 38
Motion Pattern Extraction It consists of extracting usual (or repetitive) patterns (or trends) of motion It can be considered as a synthesized information about the motion behavior in a video 39
Motion Pattern Extraction Motions patterns learned from a given scene can be used for modeling usual behaviors of subjects and have a lot of applications: They provide relevant information about subjects behavior. They can improve tracking results. They can help to detect events. Learning motion patterns in unstructured crowd scenes is a difficult task; In some locations in the scene, the motion has different orientations (example : zebra crossing) 40
Clustering similar regions Affect at most k major orientations for each cell. They are obtained from the cell s mixture model. A direction model is obtained Representation of the learned direction model 41
Clustering similar regions Cluster similar blocks depending on their major orientations Two blocks are similar If they are neighbor, the window is one block. And the cosine similarity between two of their major orientations is less that a predefined threshold. A block can belong to a maximum of k clusters Direction Model Pattern 1 Pattern 2 Pattern 3 42
Experiments Car traffic video from the AVSS dataset The orientations of optical flow vectors are represented 43
Detected patterns 44
Putting it all together 45
Escalator 46
Comparison [BID11] - Yassine Benabbas, Nacim Ihaddadene, and Chabane Djeraba. Motion pattern extraction and event detection for automatic visual surveillance. EURASIP Journal on Image and Video Processing, 2011:15, 2011 47
Conclusion and future works Conclusions General approach for video analysis Based on motion, which preserves privacy Very promising results Can be easily improved and applied to other applications Future works Open source behavior analysis toolbox Apply approaches in real environments Scale independent features In event detection: apply weights to direction and magnitude models Affine group analysis (detect walking and running persons inside a group) 48
Thank you for your attention QUESTIONS? 49