Real-time Skeleton-tracking-based Human Action Recognition Using Kinect Data
|
|
|
- Elijah Stokes
- 10 years ago
- Views:
Transcription
1 Real-time Skeleton-tracking-based Human Action Recognition Using Kinect Data Georgios Th. Papadopoulos, Apostolos Axenopoulos and Petros Daras Information Technologies Institute Centre for Research & Technology - Hellas Thessaloniki, Greece Abstract. In this paper, a real-time tracking-based approach to human action recognition is proposed. The method receives as input depth map data streams from a single kinect sensor. Initially, a skeleton-tracking algorithm is applied. Then, a new action representation is introduced, which is based on the calculation of spherical angles between selected joints and the respective angular velocities. For invariance incorporation, a pose estimation step is applied and all features are extracted according to a continuously updated torso-centered coordinate system; this is different from the usual practice of using common normalization operators. Additionally, the approach includes a motion energy-based methodology for applying horizontal symmetry. Finally, action recognition is realized using Hidden Markov Models (HMMs). Experimental results using the Huawei/3DLife 3D human reconstruction and action recognition Grand Challenge dataset demonstrate the efficiency of the proposed approach. Keywords: Action recognition, skeleton-tracking, action representation, depth map, kinect 1 Introduction Action recognition constitutes a widely studied field and a very active topic in the computer vision research community [7]. This is due to the wide set of potential fields where the research outcomes can be commercially applied, such as surveillance, security, human computer interaction, smart houses, helping the elderly/disabled, to name a few. In order to develop a robust action recognition system, the respective algorithm needs to efficiently handle the differences in the appearance of the subjects, the human silhouette features, the execution of the same actions, etc. Additionally, the system should incorporate the typical rotation, translation and scale invariances. Despite the fact that multiple research groups focus on this topic and numerous approaches have already been presented, significant challenges towards fully addressing the problem in the general case are still present. Action recognition approaches can be roughly divided into the following three categories [10], irrespectively of the data that they receive as input (i.e.
2 single-camera videos, multi-view video sequences, depth maps, 3D reconstruction data, etc.): Space-Time Interest Point (STIP)- [6][5][2], spatio-temporal shape- [13][14][3] and tracking-based [4][12][8][9]. STIP-based methods perform analysis at the local-level; however, they typically exhibit increased computational complexity for reaching satisfactory recognition performance. Spatio-temporal shape approaches rely on the estimation of global-level representations for performing recognition, using e.g. the outer boundary of an action; however, they are prone to the detrimental effects caused by self-occlusions of the performing subjects. On the other hand, the performance of tracking-based approaches, which rely on the tracking of particular features or specific human body parts in subsequent frames (including optical-flow-based methods) depends heavily on the efficiency of the employed tracker. Nevertheless, the advantage of the latter category of methods is that they can allow the real-time recognition of human actions. In this paper, a real-time tracking-based approach to human action recognition is proposed. The method receives as input a sequence of depth maps captured from a single kinect sensor, in order to efficiently capture the human body movements in the 3D space. Subsequently, a skeleton-tracking algorithm is applied, which iteratively detects the position of 15 joints of the human body in every captured depth map. Then, a new action representation is introduced, which is based on the calculation of spherical angles between selected joints and the respective angular velocities, for satisfactorily handling the differences in the appearance and/or execution of the same actions among the individuals. For incorporating further invariance to appearance, scale, rotation and translation, a pose estimation step is applied prior to the feature extraction procedure and all features are calculated according to a continuously updated torso-centered coordinate system; this is different from the usual practice of using normalization operators during the analysis process [4]. Additionally, the approach incorporates a motion energy-based methodology for applying horizontal symmetry and hence efficiently handling left- and right-body part executions of the exact same action. Finally, action recognition is realized using Hidden Markov Models (HMMs). Experimental results using the Huawei/3DLife 3D human reconstruction and action recognition Grand Challenge dataset 1 demonstrate the efficiency of the proposed approach. The paper is organized as follows: Section 2 outlines the employed skeletontracking algorithm. The proposed action recognition approach is described in Section 3. Experimental results are presented in Section 4 and conclusions are drawn in Section 5. 2 Skeleton-tracking Prior to the application of the proposed action recognition approach, the depth maps captured by the kinect sensor are processed by a skeleton-tracking algorithm. The depth maps of the utilized dataset were acquired using the OpenNI 1
3 API 2. To this end, the OpenNI high-level skeleton-tracking module is also used for detecting the performing subject and tracking a set of joints of his/her body. More specifically, the OpenNI tracker detects the position of the following set of joints in the 3D space G = {g i, i [1, I]} {T orso, Neck, Head, Left shoulder, Lef t elbow, Lef t wrist, Right shoulder, Right elbow, Right wrist, Left hip, Left knee, Left foot, Right hip, Right knee, Right foot}. The position of joint g i is implied by vector p i (t) = [x y z] T, where t denotes the frame for which the joint position is located and the origin of the orthogonal XY Z co-ordinate system is placed at the center of the kinect sensor. An indicative example of a captured depth map and the tracked joints is given in Fig. 1. The OpenNI skeleton-tracking module requires user calibration in order to estimate several body characteristics of the subject. In recent versions of OpenNI, the auto-calibration mode enables user calibration without requiring the subject to undergo any particular calibration pose. Since no calibration pose was captured for the employed dataset, the OpenNI s (v ) auto-calibration mode is used in this work. The experimental evaluation showed that the employed skeleton-tracking algorithm is relatively robust for the utilized dataset. In particular, the position of the joints is usually detected accurately, although there are some cases where the tracking is not correct. Characteristic examples of the latter are the inaccurate detection of the joint positions when very sudden and intense movements occur (e.g. arm movements when performing actions like punching ) or when self-occlusions are present (e.g. occlusion of the knees when extensive body movements are observed during actions like golf drive ). 3 Action recognition In this section, the proposed skeleton-tracking-based action recognition approach is detailed. The developed method satisfies the following two fundamental principles: a) the computational complexity needs to be relatively low, so that the real-time processing nature of the algorithm to be maintained, and b) the dimensionality of the estimated action representation needs also to be low, which is a requirement for efficient HMM-based analysis [11]. 3.1 Pose estimation The first step in the proposed analysis process constitutes a pose estimation procedure. This is performed for rendering the proposed approach invariant to differentiations in appearance, body silhouette and action execution among different subjects, apart from the typically required invariances to scale, translation and rotation. The proposed methodology is different from the commonly adopted normalization procedures (e.g. [4]), which in their effort to incorporate invariance characteristics they are inevitably led to some kind of information loss. In particular, the aim of this step is to estimate a continuously updated 2
4 (a) (b) Fig. 1. Indicative pose estimation example: (a) examined frame and (b) captured depth map along with the tracked joints and the estimated orthogonal pose vectors. orthogonal basis of vectors for every frame t that represents the subject s pose. The calculation of the latter is based on the fundamental consideration that the orientation of the subject s torso is the most characteristic quantity of the subject during the execution of any action and for that reason it could be used as reference. For pose estimation, the position of the following three joints is taken into account: Lef t shoulder, Right shoulder and Right hip. These comprise joints around the torso area, whose relative position remains almost unchanged during the execution of any action. The motivation behind the consideration of the three aforementioned joints, instead of directly estimating the position of the torso joint and the respective normal vector, is to reach a more accurate estimation of the subject s pose. It must be noted that the Right hip joint was preferred instead of the obvious T orso joint selection. This was performed so that the orthogonal basis of vectors to be estimated from joints with bigger in
5 between distances that will be more likely to lead to more accurate pose estimation. However, no significant deviation in action recognition performance was observed when the T orso joint was used instead. In this work, the subject s pose comprises the following three orthogonal vectors {n 1, n 2, n 3 } that are calculated as follows: n 1 = p 7 p 4 p 7 p 4, u = p 7 p 13 p 7 p 13 n 3 = n 1 u n 1 u, n 2 = n 3 n 1 (1) where subscript t is omitted in the above expressions for clarity,. denotes the norm of a vector and denotes the cross product of two vectors. An indicative example of the proposed pose estimation procedure is illustrated in Fig Action representation For realizing efficient action recognition, an appropriate representation is required that will satisfactorily handle the differences in appearance, human body type and execution of actions among the individuals. For that purpose, the angles of the joints relative position are used in this work, which showed to be more discriminative than using e.g. directly the joints normalized coordinates. In order to compute a compact description, the aforementioned angles are estimated in the spherical coordinate system and the radial distance is omitted, since it contains information that is not necessary for the recognition process. Additionally, building on the fundamental idea of the previous section, all angles are computed using the T orso joint as reference, i.e. the origin of the spherical coordinate system is placed at the T orso joint position. For computing the proposed action representation, only a subset of the supported joints is used. This is due to the fact that the trajectory of some joints mainly contains redundant or noisy information. To this end, only the joints that correspond to the upper and lower body limbs were considered after experimental evaluation, namely the joints Left shoulder, Left elbow, Left wrist, Right shoulder, Right elbow, Right wrist, Left knee, Left foot, Right knee and Right foot. Incorporating information from the remaining joints led to inferior performance, partly also due to the higher dimensionality of the calculated feature vector that hindered efficient HMM-based analysis. For every selected joint g i the following spherical angles are estimated: φ i = arccos( (p i p 0 ), n 2 ) [0, π] p i p 0 n 2 θ i = arctan( (p i p 0 ), n 1 ) [ π, π] (2) (p i p 0 ), n 3 where subscript t is omitted for clarity, φ i is the computed polar angle, θ i is the calculated azimuth angle and, denotes the dot product of two vectors.
6 Complementarily to the spherical angles, it was experimentally shown that the respective angular velocities provide additional discriminative information. To this end, the polar and azimuth velocities are estimated for each of the selected joints g i, using the same expressions described in (2). The difference is that instead of the position vector p i the corresponding velocity vector v i is used. The latter is approximated by the displacement vector between two successive frames, i.e. v i (t) = p i (t) p i (t 1). The estimated spherical angles and angular velocities for frame t constitute the frame s observation vector. Collecting the computed observation vectors for all frames of a given action segment forms the respective action observation sequence h that will be used for performing HMM-based recognition, as will be described in the sequel. 3.3 Horizontal symmetry A problem inherently present in action recognition tasks concerns the execution of the same action while undergoing either a right- or left-body part motion. In order to address this issue, a motion energy-based approach is followed in this work, which builds upon the idea of the already introduced subject s pose (Section 3.1). The proposed method goes beyond typical solutions (e.g. measuring the length or the variance of the trajectories of the left/right body joints) and is capable of identifying and applying symmetries not only concerning common upper limb movements (e.g. left/right-hand waving actions), but also more extensive whole-body actions (e.g. right/left-handed golf-drive). In particular, all joints defined in G are considered, except from the T orso, Head and Neck ones. For each of these joints, its motion energy, which is approximated by v i (t) = p i (t) p i (t 1), is estimated for every frame t. Then, the calculated motion energy value v i (t) is assigned to the Left/Right Body Part (LBP/RBP ) according to the following criterion: if (p i (t) p 0 (t)), n 1 (t) = { > 0, vi (t) RBP < 0, v i (t) LBP (3) where denotes the assignment of energy value v i (t) to LBP/RBP. By considering the overall motion energy that is assigned to LBP/RBP for the whole duration of the examined action, the most active body part is estimated. Then, a simple reallocation of the extracted feature values and application of horizontal symmetry attributes is performed. This is realized as follows: If the most active part is LBP, the representation described in Section 3.2 remains unchanged. In case that RBP is the most active one, the feature values of horizontally symmetric joints (e.g. Lef t shoulder-right shoulder) are exchanged in all observation vectors of the respective action observation sequence h, while the horizontal symmetry attribute is applied by inverting the sign of all estimated azimuth angles and azimuth angular velocities. In this way, efficient horizontal symmetry that covers both limb motions as well as more extensive body movements is imposed.
7 3.4 HMM-based recognition HMMs are employed in this work for performing action recognition, due to their suitability for modeling pattern recognition problems that exhibit an inherent temporality [11]. In particular, a set of J HMMs is employed, where an individual HMM is introduced for every supported action a j. Each HMM receives as input the action observation sequence h (described in Section 3.2) and at the evaluation stage returns a posterior probability P (a j h), which represents the observation sequence s fitness to the particular model. Regarding the HMM implementation details, fully connected first order HMMs, i.e. HMMs allowing all possible hidden state transitions, were utilized for performing the mapping of the low-level features to the high-level actions. For every hidden state the observations were modeled as a mixture of Gaussians (a single Gaussian was used for every state). The employed Gaussian mixture models (GMMs) were set to have full covariance matrices for exploiting all possible correlations between the elements of each observation. Additionally, the Baum Welch (or Forward Backward) algorithm was used for training, while the Viterbi algorithm was utilized during the evaluation. Furthermore, the number of hidden states of the HMMs was considered a free variable. The developed HMMs were implemented using the software libraries of [1]. 4 Experimental results In this section, experimental results from the application of the proposed approach to the Huawei/3DLife 3D human reconstruction and action recognition Grand Challenge dataset are presented. In particular, the second session of the first dataset is used, which provides RGB-plus-depth video streams from two kinect sensors. In this work, the data stream from the frontal kinect was used. The dataset includes captures of 14 human subjects, where each action is performed at least 5 times by every individual. Out of the available 22 supported actions, the following set of 17 dynamic ones were considered for the experimental evaluation of the proposed approach: A = {a j, j [1, J]} {Hand waving, Knocking the door, Clapping, T hrowing, P unching, P ush away, Jumping jacks, Lunges, Squats, P unching and kicking, W eight lif t ing, Golf drive, Golf chip, Golf putt, T ennis f orehand, T ennis backhand, W alking on the treadmill}. The 5 discarded actions (namely Arms folded, T P ose, Hands on the hips, T P ose with bent arms and F orward arms raise) correspond to static ones that can be easily detected using a simple action representation; hence, they were not included in the conducted experiments that aim at evaluating the performance of the proposed approach for detecting complex and time-varying human actions. Performance evaluation was realized following the leave-one-out methodology, where in every iteration one subject was used for performance measurement and the remaining ones were used for training; eventually, an average performance measure was computed taking into account all intermediate recognition results.
8 a1 a5 Action confusion matrix 100% 80% (a) Actions a10 60% 40% a15 20% a1 a5 a10 a15 Actions 0% 100% Action recognition rates Proposed approach Comp1 Comp2 Comp3 (b) Recognition rate 80% 60% 40% 20% 0% a1 a5 a10 a15 accuracy Actions Fig. 2. Obtained action recognition results: (a) Estimated action confusion matrix and (b) calculated action recognition rates. Supported actions: a 1: Hand waving, a 2: Knocking the door, a 3: Clapping, a 4: Throwing, a 5: Punching, a 6: Push away, a 7: Jumping jacks, a 8 : Lunges, a 9 : Squats, a 10 : Punching and kicking, a 11 : Weight lifting, a 12 : Golf drive, a 13 : Golf chip, a 14 : Golf putt, a 15 : Tennis forehand, a 16 : Tennis backhand, a 17 : Walking on the treadmill. In Fig. 2, quantitative action recognition results are presented in the form of the estimated confusion matrix (Fig. 2 (a)) and the calculated recognition rates (Fig. 2 (b)), i.e. the percentage of the action instances that were correctly identified. Additionally, the value of the overall classification accuracy, i.e. the percentage of all action instances that were correctly classified, is also given. For performance evaluation, it has been considered that arg max j (P (a j h)) indicates the action a j that is assigned to observation sequence h. From the presented results, it can be seen that the proposed approach achieves satisfactory action recognition performance (overall accuracy equal to 76.03%), which demonstrates the capability of the developed method to combine real-time processing with increased recognition rates. Examining the results in details, it can be seen that there are actions that exhibit high recognition rates (e.g. Jumping jacks, P unching and
9 Table 1. Time efficiency evaluation (per frame average processing times in msec) Kinect sensor Steps of the proposed approach Capturing Skeleton Feature HMM-based period tracking extraction recognition kicking and W alking on the treadmill), since they present characteristic motion patterns among all subjects. However, there are also actions for which the recognition performance is not that increased (e.g. P unching, Knocking the door and Golf drive). This is mainly due to these actions presenting very similar motion patterns over a period of time during their execution with other ones (i.e. P unching and kicking, Hand waving and Golf chip, respectively). Moreover, the careful examination of the obtained results revealed that further performance improvement was mainly hindered due to the following two factors: a) the employed tracker sometimes provided inaccurate joint localizations (especially in cases of rapid movements or self-occlusions) and b) significant ambiguities in the execution of particular action pairs (e.g. some Golf drive instances presented significant similarities with corresponding Golf chip ones that even a human observer would be difficult to discriminate between them). The proposed approach is also quantitatively compared with the following variants: a) use of normalized Euclidean distance of every selected joint from the T orso one (Comp1), b) use of normalized spherical angles of each joint from the T orso one (Comp2), and c) variant of the proposed approach, where the estimated spherical angles and angular velocities are linearly normalized with respect to the maximum/minimum values that they exhibit during the execution of a particular action (Comp3). Variants (a) and (b) follow the normalization approach described in [4], i.e. the human joint vector is position and orientation normalized without considering the relative position of the joints. From the presented results, it can be seen that the proposed method significantly outperforms both (a) and (b) variants. The latter demonstrates the usefulness of estimating the pose of the subject during the computation of the action representation, compared to performing a normalization step that does not take explicitly into account the relative position of the detected joints. The proposed method also outperforms variant (c), which suggests that performing a normalization of the angles/velocities within the duration of a particular action leads to decrease in performance. The time efficiency of the proposed approach is evaluated in Table 1. In particular, the per frame average processing time, i.e. the time required for processing the data that correspond to a single frame, are given for every algorithmic step of the proposed method. More specifically, the following steps were considered: a) skeleton-tracking, described in Section 2, b) feature extraction, which includes the pose estimation, action representation computation and horizontal symmetry application procedures that are detailed in Sections 3.1, 3.2 and 3.3, respectively, and c) HMM-based recognition, outlined in Section 3.4. The dura-
10 tion of the aforementioned procedures is compared with the capturing period of the employed kinect sensor, i.e. the time interval between two subsequently captured depth maps. The average processing times given in Table 1 are obtained using a PC with Intel i7 processor at 2.67 GHz and a total of 6 GB RAM, while for their computation all video sequences of the employed dataset were taken into account. From the presented results, it can be seen that the employed skeletontracking algorithm constitutes the most time-consuming part of the proposed approach, corresponding to approximately 99.74% of the overall processing. The latter highlights the increased time efficiency (0.26% of the overall processing) of the proposed action recognition methodology (Section 3), characteristic that received particular attention during the design of the proposed method. Additionally, it can be seen that the overall duration of all algorithmic steps is shorter that the respective kinect s capturing period; in other words, all proposed algorithmic steps are completed for the currently examined depth map before the next one is captured by the kinect sensor. This observation verifies the real-time nature of the proposed approach. It must be highlighted that the aforementioned time performances were measured without applying any particular code or algorithmic optimizations to the proposed method. 5 Conclusions In this paper, a real-time tracking-based approach to human action recognition was presented and evaluated using the Huawei/3DLife 3D human reconstruction and action recognition Grand Challenge dataset. Future work includes the investigation of STIP-based approaches for overcoming the inherent limitations of skeleton-tracking algorithms and the incorporation of multimodal information from additional sensors. Acknowledgment The work presented in this paper was supported by the European Commission under contract FP REVERIE. References 1. Hidden Markov Model Toolkit (HTK), 2. Ballan, L., Bertini, M., Del Bimbo, A., Seidenari, L., Serra, G.: Effective codebooks for human action representation and classification in unconstrained videos. Multimedia, IEEE Transactions on 14(4), (2012) 3. Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. Pattern Analysis and Machine Intelligence, IEEE Transactions on 29(12), (2007) 4. Gu, J., Ding, X., Wang, S., Wu, Y.: Action and gait recognition from recovered 3-d human joints. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Trans. on 40(4), (2010)
11 5. Haq, A., Gondal, I., Murshed, M.: On temporal order invariance for view-invariant action recognition. Circuits and Systems for Video Technology, IEEE Transactions on 23(2), (2013) 6. Holte, M.B., Chakraborty, B., Gonzalez, J., Moeslund, T.B.: A local 3-d motion descriptor for multi-view human action recognition from 4-d spatio-temporal interest points. Selected Topics in Signal Processing, IEEE Journal of 6(5), (2012) 7. Ji, X., Liu, H.: Advances in view-invariant human motion analysis: a review. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Trans. on 40(1), (2010) 8. Junejo, I.N., Dexter, E., Laptev, I., Pérez, P.: View-independent action recognition from temporal self-similarities. Pattern Analysis and Machine Intelligence, IEEE Trans. on 33(1), (2011) 9. Papadopoulos, G.T., Briassouli, A., Mezaris, V., Kompatsiaris, I., Strintzis, M.G.: Statistical motion information extraction and representation for semantic video analysis. Circuits and Systems for Video Technology, IEEE Transactions on 19(10), (Oct 2009) 10. Poppe, R.: A survey on vision-based human action recognition. Image and vision computing 28(6), (2010) 11. Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. of the IEEE 77(2), (1989) 12. Song, B., Kamal, A.T., Soto, C., Ding, C., Farrell, J.A., Roy-Chowdhury, A.K.: Tracking and activity recognition through consensus in distributed camera networks. Image Processing, IEEE Transactions on 19(10), (2010) 13. Turaga, P., Veeraraghavan, A., Chellappa, R.: Statistical analysis on stiefel and grassmann manifolds with applications in computer vision. In: Computer Vision and Pattern Recognition, IEEE Conf. on. pp IEEE (2008) 14. Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding 104(2), (2006)
Real Time Skeleton Tracking based Human Recognition System using Kinect and Arduino
Real Time Skeleton Tracking based Human Recognition System using Kinect and Arduino Satish Prabhu Jay Kumar Bhuchhada Amankumar Dabhi Pratik Shetty ABSTRACT A Microsoft Kinect sensor has high resolution
Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition
IWNEST PUBLISHER Journal of Industrial Engineering Research (ISSN: 2077-4559) Journal home page: http://www.iwnest.com/aace/ Adaptive sequence of Key Pose Detection for Human Action Recognition 1 T. Sindhu
Automatic Calibration of an In-vehicle Gaze Tracking System Using Driver s Typical Gaze Behavior
Automatic Calibration of an In-vehicle Gaze Tracking System Using Driver s Typical Gaze Behavior Kenji Yamashiro, Daisuke Deguchi, Tomokazu Takahashi,2, Ichiro Ide, Hiroshi Murase, Kazunori Higuchi 3,
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
Sequence of the Most Informative Joints (SMIJ): A New Representation for Human Skeletal Action Recognition
Sequence of the Most Informative (SMIJ): A New Representation for Human Skeletal Action Recognition Ferda Ofli 1, Rizwan Chaudhry 2, Gregorij Kurillo 1,RenéVidal 2 and Ruzena Bajcsy 1 1 Tele-immersion
Classifying Manipulation Primitives from Visual Data
Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if
CS231M Project Report - Automated Real-Time Face Tracking and Blending
CS231M Project Report - Automated Real-Time Face Tracking and Blending Steven Lee, [email protected] June 6, 2015 1 Introduction Summary statement: The goal of this project is to create an Android
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
Biometric Authentication using Online Signatures
Biometric Authentication using Online Signatures Alisher Kholmatov and Berrin Yanikoglu [email protected], [email protected] http://fens.sabanciuniv.edu Sabanci University, Tuzla, Istanbul,
Tracking Groups of Pedestrians in Video Sequences
Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal
An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network
Proceedings of the 8th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING & DATA BASES (AIKED '9) ISSN: 179-519 435 ISBN: 978-96-474-51-2 An Energy-Based Vehicle Tracking System using Principal
Removing Moving Objects from Point Cloud Scenes
1 Removing Moving Objects from Point Cloud Scenes Krystof Litomisky [email protected] Abstract. Three-dimensional simultaneous localization and mapping is a topic of significant interest in the research
Fall Detection System based on Kinect Sensor using Novel Detection and Posture Recognition Algorithm
Fall Detection System based on Kinect Sensor using Novel Detection and Posture Recognition Algorithm Choon Kiat Lee 1, Vwen Yen Lee 2 1 Hwa Chong Institution, Singapore [email protected] 2 Institute
Model-Based 3D Human Motion Capture Using Global-Local Particle Swarm Optimizations
Model-Based 3D Human Motion Capture Using Global-Local Particle Swarm Optimizations Tomasz Krzeszowski, Bogdan Kwolek, and Konrad Wojciechowski Abstract. We present an approach for tracking the articulated
Accelerometer Based Real-Time Gesture Recognition
POSTER 2008, PRAGUE MAY 15 1 Accelerometer Based Real-Time Gesture Recognition Zoltán PREKOPCSÁK 1 1 Dept. of Telecomm. and Media Informatics, Budapest University of Technology and Economics, Magyar tudósok
BACnet for Video Surveillance
The following article was published in ASHRAE Journal, October 2004. Copyright 2004 American Society of Heating, Refrigerating and Air-Conditioning Engineers, Inc. It is presented for educational purposes
A Reliability Point and Kalman Filter-based Vehicle Tracking Technique
A Reliability Point and Kalman Filter-based Vehicle Tracing Technique Soo Siang Teoh and Thomas Bräunl Abstract This paper introduces a technique for tracing the movement of vehicles in consecutive video
Tracking and Recognition in Sports Videos
Tracking and Recognition in Sports Videos Mustafa Teke a, Masoud Sattari b a Graduate School of Informatics, Middle East Technical University, Ankara, Turkey [email protected] b Department of Computer
A secure face tracking system
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 10 (2014), pp. 959-964 International Research Publications House http://www. irphouse.com A secure face tracking
Online Farsi Handwritten Character Recognition Using Hidden Markov Model
Online Farsi Handwritten Character Recognition Using Hidden Markov Model Vahid Ghods*, Mohammad Karim Sohrabi Department of Electrical and Computer Engineering, Semnan Branch, Islamic Azad University,
Motion Capture Sistemi a marker passivi
Motion Capture Sistemi a marker passivi N. Alberto Borghese Laboratory of Human Motion Analysis and Virtual Reality (MAVR) Department of Computer Science University of Milano 1/41 Outline Introduction:
This week. CENG 732 Computer Animation. Challenges in Human Modeling. Basic Arm Model
CENG 732 Computer Animation Spring 2006-2007 Week 8 Modeling and Animating Articulated Figures: Modeling the Arm, Walking, Facial Animation This week Modeling the arm Different joint structures Walking
Tracking Moving Objects In Video Sequences Yiwei Wang, Robert E. Van Dyck, and John F. Doherty Department of Electrical Engineering The Pennsylvania State University University Park, PA16802 Abstract{Object
Face Model Fitting on Low Resolution Images
Face Model Fitting on Low Resolution Images Xiaoming Liu Peter H. Tu Frederick W. Wheeler Visualization and Computer Vision Lab General Electric Global Research Center Niskayuna, NY, 1239, USA {liux,tu,wheeler}@research.ge.com
Privacy Preserving Automatic Fall Detection for Elderly Using RGBD Cameras
Privacy Preserving Automatic Fall Detection for Elderly Using RGBD Cameras Chenyang Zhang 1, Yingli Tian 1, and Elizabeth Capezuti 2 1 Media Lab, The City University of New York (CUNY), City College New
Vision based Vehicle Tracking using a high angle camera
Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu [email protected] [email protected] Abstract A vehicle tracking and grouping algorithm is presented in this work
Physics 160 Biomechanics. Angular Kinematics
Physics 160 Biomechanics Angular Kinematics Questions to think about Why do batters slide their hands up the handle of the bat to lay down a bunt but not to drive the ball? Why might an athletic trainer
AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION
AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION Saurabh Asija 1, Rakesh Singh 2 1 Research Scholar (Computer Engineering Department), Punjabi University, Patiala. 2 Asst.
How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm
IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode
Character Animation from 2D Pictures and 3D Motion Data ALEXANDER HORNUNG, ELLEN DEKKERS, and LEIF KOBBELT RWTH-Aachen University
Character Animation from 2D Pictures and 3D Motion Data ALEXANDER HORNUNG, ELLEN DEKKERS, and LEIF KOBBELT RWTH-Aachen University Presented by: Harish CS-525 First presentation Abstract This article presents
VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS
VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS Norbert Buch 1, Mark Cracknell 2, James Orwell 1 and Sergio A. Velastin 1 1. Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE,
Support Vector Machine-Based Human Behavior Classification in Crowd through Projection and Star Skeletonization
Journal of Computer Science 6 (9): 1008-1013, 2010 ISSN 1549-3636 2010 Science Publications Support Vector Machine-Based Human Behavior Classification in Crowd through Projection and Star Skeletonization
Ericsson T18s Voice Dialing Simulator
Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of
Synthetic Aperture Radar: Principles and Applications of AI in Automatic Target Recognition
Synthetic Aperture Radar: Principles and Applications of AI in Automatic Target Recognition Paulo Marques 1 Instituto Superior de Engenharia de Lisboa / Instituto de Telecomunicações R. Conselheiro Emídio
HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT
International Journal of Scientific and Research Publications, Volume 2, Issue 4, April 2012 1 HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT Akhil Gupta, Akash Rathi, Dr. Y. Radhika
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Low-resolution Character Recognition by Video-based Super-resolution
2009 10th International Conference on Document Analysis and Recognition Low-resolution Character Recognition by Video-based Super-resolution Ataru Ohkura 1, Daisuke Deguchi 1, Tomokazu Takahashi 2, Ichiro
Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking
Proceedings of the IEEE ITSC 2006 2006 IEEE Intelligent Transportation Systems Conference Toronto, Canada, September 17-20, 2006 WA4.1 Deterministic Sampling-based Switching Kalman Filtering for Vehicle
Traffic Monitoring Systems. Technology and sensors
Traffic Monitoring Systems Technology and sensors Technology Inductive loops Cameras Lidar/Ladar and laser Radar GPS etc Inductive loops Inductive loops signals Inductive loop sensor The inductance signal
ART Extension for Description, Indexing and Retrieval of 3D Objects
ART Extension for Description, Indexing and Retrieval of 3D Objects Julien Ricard, David Coeurjolly, Atilla Baskurt LIRIS, FRE 2672 CNRS, Bat. Nautibus, 43 bd du novembre 98, 69622 Villeurbanne cedex,
Tracking And Object Classification For Automated Surveillance
Tracking And Object Classification For Automated Surveillance Omar Javed and Mubarak Shah Computer Vision ab, University of Central Florida, 4000 Central Florida Blvd, Orlando, Florida 32816, USA {ojaved,shah}@cs.ucf.edu
Reconstructing 3D Pose and Motion from a Single Camera View
Reconstructing 3D Pose and Motion from a Single Camera View R Bowden, T A Mitchell and M Sarhadi Brunel University, Uxbridge Middlesex UB8 3PH [email protected] Abstract This paper presents a
Mean-Shift Tracking with Random Sampling
1 Mean-Shift Tracking with Random Sampling Alex Po Leung, Shaogang Gong Department of Computer Science Queen Mary, University of London, London, E1 4NS Abstract In this work, boosting the efficiency of
THE MS KINECT USE FOR 3D MODELLING AND GAIT ANALYSIS IN THE MATLAB ENVIRONMENT
THE MS KINECT USE FOR 3D MODELLING AND GAIT ANALYSIS IN THE MATLAB ENVIRONMENT A. Procházka 1,O.Vyšata 1,2,M.Vališ 1,2, M. Yadollahi 1 1 Institute of Chemical Technology, Department of Computing and Control
Hardware Implementation of Probabilistic State Machine for Word Recognition
IJECT Vo l. 4, Is s u e Sp l - 5, Ju l y - Se p t 2013 ISSN : 2230-7109 (Online) ISSN : 2230-9543 (Print) Hardware Implementation of Probabilistic State Machine for Word Recognition 1 Soorya Asokan, 2
Activity recognition in ADL settings. Ben Kröse [email protected]
Activity recognition in ADL settings Ben Kröse [email protected] Content Why sensor monitoring for health and wellbeing? Activity monitoring from simple sensors Cameras Co-design and privacy issues Necessity
Understanding Planes and Axes of Movement
Understanding Planes and Axes of Movement Terminology When describing the relative positions of the body parts or relationship between those parts it is advisable to use the same standard terminology.
A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA
A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA N. Zarrinpanjeh a, F. Dadrassjavan b, H. Fattahi c * a Islamic Azad University of Qazvin - [email protected]
A Learning Based Method for Super-Resolution of Low Resolution Images
A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 [email protected] Abstract The main objective of this project is the study of a learning based method
A DECISION TREE BASED PEDOMETER AND ITS IMPLEMENTATION ON THE ANDROID PLATFORM
A DECISION TREE BASED PEDOMETER AND ITS IMPLEMENTATION ON THE ANDROID PLATFORM ABSTRACT Juanying Lin, Leanne Chan and Hong Yan Department of Electronic Engineering, City University of Hong Kong, Hong Kong,
VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS
VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS Aswin C Sankaranayanan, Qinfen Zheng, Rama Chellappa University of Maryland College Park, MD - 277 {aswch, qinfen, rama}@cfar.umd.edu Volkan Cevher, James
A Genetic Algorithm-Evolved 3D Point Cloud Descriptor
A Genetic Algorithm-Evolved 3D Point Cloud Descriptor Dominik Wȩgrzyn and Luís A. Alexandre IT - Instituto de Telecomunicações Dept. of Computer Science, Univ. Beira Interior, 6200-001 Covilhã, Portugal
Vision based approach to human fall detection
Vision based approach to human fall detection Pooja Shukla, Arti Tiwari CSVTU University Chhattisgarh, [email protected] 9754102116 Abstract Day by the count of elderly people living alone at home
Subspace Analysis and Optimization for AAM Based Face Alignment
Subspace Analysis and Optimization for AAM Based Face Alignment Ming Zhao Chun Chen College of Computer Science Zhejiang University Hangzhou, 310027, P.R.China [email protected] Stan Z. Li Microsoft
Tracking performance evaluation on PETS 2015 Challenge datasets
Tracking performance evaluation on PETS 2015 Challenge datasets Tahir Nawaz, Jonathan Boyle, Longzhen Li and James Ferryman Computational Vision Group, School of Systems Engineering University of Reading,
Learning Motion Categories using both Semantic and Structural Information
Learning Motion Categories using both Semantic and Structural Information Shu-Fai Wong, Tae-Kyun Kim and Roberto Cipolla Department of Engineering, University of Cambridge, Cambridge, CB2 1PZ, UK {sfw26,
Human-like Arm Motion Generation for Humanoid Robots Using Motion Capture Database
Human-like Arm Motion Generation for Humanoid Robots Using Motion Capture Database Seungsu Kim, ChangHwan Kim and Jong Hyeon Park School of Mechanical Engineering Hanyang University, Seoul, 133-791, Korea.
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Low-resolution Image Processing based on FPGA
Abstract Research Journal of Recent Sciences ISSN 2277-2502. Low-resolution Image Processing based on FPGA Mahshid Aghania Kiau, Islamic Azad university of Karaj, IRAN Available online at: www.isca.in,
Face detection is a process of localizing and extracting the face region from the
Chapter 4 FACE NORMALIZATION 4.1 INTRODUCTION Face detection is a process of localizing and extracting the face region from the background. The detected face varies in rotation, brightness, size, etc.
Static Environment Recognition Using Omni-camera from a Moving Vehicle
Static Environment Recognition Using Omni-camera from a Moving Vehicle Teruko Yata, Chuck Thorpe Frank Dellaert The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 USA College of Computing
Neural Network based Vehicle Classification for Intelligent Traffic Control
Neural Network based Vehicle Classification for Intelligent Traffic Control Saeid Fazli 1, Shahram Mohammadi 2, Morteza Rahmani 3 1,2,3 Electrical Engineering Department, Zanjan University, Zanjan, IRAN
Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features
Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering
REAL TIME TRAFFIC LIGHT CONTROL USING IMAGE PROCESSING
REAL TIME TRAFFIC LIGHT CONTROL USING IMAGE PROCESSING Ms.PALLAVI CHOUDEKAR Ajay Kumar Garg Engineering College, Department of electrical and electronics Ms.SAYANTI BANERJEE Ajay Kumar Garg Engineering
Real time vehicle detection and tracking on multiple lanes
Real time vehicle detection and tracking on multiple lanes Kristian Kovačić Edouard Ivanjko Hrvoje Gold Department of Intelligent Transportation Systems Faculty of Transport and Traffic Sciences University
Matlab Based Interactive Simulation Program for 2D Multisegment Mechanical Systems
Matlab Based Interactive Simulation Program for D Multisegment Mechanical Systems Henryk Josiński,, Adam Świtoński,, Karol Jędrasiak, Andrzej Polański,, and Konrad Wojciechowski, Polish-Japanese Institute
Extracting a Good Quality Frontal Face Images from Low Resolution Video Sequences
Extracting a Good Quality Frontal Face Images from Low Resolution Video Sequences Pritam P. Patil 1, Prof. M.V. Phatak 2 1 ME.Comp, 2 Asst.Professor, MIT, Pune Abstract The face is one of the important
Interactive person re-identification in TV series
Interactive person re-identification in TV series Mika Fischer Hazım Kemal Ekenel Rainer Stiefelhagen CV:HCI lab, Karlsruhe Institute of Technology Adenauerring 2, 76131 Karlsruhe, Germany E-mail: {mika.fischer,ekenel,rainer.stiefelhagen}@kit.edu
A General Framework for Tracking Objects in a Multi-Camera Environment
A General Framework for Tracking Objects in a Multi-Camera Environment Karlene Nguyen, Gavin Yeung, Soheil Ghiasi, Majid Sarrafzadeh {karlene, gavin, soheil, majid}@cs.ucla.edu Abstract We present a framework
Articulated Body Motion Tracking by Combined Particle Swarm Optimization and Particle Filtering
Articulated Body Motion Tracking by Combined Particle Swarm Optimization and Particle Filtering Tomasz Krzeszowski, Bogdan Kwolek, and Konrad Wojciechowski Polish-Japanese Institute of Information Technology
Speed Performance Improvement of Vehicle Blob Tracking System
Speed Performance Improvement of Vehicle Blob Tracking System Sung Chun Lee and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA [email protected], [email protected] Abstract. A speed
Establishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow
, pp.233-237 http://dx.doi.org/10.14257/astl.2014.51.53 A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow Giwoo Kim 1, Hye-Youn Lim 1 and Dae-Seong Kang 1, 1 Department of electronices
Image Normalization for Illumination Compensation in Facial Images
Image Normalization for Illumination Compensation in Facial Images by Martin D. Levine, Maulin R. Gandhi, Jisnu Bhattacharyya Department of Electrical & Computer Engineering & Center for Intelligent Machines
An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation
An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation Shanofer. S Master of Engineering, Department of Computer Science and Engineering, Veerammal Engineering College,
COMPUTING CLOUD MOTION USING A CORRELATION RELAXATION ALGORITHM Improving Estimation by Exploiting Problem Knowledge Q. X. WU
COMPUTING CLOUD MOTION USING A CORRELATION RELAXATION ALGORITHM Improving Estimation by Exploiting Problem Knowledge Q. X. WU Image Processing Group, Landcare Research New Zealand P.O. Box 38491, Wellington
D-optimal plans in observational studies
D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational
Template-based Eye and Mouth Detection for 3D Video Conferencing
Template-based Eye and Mouth Detection for 3D Video Conferencing Jürgen Rurainsky and Peter Eisert Fraunhofer Institute for Telecommunications - Heinrich-Hertz-Institute, Image Processing Department, Einsteinufer
EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM
EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM Amol Ambardekar, Mircea Nicolescu, and George Bebis Department of Computer Science and Engineering University
Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006
Practical Tour of Visual tracking David Fleet and Allan Jepson January, 2006 Designing a Visual Tracker: What is the state? pose and motion (position, velocity, acceleration, ) shape (size, deformation,
Multi-view Intelligent Vehicle Surveillance System
Multi-view Intelligent Vehicle Surveillance System S. Denman, C. Fookes, J. Cook, C. Davoren, A. Mamic, G. Farquharson, D. Chen, B. Chen and S. Sridharan Image and Video Research Laboratory Queensland
Vision-based Walking Parameter Estimation for Biped Locomotion Imitation
Vision-based Walking Parameter Estimation for Biped Locomotion Imitation Juan Pedro Bandera Rubio 1, Changjiu Zhou 2 and Francisco Sandoval Hernández 1 1 Dpto. Tecnología Electrónica, E.T.S.I. Telecomunicación
Tracking in flussi video 3D. Ing. Samuele Salti
Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking
Normalisation of 3D Face Data
Normalisation of 3D Face Data Chris McCool, George Mamic, Clinton Fookes and Sridha Sridharan Image and Video Research Laboratory Queensland University of Technology, 2 George Street, Brisbane, Australia,
