Real Time Skeleton Tracking based Human Recognition System using Kinect and Arduino

Similar documents
Real-time Skeleton-tracking-based Human Action Recognition Using Kinect Data

VIRTUAL TRIAL ROOM USING AUGMENTED REALITY

How does the Kinect work? John MacCormick

The Scientific Data Mining Process

Self-Calibrated Structured Light 3D Scanner Using Color Edge Pattern

This week. CENG 732 Computer Animation. Challenges in Human Modeling. Basic Arm Model

Frequently Asked Questions

Human-like Arm Motion Generation for Humanoid Robots Using Motion Capture Database

Human Motion Tracking for Assisting Balance Training and Control of a Humanoid Robot

Synthetic Sensing: Proximity / Distance Sensors

Fall Detection System based on Kinect Sensor using Novel Detection and Posture Recognition Algorithm

Robot Perception Continued

Wii Remote Calibration Using the Sensor Bar

CS 534: Computer Vision 3D Model-based recognition

E190Q Lecture 5 Autonomous Robot Navigation

Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling

The Visual Internet of Things System Based on Depth Camera

SimFonIA Animation Tools V1.0. SCA Extension SimFonIA Character Animator

VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Robotics. Lecture 3: Sensors. See course website for up to date information.

Static Environment Recognition Using Omni-camera from a Moving Vehicle

Path Tracking for a Miniature Robot

Bluetooth + USB 16 Servo Controller [RKI-1005 & RKI-1205]

Robot Task-Level Programming Language and Simulation

Virtual Fitting by Single-shot Body Shape Estimation

Novelty Detection in image recognition using IRF Neural Networks properties

A System for Capturing High Resolution Images

THE MS KINECT USE FOR 3D MODELLING AND GAIT ANALYSIS IN THE MATLAB ENVIRONMENT

Accuracy of joint angles tracking using markerless motion system

Automated Recording of Lectures using the Microsoft Kinect

Master Thesis Using MS Kinect Device for Natural User Interface

FPGA Implementation of Human Behavior Analysis Using Facial Image

Template-based Eye and Mouth Detection for 3D Video Conferencing

Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap

Mobile Robot FastSLAM with Xbox Kinect

3D Scanner using Line Laser. 1. Introduction. 2. Theory

A method of generating free-route walk-through animation using vehicle-borne video image

Limitations of Human Vision. What is computer vision? What is computer vision (cont d)?

3D Arm Motion Tracking for Home-based Rehabilitation

Part-Based Recognition

Motion Capture Sistemi a marker passivi

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

Classifying Manipulation Primitives from Visual Data

Low-resolution Character Recognition by Video-based Super-resolution

Kinect Gesture Recognition for Interactive System

A secure face tracking system

Removing Moving Objects from Point Cloud Scenes

HYDRAULIC ARM MODELING VIA MATLAB SIMHYDRAULICS

C# Implementation of SLAM Using the Microsoft Kinect

Development of Docking System for Mobile Robots Using Cheap Infrared Sensors

Colorado School of Mines Computer Vision Professor William Hoff

3D/4D acquisition. 3D acquisition taxonomy Computer Vision. Computer Vision. 3D acquisition methods. passive. active.

Go to contents 18 3D Visualization of Building Services in Virtual Environment

Efficient on-line Signature Verification System

Integrated sensors for robotic laser welding

Kinect Interface to Play Computer Games with Movement

Character Animation from 2D Pictures and 3D Motion Data ALEXANDER HORNUNG, ELLEN DEKKERS, and LEIF KOBBELT RWTH-Aachen University

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY

A 5 Degree Feedback Control Robotic Arm (Haptic Arm)

3D Vision An enabling Technology for Advanced Driver Assistance and Autonomous Offroad Driving

Tracking devices. Important features. 6 Degrees of freedom. Mechanical devices. Types. Virtual Reality Technology and Programming

Next Generation Natural User Interface with Kinect. Ben Lower Developer Community Manager Microsoft Corporation

Interactive Computer Graphics

Self-Balancing Robot Project Proposal Abstract. Strategy. Physical Construction. Spencer Burdette March 9, 2007

RESEARCH PAPERS FACULTY OF MATERIALS SCIENCE AND TECHNOLOGY IN TRNAVA SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA

Design and Implementation of a Wireless Gesture Controlled Robotic Arm with Vision

Context-aware Library Management System using Augmented Reality

DINAMIC AND STATIC CENTRE OF PRESSURE MEASUREMENT ON THE FORCEPLATE. F. R. Soha, I. A. Szabó, M. Budai. Abstract

Differentiation of 3D scanners and their positioning method when applied to pipeline integrity

Vision based Vehicle Tracking using a high angle camera

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

Athletics (Throwing) Questions Javelin, Shot Put, Hammer, Discus

USING THE XBOX KINECT TO DETECT FEATURES OF THE FLOOR SURFACE

3D Model based Object Class Detection in An Arbitrary View

Automatic Gesture Recognition and Tracking System for Physiotherapy

INTRODUCTION TO SERIAL ARM

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE.

CE801: Intelligent Systems and Robotics Lecture 3: Actuators and Localisation. Prof. Dr. Hani Hagras

Hand Gestures Remote Controlled Robotic Arm

Human Skeletal and Muscle Deformation Animation Using Motion Capture Data

Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye MSRC

ANALYZING A CONDUCTORS GESTURES WITH THE WIIMOTE

Robotics. Chapter 25. Chapter 25 1

Leakage Detection Using PLUMBOAT

Shape Measurement of a Sewer Pipe. Using a Mobile Robot with Computer Vision

Automatic Calibration of an In-vehicle Gaze Tracking System Using Driver s Typical Gaze Behavior

A Method for Controlling Mouse Movement using a Real- Time Camera

A technical overview of the Fuel3D system.

DATA VISUALIZATION GABRIEL PARODI STUDY MATERIAL: PRINCIPLES OF GEOGRAPHIC INFORMATION SYSTEMS AN INTRODUCTORY TEXTBOOK CHAPTER 7

FUNDAMENTALS OF ROBOTICS

Web-based home rehabilitation gaming system for balance training

Privacy Preserving Automatic Fall Detection for Elderly Using RGBD Cameras

Introduction. C 2009 John Wiley & Sons, Ltd

Bernice E. Rogowitz and Holly E. Rushmeier IBM TJ Watson Research Center, P.O. Box 704, Yorktown Heights, NY USA

Transcription:

Real Time Skeleton Tracking based Human Recognition System using Kinect and Arduino Satish Prabhu Jay Kumar Bhuchhada Amankumar Dabhi Pratik Shetty ABSTRACT A Microsoft Kinect sensor has high resolution depth and RGB/depth sensing which is becoming available for wide spread use. It consists of object tracking, object detection and reorganization. It also recognizes human activity analysis, hand gesture analysis and 3D mapping. Face expression detection is widely used in computer human interface. Kinect depth camera can be used for detection of common face expressions. Face is tracked using MS Kinect which uses 2.0 SDK. This makes use of depth map to create a 3D frame model of the face. By recognizing the facial expressions from facial images, a number of applications in the field of human computer can be build. This paper describes about the working of Kinect and use of Kinect in Human Skeleton Tracking. General Terms Skeleton tracking algorithm & Action Recognition Keywords Skeleton Tracking, Kinect, Pose Estimation, Arduino, Actions 1. INTRODUCTION Mobile robots have thousands of applications, from autonomously mapping out a lawn and cutting grass to urban search and rescue autonomous ground vehicles. One important application in the future would be to fight wars in place of humans. That is humans will fight virtually and whatever move human makes the same move the mobile robot will copy. To achieve this it is required to teach robot how to copy human actions. So project deals with making a robot that will copy human action. The idea is to make use of one of the most amazing capabilities of the Kinect: skeleton tracking. This feature allows us to build a servo driven robot that will copy human actions efficiently. Natural interaction applied to the robot has an important outcome; there is no need for physical connection between the controller and the robot. This project will be extended to implement network connectivity that is the robot could be controlled remotely from anywhere in the world. It will use the concept of skeleton tracking so that the Kinect can detect the user s joints limbs movements in space. The user data will be mapped to servo angles and send them to the Arduino board controlling the servos of the robotic robot. The skeleton tracking feature is used to map the depth image of human. It will track the position of joints of the human body which is than provided to the computer which will in turn sends the signal to the Arduino board in the form of pulse for every joints this will make the servo motor rotate in accordance with the pulse. The eight servos are placed on the shoulders, elbows, hips, and knees of the robot. The servo motor is a DC motor. The rotation of servo motor depends upon the number of signal pulses applied to the servo motor. Suppose it is assume that for one pulse the motor rotates through 1 degree, than for 90 pulses it will rotate through angle of 90 degree, for 180 pulse rotates through 180 degree and so on. The second important part of paper is angle calculation. The skeleton information from the Kinect is stored in the computer which thus runs a program used by Arduino to calculate the angle inclination of every joints of the human body. This angle calculation is than converted into a pulse train for each servo motor connected to Arduino. According to the received pulse the servo motor rotates through a certain angle which is observed by Kinect sensor. Hence the robot copies the action of the human skeleton. The third important part of the project is to extend the concept of using the project on internet. So through internet the robot can be operate anywhere around the globe. To do so the user sets the external IP address of the computer in the Arduino program through this the robot will emulate the human action anywhere from the earth through internet. 2. RELATED WORK The project deals with making a robot that will copy human action. Recently, Microsoft released the Xbox Kinect, and it proves useful for detecting human actions and gestures. So in this paper we propose to use Kinect camera to capture the human gestures and then relaying these actions to the robot which will be controlled by Kinect and Arduino board. 2.1 Existing Systems Previously depth images were been recorded with the help of the silhouettes which are nothing but the contour of the body part whose depth images is to be formed [1]. They reject the shadow part of the body or the colour of the clothes the person has worn. It just simply sees the border of the body. But for the digital system it s been very difficult to predict the motion of the body part of unknown person since this type of model was based on the priori knowledge of the contours. Since the human of every part of the world are not same and differ in size, length and many other physical parameters. Hence it becomes difficult to store all such kind of information. Therefore using the silhouettes just simply reduces the scope of depth images. [1] The two major steps leading from a captured motion to a reconstructed one are: Marker reconstruction from 2-D marker sets to 3-D positions; Marker tracking from one frame to the next, in 2-D and/or 3- D. 1

However, despite the fact that 2 D and 3 D tracking ensure the identification of a large number of markers from one frame to another, ambiguities, sudden acceleration or occlusions will often cause erroneous reconstructions or breaks in the tracking links. For this reason, it has proved to be necessary to increase procedure s robustness by using the skeleton to drive the reconstruction and tracking process by introducing a third step, i.e. the accurate identification of each 3-D marker and complete marker inventory in each frame. The approaches to solving these issues are addressed in the following paragraphs, starting with the presentation of the human model used and keeping in mind that entire approach is based on the constant interaction between the model and the above marker processing tasks. 2.1.1 Skeleton model The skeleton model is controlled by 32 degrees of freedom grouped in 9 joints in 3 D space. This is a simplified version of the complete skeleton generally used. It does not include detailed hands and feet. Fig 1: Default Skeletal Joint Locations 2.1.2 Stereo triangulation 3 D markers are reconstructed from the 2 D data using stereo triangulation 2.1.3 Binocular reconstruction After reconstructing these 3 D markers in the first frame, compare the number of reconstructed markers with the number of markers known to be carried by the subject. As all remaining processing is automatic, it is absolutely essential that all markers be identified in the first frame. Any marker not present in the first frame is lost for the entire sequence. Therefore, if the number of reconstructed markers is insufficient, a second stereo matching is performed, this time also taking into account markers seen in only two views. [2] There are three techniques from which the image can be tracked without using the marker less approach First, learning-based methods which rely on prior probabilities for human poses, and assume therefore limited motions. Second, model-free methods which do not use any a priori knowledge, and recover articulated structures automatically. However, the articulated structure is likely to change in time, when encountering a new articulation for instance, hence making identification or tracking difficult. Third, model-based approaches which fit and track a known model using image information. 2.2 Proposed Approach The paper aims at limiting as much as possible the required a priori knowledge, while keeping the robustness of the method reasonable for most interaction applications. Hence, given approach belongs to the third category. [3]Among modelbased methods, a large class of approaches use an a priori surface or volume for representation of the human body, which combines both shape and motion information [4]. The corresponding models range from fine mesh models to coarser models based on generalized cylinders, ellipsoid or other geometric shapes. In order to avoid complex estimations of both shapes and motions as in, most approaches in this class assume known body dimension. However, this strongly limits flexibility and becomes intractable with numerous interaction systems where unknown persons are supposed to interact. A more efficient solution is to find a model which reduces shape information. To this purpose, a skeletal model can be used. This model does not include any volumetric information. Hence, it has fewer dependencies on body dimensions. In addition, limbs lengths tend to follow biological natural laws, whereas human shapes vary a lot among population. Recovering motion using skeletal models has not been widely investigated and an approach where a skeletal structure is fitted with the help of hand/feet/head tracking. However, volumetric dimensions are still required for the arms and legs limbs. Hence for all the complication and errors in the technique the use of Kinect in this project has tackled all the difficulties in the approaches for finding the robust technique. [3] 3. KINECT & ITS WORKING A Microsoft Kinect sensor has high resolution depth and RGB/depth sensing which is becoming available for wide spread use. It consists of object tracking, object detection and reorganization. It also recognizes human activity analysis, hand gesture analysis and 3D mapping. Face expression detection is widely used in computer human interface. It can be used to detect and distinguish between different kinds of objects. The depth information was analysed to identify the different parts of fingers or hands, or entire body in order to interpret gestures from a human standing in front of it. Thus the Kinect was found to be an effective tool for target tracking and action recognition. [5] Kinect camera consists of an infrared projector, the colour camera, and the IR camera. The depth sensor consists of the IR projector combined with the IR camera, which is a monochrome complementary metal- oxide semiconductor sensor. The IR projector is an IR laser that passes through a diffraction grating and turns into a set of IR dots. [6] The relative geometry between the IR projector and the IR camera as well as the projected IR dot pattern are known. If a dot observed in an image matches with a dot in the projector pattern, reconstruct it in 3D using triangulation. Because the dot pattern is relatively random, the matching between the IR image and the projector pattern can be done in a straightforward way by comparing small neighbourhood s using, for example, normalized cross correlation. [6] In skeletal tracking, a human body is represented by a number of joints representing body parts such as head, neck, shoulders, and arms. Each joint is represented by its 3D coordinates. The goal is to determine all the 3D parameters of these joints in real time to allow fluent interactivity and with limited computation resources allocated on the Xbox 360 so 2

as not to impact gaming performance. Rather than trying to determine directly the body pose in this high-dimensional space, Jamie Shotton and his team met the challenge by proposing per-pixel, body-part recognition as an intermediate step Shotton s team treats the segmentation of a depth image as a per-pixel classification task (no pairwise terms or conditional random field are necessary)[4]. Evaluating each pixel separately avoids a combinatorial search over the different body joints. For training data, generate realistic synthetic depth images of humans of many shapes and sizes in highly varied poses sampled from a large motion-capture database. Then train a deep randomized decision forest classifier, which avoids over fitting by using hundreds of thousands of training images. Simple, discriminative depth comparison image features yield 3D translation invariance while maintaining high computational efficiency. [6] 4. SKELETON TRACKING ALGORITHM The depth maps captured by the Kinect sensor are processed by a skeleton-tracking algorithm. The depth maps of the utilized dataset were acquired using the OpenNI API2 [7]. The OpenNI high-level skeleton-tracking module is used for detecting the performing subject and tracking a set of joints of his/her body. More specifically, the OpenNI tracker detects the position of the following set of joints in the 3D space which are Torso, Neck, Head, Left shoulder, Left elbow, Left wrist, Right shoulder, Right elbow, Right wrist, Left hip, Left knee, Left foot, Right hip, Right knee, Right foot. The position of joint gi is implied by vector pi(t) = [x y z]t, where t denotes the frame for which the joint position is located and the origin of the orthogonal XY Z co-ordinate system is placed at the centre of the Kinect sensor. 4.1 Action recognition Action recognition can be further divided into three subtypes 4.1.1 Pose estimation In particular, the aim of this step is to estimate a continuously updated orthogonal basis of vectors for every frame t that represents the subject s pose. The calculation of the latter is based on the fundamental consideration that the orientation of the subject s torso is the most characteristic quantity of the subject during the execution of any action and for that reason it could be used as reference. For pose estimation, the position of the following three joints is taken into account: Left shoulder, Right shoulder and Right hip. These comprise joints around the torso area, whose relative position remains almost unchanged during the execution of any action. The motivation behind the consideration of the three aforementioned joints, instead of directly estimating the position of the torso joint and the respective normal vector, is to reach a more accurate estimation of the subject s pose. It must be noted that the Right hip joint was preferred instead of the obvious Torso joint selection. This was performed so that the orthogonal basis of vectors to be estimated from joints with bigger in between distances that will be more likely to lead to more accurate pose estimation. However, no significant deviation in action recognition performance was observed when the Torso joint was used instead. [8] 4.1.2 Action Representation For realizing efficient action recognition, an appropriate representation is required that will satisfactorily handle the differences in appearance, human body type and execution of actions among the individuals. For that purpose, the angles of the joints relative position are used in this work, which showed to be more discriminative than using e.g. directly the joints normalized coordinates. Additionally, building on the fundamental idea of the previous section, all angles are computed using the Torso joint as reference, i.e. the origin of the spherical coordinate system is placed at the Torso joint position. For computing the pro- posed action representation, only a subset of the supported joints is used. This is due to the fact that the trajectory of some joints mainly contains redundant or noisy information. To this end, only the joints that correspond to the upper and lower body limbs were considered after experimental evaluation, namely the joints Left shoulder, Left elbow, Left wrist, Right shoulder, Right elbow, Right wrist, Left knee, Left foot, Right knee and Right foot. The velocity vector is approximated by the displacement vector between two successive frames, i.e. vi(t) = i(t) pi(t 1). The estimated spherical angles and angular velocities for frame t constitute the frame s observation vector. Collecting the computed observation vectors for all frames of a given action segment forms the respective action observation sequence h that will be used for performing HMM-based recognition, as will be described in the sequel. [8] 4.1.3 HMM based recognition Markov Models is stochastic model describing the sequence of possible events in which the probability of each event depends only on the state attend in the previous event. This model is too restrictive to be applicable to current problem of interest thus the concept of Markov model is extended to form Hidden Markov Model (HMM). HMM is doubly embedded stochastic process with the underlying stochastic process i.e. not observable (it is Hidden) but can only be observed through set of stochastic process that produce the sequence of observations. [12]. HMMs are employed in this work for performing action recognition, due to their suitability for modelling pattern recognition. In particular, a set of J HMMs is employed, where an individual HMM is introduced for every supported action aj. Each HMM receives as input the action observation sequence h (as described above) and at the evaluation stage returns a posterior probability P (aj h), which represents the observation sequence s fitness to the particular model. The developed HMMs were implemented using the software libraries of Hidden Markov Model Toolkit (HTK). [8] 3

Fig 2: Initialization of Kinect Camera 5. METHODOLOGY The entire process is divided in two parts i.e. Initialization & working. 5.1 Initialization For the smooth functioning & Error free working the Kinect is initialized to its default mode. Initialization is done with the help of calibration card been provided by the Microsoft, this card helps to align the Tx and Rx Infrared Sensor of Kinect. Fig 1 indicates the default joint location which is been used, these are treated as the reference joints and with the help of these joints other joints are been calibrated. frame in Fig 2 indicates that neither the object is been detected nor the skeletal joints are detected. This kind of image results into blackening of frame and the white spots on the black frame are due to noises present in the environment. Once the Joints are been recognized/detected Kinect uses HMM algorithm for joint estimation and predicts the future movements. These recognized joint information are been converted into PWM pulses by the programmed PWM pulse generator present on Arduino board. The generated PWM pulses which serve as input to the servo motors, are been made to perform angular tilt as per the movement been captured. Since this is real time the entire process is been continuously repeated for each frame. Fig 3: Working of stage 1 5.2 Working Initially Infrared Rays (IR) are emitted from the IR transmitter of Kinect Camera. Emitted rays are been received by Kinect receiver which is been stored in its database. Since it is monitoring for the human joints, it waits until the human joints are recognized. If any object other than the skeleton oints are recognized it discards the frame and restarts the scanning of the next frame until joints are recognized. Black 6. RESULT The framework required for the robot can be seen from the fig 6. Along with the robot PCB is made which will help to interface the servo motors HS 311 and HS 55. The PCB interfacing for the servo is formed so that connection remains proper and it looks proper and compact which can be seen in fig 5. Hence the kinect camera is successfully interfaced through OpenNI and the tracked the skeleton. 4

Fig 4: Working of stage 1I 7. CONCLUSION After analysing the studies mentioned above, it can be concluded that the Kinect is an incredible piece of technology, which has revolutionized the use of depth sensors in the last few years. Because of its relatively low cost, the Kinect has served as a great incentive for many projects in the most diverse fields, such as robotics and medicine, and some great results have been achieved. Throughout this project, it was possible to verify that although the information obtained by the Kinect may not be as accurate as that obtained by some other devices (e.g., laser sensors), it is accurate enough for many real life applications, which makes the Kinect a powerful and useful device in many research fields. And thus a real-time motion capture robot is integrated and tested using Kinect camera. The paper proposed a natural gesture based communication with robot. The skeleton tracking algorithm has been well explained for further work. The results are better than the techniques that were used before Kinect camera. users to deal with it.. Users are allowed to control the robot just by mimicking the gestures they want to be performed by the robot in front of the depth camera. This should be seen as preliminary work, where elementary interaction tools can be provided, and should be extended in many different fashions, depending on the tasks the robot. [11] 8. FUTURE SCOPE With the progress in the Kinect technology in the last decade it can be seen as a revolutionary tool in robotics. Now further modification may be as follows: 1. Here only few set of joints are tracked. So now the tracking algorithm can be expanded to track all the joints in the human body and can have more reliable and robust copying of human action. Fig 6: Robot Layout 2. As Kinect camera used is not portable so reducing the size of Kinect camera to the size of mobile phone camera can be a good future development. 3. The servo motors used could be further investigated and changed to build the system more robust and natural. Fig 5: PCB with Servo Interfaced Learning from demonstration is the scientific field which studies one of the easier ways a human have to deal with a humanoid robot: mimicking the particular task the subject wants to see reproduced by the robot. To achieve this a gesture recognition system is required. The paper presents a novel and cheap humanoid robot implementation along with a visual, gesture-based interface, which enable 4. The robot built is fixed. Instead it can be made mobile. Thus not only it will copy human action but even move around like a human. 5. It is possible to implement this project over the network. That is the Kinect camera will feed the data in the network and then the robot will get the data from network and thus it is possible to control the robot by sitting in any corner of the world. 5

9. REFRENCES [1] Agarwal, A., Triggs, B. 3D human pose from silhouettes by relevance vector Regression. In Proc. IEEE International Conference on Computer Vision and Pattern Recognition, pp.882-888, 2004. [2] Lorna HERDA, Pascal FUA, Ralf PLÄNKERS, Skeleton-based motion capture for robust reconstruction of human motion, in Proc. Computer Animation 2000, pp. 77-83, 2000. [3] Clement Menier, Edmond Boyer, Bruno Raffin, 3D Skeleton-Based Body Pose Recovery, in Proc. 3rd International Symposium, 3D Data Processing, Visualization and Transmission, pp 389 396, 2006. [4] Jamie Shotton, Toby Sharp, Alex Kipman, Andrew Fitzgibbon, Real-Time Human Pose Recognition in Parts from Single Depth Images, in the Proc. Conference on Computer Vision and Pattern Recognition, pp.1297-1304, 2011. [5] Dnyaneshwar R. Uttaarwar, Motion Computing using Microsoft Kinect, in the Proc. National conference on advances on computing, 2013. [6] Z. Zhang, Microsoft Kinect Sensor and Its Effect, in IEEE Multimedia Magazine, vol. 19, no. 2, pp. 4-10, April- June 2012. [7] James Ashley and Jarrett Webb, (Ed.), Beginning Kinect Programming with the Microsoft Kinect SDK, Apress, 2011. [8] Georgios Th. Papadopoulos, Apostolo Axenopoulo and Petros Daras, A Compact Multi-view Descriptor for 3D Object Retrieval, in Content-Based Multimedia Indexing, pp.115-119, 2009. [9] Michael Margolis, (Ed.), Arduino Cookbook, O Reilly, 2011. [10] Jack Purdum, (Ed.), Beginning C for Arduino, Apress, 2011. [11] Giuseppe Broccia, Marco Livesu, & Riccardo Scateni, Gestural Interaction for Robot Motion Control, in the Proc. Eurographics Italian Chapter Conference, 2011. [12] Lawrence R Rabiner, A Tutorial on Hidden Markov Model & Selected Applications in Speech Recognition, in Proc. IEEE 77, no. 2, pp 257-286, 1989. IJCA TM : www.ijcaonline.org 6