Efficient Fingertip Tracking and Mouse Pointer Control for a Human Mouse

Similar documents
A Method for Controlling Mouse Movement using a Real- Time Camera

An Active Head Tracking System for Distance Education and Videoconferencing Applications

Mouse Control using a Web Camera based on Colour Detection

A NATURAL HAND GESTURE HUMAN COMPUTER INTERFACE USING CONTOUR SIGNATURES

Vision based Vehicle Tracking using a high angle camera

A Real Time Hand Tracking System for Interactive Applications

A Hybrid Software Platform for Sony AIBO Robots

HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT

HAND GESTURE BASEDOPERATINGSYSTEM CONTROL

Laser Gesture Recognition for Human Machine Interaction

Ping Pong Game with Touch-screen. March 2012

Template-based Eye and Mouth Detection for 3D Video Conferencing

Building an Advanced Invariant Real-Time Human Tracking System

Centripetal Force. This result is independent of the size of r. A full circle has 2π rad, and 360 deg = 2π rad.

Virtual Mouse Using a Webcam

Fixplot Instruction Manual. (data plotting program)

A General Framework for Tracking Objects in a Multi-Camera Environment

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Analecta Vol. 8, No. 2 ISSN

ANALYZING A CONDUCTORS GESTURES WITH THE WIIMOTE

High-Speed Thin Client Technology for Mobile Environment: Mobile RVEC

Real time vehicle detection and tracking on multiple lanes


1 One Dimensional Horizontal Motion Position vs. time Velocity vs. time

Digital image processing

A Cheap Portable Eye-Tracker Solution for Common Setups

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

Path Tracking for a Miniature Robot

Cloud tracking with optical flow for short-term solar forecasting

Face detection is a process of localizing and extracting the face region from the

A ROBUST BACKGROUND REMOVAL ALGORTIHMS

CHAPTER 5 PREDICTIVE MODELING STUDIES TO DETERMINE THE CONVEYING VELOCITY OF PARTS ON VIBRATORY FEEDER

Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras

Rotation: Moment of Inertia and Torque

Journal of Engineering Science and Technology Review 2 (1) (2009) Lecture Note

Difference between a vector and a scalar quantity. N or 90 o. S or 270 o

PASSENGER/PEDESTRIAN ANALYSIS BY NEUROMORPHIC VISUAL INFORMATION PROCESSING

Colour Image Segmentation Technique for Screen Printing

Feature Tracking and Optical Flow

Experiment: Static and Kinetic Friction

A Method of Caption Detection in News Video

Model-Based 3D Human Motion Capture Using Global-Local Particle Swarm Optimizations

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS

QAV-PET: A Free Software for Quantitative Analysis and Visualization of PET Images

Introduction to the Perceptual Computing

Acceleration levels of dropped objects

INSTRUCTOR WORKBOOK Quanser Robotics Package for Education for MATLAB /Simulink Users

HOOKE S LAW AND SIMPLE HARMONIC MOTION

QUALITY TESTING OF WATER PUMP PULLEY USING IMAGE PROCESSING

A Learning Based Method for Super-Resolution of Low Resolution Images

Lecture L22-2D Rigid Body Dynamics: Work and Energy

Whitepaper. Image stabilization improving camera usability

Face Locating and Tracking for Human{Computer Interaction. Carnegie Mellon University. Pittsburgh, PA 15213

A Computer Vision System for Monitoring Production of Fast Food

n 2 + 4n + 3. The answer in decimal form (for the Blitz): 0, 75. Solution. (n + 1)(n + 3) = n lim m 2 1

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER

Television Control by Hand Gestures

Using MATLAB to Measure the Diameter of an Object within an Image

Hands Tracking from Frontal View for Vision-Based Gesture Recognition

How To Analyze Ball Blur On A Ball Image

The purposes of this experiment are to test Faraday's Law qualitatively and to test Lenz's Law.

Note monitors controlled by analog signals CRT monitors are controlled by analog voltage. i. e. the level of analog signal delivered through the

Jitter Measurements in Serial Data Signals

Displacement (x) Velocity (v) Acceleration (a) x = f(t) differentiate v = dx Acceleration Velocity (v) Displacement x

Design-Simulation-Optimization Package for a Generic 6-DOF Manipulator with a Spherical Wrist

In order to describe motion you need to describe the following properties.

More Local Structure Information for Make-Model Recognition

Vision-based Walking Parameter Estimation for Biped Locomotion Imitation

Mechanics lecture 7 Moment of a force, torque, equilibrium of a body

Projection Center Calibration for a Co-located Projector Camera System

CS 4204 Computer Graphics

Lecture L6 - Intrinsic Coordinates

Project: OUTFIELD FENCES

1 of 7 9/5/2009 6:12 PM

Nowcasting of significant convection by application of cloud tracking algorithm to satellite and radar images

A System for Capturing High Resolution Images

Towards License Plate Recognition: Comparying Moving Objects Segmentation Approaches

Vector Algebra II: Scalar and Vector Products

WEIGHTLESS WONDER Reduced Gravity Flight

Newton s Laws of Motion Project

BrainMaster Macromedia Flash Player

Examples of Data Representation using Tables, Graphs and Charts

Video in Logger Pro. There are many ways to create and use video clips and still images in Logger Pro.

animation animation shape specification as a function of time

Speed Performance Improvement of Vehicle Blob Tracking System

Real Time Target Tracking with Pan Tilt Zoom Camera

Binary Image Scanning Algorithm for Cane Segmentation

Demo: Real-time Tracking of Round Object

1. Units of a magnetic field might be: A. C m/s B. C s/m C. C/kg D. kg/c s E. N/C m ans: D

Monitoring Head/Eye Motion for Driver Alertness with One Camera

Tutorial for Tracker and Supporting Software By David Chandler

3. KINEMATICS IN TWO DIMENSIONS; VECTORS.

Robert Collins CSE598G. More on Mean-shift. R.Collins, CSE, PSU CSE598G Spring 2006

Object tracking & Motion detection in video sequences

Lecture L5 - Other Coordinate Systems

Experiment 9. The Pendulum

Making Machines Understand Facial Motion & Expressions Like Humans Do

Self-Calibrated Structured Light 3D Scanner Using Color Edge Pattern

Awell-known lecture demonstration1

Accurate Air Flow Measurement in Electronics Cooling

Transcription:

Efficient Fingertip Tracking and Mouse Pointer Control for a Human Mouse Jiyoung Park and Juneho Yi School of Information and Communication Engineering Sungkyunkwan University Suwon 440-746, Korea {jiyp, jhyi}@ece.skku.ac.kr Abstract. This paper discusses the design of a working system that visually recognizes hand gestures for the control of a window based user interface. We present a method for tracking the fingertip of the index finger using a single camera. Our method is based on CAMSHIFT algorithm and it tracks well particular hand poses used in the system in complex backgrounds. We describe how the location of the fingertip is mapped to a location on the monitor, and how it is both necessary and possible to smooth the path of the fingertip location using a physical model of a mouse pointer. Our method is able to track in real time, yet does not absorb a major share of computational resources. The performance of our system shows a great promise that we will be able to use this methodology to control computers in near future. 1 Introduction Computer vision has a significant role to play in the human-computer interaction devices of the future. There have been reported many research results [1-5, 7, 8] in the literature that try to substitute the currently used devices such as mouse and keyboard with a vision based natural interface. Rehg and Kanade described DigitEyes [4], a model-based hand tracking system that uses two cameras to recover the state of a hand. Kjeldsen and Kender suggested a simple visual gesture system to manipulate graphic user interface of a computer [5]. Andrew et al. presented a method for tracking the 3D position of a finger, using a single camera placed several meters away from the user [7]. DigitEyes is based on a 3D model of a human hand and is computationally very expensive. The performance of the other two approaches is not robust to complex background. This research presents a system that efficiently tracks the fingertip of the index finger for the purpose of application to a human mouse. The main features of the proposed system are as follows. First, adaptive online training of skin color distribution is employed to reliably detect human hand regions. Segmentation of hand regions is robust to noise, and is very accurate because a user-specific lookup table is used. Second, we have proposed a tracking method based on CAMSHIFT algorithm [1] where the method is particularly tuned for tracking hand regions. As can be seen in the experimental results, the tracking performance is robust to cluttered background having colors similar to those of skin regions. Third, a novel method to display the J.L. Crowley et al. (Eds.): ICVS 2003, LNCS 2626, pp. 88 97, 2003. Springer-Verlag Berlin Heidelberg 2003

Efficient Fingertip Tracking and Mouse Pointer Control 89 mouse pointer gives users a natural feeling of controlling a mouse by physically modeling the mouse pointer. As a result, the motion of the mouse pointer on the monitor is very smooth even though actual tracking rate is as low as 5-6 Hz. The problem due to the difference of resolution between the input image from the camera and the monitor has also been resolved. The experimental results show a great promise that the proposed system can be applied to control computers in near feature. This paper is organized as follows. The following section briefly reviews an overview of our system. Section 3 and 4 present the proposed methods for fingertip detection and the display of the mouse pointer on the monitor, respectively. Experimental results are reported in section 5. 2 System Overview Let us briefly overview the entire system. Fig. 1 shows a block diagram of the computation in the system. Details of each computation will be presented in later sections. Fig. 1. System overview The system is divided into three parts: adaptive on-line training of skin colors for hand region segmentation, fingertip tracking, and mouse pointer display. The first part is on-line training of skin colors robust to changes of illumination and hand poses, and complex backgrounds. The system learns the probability distribution of skin colors of the current user. The second part is fingertip tracking (dashed region in Fig. 1). Hand

90 J. Park and J. Yi regions are segmented in the input image sequence using the learned skin color lookup table computed in the first part. After segmentation of the hand regions, we compute the location of the fingertip. To rapidly detect the fingertip location, we have employed a region of interest (ROI) using the information about the size and the orientation of a hand region. The last part is for the control of the mouse pointer. The mapping between camera coordinates and monitor coordinates of the fingertip location is computed. In addition, smoothing of the mouse pointer path is performed so that the user can have natural feeling as if he or she were using a real mouse. 3 Fingertip Detection 3.1 Adaptive Online Training of Skin Colors We use an on-line training method as can be seen in Fig. 2. The user is asked to place his or her hand so that the rectangular box can stay in the hand. The method gets the system to learn variations of skin colors due to changes of local illumination and hand poses. The advantage of this method is that the accurate distribution of the current user s skin colors is obtained. Fig. 2. The on-line training procedure to learn variations of skin colors due to changes of local illumination and hand poses. There has been a research report [1] claming that, Humans are all the same color (hue). However, when skin color is detected by only hue information, we find it difficult to find skin regions of the user because it is affected by the colors of the environment having values of H (hue) similar to skin regions. Fig. 3 shows a couple of examples when the CAMSHIFT algorithm is applied to track a hand region. The intersection of the two lines represents the center of the hand region detected. However, the result of skin color segmentation is wrong because of (a) the red sleeve of the jacket and (b) the color of the bookshelf. We have employed H (hue) and S (saturation) values of HSV color space for learning the probability distribution of skin colors. A 2D look up table is computed using H and S as the result of on-line training of skin colors. The entries of this table are the probability density of hue and saturation values.

Efficient Fingertip Tracking and Mouse Pointer Control 91 B BB B (a) (b) Fig. 3. Examples of wrong segmentation results of CAMSHIFT algorithm when only hue (H) value is used. (a) the effect of the red sleeve of the jacket, (b) the effect of the color of the bookshelf 3. 2 Detection and Tracking of Hand Regions Determining the location and size of an ROI. The computation time is saved a lot by confining necessary computation to ROI. The hand region is tracked in the current frame within the ROI that was computed in the previous frame. In the ROI of the current frame, the hand region is detected but the centers of the ROI and the hand region detected do not accord because of the hand motion. The center of the hand region within the current ROI becomes the center of ROI to be used in the next frame. Noise around the hand region may cause the propagation of errors in detection of the hand region. When there is noise around the hand region, the location (i.e. center) of ROI to be used in the next frame is not computed correctly and this has an effect on detection of the hand region. We can reduce the effect due to noise by not considering pixels in the skin region of which probability density is less than some threshold value. After thresholding, we obtain a binary image, B(x,y), in ROI. B(x,y)=1 for pixels belonging to hand region. However, in some cases, part of the hand region is not detected due to this thresholding operation. The size of an ROI is determined by considering the size and the orientation of the hand region. The center coordinates (x c, y c ) of the hand region can be simply computed using the 0 th, 1 st order moments [6] as follows. x c = M M, y M = BBBBBBBBBBBBBBBBBBBBB(1) 10 01 c 00 M00 M = B( x, y), M = xb( x, y), M = yb( x, y) 00 10 01 x y x y x y The 2D orientation of the detected hand region is also easy to obtain by using the 2 nd order moments. The 2 nd order moments are computed as follows. BBBBBBBBBBBBBBB(2) 2 2 B 20 = 02 = x y x y M xbxy (, ), M ybxy (, )

92 J. Park and J. Yi Then the object orientation (major axis) is; ( b a c ) arctan,( ) θ = BBBBBBBBBBBBBBBBBBB(3) 2 wherebb M 20 2 M 11 M 02 2 a = xc, b = 2 xcyc, c = y P c M 00 M 00 M 00 The size of an ROI can be determined by the orientation, θ, of the hand region by using the moments computed and by scaling the horizontal and vertical lengths of the ROI as in equation (4). The horizontal and vertical lengths for the ROI are denoted by R x and R y, respectively. R = s M, R = s M, x x 00 y y 00 where s = cos θ + 1, s = sin θ + 1, max( s, s ) + 0.1 x y x y (4) The scaling factors, s x and s y, are determined from the statistical knowledge that the length ratio of the elongated axis vs. its perpendicular axis of a human hand does not exceed 2.1:1. Hand tracking. The algorithm for tracking a hand region is based on the CAMSHIFT algorithm. The main difference between the CAMSHIFT algorithm and our algorithm is the way to determine the ROI. Fig. 4 shows an example of three successive updates of the search window when the CAMSHIFT algorithm is applied to track a hand region. The search window tends to include noise regions next to the previous search window that have the colors similar to skin regions. We can see that the size of the search window has grown and that the center of the detected hand region is not correctly computed. BB BB Fig. 4PBGrowing the search window in the CAMSHIFT algorithm In contrast to CAMSHIFT, computes the ROI in a particular way that is tuned for a human hand while the CAMSHIFT is not. Therefore, the tracking performance of our algorithm is tweaked, especially when there are colors in the background similar to those of skin region. A block diagram of the algorithm is shown in Fig. 5.

Efficient Fingertip Tracking and Mouse Pointer Control 93 Fig. 5PBA block diagram of hand tracking 3. 3 Computation of the Fingertip Location If there is no noise, skin regions are detected perfectly. In this case, to compute the location of the fingertip, we can simply use the coordinates of boundary pixels [7] or can compute the center coordinates of the first knuckle of the index finger. However, detection of skin region cannot be perfect due to noise. We have used a way robust to noise for the detection of the fingertip location. We assume that the fingertip location lies in the first knuckle of the index finger. The length of the elongated axis of the hand region can be simply computed using equation (5) [8]. The end point of l is determined as the location of the fingertip. Refer to Fig. 6. l = ( a+ c) + b + ( a c) 2 2 2 (5) where a, b and c are the same as in equation (3). Fig. 6. Computation of the fingertip location in the cases of 0 θ < (left) and θ > 0 (right)

94 J. Park and J. Yi 4 Display of a Mouse Pointer on the Monitor Even if the fingertip is detected correctly, we might not get a natural motion of a mouse pointer due to the following problems. First, because of the limitation on the tracking rate on commonly available PC class hardware, we can merely get discontinuous coordinates of fingertip locations. Accordingly, if these coordinates of the fingertip locations are used for the display on the monitor without any smoothing, the mouse pointer is going to jump around in a very unnatural fashion. Second, if the location of the fingertip in each frame is converted to the monitor coordinates directly, the difference of the resolution between the input image from the camera and the monitor makes it very difficult to position the mouse pointer accurately. Moreover, the jitter can occur even though the hand remains still. A small movement of the fingertip in the input image can cause a significantly large movement of the mouse pointer on the monitor. To circumvent these problems, we describe how the fingertip location in the input image is mapped to a location on the monitor, and how it is both necessary and possible to smooth the trajectory of the mouse pointer by physically modeling of the mouse pointer. For simplicity, let us describe our method for one-dimensional case. In the i th frame, the difference between the location of the current mouse pointer (X i-1 ) and the fingertip location (H i ) is converted to an amount of force that accelerates the mouse pointer in an appropriate direction as in equation (6). Di = Hi X i 1 NB F i = f ( D i ) (6) If the displacement of the fingertip location in the input image is smaller than a predetermined threshold, the minimum force, F min, will be set to zero because the mouse pointer should remain still on the monitor. The threshold is set to 1% of the resolution of each axis on the monitor. The mouse pointer is treated as an object having mass and velocity as well as position. Its motion can be described using the second-order differential equation for the position x(t) based on the Newton s law. 2 d x () t F = (7) 2 dt m where F and m denote the amount of force and the mass of an object, respectively. For the solution of equation (7), we use Euler-Cromer algorithm [9] as follows. (8) BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB where t is the time interval between successive two frames. Our observations shows that we need to achieve the tracking rate of more than 20 fps so that a user can perceive a natural movement of the mouse pointer. If a system cannot support this rate, the value of k that satisfies t/ k = 0.03( s) is chosen. That is, locations of the mouse pointer are generated on the monitor during the time between two image frames so that the resulting path of the mouse pointer may be smooth. This one-dimensional method is simply extended to two dimensions, x and y, to be applied to our case.

5 Experimental Results Efficient Fingertip Tracking and Mouse Pointer Control 95 Fig. 7 shows examples of the tracking results for three lighting conditions: (a) natural and artificial lighting, (b) natural lighting and (c) artificial lighting. The rectangular boxes represent ROI depending on the size and the orientation of the hand region. The location of the fingertip is reliably computed in spite of the background having colors similar to those of skin regions. BBBB BBBB BBBB BBBB JƒKBBBBBBBBBBBBBBBBBBBBBBJ KBBBBBBBBBBBBBBBBBBBBJ K Fig. 7. Tracking results of the fingertip location. (a) natural and artificial lighting, (b) natural lighting, (C) artificial lighting Table 1 summarizes the experimental results. The accuracy measures the frequency of correct detection of fingertip location. Table 1. The accuracy of detecting a fingertip location # total frames # detected frames accuracy (%) natural and artificial lighting 400 373 93.25 natural lighting 400 368 92 artificial lighting 400 387 96.75 total 1200 1128 94 We can see that the tracking performance is quite robust to lighting conditions. Fig. 8 shows the smoothing result of the mouse pointer path when the tracking rate is 6Hz. With raw tracking results, the monitor coordinates of the mouse pointer are quite discontinuous and not smooth as can be seen in Fig. 8 (a). The path of the mouse pointer is smoothed using the proposed method that physically models the mouse pointer.

96 J. Park and J. Yi (a) (b) Fig. 8. An example trajectory of the mouse pointer. (a) raw tracking results (6Hz), (b) a smoothed version of the raw tracking results 6 Conclusion In this paper we have proposed a method to detect and track the fingertip of the index finger for the purpose of application to a human mouse system. The proposed method for fingertip tracking shows a good performance under cluttered background and variations of local illumination due to hand poses by virtue of online training of skin colors and optimal determination of ROI. In addition, a natural motion of the mouse pointer is achieved by smoothing the path of the fingertip location using physical model of the mouse pointer. The experimental results show a great promise that the proposed method can be applied to manipulate computers in near future. Acknowledgements. This work was supported by grant number R01-1999-000-00339-0 from the Basic Research program of the Korean Science and Engineering Foundation. References 1. Gary R. Bradski, "Computer Vision Face Tracking For Use in a Perceptual User Interface", Intel Technology Journal Q2, 1998. 2. R. Kjeldsen and J. Kender "Interaction with On-Screen Objects using Visual Gesture Recognition" IEEE Conference on Computer Vision and Pattern Recognition, pp. 788 793, 1997. 3. C. Jennings, "Robust finger tracking with multiple cameras", International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, pp. 152 160, 1999. 4. J. Rehg and T. Kanade, "Visual Analysis of High DOF Articulated Object with Application to Hand Tracking", CMU Tech. Report CMU-CS-95-138, Carnegie Mellon University, April, 1993.

Efficient Fingertip Tracking and Mouse Pointer Control 97 5. R. Kjeldsen and J. Kender, "Towards the use of Gesture in Traditional User Interfaces", International Conference on Automatic Face and Gesture Recognition, pp. 151 156, 1996. 6. B. K. P. Horn,B"Robot vision", MIT Press, 1986. 7. Andrew Wu, Mubarak Shah and N. da Vitoria Lobo, " A Virtual 3D Blackboard: 3D Finger Tracking using a Single Camera" Proceedings of the Fourth International Conference on Atomatic FaceBand Gesture Reconition, pp. 536 543, 2000. 8. W. T. Freeman, K. Tanaka, J. Ohta and K. Kyuma, "Computer vision for computer games" Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, pp. 100 105, 1996. 9. Harvey Gould and Jan Tobochnik, "An Introduction to Computer Simulation Methods", Addison Wesley Publishing Company, 1996.