Gaze Estimation Using Image Segmentation, Feature Extraction, & Neural Networks. by Katie Morgan

Similar documents
Eye Tracking Instructions

Analecta Vol. 8, No. 2 ISSN

Face detection is a process of localizing and extracting the face region from the

Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras

Starburst: A hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches

A concise guide to Safety Glasses, the different standards and the effects of light on the eye. Contents. Links. Year of publication: 2010

Tracking and Recognition in Sports Videos

HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT

PRODUCT SHEET.

Face Recognition For Remote Database Backup System

Poker Vision: Playing Cards and Chips Identification based on Image Processing

How To Use Eye Tracking With A Dual Eye Tracking System In A Collaborative Collaborative Eye Tracking (Duet)

Tobii Technology AB. Accuracy and precision Test report. X2-60 fw Date: Methodology/Software version: 2.1.7

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

Circle Object Recognition Based on Monocular Vision for Home Security Robot

Real-Time Eye Gaze Tracking With an Unmodified Commodity Webcam Employing a Neural Network

Mean-Shift Tracking with Random Sampling

An Approach for Utility Pole Recognition in Real Conditions

Virtual Fitting by Single-shot Body Shape Estimation

International Journal of Advanced Information in Arts, Science & Management Vol.2, No.2, December 2014

Template-based Eye and Mouth Detection for 3D Video Conferencing

Tobii AB. Accuracy and precision Test report. Tobii Pro X3-120 fw Date: Methodology/Software version: 2.1.7*

A Dynamic Approach to Extract Texts and Captions from Videos

Appearance-based Eye Gaze Estimation

A Learning Based Method for Super-Resolution of Low Resolution Images

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

Automatic Calibration of an In-vehicle Gaze Tracking System Using Driver s Typical Gaze Behavior

Colour Image Segmentation Technique for Screen Printing

Novelty Detection in image recognition using IRF Neural Networks properties

Eye-Tracking with Webcam-Based Setups: Implementation of a Real-Time System and an Analysis of Factors Affecting Performance

Segmentation of building models from dense 3D point-clouds

RESEARCH ON SPOKEN LANGUAGE PROCESSING Progress Report No. 29 (2008) Indiana University

Mouse Control using a Web Camera based on Colour Detection

Static Environment Recognition Using Omni-camera from a Moving Vehicle

How To Filter Spam Image From A Picture By Color Or Color

Simultaneous Gamma Correction and Registration in the Frequency Domain

Component Ordering in Independent Component Analysis Based on Data Power

Neural Network based Vehicle Classification for Intelligent Traffic Control

A Simple Feature Extraction Technique of a Pattern By Hopfield Network

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Projection Center Calibration for a Co-located Projector Camera System

Digital Image Requirements for New Online US Visa Application

Do-It-Yourself Eye Tracker: Impact of the Viewing Angle on the Eye Tracking Accuracy

Crater detection with segmentation-based image processing algorithm

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

Limitations of Human Vision. What is computer vision? What is computer vision (cont d)?

REAL TIME TRAFFIC LIGHT CONTROL USING IMAGE PROCESSING

Designing eye tracking experiments to measure human behavior

ANALYZING A CONDUCTORS GESTURES WITH THE WIIMOTE

Technical Considerations Detecting Transparent Materials in Particle Analysis. Michael Horgan

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE.

Fast Subsequent Color Iris Matching in large Database

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan

HAND GESTURE BASEDOPERATINGSYSTEM CONTROL

Multimodal Biometric Recognition Security System

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

FACE RECOGNITION BASED ATTENDANCE MARKING SYSTEM

COLOR-BASED PRINTED CIRCUIT BOARD SOLDER SEGMENTATION

Camera Technology Guide. Factors to consider when selecting your video surveillance cameras

BCC Multi Stripe Wipe

A Method of Caption Detection in News Video

A Method for Controlling Mouse Movement using a Real- Time Camera

COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER

A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS

AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION

Protocol for Microscope Calibration

3D Scanner using Line Laser. 1. Introduction. 2. Theory

Face Recognition in Low-resolution Images by Using Local Zernike Moments

OptimizedIR in Axis cameras

Image Processing Based Automatic Visual Inspection System for PCBs

Galaxy Morphological Classification

Taking Inverse Graphics Seriously

To Install EdiView IP camera utility on iphone, follow the following instructions:

VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION

PASSENGER/PEDESTRIAN ANALYSIS BY NEUROMORPHIC VISUAL INFORMATION PROCESSING

Performance Comparison of Visual and Thermal Signatures for Face Recognition

A Real Time Hand Tracking System for Interactive Applications

VISUAL RECOGNITION OF HAND POSTURES FOR INTERACTING WITH VIRTUAL ENVIRONMENTS

MACHINE VISION MNEMONICS, INC. 102 Gaither Drive, Suite 4 Mount Laurel, NJ USA

The Implementation of Face Security for Authentication Implemented on Mobile Phone

Vision based approach to human fall detection

Transcription:

Gaze Estimation Using Image Segmentation, Feature Extraction, & Neural Networks by Katie Morgan

Introduction Human gaze estimation is a rapidly growing field of research with many useful applications. It is especially useful for human-computer interaction (HCI) applications, such as video conferencing and eye-typing Many eye and gaze tracking methods exist, but not all are desirable for the average computer user, either because they are intrusive, expensive, or require frequent recalibration

Previous Works One method [10] uses an infrared camera and an infrared light to obtain bright and dark pupil images which highlight the pupil and cause an infrared glint on the eye. By fitting an ellipse to the pupil and using the glint coordinates, user gaze is accurately approximated Requires the use of expensive and specialized equipment

Previous Works Another method [3] uses a simple camera to capture grayscale images of the user. A small eye image is segmented from this large image, and is then input to a neural network, which returns the gaze position coordinates Although straightforward, it was thought that the neural networks in this method would be very large, since it would need to represent images containing 600 pixel values.

Proposed Feature Extraction Method The method researched combines the approaches of the infrared method and the image neural networking method 1. A small eye image is segmented from the larger image of the user 2. Pupil is identified and an ellipse is fit around it 3. The area of the sclera (white of the eye) to the left and right of the pupil is calculated 4. Parameters from the fitted ellipse and the ratio of the left and right sclera areas are saved in feature vectors used to train 2 neural networks to approximate screen gaze coordinates By using feature vectors instead of images to train the neural networks, it was hypothesized that the networks would be smaller and more efficient

Equipment and Setup The system setup was designed to run in an open office environment Users sat about one meter away from the camera in front of the left monitor, after which the camera was manually adjusted once to put the subject s right eye near the center of the resulting images

Feature Extraction Eye segmentation Using thresholding and known information about average pupil features, extract the small image of the user s right eye from the larger image

Feature Extraction Pupil Ellipse Fitting Identify the pupil in the eye image and fit an ellipse around it

Feature Extraction Pupil Ellipse Fitting Histogram normalization is performed on the eye image to enhance the brightest and darkest values This image is thresholded so that only those points with the lowest 3% of pixel values remain The resulting clusters of points are analyzed based on size, and the cluster with the largest area is assumed to be the pupil The image indices corresponding to the pupil cluster points are input to an ellipse fitting method to determine the parameters of the ellipse which best fits the pupil points Angle of ellipse orientation, x & y coordinates of the ellipse center, and the ratio of it s major to minor axis are the 4 ellipse parameters used to train the networks.

Feature Extraction Sclera Detection & Area Calculation Detect the sclera to the left and right of the pupil and calculate the ratio of their areas

Feature Extraction Sclera Detection & Area Calculation Starting from the center of the pupil and moving left, compare the t difference in value between neighboring pixels If two pixels are found whose difference is greater than the difference ference threshold, then the edge between the iris and sclera has been found If no edge is detected, the difference threshold decreases and the t process is repeated until an edge is found The maximum and minimum values in a neighborhood around the first sclera pixel determine the range of pixel values of the sclera Using these values, a recursive algorithm searches a set window for neighboring pixels within value range, and increases the sclera area count until all appropriate pixels are found The same procedure repeated to find the right sclera area Once both counts are obtained, their ratio is calculated and used d as the 5 th element in the feature vector

Training Data Collection Four black, square points were produced in the center of the four screen quadrants One by one, in random order, each of these points was made much larger than the remaining three As a point was enlarged, the program allowed a one second delay time for the user to focus on the point The interface captured three different images to avoid having an image of the subject blinking The images and the position of the corresponding gaze point were appended to the end of a data file.

Training Data Collection Segmentation was then used to attempt to find the subject s right eye If an eye candidate was not found, then the program segmented the second image, and then the third If no eye candidates were found, the test data was rejected Once an eye candidate was found, the segmented image was saved to the same file as the data and full image This procedure was repeated for all 4 gaze points. The operator then manually checked the segmented images of the saved data Any segmented image not of an eye, then this image, along with the full image and data corresponding to it, were deleted By doing this, the neural networks would only be trained with relevant data.

Timing Results Avg. Time (s) Image Segmentation (fps) Feature Extraction (fps) Combined Method (fps) Image segmentation 0.3190 IR -- -- 20 Feature Extraction 0.0300 Image NN -- -- 20 Combined 0.3596 Method 3.13 33.33 2.78 Timing results from the average runtimes of 77 images (combined) and 50 successful images (segmentation and feature extraction) Improving segmentation time will improve the combined speed

Feature Extraction Results The method was tested on twelve test subjects of varying age, gender, and race (Asian and Caucasian) The method performed fairly well for many test subjects

Feature Extraction Results Once the image was correctly segmented, features were accurately extracted in most cases. Glare, squints, and eyelashes were the usual causes of incorrect extraction. A correlation between gaze position and ellipse orientation can be seen although there is some difference for each test subject, the main ellipse shape and orientation remains similar for each of the four gaze points

Feature Extraction Results The feature extraction procedure also performs well for some users wearing glasses, although the results are dependent on the type of glasses worn, which affects the position of the glare on the lens.

Incorrect Results Hair Detected Eye Corner Detected One Sided Sclera Eyebrow Fit with Ellipse

Future Work Training the Neural Networks At this time, there is not sufficient training data for the appropriate training of the neural networks there was simply not enough time before this presentation to collect the hundreds of necessary feature vectors required The 50 feature vectors obtained in the data collection were from tests of many different people In order to effectively train the neural networks for this many subjects, many more feature vectors per person would be required Until the feature extraction and segmentation procedure can be improved and sped up, the neural network data would also be inconsistent and incorrect

Future Work Training the Neural Networks Eventually, there will be two neural networks, each trained with the same large set of input feature vectors and target gaze coordinates One network will approximate the vertical coordinate, and the other the horizontal, since there is a larger difference between horizontal movement than vertical movement Many other methods use General Regression Neural Networks because of their faster training times However, they often use varying numbers of layers, a capability not available in Matlab,, so the use of Radial Basis Neural Networks will also be examined during future study, since the two networks are very similar

Future Work There is much future work to be done to explore and improve this method Testing the method with different head angles and experimenting with the eye segmentation portion of the method Using edge enhancement to improve sclera detection Improving the accuracy of the system to divide the screen into 8 regions Implementing the method with a web-camera instead of the Sony DVR camera would also be a desirable future study in order to decrease the cost of this method even further

Works Cited [1] A. García,, A. Pérez,, F. Sánchez,, J.L. Pedraza,, M.L. Córdoba,, M.L. Muñoz,, and R.Méndez,, A Precise Eye-Gaze Detection and Tracking System, in Proceedings of the 11th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision,, 2003. [2] A.S. Johansen, D.W. Hansen, J.P. Hansen, and M. Nielsen, Eye Typing using Markov and Active Appearance Models, Sixth IEEE Workshop on Applications of Computer Vision,, pp. 132-136, 136, 2002. [3] D. Machin,, L.-Q. Xu,, and P. Sheppard, A Novel Approach to Real-time Non-intrusive Gaze Finding, in British Machine Vision Conference,, pp. 428-437, 437, 1998. [4] J. Nusairat,, N.O. Nawari,, and R. Liang,, Artificial Intelligence Techniques for the Design and Analysis s of Deep Foundations, The Electronic Journal of Geotechnical Engineering,, 1999. [5] J. Yang, and J. Zhu, Subpixel Eye Gaze Tracking, Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition,, p.131, 2002. [6] K. Fujimura, Q. Ji,, and Z. Zhu, Combining Kalman Filtering and Mean Shift for Real Time Eye Tracking Under Active IR Illumination, in International Conference on Pattern Recognition,, pp: IV 318-321, 321, 2002. [7] K. Fujimura, Q. Ji,, and Z. Zhu, Real-Time Eye Detection and Tracking Under Various Light Conditions, Eye Tracking Research & Application, ACM Press, New York, NY, USA, pp. 139-144, 144, 2002. [8] N. Mukawa,, and T. Ohno,, A free-head, Simple Calibration, Gaze Tracking System That Enables Gaze-Based Interaction, Proceedings of the Eye tracking research & applications symposium m on Eye tracking research & applications, ACM Press, New York, NY, USA, pp. 115-122, 122, 2004. [9] Q. Ji,, and X. Yang, Real Time 3D Face Pose Discrimination Based On Active A IR Illumination, Proceedings, 16th International Conference on Pattern Recognition, pp. 310-313, 313, 2002. [10] Q. Ji,, and Z. Zhu, Eye and gaze Tracking for Interactive Graphic Display, ACM International Conference Proceeding Series; Vol. 24,, ACM Press, New York, NY, USA, pp. 79-85, 2002. [11] Y. Ebisawa,, Improved Video-Based Eye-Gaze Detection Method, IEEE Trans. Instr and Meas., vol. 47, no.4, pp. 948-955, 955, 1998.