Low-resolution Character Recognition by Video-based Super-resolution



Similar documents
Automatic Calibration of an In-vehicle Gaze Tracking System Using Driver s Typical Gaze Behavior

Accurate and robust image superresolution by neural processing of local image representations

Limitation of Super Resolution Image Reconstruction for Video

Performance Verification of Super-Resolution Image Reconstruction

Vision based Vehicle Tracking using a high angle camera

SUPER RESOLUTION FROM MULTIPLE LOW RESOLUTION IMAGES

Super-resolution method based on edge feature for high resolution imaging

Low-resolution Image Processing based on FPGA

Super-resolution Reconstruction Algorithm Based on Patch Similarity and Back-projection Modification

Extracting a Good Quality Frontal Face Images from Low Resolution Video Sequences

Redundant Wavelet Transform Based Image Super Resolution

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

Latest Results on High-Resolution Reconstruction from Video Sequences

ROBUST COLOR JOINT MULTI-FRAME DEMOSAICING AND SUPER- RESOLUTION ALGORITHM

Face Model Fitting on Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images

Performance Analysis and Comparison of JM 15.1 and Intel IPP H.264 Encoder and Decoder

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

High Quality Image Magnification using Cross-Scale Self-Similarity


Bayesian Image Super-Resolution

Image Super-Resolution for Improved Automatic Target Recognition

HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT

SUPER-RESOLUTION IMAGE RECONSTRUCTION FROM ALIASED FLIR IMAGERY

Video stabilization for high resolution images reconstruction

Parametric Comparison of H.264 with Existing Video Standards

High Resolution Images from a Sequence of Low Resolution Observations

Shogo Kumagai, Keisuke Doman, Tomokazu Takahashi, Daisuke Deguchi, Ichiro Ide, and Hiroshi Murase

Face Recognition in Low-resolution Images by Using Local Zernike Moments

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

How To Segmentate In Ctv Video

Resolving Objects at Higher Resolution from a Single Motion-blurred Image

High-accuracy ultrasound target localization for hand-eye calibration between optical tracking systems and three-dimensional ultrasound

SUPER-RESOLUTION FROM MULTIPLE IMAGES HAVING ARBITRARY MUTUAL MOTION

Superresolution images reconstructed from aliased images

Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap

Analecta Vol. 8, No. 2 ISSN

Subspace Analysis and Optimization for AAM Based Face Alignment

Two-Frame Motion Estimation Based on Polynomial Expansion

Static Environment Recognition Using Omni-camera from a Moving Vehicle

SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS. Nitin Khanna and Edward J. Delp

FAST REGISTRATION METHODS FOR SUPER-RESOLUTION IMAGING. Jari Hannuksela, Jarno Väyrynen, Janne Heikkilä and Pekka Sangi

Super-Resolution for Traditional and Omnidirectional Image Sequences

Real time vehicle detection and tracking on multiple lanes

Parameter Estimation in Bayesian High-Resolution Image Reconstruction With Multisensors

Hybrid Lossless Compression Method For Binary Images

A Novel Method for Brain MRI Super-resolution by Wavelet-based POCS and Adaptive Edge Zoom

False alarm in outdoor environments

COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS

Joint MAP Registration and High Resolution Image Estimation Using a Sequence of Undersampled Images 1

Navigation Aid And Label Reading With Voice Communication For Visually Impaired People

A Reliability Point and Kalman Filter-based Vehicle Tracking Technique

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions

A NEW SUPER RESOLUTION TECHNIQUE FOR RANGE DATA. Valeria Garro, Pietro Zanuttigh, Guido M. Cortelazzo. University of Padova, Italy

Convolution. 1D Formula: 2D Formula: Example on the web:

Projection Center Calibration for a Co-located Projector Camera System

The use of computer vision technologies to augment human monitoring of secure computing facilities

Signature Segmentation from Machine Printed Documents using Conditional Random Field

Image Interpolation by Pixel Level Data-Dependent Triangulation

High Quality Image Deblurring Panchromatic Pixels

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE.

RESOLUTION IMPROVEMENT OF DIGITIZED IMAGES

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

A Novel Method to Improve Resolution of Satellite Images Using DWT and Interpolation

3D Scanner using Line Laser. 1. Introduction. 2. Theory

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

Class-specific Sparse Coding for Learning of Object Representations

Video Affective Content Recognition Based on Genetic Algorithm Combined HMM

PERFORMANCE ANALYSIS OF HIGH RESOLUTION IMAGES USING INTERPOLATION TECHNIQUES IN MULTIMEDIA COMMUNICATION SYSTEM

Implementation of OCR Based on Template Matching and Integrating it in Android Application

1646 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 12, DECEMBER 1997

Open-Set Face Recognition-based Visitor Interface System

REAL TIME TRAFFIC LIGHT CONTROL USING IMAGE PROCESSING

How To Filter Spam Image From A Picture By Color Or Color

Feature Tracking and Optical Flow

Numerical Methods For Image Restoration

A System for Capturing High Resolution Images

Development of a License Plate Number Recognition System Incorporating Low- Resolution Cameras

Detection and Recognition of Mixed Traffic for Driver Assistance System

High Resolution Images from Low Resolution Video Sequences

Canny Edge Detection

Circle Object Recognition Based on Monocular Vision for Home Security Robot

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS

Mean-Shift Tracking with Random Sampling

Super-Resolution Imaging Applied to Moving Targets in High Dynamic Scenes

Tracking and Recognition in Sports Videos

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition

Detection and Restoration of Vertical Non-linear Scratches in Digitized Film Sequences

Speed Performance Improvement of Vehicle Blob Tracking System

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

Learning Enhanced 3D Models for Vehicle Tracking

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features

MUSIC-like Processing of Pulsed Continuous Wave Signals in Active Sonar Experiments

Neural Network based Vehicle Classification for Intelligent Traffic Control

Synthetic Aperture Radar: Principles and Applications of AI in Automatic Target Recognition

COMPONENT FORENSICS OF DIGITAL CAMERAS: A NON-INTRUSIVE APPROACH

Chapter 1 Simultaneous demosaicing and resolution enhancement from under-sampled image sequences

Resolution Enhancement of Photogrammetric Digital Images

1816 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 7, JULY Principal Components Null Space Analysis for Image and Video Classification

A New Imaging System with a Stand-type Image Scanner Blinkscan BS20

Transcription:

2009 10th International Conference on Document Analysis and Recognition Low-resolution Character Recognition by Video-based Super-resolution Ataru Ohkura 1, Daisuke Deguchi 1, Tomokazu Takahashi 2, Ichiro Ide 1 and Hiroshi Murase 1 1 Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi, 464-8601 Japan okura@murase.m.is.nagoya-u.ac.jp, {ddeguchi, ide, murase}@is.nagoya-u.ac.jp 2 Gifu Shotoku Gakuen University, Nakauzura 1-38, Gifu-shi, Gifu, 501-6194 Japan ttakahashi@gifu.shotoku.ac.jp Abstract In this paper, we propose a method for recognizing lowresolution characters using a super-resolution technique. Although portable digital cameras can be used for camerabased character recognition, the captured images contain several types of noises which make the recognition task difficult. We introduce a phase of super-resolution before the recognition to enhance the resolution of images obtained from a video. The proposed method uses the subspace method for the recognition of characters which are integrated from multiple low-resolution characters by the super-resolution technique. Experimental results show that the proposed method improves the recognition accuracy; we confirmed that the recognition rate for the input size of 7 7 was 90.35%, and for the input size of 9 9 was 99.97%. 1. Introduction Portable digital cameras are widely used in recent years. Character recognition techniques using these cameras have come into practical use. It enables us to recognize characters simply by capturing the document. For example, we can scan and input personal information from business cards or URLs from magazines. Although many character recognition methods have already been proposed, it is still difficult to recognize them accurately when a target document contains many characters. This is because the resolution of each character becomes lower in the case that many characters are included in the target text region (Fig. 1). Elms et al. [1] have proposed a method to recognize low-resolution characters captured by an image scanner. However, the performance of this method is insufficient to recognize characters when the resolution of a single character is low. Against this problem, methods for integrating multiple low-resolution images obtained from a video have been proposed. Yanadume et al. [2] have pro- Figure 1. Example of low-resolution characters obtained from a relatively large target text region. posed to integrate a sequence of similarity values given by a subspace method [3, 4, 5, 6] for the classification. It is based on the idea that the use of multiple images can restrict mis-classification. However, when multiple images include many images closer to an incorrect category, it is difficult to restrict misclassifications. As an alternative approach to low-resolution character recognition in a video, we propose a method that uses a super-resolution technique. The recognition scheme of the proposed method is presented in Fig. 2. The proposed method reconstructs a high-resolution image from multiple low-resolution images by means of the super-resolution technique, in order to enhance the resolution of both training data and test data images. Hence, we expect to recognize the test data correctly. This paper is organized as follows: Section 2 introduces the super-resolution technique used to enhance the resolution of the character images. In Section 3, the recognition step using the subspace method is described. Experimental results are presented in Section 4. 2. Super-resolution technique An image captured by a camera is generally influenced by several noises (for example, hand movement, optical blur and so on). These influences are usually modeled by a PSF 978-0-7695-3725-2/09 $25.00 2009 IEEE DOI 10.1109/ICDAR.2009.168 191

Input image Input image Low-resolution High-resolution Super-resolution Super-resoluted image Recognition Result (a) Low-resolution A (b) Low-resolution B Figure 2. The recognition scheme of the proposed method. (Point Spread Function). Moreover, when a full document is captured in a single shot, the image of each character tends to be too small for recognition. The super-resolution technique reconstructs a high-resolution image from multiple low-resolution images which have different subpixel shifts from each other [7, 8, 9, 10, 11]. Given low-resolution character images obtained from a sequence of video frames captured by a video camera, the method first integrates the low-resolution images into a high-resolution image by subpixel matching. Subpixel shifts from each image are calculated by matching the images resampled from integer to sub. Then, according to subpixel shifts, the resampled low-resolution images are integrated into a high-resolution image. Next, a target image is updated iteratively until the evaluation function I defined in Eq. 1 becomes sufficiently small, which removes the influences of optical blur in the target image. I = H y=1 x=1 W [ b(x, y) T h f (x, y) ] 2. (1) Here h is the target image in the vector form, b(x, y)isavector of the PSF kernel at a position (x, y), and f (x, y) is the pixel value of the integrated image at the position. H and W are the height and the width of the image, respectively. The evaluation function I is minimized by the conjugate gradient method proposed by Fletcher and Reevess [12]. Fig. 3 shows examples of high-resolution character images which are reconstructed by the super-resolution technique. (c) Image super-resoluted from (a) (d) Image super-resoluted from (b) Figure 3. Examples of character A and B reconstructed by super-resolution. 3. Recognition by the subspace method We employ the subspace method, which is robust against slight transformation of patterns, for the recognition step of the proposed method. The subspace method consists of two stages; a learning stage and a recognition stage. The learning stage constructs a subspace for each category, and then the recognition stage calculates similarities between an input image and the subspaces. 3.1. Training data Training data are reconstructed from multiple-frame images by the super-resolution technique. We use a sequence of multiple frame images from a video which captures a printed list of alphabets as training data to obtain character images with plenty of variations for each low-resolution character. 3.2. Construction of a subspace A subspace approximates a distribution of the training data. The learning stage finds orthogonal bases that well approximate the distribution of the training data for each category. Each i-th training image is normalized to a unit vector whose mean value is equal to 0. The normalized 192

(a) The first (b) The second (c) The third Table 1. Specification of the digital camera. Product model Canon PowerShot G9 Resolution 360 240 Frame rate 30 fps Figure 4. Eigenvectors obtained from 100 samples of character A. vector is represented as x i = [x 1, x 2,, x N ] T, (2) where N is the number of. Next, a matrix X is defined as X = [x 1, x 2,, x K ], (3) where K is the number of training data used for a category. Then an autocorrelation matrix Q is calculated from the matrix X by Q = XX T. (4) A subspace is constructed for each category as R ( K) s that correspond to the largest R eigenvalues. Each r-th is represented as e (c) r, where c (= 1, 2,, C) represents a category. Fig. 4 shows examples of s. 3.3. Recognition Each input character image is segmented from highresolution images which are reconstructed by the superresolution technique. Let y be an input image which are normalized in the same manner as in the learning stage. The similarity between a category c and the input image y is defined as R ( L (c) (y) = e (c) r y ) 2, (5) r=1 where (a b) represents the inner product of a and b. Then the input image is classified to a category c which maximizes Eq. (5). The category c is calculated by the following equation ĉ = arg max L (c) (y). (6) c 4. Experiments and disscussion We confirmed the effectiveness of our method experimentally. Test samples were obtained by capturing a video sequence of printed characters with a portable digital camera (Table 1). Fig. 5 shows an example of the images that Figure 5. Example of a captured image. were used in the experiment. During this experiment, because subpixel shifts between the captured frames is essential to the proposed method, the anti-blur function of the digital camera was kept off. An alphanumeric Century font was used in the experiments. The number of categories C was 62 as shown in Fig. 6. In applying the superresolution technique, we assumed that the PSF kernel could be represented by a Gaussian filter (σ = 0.3). When we constructed a subspace, the number of s R was set to 5, and the size W H of images was normalized to 32 32. 4.1. Recognition accuracy by image sizes We compared recognition rates of the proposed method to a comparative method by changing the size of characters. In the comparative method, characters are recognized from multiple low-resolution images without applying the super-resolution technique. The comparative method obtains a recognition result ĉ by the following equation ĉ = arg max c M L (c) (y m ) (7) m=1 where y m represents the m-th input of a low-resolution image, M is the number of input images, L (c) ( ) is the similarity to category c s subspace constructed from c s lowresolution images. The proposed method employed the super-resolution technique for the recognition. 193

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 Figure 6. Character categories (Century font). (a) 6 6 (b) 7 7 (c) 8 8 (d) 9 9 Figure 7. Example of test data for each size. Figure 8. Recognition rates of low-resolution characters. In the proposed method, a character was super-resoluted from 30 frames. In the comparative method, 30 low-resolution characters were used directly. As training data, 100 characters per category (a total of 6,200 images) were prepared for each method. In the case of the proposed method, characters were super-resoluted from 30 frames. In the case of the comparative method, the captured low-resolution characters were used directly. The average sizes of the captured characters were 6 6, 7 7, 8 8, and 9 9 (Fig. 7). Recognition rates were calculated from 50 sets of test data per category for each size and each method. Fig. 8 shows that the recognition rate of the proposed method is higher than the comparative method for any size. This result indicates that the super-resolution technique successfully worked, and also proved the superiority of the proposed method for low-resolution character recognition. 4.2. Recognition accuracy by the number of images used for super-resolution We also evaluated the recognition rates by changing the number of frames used for the super-resolution; 1, 2, 4, 8, 16, and 32 frames. Subspaces were constructed under the same condition as in Section 4.1. We used test data reconstructed from characters with the size of 8 8. Fig. 9 shows examples of characters reconstructed from low-resolution images. Recognition rates were calculated from 50 sets of test data per category for each number of frames, where each set consists of reconstructed input characters. Fig. 10 shows that the recognition rate increases in proportion to the number of frames used for the superresolution. This result indicates that the super-resolution technique complements insufficient resolution to recognize (a) 2 frames (b) 8 frames (c) 32 frames Figure 9. The super-resoluted characters. characters by integrating multiple low-resolution images. 4.3. Disscussion From Fig. 8 and Fig. 10, we confirmed that the proposed method could recognize characters accurately, compared to the conventional method. However, the recognition accuracy declines in proportion to the input character size. As shown in Fig. 11, when a character was super-resoluted from an image size of 9 9, even similar characters I and 1 could be distinguished. Meanwhile, when superresoluted from 7 7, it is difficult to distinguish between them. We consider that such phenomena caused the decline of the recognition rate in low-resolutions in Fig. 8. 5. Conclusion In this paper, we proposed a recognition method for low-resolution characters using the super-resolution technique. Experimental results exhibited the effectiveness of our method for the recognition of low-resolution characters. Future works include using the structure of characters for low-resolution character recognition. For example, we will 194

Figure 10. Recognition rates by changing the number of frames used for the super-resolution. Notes in Computer Science, vol.3332, pp.489 496, Springer-Verlag, December 2004. [3] E. Oja, Subspace methods of pattern recognition, Research Studies, Hertfordshire, UK, December 1983. [4] H. Murase, H. Kimura, M. Yoshimura, and Y. Miyake, An improvement of the auto-correlation matrix in the pattern matching method and its application to handprinted HIRAGANA recognition (in Japanese), IECE Trans., vol.j64-d, no.3, pp.276 283, March 1981. [5] H. Murase and S. K. Nayar, Visual learning and recognition of 3-D objects from appearance, Int. J. Computer Vision, vol.14, no.1, pp.5 24, January 1995. [6] S. Omachi and H. Aso, A qualitative adaptation of subspace method for character recognition, Systems and Computers in Japan, vol.31, issue.12, pp.1 10, Wiley, November 2000. (a) I (from 7 7 ) (b) 1 (from 7 7 ) (c) I (from 9 9 ) (d) 1 (from 9 9 ) Figure 11. Example of super-resoluted similar characters. apply edge information to the super-resolution to improve the recognition accuracy. 6. Acknowledgement The authors would like to thank their colleagues for useful discussions. Parts of this research were supported by the JSPS Grants-In-Aid for Scientific Research. The proposed method was implemented by using the MIST library (http://mist.suenaga.m.is.nagoya-u.ac.jp/). References [1] A. J. Elms, S. Procter, and J. Illingworth, The advantage of using an HMM-based approach for faxed word recognition, Int. J. Document Analysis and Recognition, vol.1, no.1, pp.18 36, February 1998. [7] C. P. Sung, K. P. Min, and M. G. Kang, Super-resolution image reconstruction: A technical overview, IEEE Signal Process. Mag., vol.20, no.3, pp.21 36, May 2003. [8] M. Tanaka and M. Okutomi, A fast algorithm for reconstruction-based super-resolution and its accuracy evaluation, Systems and Computers in Japan, vol.38, issue.7, pp.44 52, Wiley, June 2007. [9] B. C. Tom and A. K. Katsaggelos, Reconstruction of a high-resolution image by simultaneous registration, restoration, and interpolation of low-resolution images, Proc. 1995 IEEE Int. Conf. on Image Processing, vol.2, pp.539 542, October 1995. [10] R. R. Schulz and R. L. Stevenson, Extraction of highresolution frames from video sequences, IEEE Trans. Image Process., vol.5, no.6, pp.996 1011, June 1996. [11] H. Stark and P. Oskoui, High resolution image recovery from image-plane arrays, using convex projections, J. Opt. Soc. Am. A, vol.6, pp.1715 1726, November 1989. [12] R. Fletcher and C. Reeves, Function minimization by conjugate gradients, Computer Journal, vol.7, no.2, pp.149 154, British Computer Society, 1964. [2] S. Yanadume, Y. Mekada, I. Ide, and H. Murase, Recognition of very low-resolution characters from motion images captured by a portable digital camera, Proc. 2004 Pacific-Rim Conf. on Multimedia, Lecture 195