Recognition Method for Handwritten Digits Based on Improved Chain Code Histogram Feature



Similar documents
ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan

Signature Region of Interest using Auto cropping

Signature Segmentation from Machine Printed Documents using Conditional Random Field

How To Filter Spam Image From A Picture By Color Or Color

Keywords image processing, signature verification, false acceptance rate, false rejection rate, forgeries, feature vectors, support vector machines.

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

DIAGONAL BASED FEATURE EXTRACTION FOR HANDWRITTEN ALPHABETS RECOGNITION SYSTEM USING NEURAL NETWORK

APPLYING COMPUTER VISION TECHNIQUES TO TOPOGRAPHIC OBJECTS

A Method of Caption Detection in News Video

Multimodal Biometric Recognition Security System

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

RFID and Camera-based Hybrid Approach to Track Vehicle within Campus

A Lightweight and Effective Music Score Recognition on Mobile Phone

Research on Chinese financial invoice recognition technology

Using Lexical Similarity in Handwritten Word Recognition

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition

Open Access A Facial Expression Recognition Algorithm Based on Local Binary Pattern and Empirical Mode Decomposition

Efficient on-line Signature Verification System

Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap

3)Skilled Forgery: It is represented by suitable imitation of genuine signature mode.it is also called Well-Versed Forgery[4].

Colour Image Segmentation Technique for Screen Printing

High-Performance Signature Recognition Method using SVM

DEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACH

Automatic Extraction of Signatures from Bank Cheques and other Documents

A Dynamic Approach to Extract Texts and Captions from Videos

AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION

Face detection is a process of localizing and extracting the face region from the

Image Compression through DCT and Huffman Coding Technique

Unconstrained Handwritten Character Recognition Using Different Classification Strategies

Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network

Visual Structure Analysis of Flow Charts in Patent Images

DESIGN OF DIGITAL SIGNATURE VERIFICATION ALGORITHM USING RELATIVE SLOPE METHOD

Poker Vision: Playing Cards and Chips Identification based on Image Processing

SIGNATURE VERIFICATION

Building an Advanced Invariant Real-Time Human Tracking System

Euler Vector: A Combinatorial Signature for Gray-Tone Images

FPGA Implementation of Human Behavior Analysis Using Facial Image

Circle Object Recognition Based on Monocular Vision for Home Security Robot

Neural Network based Vehicle Classification for Intelligent Traffic Control

Tracking and Recognition in Sports Videos

Analecta Vol. 8, No. 2 ISSN

Character Image Patterns as Big Data

REAL TIME TRAFFIC LIGHT CONTROL USING IMAGE PROCESSING

Detection of Heart Diseases by Mathematical Artificial Intelligence Algorithm Using Phonocardiogram Signals

Pattern Recognition of Japanese Alphabet Katakana Using Airy Zeta Function

Image Classification for Dogs and Cats

III. SEGMENTATION. A. Origin Segmentation

Handwritten Signature Verification using Neural Network

Face Recognition in Low-resolution Images by Using Local Zernike Moments

The Delicate Art of Flower Classification

Signature verification using Kolmogorov-Smirnov. statistic

Introduction to Pattern Recognition

Document Image Retrieval using Signatures as Queries

Categorical Data Visualization and Clustering Using Subjective Factors

A Study of Automatic License Plate Recognition Algorithms and Techniques

An Imbalanced Spam Mail Filtering Method

Galaxy Morphological Classification

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Electroencephalography Analysis Using Neural Network and Support Vector Machine during Sleep

Blog Post Extraction Using Title Finding

Low-resolution Character Recognition by Video-based Super-resolution

A Study on M2M-based AR Multiple Objects Loading Technology using PPHT

Railway Expansion Joint Gaps and Hooks Detection Using Morphological Processing, Corner Points and Blobs

Comparison of K-means and Backpropagation Data Mining Algorithms

Method of Fault Detection in Cloud Computing Systems

Automatic License Plate Recognition using Python and OpenCV

The enhancement of the operating speed of the algorithm of adaptive compression of binary bitmap images

A fast multi-class SVM learning method for huge databases

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

What is the Right Illumination Normalization for Face Recognition?

Face Recognition For Remote Database Backup System

Spam Detection Using Customized SimHash Function

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Hybrid Lossless Compression Method For Binary Images

Thresholding technique with adaptive window selection for uneven lighting image

Recognition of Single Handed Sign Language Gestures using Contour Tracing Descriptor

AnalysisofData MiningClassificationwithDecisiontreeTechnique

K-means Clustering Technique on Search Engine Dataset using Data Mining Tool

Server Load Prediction

Volume 2, Issue 3, March 2014 International Journal of Advance Research in Computer Science and Management Studies

LEAF COLOR, AREA AND EDGE FEATURES BASED APPROACH FOR IDENTIFICATION OF INDIAN MEDICINAL PLANTS

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE.

How To Predict Web Site Visits

Super-resolution Reconstruction Algorithm Based on Patch Similarity and Back-projection Modification

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

A Content based Spam Filtering Using Optical Back Propagation Technique

How To Solve The Kd Cup 2010 Challenge

Cursive Handwriting Recognition for Document Archiving

Determining optimal window size for texture feature extraction methods

Transcription:

3rd International Conference on Multimedia Technology ICMT 2013) Recognition Method for Handwritten Digits Based on Improved Chain Code Histogram Feature Qian You, Xichang Wang, Huaying Zhang, Zhen Sun and Jiang Liu Abstract. The chain code histogram feature is a simple and effective feature extraction technology. This paper proposes improvements based on Chain Code Histogram (CCH) and its first differential characteristics. According to the first Differential Chain Code Histogram (DCCH), the turning points in the direction are extracted and the judging method of Direction Turning Point (DTP) is given. We combine CCH and DTP into a new feature, then handwritten digits of MNIST database are recognized and classified by Support Vector Machine (SVM) classifier. The experimental results proved that the recognition rate of the improved method is not only higher than CCH and first differential CCH, but also closely to the recognition rate of their combination. Obviously, the new combination reduces the feature dimension, improves the speed of training and recognition. Keywords: Chain code histogram, Differential chain code Direction turning point Handwritten digit recognition Support vector machine 1 Introduction As computer technology and Internet technology are widely used, the transformation of the traditional text information on paper to digital format is a main trend in the world today. Digits as a kind of universal symbol in the world, are widely used in the systems of bank check, postal service, online marking and so on [1, 2]. In practical application, digits generally represent accurate data, and its recognition must be in high accuracy. Although the handwritten digit recognition research has been going on for decades, and get delectable achievements, but compared with Q. You ( ) X.Wang Z.Sun School of Management Science and Engineering, Shandong Normal University, Lixia District, East of Culture Road 88,250014 Jinan, China e-mail: qianyou77@163.com H.Zhang J.Liu Shandong Research Institute of Data Processing Oumasoft, 250100 Jinan, China 2013. The authors - Published by Atlantis Press 438

the human ability to recognize there is a certain distance. Therefore, handwritten digit recognition is still significant in research. Feature extraction technology is an important part of handwritten digit recognition. We must take peoples different writing habits into account when chose the features. A successful feature must be stable and easy to be distinguished. CCH is a relatively successful and simple feature extraction technology. Freeman firstly proposed chain code in 1977[3]. The CCH method has been successfully applied to handwritten Devnagari, Gurumukhi, Malayalam, Arabic [4, 5, 6, 7] character recognition, and received high recognition rate. J. Jain et al. achieved accuracy of 96% by using CCH and DCCH [8]. This method needs more storage space while the speed of training and recognition slows down. In this paper, we have improved the combination feature based on CCH and DCCH. Firstly, Direction turning point is computed from DCCH. Then we combine it with CCH into a new feature, describe the judgment of DTP in details. Finally, we estimate the algorithm by contrast experiment. The handwritten digits in this study used are all from MNIST handwritten numeral database. 2 Feature Extraction The chain code Freeman proposed is divided into 4-direction and 8-direction (Fig.1). This method extracts contour shape of the character pattern, and the contour is coded according to the direction change information of each point with the next one. CCH is the statistical characteristics of chain code. Similarly, DCCH is the statistical characteristics of differential chain code. Fig.1 Chain code directions 2.1 CCH Feature Extraction The CCH feature extraction technology gets directional information from the pixels of the extracted character contour. Subsequently, we count the different codes. The whole process of CCH feature extraction is showed as follows. 439

1) Transform the image into binary format and extract the contour shape of the character pattern. 2) Resize the image to m*m, and divide it into m 2 /n 2 blocks in the scale of n*n. 3) According to traversal sequence of top to bottom and left to right, each contour pixels direction was calculated, and the direction is marked as 8- direction as showed in Fig.1. 4) Count the frequency of each direction appears in every block. (a) (b) 30 25 No.of pixels 20 15 10 5 0 0 1 2 3 4 5 6 7 8 Direction (c) Fig.2 (a) shows the original image of handwritten digit 5, (b) is the image after preprocessing, and (c) represents the CCH feature of digit 5 The CCH feature of digit 5 is shown in Fig.2 (C), we use the direction value 1~8 instead of 0~7 for preventing the gray value 0 of background pixels from mixing up with the direction 0. The CCH is the statistical characteristic of the whole character with all blocks. 2.2 DCCH Feature Extraction The CCH feature only represents the directional information of handwritten digit, without any features on directional change. However, completely different charac- 440

ters may have the same characteristics DCCH can solve this problem by providing the variations in the directional information. The first DCCH can be obtained by computing the difference between two neighbor pixels. In general, we use the succeeding one minus the preceding one. The DCCH also can be calculated by counting the steps of the first chain code turning to the succeeding one in clockwise or counterclockwise. For example, a chain code of 4-direction is 10103322, conversely its differential chain code is 3133030 in counterclockwise. There will be not negative values in the differential chain code which obtained through this method. No.of pixels 80 70 60 50 40 30 20 10 0 0 1 2 3 4 5 6 7 8 Direction Fig.3 the DCCH representation of digit 5, its original image is fig.2 (a) The DCCH above is calculated in clockwise. We can notice that the differential chain code concentrating upon 1,2,7,8. Its distribution is in polarized trend. This feature can filter some extra unexpected directional information to some extent. 3 Modified Feature Extraction Method CCH and DCCH reflect the direction information and direction change information respectively. If they can be effectively combined, then the combined feature can represent more comprehensive characters. However, their direct combination is bound to cause a problem that the speed of training and recognition becomes slower. To solve this problem, we propose a new feature which is direction turning point (DTP). M. Blumenstein described the distinction of individual line segments [10]. According to this method, we can distinguish the DTP. The distinction of DTP based 8-direction chain code is described in details. DTP The DTP is a point where direction changes sharply. Assuming an 8-direction chain code as A1A2 AiAi+1 An, its differential chain code as B1B2 Bi Bn-1 441

(0 Ai 7,i and n represent nature number,and 1 i n). So Bi is the first differential of Ai and Ai+1. After computing the first differential chain code, if the differential chain code meets any one as follows: 1) Bi 0,and Bi-1=Bi-2=0; 2) Bi= Bi-1=Bi-2=1or 7; 3) 2 Bi 6. Then we consider the direction changes sharply at point Ai+1.That is to say, the point Ai+1 is a DTP. The first condition expresses that the current direction value changes and the previous direction value has sustained more than three times. The second condition represents that the previous direction changed 45. The last condition indicates that the previous direction changed greater than 90. Finally, the DTP is marked as 9 in the character pattern. Repeat the operation for the whole differential chain code. So we get the DTP feature of the character image. In this work, we combined the CCH feature and the DTP feature. The structure of the combined feature is shown as Table 1. Table 1 The feature vector of each block Frequency of direction 1 Frequency of direction 2 Frequency of direction 3 Frequency of direction 4 Frequency of direction 5 Frequency of direction 6 Frequency of direction 7 Frequency of direction 8 Frequency of DTP 4 The Process of Handwritten Digit Recognition In this paper, the process of handwritten digit recognition system is shown in Fig.4. The steps in the system are introduced in following sections. Training Sample Preprocessing Feature Extraction SVM Training Testing Sample Preprocessing Feature Extraction SVM Testing Recognition Result Fig.4 Handwritten numeral recognition system flow chart 442

4.1 Preprocessing This part includes image binarization, contour extraction, character position adjustment, and the effect figure after preprocessing as shown in fig.2 (b).we choose K-means clustering algorithm as the binarization methods [9], the clustering centers are 64 and 192. We use canny detector detect the edge of characters. One of the most important is position adjustment, it is illustrated as follows: 1) Find out the width and height of the characters. 2) Shear the image according to the width and height of the characters. 3) Enlarge the sheared image to according to the maximum of the width and height until either of their length is equal to the fixed size. 4) Normalize the image into a fixed size by filling background pixels. The character position cannot be inclined to one side, the distance from leftmost character to the edge of the image is equal to the right one. So the top distance and the bottom one are. Extracting feature vector Firstly, we capture the CCH feature. Secondly, DCCH feature is calculated from chain code. Finally, we distinguish DTP based on analysis of first differential chain code. CCH feature is combined with DTP into a new fusion feature. So the number of vector in each block is 9. 4.3Training and testing Supporting Vector Machine has been successfully applied in the field of pattern recognition, such as text recognition, face recognition and so on. It shows good performance in applications [11]. So this part we utilize SVM to train and test. The Radio Basis Function (RBF) is chosen as the kernel. 5 Experiment Results and Analysis In this paper, we improve the combination of CCH and DCCH by using DTP feature instead of DCCH. In order to demonstrate the effectiveness of this method, all the experiment data are from MNIST handwritten digit image standard database. This database has 60000 training samples and 10000 testing samples. In the expe- 443

riment, all the character images are scaled in size of 28*28 and the block size is set to 7*7. Firstly, 60000 training samples and 10000 testing samples are preprocessed, extracted feature vector, and prepared in different required matrixes. Secondly, we gain a classification model by training 60000 samples in the training set. At last, we use the model recognize the 10000 testing samples. This experiment is conducted in Widows XP operating system and Matlab.R2010.version. To verify the effectiveness of DTP feature, we have done four groups of experiment involves CCH, DCCH, CCH+DCCH and CCH+DTP. The penalty parameter C is set to 20 in the experiment. We take the dimensions of feature as the value of γ. Table 2 Experiment results of different feature extraction techniques Feature extraction γ Training Time (s) Testing Time (s) Model Storage Space(MB) No. of SV Recognition Accuracy (%) CCH 1/128 535 245 2.84 13019 97.54 DCCH 1/128 708 321 2.63 17072 95.32 CCH+DCCH 1/256 1210 553 5.35 15234 98.10 CCH+DTP 1/144 587 297 3.08 13600 98.00 The performances of handwritten digit recognition based on four feature extraction techniques are shown in Table 2. Obviously, CCH combined with DCCH achieves the highest recognition rate of 98.00%, CCH combined with DTP falls behind with 0.1% difference. However, the recognition method based on CCH and DTP takes shorter time, smaller storage space. Especially the training time of CCH and DCCH is 1210s while the training time of CCH and DTP is 587s. This is almost twice as much. If considered about comprehensive evaluation, CCH and DTP feature extraction method performs better. 6 Conclusions and future research This paper presented a new fusion feature extraction technique for the recognition of handwritten digit. The new fusion feature is the combination of CCH and DTP. A recognition system for handwritten digit was developed based on SVM. The new fusion feature achieves a recognition rate of 98%, it is comparable to the direct combination of CCH and DCCH of 98.1%. But in the respects of storage space and time-consume, the new fusion feature outperformed the direct combination. It not only expresses the directional information of the handwritten digit but also represents the directional change information. The experiment results reveal that the combined feature of CCH and DTP is stable and effective. 444

In the future research, many improvements may be proposed on preprocessing part, more fusion features will be explored. We may modify out classification technique with more outstanding performance applied to handwritten digit recognition system. Acknowledgements This study is funded by the Research and Application of High-speed Image Data Acquisition Processing, one of Shandong province Science and Technology Research Program of China (211GZC20106). References 1. Hu D (2012). Research and application of handwritten numeral recognition method. SM thesis, University of Nanchang, Nanchang, China. 2. Tong X (2009). Handwritten digital recognition technology research and its application in automatic checking system. SM thesis, Dongbei Normal University, Jilin, China. 3. Freeman H., Davis L.S. (1977). A corner finding algorithm for chain-coded curves. IEEE Transcations on computers, 26(5): 297-303. 4. Singh P., Verma A., Chaudhari N.S. (2011). Performance Evaluation of Classifiers Applying Directional Features for Devnagri Numeral Recognition. Advanced Materials Research, 1042, 403-408. 5. Kaur H., Kumar R. (2011). Slant Correction and Resampling for Online handwritten character Recognition of Gurumukhi Script. SM thesis, Thapar University, India. 6. John J., Pramod K.A., Balakrishnan K. (2011). Offline handwritten Malayalam Character Recognition based on chain code histogram. Emerging Trends in Electrical and Computer Technology (ICETECT), 736-741. doi: 10.1109/ICETECT.2011.5760215. 7. Lawal I.A., Abdel-Aal R.E., Mahmoud S.A. (2010). Recognition of Handwritten Arabic (Indian) Numerals Using Freeman s Chain Codes and Abductive Network Classifiers. Pattern Recognition (ICPR), 1884-1887. 8. Jain J., Sahoo K.S., Prasanna S.R., Reddy G.S.(2012). Modified Chain Code Histogram Feature for Handwritten Character Recognition.C-CSIT. Advances in Computer Science and Information Technology. Networks and Communications, 84, 611-619. 9. Gonzalez R.C., Woods R.E., Eddins S.L.(2004). Digital Image Processing Using MATLAB. Prentice Hall. 10. Blumenstein M., Verma B., Basli H. (2003). A Novel Feature Extraction Technique for the Recognition of Segmented Handwritten Characters. Proceedings of the Seventh International Joint Conference On Document Analysis and Recognition, 1:137-141. 11. Ding S, Qing B, Tan H. (2011). An Overview on Theory and Algorithm of Support Vector Machines. Journal of University of Electronic Science and Technology of China, 40(1):2-10. 445