OCR-ANN Back-Propagation Based Classifier

Similar documents
Analecta Vol. 8, No. 2 ISSN

A Content based Spam Filtering Using Optical Back Propagation Technique

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

Implementation of OCR Based on Template Matching and Integrating it in Android Application

DIAGONAL BASED FEATURE EXTRACTION FOR HANDWRITTEN ALPHABETS RECOGNITION SYSTEM USING NEURAL NETWORK

Morphological segmentation of histology cell images

6.2.8 Neural networks for data mining

Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network

Face Recognition For Remote Database Backup System

Neural Networks and Support Vector Machines

Neural Networks in Data Mining

Method of Combining the Degrees of Similarity in Handwritten Signature Authentication Using Neural Networks

COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS

Visual Structure Analysis of Flow Charts in Patent Images

Automatic Detection of PCB Defects

Data quality in Accounting Information Systems

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

ANN Based Fault Classifier and Fault Locator for Double Circuit Transmission Line

How To Use Neural Networks In Data Mining

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Keywords image processing, signature verification, false acceptance rate, false rejection rate, forgeries, feature vectors, support vector machines.

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

An Android based Medication Reminder System based on OCR using ANN

Comparison of K-means and Backpropagation Data Mining Algorithms

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan

Handwritten Character Recognition from Bank Cheque

Recognition of Facial Expression Using AAM and Optimal Neural Networks

Design call center management system of e-commerce based on BP neural network and multifractal

MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL

Credit Card Fraud Detection Using Self Organised Map

Artificial Neural Network Approach for Classification of Heart Disease Dataset

Handwritten Digit Recognition with a Back-Propagation Network

Managing Healthcare Records via Mobile Applications

Recognition Method for Handwritten Digits Based on Improved Chain Code Histogram Feature

Automatic Extraction of Signatures from Bank Cheques and other Documents

STATIC SIGNATURE RECOGNITION SYSTEM FOR USER AUTHENTICATION BASED TWO LEVEL COG, HOUGH TRANSFORM AND NEURAL NETWORK

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin *

Neural Network Design in Cloud Computing

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing Classifier

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

DEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACH

REVIEW OF HEART DISEASE PREDICTION SYSTEM USING DATA MINING AND HYBRID INTELLIGENT TECHNIQUES

Handwritten Signature Verification using Neural Network

PCB DETECTION AND CLASSIFICATION USING DIGITAL IMAGEPROCESSING

DESIGN OF DIGITAL SIGNATURE VERIFICATION ALGORITHM USING RELATIVE SLOPE METHOD

Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring

NEURAL NETWORKS IN DATA MINING

Performance Evaluation of Artificial Neural. Networks for Spatial Data Analysis

Price Prediction of Share Market using Artificial Neural Network (ANN)

Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승

A Victimization Optical Back Propagation Technique in Content Based Mostly Spam Filtering

An Approach for Utility Pole Recognition in Real Conditions

Automatic License Plate Recognition using Python and OpenCV

A Dynamic Approach to Extract Texts and Captions from Videos

Performance Evaluation of Online Image Compression Tools

Use of Artificial Neural Network in Data Mining For Weather Forecasting

FACE RECOGNITION BASED ATTENDANCE MARKING SYSTEM

Electroencephalography Analysis Using Neural Network and Support Vector Machine during Sleep

Signature Segmentation from Machine Printed Documents using Conditional Random Field

Data Mining using Artificial Neural Network Rules

SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS

Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing

Keywords - Intrusion Detection System, Intrusion Prevention System, Artificial Neural Network, Multi Layer Perceptron, SYN_FLOOD, PING_FLOOD, JPCap

Predictive time series analysis of stock prices using neural network classifier

Recurrent Neural Networks

A Tokenization and Encryption based Multi-Layer Architecture to Detect and Prevent SQL Injection Attack

Er is door mij gebruik gemaakt van dia s uit presentaties van o.a. Anastasios Kesidis, CIL, Athene Griekenland, en Asaf Tzadok, IBM Haifa Research Lab

A Simple Feature Extraction Technique of a Pattern By Hopfield Network

2. IMPLEMENTATION. International Journal of Computer Applications ( ) Volume 70 No.18, May 2013

A Lightweight and Effective Music Score Recognition on Mobile Phone

Low Cost Correction of OCR Errors Using Learning in a Multi-Engine Environment

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network

Navigation Aid And Label Reading With Voice Communication For Visually Impaired People

Bank Customers (Credit) Rating System Based On Expert System and ANN

Keywords: Data Mining, Neural Networks, Data Mining Process, Knowledge Discovery, Implementation. I. INTRODUCTION

Poker Vision: Playing Cards and Chips Identification based on Image Processing

Utilization of Neural Network for Disease Forecasting

FPGA Implementation of Human Behavior Analysis Using Facial Image

Online Farsi Handwritten Character Recognition Using Hidden Markov Model

Neural Networks and Back Propagation Algorithm

SEARCH AND CLASSIFICATION OF "INTERESTING" BUSINESS APPLICATIONS IN THE WORLD WIDE WEB USING A NEURAL NETWORK APPROACH

Algorithm for License Plate Localization and Recognition for Tanzania Car Plate Numbers

Cursive Handwriting Recognition for Document Archiving

Recognition of Handwritten Digits using Structural Information

Time Series Data Mining in Rainfall Forecasting Using Artificial Neural Network

A simple application of Artificial Neural Network to cloud classification

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling

Pattern Recognition of Japanese Alphabet Katakana Using Airy Zeta Function

Keywords: Image complexity, PSNR, Levenberg-Marquardt, Multi-layer neural network.

Chapter 4: Artificial Neural Networks

Neural Network based Vehicle Classification for Intelligent Traffic Control

Application of Neural Networks to Character Recognition

How To Filter Spam Image From A Picture By Color Or Color

A New Approach For Estimating Software Effort Using RBFN Network

Lecture 6. Artificial Neural Networks

Power Prediction Analysis using Artificial Neural Network in MS Excel

Open Access A Facial Expression Recognition Algorithm Based on Local Binary Pattern and Empirical Mode Decomposition

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013

Signature Region of Interest using Auto cropping

Transcription:

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 1, January 2015, pg.307 313 RESEARCH ARTICLE ISSN 2320 088X OCR-ANN Back-Propagation Based Classifier Asmaa Qasim Shareef 1 Sukaina M. Altayar 2 ¹,2 College of Science, University of Baghdad, Iraq 1 2 asama_sal@yahoo.com sukaina_altayar@yahoo.com Abstract Optical Character Recognition by using Neural Network is a prototype system that is useful to recognize the character. pre-processing for document is the preliminary step for the recognition to be accurate, which transforms the data into a format that will be processed easily and effectively. The main task of pre-processing is to decrease the variation that causes a reduction in the recognition rate and increases the complexities. In this paper, many pre-processing techniques has been used to improved OCR accuracy by pre-processing the original image and its character spiritedly, which includes noise filtering, smoothing, thresholding, and skewing. The experimental results show that the improvement of recognition side has achieved a good result for many types of noisy and non-uniform characters document. The efficiency and recognition testes for training method has been performed and reported in this paper. This paper shows how the use of artificial neural network for an optical character recognition application, while achieving highest quality of recognition and good performance by applying multiple image processing technique. Keywords Optical Characters Recognition; Back-Propagation; Neural Network; Image Analyses I. INTRODUCTION The Optical Character Recognition (OCR) is the process of automatic recognition for different characters from an image documented, and also provides a full alphanumeric recognition of printed or handwritten characters, text, numerals, letters and symbols into a computer process able to be formatted. OCR has gained largely impetus due to its application in the fields of Computer Vision, Intelligent Text Recognition applications and Text based decision-making systems [1,2]. The approach is taken as an attempted to solve the OCR problem, is based on psychology of the characters as perceived by the humans. Thus, the geometrical features of a character and its variants will be considered for recognition. The addition of the neural networks allow improvement on the recognition of arbitrary font styles as opposed to a standard font. The difference in font and its sizes of a scanned image make recognition process difficult if no pre-processing applied. In most document images that are gotten from scanning, allow a noisy pixels to be found; In addition, width of the stroke is also a factor that affects recognition, therefore; to get a good character recognition accuracy an elimination to the noise after reading binary 2015, IJCSMC All Rights Reserved 307

image data is needed, in addition to smooth image for better recognition, and extract features efficiently, train the system and classify patterns [3,4]. II. RELATED WORK [6] proved that the Succession in OCR depends on two factors, which are feature extraction and classification algorithms.[7] applied neural network approach to perform high accuracy recognition on music score with backward propagation. [8] presented scheme to develop a complete OCR system for different five fonts and sizes of characters, and implemented the steps of the OCR system: pre-processing, feature extraction, segmentation, and classification. The artificial neural network (ANN) has been used for classification purpose. [9] explained the classification methods based on learning from examples and its application to character recognition. III. PROPOSED SYSTEM ARCHITECTURE This paper presents a procedure for designing OCR-ANN system, which recognized text from a scanned image. The model built using C# language with Open CV library. The system consists of two parts shown in Fig.1: Training ANN Part and Pattern Recognition part. Fig.1 OCR-ANN system scheme In 1 st part, the database of characters has been created based on training ANN on images of characters with Backpropagation (BP) to generate data matrix that was used for classification operation. In 2 nd part multiple processing have been used to recognize charterers from scanned image includes; preprocessing and classification. ANN Training Part This part used to generate data that used in recognition part. It could be represented by the flow chart in Fig.2. The BP algorithm has been applied to feed-forward multilayer neural network (FFML). The nodes are organized in layers, and send their signals forward with errors propagated backwards. The network receives inputs by neurons in the input layer, and the output of the network is obtained by the neurons on an output layer with one hidden layers. Each layer is fully connected to the next layer. The learning continue until the error is reduced, i.e. the ANN learns the training data. The training starts with random weights, and aim is to adjust them to arrive at minimal error. The number of layers and the number of neurons per layer are important decisions to make when applying this architecture. The complexity between 2015, IJCSMC All Rights Reserved 308

the input data and desired output determines the number of nodes in the hidden layer. Also, the amount of training data sets set an upper bound for the number of nodes in the hidden layer. This upper bound is calculated by dividing the number of input output pairs examples in the training set by the total number of input and output nodes in the network. Then divide the result by scaling factor between five and ten [10]. Pattern Recognition Part This part of scheme will imported the scanned image through pre-processing operations to recognition operation to be compared with the trained images in order to classify each character. pre-processed operation is to enhance and segment the characters to prepare them for recognition process. Image binarization (thresholding) is used for edge/boundary detection [11]. Fig.2 Input/output Data matrix Generation scheme IV. EXPERIMENTAL RESULTS The first step of programming is to generate DataMatrix file that be used later in the recognition process. The program will request font from operation system, then it will convert each character into an image, it will be used as an input data to BP-NN. There is an option to add some noise to characters' image, in order to enhance the recognized result by making ANN trained with non-uniform characters, this causes an increasing in the time of training. The second part is to initialized FFML with BP to adjust the weights. After training was completed, the weights were saved in DataMatrix file. Fig.3 shows the BP-ANN front page. 2015, IJCSMC All Rights Reserved 309

Fig.3 BP-ANN Training Dialog. After completing the training part, recognition part then started. First document image either was loaded from image files or through an optical device, and load training weights to be used in recognition. Fig.4 shows two samples of images has been taken to compare results: white background and graduate lighting images. (d) Fig.4 Binarazation of document images for Character image with white background the binaraized image (c) Character image with graduate lights background (d) the binaraized image 2015, IJCSMC All Rights Reserved 310

From Fig.5 it shows the Comparison between two different background documents the first image that is character image with white background and second are character image with graduate lights background. As appeared there are no noisy background from first image Fig.6-b because of no graduate lighting which makes recognized process with good accuracy, the problem appears with colour and graduate lights background images it makes classification process difficult due to noisy background as in Fig.6-d. The recognition result for both is shown in Fig.7 Fig.5 Recognition of characters for Character image with white background and Character image with graduate lights background Fig.6 Applying pre-processing algorithms on graduate lights background A sample of character images with white background has been used in test. As shown in Fig.7 the background is graduate in lighting duo to scanning process, also it has mirrored characters, this makes more noise and unwanted features that makes recognition process has many errors. Fig.7 Scanned Character image Original scanned Image binarazation of image 2015, IJCSMC All Rights Reserved 311

When used ANN to recognized the image, it obviously that recognition will field to identify the actual characters and it will have many errors as shown in Fig.8. After used pre-processing enhancement (smoothing, adaptive threshold and noise filter), the recognition process has been removed noisy background and recognized just characters with good accuracy. Table-1 shows the results of recognition for all the previous tested images that either use pre-processing or without using it. Fig.8 Recognition results Recognition by ANN without used pre-processing Recognition by ANN with used pre-processing Table 1:Recognition Results for trained document images 1 2 3 Type of Images Computerized Clear White background Computerized Graduate lights background Scanned image with complex background Number of characters Recognition Result No Pre-processing With Pre-processing Detect Failed Eff. % Detect Failed Eff. % 36 35 1 97.22 36 0 100 71 64 7 90.14 71 0 100 396 334 773 43.20 334 62 84.34 v. Conclusion Using BP-ANN in OCR gives good results. However, the OCR affects by many factors like noise, brightness, coloured background. So pre-processing is necessary to be used in documented images as an initial step for character recognition systems to remove the effects of these factors. Each image requires different pre-processing techniques depending on the effect of the factor that may affect the quality of it. ACKNOWLEDGEMENTS Our thanks to Referee who read our paper and to the experts who contributed towards development of the template, also our thanks to the Editorial Support Team, International Journal of Computer Science and Mobile Computing. 2015, IJCSMC All Rights Reserved 312

References [1] Sandeep S., Nabarag P., Sayam K. D. and Sandip K., Optical Character Recognition using 40-point Feature Extraction and Artificial Neural Network, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 3, Issue 4, ISSN: 2277 128X, pp. 495-502, October 2012. [2] Vivek S. and Navdeep S., Optical Character Recognition using Artificial Neural Networks, Signal & Image Processing: An International Journal (SIPIJ) Vol.3, No.5, October 2012. [3] Rakesh B., Artificial Neural Network Based Optical Character Recognition, BLB-International Journal of Science & Technology, Vol.1 No. 2, pp. 143-152, ISSN 0976-3074, 2010. [4] Sameeksha B., Optical Character Recognition Using Artificial Neural Network, International Journal of Advanced Research in Computer Engineering & Technology Vol.1, Issue 4, June 2012. [5] Marinai S., Gori M. and Soda G., Artificial neural networks for document analysis and recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.27, no.1, pp. 23-35, ISSN: 0162-8828, Jan. 2005. [6] Ivan D., Machine Learning Methods for Optical Character Recognition, Chair of Computer Science, Department of Mathematics and Informatics, Novi Sad, Serbia, December 2005. [7] Akinwonmi A. E., Adewale O.S., Alese B.K., and Adetunmbi O.S., Design of a Neural Network Based Optical CharacterRecognition System for Musical Notes, the Pacific Journal of Science and Technology, Vol.9. no.1, 2008. [8] Raghuraj S., Yadav C. S., Prabhat V., and Vibhash Y., Optical Character Recognition for Printed Devnagari Script Using Artificial Neural Network, International Journal of Computer Science & Communication Vol.1, no.1, pp. 91-95, 2010. [9] Vijay L. S. and Babita K., Offline Handwritten Character Recognition Techniques using Neural Network, AReview International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Vol.2 Issue1, January 2013. [10] Jon M [12] Dike U. I and Adoghe U. A., Back-Propagation Artificial Neural Network Techniques for Optical Character Recognition A Survey, International Journal of Computers and Distributed Systems, Vol. No.3, Issue 2, ISSN: 2278-5183, Jun-July 2013. [11] Khorsheed O. K., Produce Low-Pass and High-Pass Image Filter In Java, International Journal of Advances in Engineering & Technology IJAET, Vol.7, Issue 3, pp. 712-722, ISSN: 22311963, July 2014. 2015, IJCSMC All Rights Reserved 313