Recognition of Handwritten Textual Annotations using Tesseract Open Source OCR Engine for information Just In Time (ijit)



Similar documents
Aegis Identity Software, Inc. Experts in Identity Management 100% Focused on Education

ETSI TS V1.1.1 ( ) Technical Specification

HEAT TRANSFER ANALYSIS OF LNG TRANSFER LINE

THE NAVAJO NATION Department of Personnel Management JOB VACANCY ANNOUNCEMENT INFORMATION SYSTEMS TECHNICIAN

Load Balancing Algorithm Based on QoS Awareness Applied in Wireless Networks

Incorporating Statistical Process Control and Statistical Quality Control Techniques into a Quality Assurance Program

Problem Solving Session 1: Electric Dipoles and Torque

Tank Level GPRS/GSM Wireless Monitoring System Solutions

Factors that Influence Memory

Physics. Lesson Plan #9 Energy, Work and Simple Machines David V. Fansler Beddingfield High School

Magic Message Maker Amaze your customers with this Gift of Caring communication piece

Who uses our services? We have a growing customer base. with institutions all around the globe.

Keep Te d C u b e. TedCube

A Systematic Approach to the Comparison of Roles in the Software Development Processes

Instruction: Solving Exponential Equations without Logarithms. This lecture uses a four-step process to solve exponential equations:

ISSeG EGEE07 Poster Ideas for Edinburgh Brainstorming

DEGRADATION MODEL OF BREAST IMAGING BY DISPERSED RADIATION

IT Update - August 2006

CHAPTER 4c. ROOTS OF EQUATIONS

Chapter 30: Magnetic Fields Due to Currents

TELL YOUR STORY WITH MYNEWSDESK The world's leading all-in-one brand newsroom and multimedia PR platform

Dr David Dexter The Parkinson s UK Brain Bank

Before attempting to connect or operate this product, please read these instructions carefully and save this manual for future use.

Chad Saunders 1, Richard E Scott 2

UNIVERSITÀ DEGLI STUDI DI NAPOLI FEDERICO II

Cumulative effects of idalopirdine, a 5-HT 6 antagonist in advanced development for the treatment of mild and moderate Alzheimer s disease

700 EN S e r i e s

1. Online Event Registration 2. Event Marketing 3. Automated Event Progress Reports 4. Web based Point of Sale Terminal 5. Marketing System

SPECIAL VOWEL SOUNDS

A Newer Secure Communication, File Encryption and User Identification based Cloud Security Architecture

Spring 2014 Course Guide

Events and Constraints: A Graphical Editor for Capturing Logic Requirements of Programs

How to SYSPREP a Windows 7 Pro corporate PC setup so you can image it for use on future PCs

The example is taken from Sect. 1.2 of Vol. 1 of the CPN book.

Section 7.4: Exponential Growth and Decay

Consulting. Creating value through HR HR Strategy

NS Solutions Corporation


Module Availability at Regent s School of Drama, Film and Media Autumn 2016 and Spring 2017 *subject to change*

Derivations and Applications of Greek Letters Review and

Design of Extended Warranties in Supply Chains. Abstract

Question 3: How do you find the relative extrema of a function?

An Introduction to Omega

YARN PROPERTIES MEASUREMENT: AN OPTICAL APPROACH

Overview. COSC 6397 Big Data Analytics. Fundamentals. Edgar Gabriel Spring Data Characteristics. Performance Characteristics

5 2 index. e e. Prime numbers. Prime factors and factor trees. Powers. worked example 10. base. power

Why An Event App... Before You Start... Try A Few Apps... Event Management Features... Generate Revenue... Vendors & Questions to Ask...

Distributed Computing and Big Data: Hadoop and MapReduce

A Spam Message Filtering Method: focus on run time

Interface Design for Rationally Clocked GALS Systems

SELF-INDUCTANCE AND INDUCTORS

The Casino Experience

Using Model Checking to Analyze Network Vulnerabilities

Product Overview. Version 1-12/14

Determining solar characteristics using planetary data

A Novel Approach For Generating Rules For SMS Spam Filtering Using Rough Sets

we secure YOUR network t i M c a t r n Compumatica Connected Security English

IBM Healthcare Home Care Monitoring

Whole Systems Approach to CO 2 Capture, Transport and Storage

Contracts in outsourcing

MULTIPLE SOLUTIONS OF THE PRESCRIBED MEAN CURVATURE EQUATION

Grade 5 History Program

Functions of a Random Variable: Density. Math 425 Intro to Probability Lecture 30. Definition Nice Transformations. Problem

fiziks Institute for NET/JRF, GATE, IIT JAM, JEST, TIFR and GRE in PHYSICAL SCIENCES NUCLEAR AND PARTICLE PHYSICS NET/JRF (JUNE-2011)

5.4 Exponential Functions: Differentiation and Integration TOOTLIFTST:

C H A P T E R 1 Writing Reports with SAS

Use a high-level conceptual data model (ER Model). Identify objects of interest (entities) and relationships between these objects

Effect of Unemployment Insurance Tax On Wages and Employment: A Partial Equilibrium Analysis

Visualizing Our Futures: Using Google Earth and Google Maps in an Academic Library Setting

BLADE 12th Generation. Rafał Olszewski. Łukasz Matras

The Supply of Loanable Funds: A Comment on the Misconception and Its Implications

Reach Versus Competition in Channels with Internet and Traditional Retailers

Cookie Policy- May 5, 2014

Agilent Basics of Measuring the Dielectric Properties of Materials. Application Note

Ethernet-based and function-independent vehicle control-platform

DOCTORAL DEGREE PROGRAM

Preflighting for Newspaper

Maintain Your F5 Solution with Fast, Reliable Support

College of Engineering Bachelor of Computer Science

Department of Health & Human Services (DHHS) Pub Medicare Claims Processing Centers for Medicare &

7th WSEAS Int. Conf. on TELECOMMUNICATIONS and INFORMATICS (TELE-INFO '08), Istanbul, Turkey, May 27-30, 2008.

FACULTY SALARIES FALL NKU CUPA Data Compared To Published National Data

Implied volatility formula of European Power Option Pricing

Model Question Paper Mathematics Class XII

Category 7: Employee Commuting

Standardized Coefficients

DOCTORATE DEGREE PROGRAMS

TIME MANAGEMENT. 1 The Process for Effective Time Management 2 Barriers to Time Management 3 SMART Goals 4 The POWER Model e. Section 1.

Concept and Experiences on using a Wiki-based System for Software-related Seminar Papers

An Epidemic Model of Mobile Phone Virus

STUDENT RESPONSE TO ANNUITY FORMULA DERIVATION

Cloud Vision & OpenStack

Database Management Systems

Architecture of the proposed standard

The entropy change in a bath coupled to a driven quantum system

PY1052 Problem Set 8 Autumn 2004 Solutions

Caution laser! Avoid direct eye contact with the laser beam!

Diabetes Care Beyond Meters and Strips

by John Donald, Lecturer, School of Accounting, Economics and Finance, Deakin University, Australia

Transcription:

Rcognition of Handwittn Txtual Annotation uing Tact Opn Souc OCR Engin fo infomation Jut In Tim (ijit) Sandip Rakhit 1, Subhadip Bau 2, Hiahi Ikda 3 1 Tchno India Collg of Tchnology, Kolkata, India 2 Comput Scinc and Engining Dpatmnt, Jadavpu nivity, India 3 Intllignt Mdia Sytm Dpatmnt, Cntal Rach Laboatoty, Hitachi Limitd, Japan 1 Coponding autho. E-mail: ubhadip@i.og Abtact Objctiv of th cunt wok i to dvlop an Optical Chaact Rcognition (OCR) ngin fo infomation Jut In Tim (ijit) ytm that can b ud fo cognition of handwittn txtual annotation of low ca Roman cipt. Tact opn ouc OCR ngin und Apach Licn 2.0 i ud to dvlop u-pcific handwiting cognition modl, viz., th languag t, fo th aid ytm, wh ach u i idntifid by a uniqu idntification tag aociatd with th digital pn. To gnat th languag t fo any u, Tact i taind with labld handwittn data ampl of iolatd and f-flow txt of Roman cipt, collctd xcluivly fom that u. Th dignd ytm i ttd on fiv diffnt languag t with f- flow handwittn annotation a tt ampl. Th ytm could uccfully gmnt and ubquntly cogniz 87.92%, 81.53%, 92.88%, 86.75% and 90.80% handwittn chaact in th tt ampl of fiv diffnt u. 1. Intoduction In onlin chaact cognition, th tajctoi of pn tip movmnt a codd and analyzd to idntify th linguitic infomation xpd. With th latt tchnological advancmnt in pn input dvic, nw intfac a dignd to captu th pci pntajctoy infomation and ubqunt analyi of onlin handwittn data, with u comfot in witing. It i now poibl to wit on an odinay pap and immdiat wil tanmiion of handwittn annotation to a mot v [1]. With th tchnological advanc, handwittn annotation in digital notbook may b digitizd in no tim. Taditionally, documnt containing handwittn infomation a difficult to achiv in digital fom. Evn with th hlp of latt optical cann, contnt bad indxing tchniqu and ach tool; it i difficult to find digitizd vion of documnt pag bad on u qui. Som wok ha cntly bn don on contnt bad tival of handwittn documnt [2-4]. In [2], Btand t.al. hav dvlopd a tchniqu fo tuctual documnt cognition and cognition of handwittn nam. In anoth wok, Matthw t.al. [3] dvlopd a tok fatu bad tchniqu fo tival of handwittn Chin annotation bad on typd/handwittn quy. Sihai t.al. [4] had ud tok/hap fatu fo indxing and tival of handwittn documnt bad on wit chaactitic, txtual contnt and wit pofil. In on of ou ali wok [5], a cognition bad indxing tchniqu wa dicud fo al-tim tival of handwittn annotation bad on typd/handwittn quy. David Domann, in hi uvy [6], had highlightd ky iu involvd in indxing and tival of documnt imag. In any cognition bad indxing tchniqu, th ovall pfomanc pdominantly dpnd on accuacy of th undlying cognition ngin. Dvlopmnt of a handwittn OCR ngin with high cognition accuacy i a till an opn poblm fo th ach community. Lot of ach ffot hav alady bn potd [7-9] on diffnt ky apct of handwittn chaact cognition ytm. In thi wok, w hav ud Tact 2.01 [10], an opn ouc OCR Engin und Apach Licn 2.0, fo gmntation and ubqunt cognition of handwittn txtual annotation of low ca of Roman cipt.

Objctiv of th cunt wok i to dvlop an Optical Chaact Rcognition (OCR) ngin fo th ijit ytm that can b ud fo cognition of handwittn txtual annotation of low ca Roman cipt. Tact i ud to dvlop u-pcific handwiting cognition modl, viz., th languag t, fo th aid ytm. Each u of th ijit ytm may b idntifid by a uniqu idntification tag aociatd with th Anoto digital pn [1]. Tact OCR ngin i cutomizd to pfom u pcific taining on labld handwiting ampl of both iolatd and f-flow txt, wittn uing low ca Roman cipt. Th pfomanc i valuatd on both th catgoi of documnt pag fo obvation of gmntation and chaact cognition accuaci. Th following ction dcib an ovviw of th xiting ijit ytm, an ovviw of th Tact OCR ngin and th pnt xpimnt on digning an OCR ngin fo gmntation and cognition of handwittn txtual annotation. 2. Th ijit ytm Jut in tim availability of maningful infomation i th ky to any al-tim infomation tival ytm. Th infomation Jut In Tim (ijit) ytm [5], dvlopd at th Hitachi Cntal Rach Laboatoy, kp tack of all th digital documnt tod in th ijit v. ing th popod ytm, handwittn annotation on th pintd digital documnt pag uing Anoto digital pn [1] may b viwd/had/achd bad on typd/handwittn quy. Fig. 1. how a chmatic ovviw of th cognition bad quy tival chm dignd fo th ijit ytm. Th ijit ytm u odinay pap, attachd with digitally lgibl patnt-potctd dotpattn fom Anoto [1], fo pintout of ach digital documnt though th v. Th Anoto pattn conit of numou naly inviibl, intllignt black dot that can b ad by a digital pn. Th pattn on ach pap i uniqu o that ach pag can b kpt paat fom on anoth. Fig. 1. A chmatic achitctu of th cognition bad quy tival ytm. An Anoto digital pn [1], look lik it nomal ballpoint countpat, contain an intgatd digital cama, an advancd imag micopoco and a wil communication dvic. Th pn can tak aound 50 digital naphot p cond, can to up to 50 full A4/ltt iz pag of handwittn data and thn culy nd th infomation to th ijit v though wil communication o nival Sial Bu (SB). Evy naphot contain nough data to dtmin th xact poition of th pn in th pap, th tim of pn-tok and th uniqu idntification numb of th Anoto pap. Each pn i alo having uniqu idntification numb o that th ijit ytm can ditinguih btwn vy individual handwiting. 3. Ovviw of th Tact OCR ngin

Tact i an opn ouc (und Apach Licn 2.0) offlin optical chaact cognition ngin, oiginally dvlopd at Hwltt Packad fom 1984 to 1994. Tact i now patially fundd by Googl [10] and lad und th Apach licn, vion 2.0. Th latt vion, Tact 2.03 i lad in Apil, 2008. In th cunt wok, w hav ud Tact vion 2.01, lad in Augut 2007. Lik any tandad OCR ngin, Tact i dvlopd on top of th ky functional modul lik, lin and wod find, wod cogniz, tatic chaact claifi, linguitic analyz and an adaptiv claifi. Howv, it do not uppot documnt layout analyi, output fomatting and gaphical u intfac. Cuntly, Tact can cogniz pintd txt wittn in Englih, Spanih, Fnch, Italian, Dutch, Gman and vaiou oth languag. To tain Tact in Englih languag 8 data fil a quid in tdata ub dictoy. Th 8 fil ud fo Englih a to b gnatd a follow: tdata/ng.fq-dawg tdata/ng.wod-dawg tdata/ng.u-wod tdata/ng.inttmp tdata/ng.nompoto tdata/ng.pffmtabl tdata/ng.unichat tdata/ng.dangambig 4. Th pnt wok In th cunt wok, Tact 2.01 i ud fo dvloping u-pcific handwiting cognition modl, viz., th languag t, fo th ijit ytm. To gnat th languag t fo ach u, Tact i taind with labld handwittn data ampl of iolatd and f-flow txt of low ca Roman cipt. Ky functional modul of th dvlopd ytm a dicud in th following ub-ction. 4.1. Collction of th datat Fo ppaation of th datat fo th cunt xpimnt, digitizd handwittn ampl of low ca Roman cipt w collctd fom fiv diffnt u. Six handwittn documnt pag, coniting of iolatd chaact and f-flow wod w collctd fom ach of th u of th dignd ytm. Th pag a catgoizd into two datat. Datat-1 conit of fou pag of iolatd handwittn low ca Roman chaact and Datat-2 contitut two pag of f-flow handwittn wod, wittn fom tchnical aticl. Fo ach u, th pag fom th datat-1 and on pag fom th datat-2 w conidd fo taining th Tact OCR ngin. Th maining two pag, on fom ach datat, contitut th tt t fo th cunt xpimnt. Th ovall ditibution of th chaact ampl in th taining and th tt t fo th fiv u i hown in Tabl 1.

Tabl 1. Compoition of th taining and tt t chaact ampl fo diffnt u 1 2 3 4 5 Tain t 1185 659 1844 Tt t 442 691 1133 Tain t 1006 529 1535 Tt t 468 718 1186 Tain t 992 884 1876 Tt t 546 1004 1550 Tain t 619 578 1197 Tt t 260 751 1011 Tain t 467 255 722 Tt t 234 277 511 4.2. Labling taining data Fo labling th taining ampl of ach u uing Tact w hav takn hlp of a tool namd bbtact [13]. To gnat th taining fil fo a pcific u, w nd to ppa th box fil fo ach taining imag uing th following command: tact fontfil.tif fontfil batch.nochop makbox Th box fil i a txt fil that includ th chaact in th taining imag, in od, on p lin, with th coodinat of th bounding box aound th imag. Incoct labl in th taining t may b manually coctd uing th bbtact Tool. Thn w hav to nam th boxfil fontfil.txt to fontfil.box. Fig. 2 how a cnhot of th bbtact tool.

Fig.2. A ampl cnhot of a gmntd taining pag uing th bbtact tool 4.3. Taining th data uing Tact OCR ngin Fo taining a nw languag t fo any u, w hav to put in th ffot to gt on good box fil fo a handwittn documnt pag, un th t of th taining poc, dicud blow, to cat a nw languag t. Thn u Tact again uing th nwly catd languag t to labl th t of th box fil coponding to th maining taining imag uing th poc dicud in ction 4.2. Fo ach of ou taining imag, boxfil pai, un Tact in taining mod uing th following command: tact fontfil.tif junk nobatch box.tain Th output of thi tp i fontfil.t which contain th fatu of ach chaact of th taining pag. Th chaact hap fatu can b clutd uing th mftaining and cntaining pogam: mftaining fontfil_1.t fontfil_2.t... Thi will output th data fil: inttmp, pffmtabl and Micofat, and th following command: cntaining fontfil_1.t fontfil_2.t... Thi will output th nompoto data fil. Now, to gnat th unichat data fil, unichat_xtacto pogam i ud a follow: unichat_xtacto fontfil_1.box fontfil_2.box... Tact u 3 dictionay fil fo ach languag. Two of th fil a codd a a Dictd Acyclic Wod Gaph (DAWG), and th oth i a plain TF-8 txt fil. Th wodlit i fomattd a a TF-8 txt fil with on wod p lin. Th coponding command a: wodlit2dawg fqunt_wod_lit fq-dawg wodlit2dawg wod_lit wod-dawg Th thid dictionay fil nam i u-wod and i uually mpty. Th final data fil of Tact i DangAmbig fil. Thi fil cannot b ud to tanlat chaact fom on t to anoth. Th DangAmbig fil may b mpty alo. Now w hav to collct all th 8 fil and nam thm with a lang. pfix, wh lang i th 3- ltt cod fo ou languag and put thm in ou tdata dictoy. Tact can thn cogniz txt in ou languag t uing th command: tact imag.tif output -l lang

5. Expimntal ult Poc. Int. Conf. on Infomation Tchnology and Buin Intllignc (2009) 117-125 Fo conducting th cunt xpimnt, fiv u-pcific languag t a gnatd uing Tact opn ouc OCR ngin. Th taining and tt pattn of ach individual u a pad ov two typ of datat, a dcibd in Sc. 4.1. Th xpimnt i focud on tting th gmntation and co cognition accuacy of Tact OCR ngin on f flow handwittn annotation wittn uing digital pn by diffnt u. Th linguitic analyi modul of Tact, involving th languag fil fq-dawg, wod-dawg, u-wod and DangAmbig a not utilizd in th cunt xpimnt. To valuat th pfomanc of th pnt tchniqu th following xpion i dvlopd. Rcognition accuacy = (CB tb / (CB m B + CB B ))*100 wh CB tb = th numb of chaact gmnt poducing tu claification ult and CB mb = th numb of miclaifid chaact gmnt and CB B ignifi th numb of chaact Tact fail to gmnt, i.., poducing und gmntation. Th jctd chaact/wod ampl a xcludd fom computation of cognition accuacy of th dignd ytm. Tabl 2(a-) how an analyi of uccful claification (SC), miclaification (Mic), gmntation failu (SF) and jction (Rj) ult on th tt ampl of th th u. Fig. 3 how a chaact wi ditibution of ucc and failu accuaci on th ovall tt datat. A obvd fom th xpimntation a ignificant popotion jction ca volv out of th wod gmntation failu. Thi i o bcau Tact i oiginally dignd to cogniz pintd documnt pag with unifomity in balin and chaact/wod pacing. Anoth ouc of o i du to th intnal gmntation of om of th chaact. Mo pcifically, th chaact 'i' oftn gt intnally gmntd into two pat, lading to high individual o at. Fquncy 450 400 350 300 250 200 150 100 50 0 a b c d f g h i j k l m n o p q t u v w x y z Labl of Tt Chaact Succ Failu Fig. 3. Ditibution of ucc and failu ca ov th f flow tt pag. Tabl 2. Analyi of cognition pfomanc of th dvlopd ytm (a) Rcognition pfomanc of -1 tt datat SC 95.42 83.2 87.92 Mic 4.1 16.19 11.52 SF 0.48 0.61 0.56 Rj 6.10 4.34 5.03

(b) Rcognition pfomanc of -2 tt datat SC 91.62 76.45 81.53 Mic 8.38 18.31 15.00 SF 0.00 5.24 3.47 Rj 26.07 4.18 12.82 (c) Rcognition pfomanc of -3 tt datat SC 96.78 90.94 92.88 Mic 3.22 6.18 5.19 SF 0.00 2.88 1.93 Rj 8.97 0.00 3.16 (d) Rcognition pfomanc of -4 tt datat SC 90.38 85.49 86.75 Mic 8.85 7.32 7.72 SF 0.77 7.19 6.03 Rj 0 0 0 ) Rcognition pfomanc of -5 tt datat SC 91.88 89.89 90.80 Mic 8.12 10.11 9.20 SF 0 0 0 Rj 0 0 0 Fig. 4. Som of th uccfully gmntd and cognizd wod imag.

(a) (b) Fig. 5. Som of th miclaifid wod imag (a) Rcognition o in th 3 d chaact (b) Intnal gmntation in th 8 th chaact A hown in Tabl 2(a-), th ovall chaact-lvl cognition accuacy of th dvlopd ytm i aound 87.98%. Th ovall chaact miclaification at i obvd a aound 9.73%. Sgmntation failu in th documnt pag account fo aound 2.29% o ca. Th aon bhind high gmntation failu i du to th ov-gmntation of om of th contitunt chaact lik i, 'j' and alo du to und-gmntation of cuiv wod in th documnt pag. Th dignd ytm jct aound 9.24% chaact in th tt datat. Thi i mainly du to th pnc of multi-kwd handwittn txt lin in th tt documnt. Compltly cuiv wod w alo jctd compltly in many ca duing th xpimntation. Som of th ampl wod imag uccfully gmntd and cognizd by Tact a hown in Fig. 4. Fig. 5(a-b) how om of th wod imag with onou gmntation and cognition ult. A majo dawback of th cunt ytm i it failu to avoid ov-gmntation in om of th chaact. Alo th ytm fail to gmnt cuiv wod in many ca lading to undgmntation and jction. Th cognition pfomanc of th dignd ytm may futh b impovd by incopoating mo taining ampl fo ach u and incluion of wod-lvl dictionay matching tchniqu. Dpit th limitation, th dignd cognition ngin i uccfully intgatd with th ijit ytm fo onlin intptation of handwittn txtual annotation. Th wod-lvl cognition tim of th OCR ngin, a obvd on aonably powd comput hadwa, i alo found to b atifactoy. In a nuthll, th cunt wok ffctivly cutomiz an opn ouc OCR ngin fo gmntation and cognition of handwittn txtual annotation of multipl u within th dignd ijit ytm. 6. Rfnc [1] www.anoto.com [2] Btand Coüanon Jan Camillapp Ivan Lplumy, Acc by contnt to handwittn achiv documnt: gnic documnt cognition mthod and platfom fo annotation, IJDAR (2007) 9: 223 242. [3] Matthw Ma, Chi Zhang and Patick Wang, Studi of Radical Modl fo Rtival of Cuiv Chin Handwittn Annotation, SSPR&SPR 2000, LNCS 1876, pp. 407-416, 2000. [4] Sagu Sihai, Ananthaaman Ganh, Catalin Tomai, Yong-Chul Shin, and Chn Huang, Infomation Rtival Sytm fo Handwittn Documnt, DAS 2004, LNCS 3163, pp. 298 309, 2004. [5] S. Bau, K. Konihi, N. Fuukawa, H, Ikda, A novl chm fo tival of handwittn txtual annotation fo infomation Jut In Tim (ijit), pocding (CD) of IEEE Rgion 10 Confnc (TENCON) -2008. [6] David Domann, Th Indxing and Rtival of Documnt Imag: A Suvy, Comput Viion and Imag ndtanding achiv Volum 70, Iu 3. [7] R.M. Bozinovic and S.N. Sihai, Off-lin Cuiv Scipt Wod Rcognition, IEEE Tan. Pattn Analyi and Machin Intllignc, vol. 11,pp 68-83, 1989. [8] B. B. Chaudhui and. Pal, A Complt Pintd Bangla OCR Sytm, Pattn Rcognition, vol. 31, No. 5. pp. 531-549, 1998. [9] S. Bau, C. Chawdhui, M. Kundu, M. Naipui, D. K. Bau, A Two-pa Appoach to Pattn Claification, N.R. Pal t.al. (Ed.), ICONIP, LNCS 3316, pp. 781-786. [10] http://cod.googl.com/p/tact-oc