Forms Processing - user experiences of text and handwriting recognition (OCR/ICR)



Similar documents
Scan More, Print Less - save your small business a small fortune

A Guide to Better Postal Services Procurement. A GUIDE TO better POSTAL SERVICES PROCUREMENT

Making training work for your business

TIAA-CREF Wealth Management. Personalized, objective financial advice for every stage of life

Prescribing costs in primary care

IT Support n n support@premierchoiceinternet.com. 30 Day FREE Trial. IT Support from 8p/user

Flood Emergency Response Plan

leasing Solutions We make your Business our Business

Optimizing content management in plant, process and manufacturing

Managing and Sharing Important Documents - in Small to Mid-Sized Businesses

Agency Relationship Optimizer

CREATIVE MARKETING PROJECT 2016

Domain 1: Designing a SQL Server Instance and a Database Solution

optimise your investment in Microsoft technology. Microsoft Consulting Services from CIBER

CCH Accountants Starter Pack

Total Program Management for High-Tech

Frequently Asked Questions

Business Application Services. Business Applications that provide value to your enterprise.

Baan Service Master Data Management

AGC s SUPERVISORY TRAINING PROGRAM

How to use what you OWN to reduce what you OWE

A GUIDE TO BUILDING SMART BUSINESS CREDIT

Wells Fargo Insurance Services Claim Consulting Capabilities

LEASE-PURCHASE DECISION

(VCP-310)

CHAPTER 3 THE TIME VALUE OF MONEY

Document Control Solutions

CCH Practice Management

Enhancing Oracle Business Intelligence with cubus EV How users of Oracle BI on Essbase cubes can benefit from cubus outperform EV Analytics (cubus EV)

Ideate, Inc. Training Solutions to Give you the Leading Edge

ODBC. Getting Started With Sage Timberline Office ODBC

client communication

France caters to innovative companies and offers the best research tax credit in Europe

Agenda. Outsourcing and Globalization in Software Development. Outsourcing. Outsourcing here to stay. Outsourcing Alternatives

Professional Networking

QUADRO tech. FSA Migrator 2.6. File Server Migrations - Made Easy

Skytron Asset Manager

CCH Accounts Production

A Balanced Scorecard

The ERP Card-Solution. The power, control and efficiency of ERP combined with the ease-of-use and financial benefits of a P-Card.

Investing in Stocks WHAT ARE THE DIFFERENT CLASSIFICATIONS OF STOCKS? WHY INVEST IN STOCKS? CAN YOU LOSE MONEY?

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

Assessment of the Board

Get advice now. Are you worried about your mortgage? New edition

Baan Finance Accounts Payable

Engineering Data Management

STUDENTS PARTICIPATION IN ONLINE LEARNING IN BUSINESS COURSES AT UNIVERSITAS TERBUKA, INDONESIA. Maya Maria, Universitas Terbuka, Indonesia

Information for Programs Seeking Initial Accreditation

InventoryControl. The Complete Inventory Tracking Solution for Small Businesses

UK Grant-making Trusts and Foundations

Supply Chain Management

WHERE CHANGE IS POSSIBLE

INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology

DC College Savings Plan Helping Children Reach a Higher Potential

The Forgotten Middle. research readiness results. Executive Summary

Configuring Additional Active Directory Server Roles

CCH CRM Books Online Software Fee Protection Consultancy Advice Lines CPD Books Online Software Fee Protection Consultancy Advice Lines CPD

Full Lifecycle Project Cost Controls

The Big Picture: An Introduction to Data Warehousing

From Customer Satisfaction to Customer Advocacy

EUROCONTROL PRISMIL. EUROCONTROL civil-military performance monitoring system

INVESTING IN SOCIAL CHANGE TOOLS FOR SOCIAL INNOVATION

Comparing Credit Card Finance Charges

Creating an Agile BI Environment

PUBLIC RELATIONS PROJECT 2016

The Social Business System - connecting people and content

facing today s challenges As an accountancy practice, managing relationships with our clients has to be at the heart of everything we do.

Trustwave Leverages OEM Partnerships to Deepen SIEM Market Penetration

e-trader user guide Introduction

Authentication - Access Control Default Security Active Directory Trusted Authentication Guest User or Anonymous (un-authenticated) Logging Out

Introducing Your New Wells Fargo Trust and Investment Statement. Your Account Information Simply Stated.

Unicenter TCPaccess FTP Server

The Potential for Energy Savings in Affordable Multifamily Housing

WHAT IS YOUR PRIORITY?

FIRE PROTECTION SYSTEM INSPECTION, TESTING AND MAINTENANCE PROGRAMS

TIAA-CREF WEALTH MANAGEMENT A HIGHLY PERSONALIZED, SOPHISTICATED SERVICE DESIGNED TO MATCH ONE OBJECTIVE: YOURS

ANALYTICS. Insights that drive your business

Modified Line Search Method for Global Optimization

Smart Connected Products & The Internet of Things

Platform Solution. White Paper. Transaction Based Pricing in BPO: In Tune with Changing Times

RISK TRANSFER FOR DESIGN-BUILD TEAMS

PRICE BAILEY CHARITIES & NOT FOR PROFIT THE RIGHT ADVICE FOR LIFE

Handling. Collection Calls

I apply to subscribe for a Stocks & Shares NISA for the tax year 2015/2016 and each subsequent year until further notice.

GoVal Group Government Consulting and Valuation Advisory Group. real. Real expertise. Real choices. Real value.

A guide to School Employees' Well-Being

Improving corporate functions using shared services

QUADRO tech. PST Flightdeck. Put your PST Migration on autopilot

CCH Document Management

Enable Compliance, Quality, and Efficiency in Your Safety Operations with Oracle Argus

National Institute on Aging. What Is A Nursing Home?

Advancement FORUM. CULTIVATING LEADERS IN CASE MANAGEMENT

Banking & Financial Services. White Paper. Basel III Capital Disclosure Requirements The Way Forward For Banks

The Medical Assessment of Incapacity and Disability Benefits

auction a guide to selling at Residential

RECRUITMENT TRENDS SURVEY RESULTS

Online Banking. Internet of Things

ContactPro Desktop for Multi-Media Contact Center

Supply Chain Innovation Driving Operational Improvements

FM4 CREDIT AND BORROWING

Transcription:

- user experieces of text ad hadwritig recogitio (OCR/ICR) Sposored by:

About the As the o-profit associatio dedicated to urturig, growig ad supportig the user ad supplier commuities of ECM (Eterprise Cotet Maagemet) ad Social Busiess Systems (or Eterprise 2.0), AIIM is proud to provide this research at o charge. I this way the educatio, the etire commuity ca leverage the thought- leadership ad directio provided by our work. Our objective is to preset the wisdom of the crowds based o our 70,000- strog commuity. We are happy to exted free use of the materials i this report to ed-user compaies ad to idepedet cosultats, but ot to suppliers of ECM systems, products ad services, other tha Parascript ad its subsidiaries ad parters. Ay use of this material must carry the attributio AIIM 2012 www.aiim.org / Parascript 2012 www.parascript.com Rather tha redistribute a copy of this report to your colleagues, we would prefer that you direct them to www.aiim.org/research for a free dowload of their ow. Our ability to deliver such high-quality research is made possible by the fiacial support of our uderwritig sposor, without whom we would have to retur to a paid subscriptio model. For that, we hope you will joi us i thakig our uderwriter for this support: Parascript LLC 6275 Moarch Park Place Logmot, CO 80503 USA Phoe: (+1) 303-381-3100 Website: www.parascript.com Process used ad survey demographics The survey results quoted i this report are take from a survey carried out betwee 09 March 2012 ad 29 March 2012, with 324 resposes from idividual members of the AIIM commuity surveyed usig a Web-based tool. Ivitatios to take the survey were set via email to a selectio of AIIM s 70,000 registered idividuals. Respodets are predomiatly from North America ad cover a represetative spread of idustry ad govermet sectors. Results from orgaizatios of less tha 10 employees ad suppliers of ECM products ad services have ot bee icluded, brigig the total respodets to 255. About AIIM AIIM has bee a advocate ad supporter of iformatio professioals for early 70 years. The associatio missio is to esure that iformatio professioals uderstad the curret ad future challeges of maagig iformatio assets i a era of social, mobile, cloud ad big data. AIIM builds o a strog heritage of research ad member service. Today, AIIM is a global, o-profit orgaizatio that provides idepedet research, educatio ad certificatio programs to iformatio professioals. AIIM represets the etire iformatio maagemet commuity: practitioers, techology suppliers, itegrators ad cosultats. AIIM rus a series of traiig programs, icludig the Capture traiig course www.aiim.org/traiig/capture-course. About the author Doug Miles is head of the AIIM Market Itelligece Divisio. He has over 25 years experiece of workig with users ad vedors across a broad spectrum of IT applicatios. He was a early pioeer of documet maagemet systems for busiess ad egieerig applicatios, ad has produced may AIIM survey reports o issues ad drivers for Capture, ECM, Email Maagemet, Records Maagemet, SharePoit ad Social Busiess. Doug has also worked closely with other eterprise-level IT systems such as ERP, BI ad CRM. He has a MSc i Commuicatios Egieerig ad is a member of the IET i the UK. 2012 2012 AIIM Parascript LLC 1100 Waye Aveue, Suite 1100 6273 Moarch Park Place Silver Sprig, MD 20910 Logmot, CO 80503 +1 301-587-8202 +1 303-381-3100 www.aiim.org www.parascript.com 2

Table of Cotets About the : About the... 2 Process used ad survey demographics... 2 About AIIM... 2 About the author... 2 Itroductio: Itroductio... 4 Key fidigs...4 Had-Writig Recogitio: Had-Writig Recogitio... 12 Drivers...12 ICR Adoptio...13 Coclusio ad Recommedatios: Coclusio ad Recommedatios... 14 Recommedatios... 15 Refereces... 15 Drivers ad Adoptio for Capture: Drivers ad Adoptio for Capture:... 4 No-Adopters - Forms Scaig...6 No-Adopters - OCR...7 Outsourcig: Outsourcig... 7 Drivers for Outsourcig...8 Capture ad Recogitio Strategies: Capture ad Recogitio Strategies... 9 Cetral vs. Distributed...9 Data Capture ad OCR...9 Levels of OCR/ICR...10 OCR Performace...11 Currecy of OCR/ICR Software...11 Appedix 1: Survey Demographics: Appedix 1: Survey Demographics... 16 Survey Backgroud... 16 Orgaizatioal Size... 16 Geography... 16 Idustry Sector... 17 Job Role... 17 Uderwritte i part by: Parascript LLC... 18 AIIM... 19 3

Itroductio Forms processig ad character recogitio is ot a ew techology. I fact the first applicatio was i mailig address recogitio for sortig machies over 40 years ago. From that day to this the challege has bee to sca, clea, aalyse ad match the characters fast eough to provide a sufficietly accurate recogitio at a suitable documet throughput rate ad these two factors are always a trade-off. This kid of image processig is very compute itesive, ad also leds itself to multi-processig. I view of the advaces over the last 5 years i multicore processors ad the sheer compute power i eve the most basic of servers, we ca see that recogitio techology is likely to have made dramatic strides i both performace ad accuracy. Sophisticated character aalysis algorithms have bee steadily refied whilst throughput keeps up with the fastest moder scaers. Othe-fly layout recogitio ca quickly adapt to mixed form types whilst superfast look-up arrays have exteded character-validatio to full-word cotext checkig. I additio, multi-pass votig techiques improve ambiguous matches ad the log-term goal of hadprit recogitio ad eve cursive script recogitio is ow well withi the capabilities of some systems. However, because the core techology has bee aroud for a log time, existig users ca easily be complacet about the performace of their existig capture suites, ad more importatly, may ot recogize ew potetial applicatios that ca be capture-eabled if the latest recogitio ad server techology is used. If the existig capture operatio is separated from the dowstream lie-of-busiess process, or if it is outsourced, the process ower may ot be aware that such ew possibilities exist. I this report we will review the differet levels of forms-processig i use ad the issues ad potetial beefits of character recogitio. We will explore the awareess ad take-up of the latest techologies, particularly ICR (Itelliget Character Recogitio), as geerally applied to had-writig, ad compare this to the more traditioal OCR (Optical Character Recogitio), which is usually restricted to machie prited text. Key Fidigs 88% of survey respodets sca forms but oly 32% do text recogitio. 55% workflow scaed images ad maually re-key the data. Localized decisio-makig is give as the mai reaso for o-adoptio of forms scaig, followed by a lack of desigated ower. 42% have half or more of their forms with hadwritte data fields. For 38% half or more of their forms have had-writte ame ad address fields, ad 32% have had-writte free text or ope-eded data fields. 12% use ICR to recogize had-prited costraied field etries. 6% use ICR to recogize had-writte script ad free-form etries. A average productivity improvemet of 34.8% was cosidered possible if recogitio of had-writte text could be automated. 36% of respodets would expect a 50% or more improvemet. I 26% of orgaizatios, had-writte script fields play a key role i the efficiecy of their busiess processes. A further 3% cosider them to be quite importat. No-users of recogitio cite had-filled form fields ad techology reservatios. But 43% have t evaluated ICR lately. 60% of outsource users have ever bee offered had-writig recogitio, or have ever asked. 13% feel that their outsource does ot have up-to-date techology. 42% of users last updated their recogitio software 3 or more years ago. 13% last updated 5 or more years ago. Across OCR ad ICR, 44% are achievig a 95% or better o-itervetio rate per scaed form. 61% are gettig 90% or better. Drivers ad Adoptio for Capture Every busiess has forms. The bigger the busiess, the more forms. Each form will relate to a process, ad each process will ivolve employees eterig ad processig the data o each form. As we kow from previous AIIM reports 1 scaig forms ad movig image files rather tha paper will improve productivity ad speed up respose. It provides a electroic workflow that ca be readily moitored ad maaged, thereby elimiatig bottleecks i the process ad improvig trasit times. The ext step i productivity improvemet is to remove the eed for keyig the data from each form ito the 4

process applicatio. Although this would seem to be a obvious step, it is ot as widely adopted as might at first be thought. Whereas 88% of respodets to our survey sca forms, oly 32% use OCR to recogize machiewritte text, ad oly 12% recogize ay level of had-writte text. As we will see later, most processes derive cosiderable value from hadwritte forms data. Figure 1: How do you pre-process forms comig ito your busiess uit, or geerated withi it? (Check all that apply, icludig outsourced services) (N=255: part 1) We do ot sca ay forms We sca forms ad documets for archive We sca forms ad workflow the images, re-keyig the data We OCR scaed forms for machie-wri e text fields We use ICR to recogize had-prited field etries 0% 10% 20% 30% 40% 50% 60% 70% 80% May potetial users cosider that there is a distict poit below which there are isufficiet forms comig ito the busiess to justify OCR automatio. I fact, the correlatio is ot as strog as oe might thik. There is a degree of iversio at 1,000 forms per day, but 26% of OCR users are processig 100 forms per day or less. There are also orgaizatios processig may thousads of forms per day who are ot usig OCR. Figure 2: How may forms do you estimate you process daily o average i your busiess uit? (N=193) Less tha 50 forms 50-100 forms 100-500 forms 500-1,000 forms 1,000-5,000 forms 5,000-10,000 forms 10,000-25,000 forms >25,000 forms 0% 5% 10% 15% 20% 25% No-OCR user OCR user Overall, 29% or our respodets are processig more tha 1,000 forms per day (250,000 per year), risig to 42% of the largest orgaizatios. Lookig i more detail at differet levels of IMR (Itelliget Mark Recogitio), OCR ad ICR activity, we ca see that aroud half of those deployig OCR for machie-writte text use it for ivoice automatio (AP, Accouts Payable) although the most popular forms scaig applicatio is actually check (cheque) scaig. 5

Figure 3: How do you pre-process forms comig ito your busiess uit, or geerated withi it? (check all that apply, icludig outsourced services) (N=255: part 2) 0% 5% 10% 15% 20% 25% 30% 35% We sca forms for barcodes ad/or check boxes (IMR) We OCR ivoices for AP automa o We sca checks/cheques We OCR scaed forms for machie-wri e text fields We OCR faxed forms for machie-wri e text fields We use ICR to recogize had-prited costraied field etries We use ICR to recogize had-wri e script ad free-form etries No-Adopters forms scaig The mai reaso give for ot adoptig forms-scaig is localized decisio-makig withi idividual departmets ad processes, ad this is particularly the case i mid-sized busiesses. Eve where a commo, cetralized requiremet for scaig ca be established, leadership will ofte fall betwee IT, records ad facilities maagemet. There is also cofirmatio here that a certai level of forms throughput is cosidered ecessary before the techology becomes cost effective a situatio that ca be improved by cetralizig mail deliveries ito a sigle address. Figure 4: What are the mai reasos you have ot adopted forms scaig for your processes? (Max TWO) (N=23 o-users) We are plaig it ow Each departmet/process/loca o makes its ow decisios No oe is tasked to evaluate it Not eough forms processed to be worth it We do t thik there is a big eough ROI 0% 5% 10% 15% 20% 25% 30% 35% Too may differet types of form We process eough forms as a group, but are ot cetralized eough Maagemet prefers the tradi oal ways 6

No-Adopters OCR The mai reaso give for ot adoptig recogitio software is the feelig that it is difficult to accommodate multiple forms layouts. Although a certai amout of pre-setup is required for each form, a moder capture system will make this very easy. Multiple form types ca the be fed from the scaer i a mixed feed, ad the capture software will automatically separate ad detect each form-type ad its layout, ad use the correct template to fid the fields. The ext most likely reaso is the difficulty of dealig with had-writte fields, ad we will deal with that i detail later. Figure 5: What are the mai reasos that you do ot use recogitio techologies to capture forms data? (Max TWO) (N=87 o-users) We are plaig to do so 0% 5% 10% 15% 20% 25% 30% We have too may differet types of form Our forms are mostly filled-i by had, so hard to recogize We do t thik the techology is good eough overall Have t go e aroud to evalua g the techology Each busiess uit makes its ow decisio so we do t have cri cal mass We do t thik the busiess case is strog eough Outsourcig We do t process eough to make it worthwhile We oly eed to archive the forms - o real process As we might expect, the largest orgaizatios are early 3 times more likely to outsource their forms-processig tha the smallest, ad they process aroud five times as may forms overall. Mid-sized compaies reflect a balace betwee these two, although they are likely to outsource a higher percetage of their forms. Figure 6: What percetage of your forms throughput do you outsource? (N=240, excl. 10 Do t Kow) Size of orgaiza o Average o of forms/day Use outsource % of those outsourcig 25% or more of forms % of those outsourcig 50% or more of forms 10-500 emps 1,216 14% 58% 50% 500-5,000 emps 2,349 27% 77% 59% 5,000+ emps 6,787 41% 74% 39% 7

Drivers for Outsourcig The mai driver for full outsourcig is that it is ot a core fuctio. The possibility that a outsource might have better equipmet or expertise is ot such a key issue, ad early 25% have plas to brig forms processig back i-house. Figure 7: What percetage of your forms throughput do you outsource? (N=17, fully outsource users) 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% We are plaig to brig it back i-house It s ot a core fuc o for us Bo om-lie cost per form Cheaper labor Saves space i our buildigs They have bigger ad be er equipmet They have be er exper se i recogi o They re be er equipped to maage peak throughput There is also a suspicio that oce a outsource cotract is i place the level of potetial OCR/ICR capability is ot upgraded, with 40% ot beig offered a higher level of recogitio ad 20% ot askig about it. I additio, 13% feel that their outsource may be usig old or outdated equipmet. This idicates that outsourcers may be missig out o the potetial of a icreased level of capture ad a greater value-add, particularly with regard to had-writig recogitio. Figure 8: Do you thik that the level of ICR/OCR techology at your outsource is hamperig the degree of hadwritte recogitio they ca do for you?? (N=17, fully outsource users) Yes, they do t replace/upgrade their equipmet that o e Do t really kow they ve ever offered us a higher-level capability Do t really kow we ve ever asked them about a higher-level capability No, they have the latest systems ad techologies 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% It s ot really applicable to what they do for us 8

Capture ad Recogitio Strategies Cetral vs. Distributed Although there has bee some movemet over the last few years away from cetralized scaig towards distributed scaig usig desk-top scaers ad MFPs, more recetly we have see may orgaizatios, particularly larger oes, move back to a cetralized model, primarily to create a digital mailroom cocept where icomig forms ad mail are scaed o etry to the busiess ad distributed electroically. As well as keepig paper out of the busiess, this ca also justify a greater level of ivestmet i associated capture ad recogitio techology. I this survey, 24% are usig a digital mailroom for their forms scaig, ad overall, 4% more are cetralized compared to distributed. Those usig a digital mailroom are more likely to fuel the majority of their forms through it. Figure 9: What is the primary forms-scaig mechaism i your busiess uit for process iput? (N=216, excl.17 usig outsource ad 23 ot scaig forms) 0% 5% 10% 15% 20% 25% 30% 35% Cetralized digital mailroom Cetralized poit of process Distributed desk scaers Distributed - MFPs Data Capture ad OCR Ad hoc Takig a closer look at how scaed forms are subsequetly processed, we see that aroud 20% are simply scaig direct to archive, either before or after a paper-based process. Eve amogst the largest orgaizatios, 17% are scaig forms but have o capture fuctioality. Figure 10: Do you capture data from your scaed forms? (N=195 scaig users) No, simply archive them that way No, we re-key the data from the scaed forms 0% 10% 20% 30% 40% 50% 60% 70% 80% 10-500 emps 500-5,000 emps 5,000+ emps Yes, we recogize data usig OMR ad/or OCR ad/or ICR. Amogst smaller orgaizatios 45% are usig data recogitio rather tha data re-keyig, but more tha a third are scaig their forms ad re-keyig the data from the scaed image. Oly 15% of larger orgaizatios are workig this way, with two-thirds doig some form of data capture. 9

Eve where data capture is beig used, the most commo applicatio is for archive idexig (64%). Similar idexig procedures for workflow routig, ad for search-term extractio are also commo. 40% use the data captured from the form to partially or fully populate the process, with data capture to fiacial processes likely to be the most popular applicatio. Figure 11: What uses do you make of captured forms data? (N=102 capture users) 0% 10% 20% 30% 40% 50% 60% 70% Idexig for archive Idexig for rou g to workflow or processes Idexig for keyword or free text search Data capture to fiacial processes (AP/AR/Checks) Par al forms data capture to other processes (eg. IDs, address fields, persoal data) Full forms data capture to process Levels of OCR/ICR Traditioal OCR of machie text usig character-matchig techiques is by far the most popular method i use. Alog with the much simpler optical mark or barcode recogitio, this is the highest level of sophisticatio for two thirds of our recogitio users. We ca add to this 8% who are usig the parametric aalysis techiques kow as ICR to capture machie text. 27% are recogizig had-writig i some form, mostly as costraied had-prit where the perso fillig out the form eeds to keep withi a box or marker for each character. 6% are utilizig ucostraied had-writig recogitio ad/or cursive script, which ted to be much more challegig for the recogitio software. Figure 12: What is the highest level of recogitio that you use? (N=102 capture users) Barcode ad OMR (mark recogi o) OCR machie-text (character matchig) OCR costraied had-prit ICR o machie-text (parametric recogi o) 0% 10% 20% 30% 40% 50% 60% ICR costraied had-prit ICR free etry had-prit ICR cursive script 10

OCR Performace A importat aspect of ay data capture operatio is the level of straight-through recogitios that do t require QA itervetio or maual re-keyig. This ca be measured o a character-by-character, field by field or form-byform basis. We compared users assessmet for field-by-field ad form-by-form ad foud little differece i reported results. Figure 13: For your OCR/ICR techology, what recogitio failure rate/qa itervetio rate would you say you are gettig o a form-by-form basis? (N=75 OCR/ICR users, excl. 21 Do t Kow) 2% or less failures 3% failures 5% failures 7% failures 10% failures 15% failures 20% failures 25% failures 30% failures 40% failures 50% or more failures More tha a quarter have a very low itervetio rate of just 2% or less. 56% are achievig a failure rate of 5% or less o a form-by-form basis ie, a 95% of forms processed without itervetio. 79% achieve a fail rate of 10% or less. There is evidece of some complacecy i moitorig recogitio performace as oly 25% regularly calibrate the performace of their scaers ad OCR usig stadard test pieces. Currecy of OCR/ICR Software 0% 5% 10% 15% 20% 25% 30% Obviously, i this survey we do ot kow how difficult the forms are to recogize or how may fields they are o each form. We ca, however, take a view o how may are usig the latest recogitio software. We ca see from Figure 14 that 58% are up-to-date at least to the capability they ca afford - 29% last updated 3 years ago, ad 13% updated 5 or more years ago. This could explai the log tail i recogitio failure rates, ad also has a bearig o users experiece of had-writig recogitio. Figure 14: How would you describe the sophisticatio of your recogitio software? (N=97 OCR/ICR users, excl. 5 Do t Kow) Latest, state of the art, updated regularly Media 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% Best we ca afford, updated regularly Purchased/updated 3 years ago Purchased/updated 5 years ago Purchased/updated more tha 5 years ago 11

Had-Writig Recogitio As we outlied earlier, recogitio of had-writig is very compute-itesive ad cosiderable advaces have bee made of late i terms of performace ad throughput, although some vedors have made greater strides tha others. More importatly, we eed to look at the drivers ad potetial beefits of recogizig had-writte form fields. Drivers I most orgaizatios, had-writte fields are prevalet o a sigificat umber of their forms, with aroud a third havig free-form, ucostraied fields o half or more of the forms they process. Figure 15: How may of the forms processed by your busiess uit (or outsource) would you say have had-writte fields for: ame ad address, other textual data fields, ucostraied free-form data? (N=219 excl. 31 do t kows) Had-writte fields Noe 25% or more of forms 50% or more of forms Name ad address fields 20% 57% 38% Other data fields 11% 62% 42% Free text/ope-eded data 12% 56% 32% Not oly are these had-writte fields prevalet, they are also importat to the efficiecy of the process. I 20% of orgaizatios they play a key role, ad i a further 40% they are quite importat icludig the free form script fields. As we eter the big data era, the cotets of these commet fields is becomig eve more importat as orgaizatios look to glea all kids of iformatio for product improvemet, setimet aalysis, fraud detectio, etc. Figure 16: How importat are the cotets of the followig to the efficiecy of your busiess processes? (N=224 excl. 29 do t kows) Had-prited addresses Other had-prited data fields Had-wri e/script fields for keyword extrac o Had-wri e/script fields for full data extrac o 0% 20% 40% 60% 80% 100% Play a Key Role Quite Importat Not That Importat As we might expect, therefore, our survey participats estimate that they would achieve a cosiderable productivity savig if they were able to automate the recogitio of had-writte text. The average estimate is 34.8% improvemet, with a media at 23%. 36% would expect a 50% or more improvemet. 12

Figure 17: How much more productive would you estimate that your admi staff would be (or are) if you could automate (or have automated) the recogitio of had-writte text? (N=252) 0% 5% 10% 15% 20% No more produc ve 5% more produc ve 10% more produc ve 15% more produc ve 20% more produc ve 25% more produc ve 33% more produc ve 50% more produc ve 75% more produc ve 100% (Twice as produc ve) More tha twice as produc ve Media ICR Adoptio We asked the geeral survey sample for their assessmet of curret recogitio techology for had-writte text. Overall, 20% are positive i their assessmet, ad a similar umber feel it works well o costraied text. A third admit that they do t kow, as they have t evaluated it lately. No-OCR users are likely to be much more sceptical of ICR techology 10% positive compared to 31% of OCR users ad are much less likely to be basig their view o a recet evaluatio 43% have ot evaluated it recetly compared to 22% of OCR users. Crucially, the mai reaso that most users are ot doig had-writig recogitio is that they do ot have ICRcapable software, followed by a lack of willigess to evaluate it. Doubt about the potetial results is much lower dow the list, ad is cetered o form desig ad cotet. Figure 18: If you do ot use had-writig recogitio, what are the mai reasos? (max TWO)) (N=60 OCR but ot ICR users) We do't have ICR-capable so ware It's ot a priority to ives gate We do't have eough had-wri e cotet o our forms The urestraied had-wri e cotet is too variable The ICR so ware we have does't give good eough results We eed to re-desig our forms to get be er results 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% Not cost-effec ve compared to maual keyig Our ICR so ware is rather old As we saw earlier i Figure 3, 12% of overall respodets are usig ICR to recogize had-prited costraied field etries, ad 6% use ICR to recogize writte script ad free-form etries. 13

For these ICR users, the biggest beefit is time savig o fixed data such as ames ad addresses, ad these are relatively easy fields to validate agaist a ames ad address database. This is followed by variables data such as educatioal qualificatios, medical pre-coditios, claims history, etc., where a degree of cotextual validatio ca be applied. Extractig keywords from free text fields is also a importat beefit for autoclassificatio ad search. Figure 19: If you use had-writig recogitio (ICR), what are the mai beefits i your applicatio? (check those that apply) (N=25 ICR users) Saves me o keyig fixed data, eg. ames ad addresses Saves me o keyig variables data, eg. qualifica os, pre-codi os, claims history, etc. 0% 10% 20% 30% 40% 50% 60% 70% Provides keywords for retrieval search Drives rou g or excep o workflows Provides keywords for research/aaly cs ICR gives be er results tha OCR o poor machie text Coverts/recogizes commets for re-prit, eg, delivery istruc os Coclusio ad Recommedatios Util such time as we are all equipped with tablets ad e-forms, or withi easy reach of a always-o computer, paper forms will be the backboe of iformatio gatherig ad process iput. Scaig forms for archivig or image workflow is a widely accepted way of reducig storage space, improvig access ad speedig up processes. However, we have see that may orgaizatios are slow to take the ext step, which is to replace maual data-keyig with recogitio software, ad automatically trasfer data ito the routig or idexig egie, or better still, ito the process itself. A frequetly give reaso is the difficulty of recogizig had-writte text, ad we have foud that had-writte address, data ad free-format fields play a importat role i most busiess processes ad a icreasig oe as orgaizatios seek to exploit the big data they may cotai. We also foud that a sigificat proportio of busiess forms, o matter how well desiged, still cotai a sigificat umber of had-writte field etries. Over ad above that is the discoected decisio-makig i may orgaizatios that makes it very difficult to cosolidate scaig ad capture requiremets across multiple departmets or processes, particularly with regard to implemetig a digital mailroom sceario. For those orgaizatios that use OCR to recogize machie-prited text, performace is o the whole very good, with the percetage of hads-free throughput i the upper ieties. It is geerally ackowledged that the accuracy of OCR o machie text will usually be higher tha huma re-keyig. However, we have foud that may orgaizatios have ot upgraded or refreshed their OCR software for some years so may be fallig behid the curve of what is ow possible. For may, the output of data capture is used merely to automate the idexig for routig ad search, rather tha to feed the process, ad there is still much scaig that takes place at the ed of the process, allowig paper to hold sway withi the process itself. We have foud that although users uderstad the potetial beefits of had-writig recogitio i terms of a substatial improvemet i process efficiecy of aroud 30%, there is a level of both perceptio ad complacecy that is based o out-of-date evaluatios of how well a moder ICR had-writig recogitio system ca work ad ideed how much it might cost. I may cases, the agecy for scaig ad capture, whether it is a ihouse uit or a outsource bureau, is ot explorig the possibility with the busiess process maagers for automatically capturig these very useful free-format fields. As might be expected i such a demadig applicatio, there is also cosiderable variatio i the sophisticatio of ICR algorithms embedded withi the mai capture system products. 14

Recommedatios Esure that there is a clear resposibility i your orgaizatio for pursuig paper-free processes. Cosult with process owers to cosolidate requiremets, particularly if a digital mailroom solutio servig multiple busiess processes might be appropriate. If you are ot curretly scaig forms at all, re-evaluate the reasos ad iclude all of the beefits of paperfree processes visibility, accessibility, speed of respose, mobilizatio. If you are scaig forms but ot capturig data through OCR, evaluate savigs i keyig costs, speed improvemets, ad quality of data. Do ot assume that you do ot have sufficiet forms to be cost-effective, or that you have too may differet types of form. Cetralizig all mail processig ca chage the tippig poit, ad the cost/performace ratio of OCR techology has dramatically improved over the last few years. If the prevalece of had-writte fields o your forms has put you off automatig your capture, or if you are curretly usig OCR for partial capture ad igorig valuable had-writte cotet, take a fresh look at hadwritig recogitio ad the latest ICR capabilities. ICR could also improve your machie-text recogitio. Collect a umber of examples of both typical ad demadig forms, with mixtures of machie text ad hadwritig, ad have differet capture vedors show how well they ca capture the data. Be prepared to provide supportig data for look-up ad validatio. Ask about mixed feedig of form types. If you are usig a bureau or DPO (Documet Process Outsource), ask them if they have a up-to-date ICR capability that could further improve the level of capture they offer. If you are a bureau or DPO, have you geared up your capabilities to offer the maximum value add as far ito the process as possible? Refereces 1. AIIM Idustry Watch, The Paper Free Office Dream or Reality Feb 2012, http://www.aiim.org/research/idustry-watch/paper-free-capture-2012 15

Appedix 1: Survey Demographics Survey Backgroud The survey was take by 324 idividual members of the AIIM commuity betwee 09 March 2012 ad 29 March 2012 usig a web-based tool. Ivitatios to take the survey were set via email to a selectio of the 65,000 AIIM commuity members Orgaizatioal Size Orgaizatios of 10 employees or less are excluded from all of the results i this report. O this basis, larger orgaizatios (over 5,000 employees) represet 31%, with mid-sized orgaizatios (500 to 5,000 employees) at 34%. Small-to-mid sized orgaizatios (10 to 500 employees) are 37%. over 10,000 emps, 20% 11-100 emps, 13% Geography 5,001-10,000 emps, 11% 1,001-5,000 emps, 26% US ad Caada make up 69% of respodets, with 18% from Europe. Mexico, Cetral/ S.America, 3% Asia,Far East, 4% Australasia, South Africa, 4% Easter Europe, Russia, 2% Wester Europe, 8% Middle East, Africa, 2% 501-1,000 emps, 8% 101-500 emps, 22% U UK & Irelad, 8% US, 55% Caada, 14% 16

Idustry Sector Local govermet ad public services represet 18%, ad atioal govermet 6% - reflectig a log history of forms processig i the govermet sector. Fiace, bakig ad isurace represet 25%. ECM suppliers ad outsource bureaus have bee excluded. The remaiig sectors are evely split. Job Role Egieerig & Costruc o, 3% Cosultats, 3% Retail, Trasport, Real Estate, 3% Power, U li es, Telecoms, 4% Professioal Services & Legal, 4% Miig, Oil & Gas, 4% Pharmaceu cal & Chemicals, 2% IT & High Tech ot ECM, 3% Maufacturig, Agriculture, 4% No-Profit, Charity, 5% Educa o, 5% Healthcare, 6% Isurace, 9% Govermet & Public Services - Local/State, 18% Govermet & Public Services - Na oal, 6% Fiace/Bakig, 16% Records or Iformatio Maagemet disciplies make up 39% compared to 37% from IT. Lie of busiess maagers ad busiess cosultats make up 23%. I Lie-of-busiess execu ve, departmet head or process ower, 9% Presidet, CEO, Maagig Director, 3% Busiess Cosultat, 8% Media, Publishig, Web, 1% Other, 4% Other, please specify, 3% IT Cosultat or Project Maager, 16% IT staff, 16% Head of records/ compliace/ iforma o maagemet, 15% Records or documet maagemet staff, 24% Head of IT, 5% 17

UNDERWRITTEN BY Parascript LLC Parascript is a global leader i developig cursive, hadprit, ad machie prit recogitio techology. Leveragig digital image aalysis ad advaced patter recogitio, its software eables critical busiess automatio i areas like forms processig, postal ad fiacial automatio, fraud prevetio ad medical imagig. Parascript s award-wiig techology draws o a prove 15+ year track record ad processes billios of critical documet images aually. Through its parter etwork, it eables compaies to save moey by elimiatig costly maual data etry with automated recogitio that accurately, securely ad quickly turs characters ito useful data. Parascript recogitio techology reads all text styles, whole words or phrases, ad deciphers poor quality machie prit ad hadwritig ureadable by other recogitio egies. Parascript recogitio products are desiged to be cofigurable ad supported by its worldwide etwork of itegrators, origial equipmet maufacturers ad value-added resellers. Fortue 500 compaies, postal operators, major govermet, ad fiacial istitutios rely o Parascript products, icludig the U.S. Postal Service, IBM, Bell ad Howell, Fiserv, Selex Elsag, Lockheed Marti, NCR, Siemes, ad Burroughs. Visit Parascript olie at http://www.parascript.com. 18

AIIM (www.aiim.org) is the global commuity of iformatio professioals. We provide the educatio, research ad certificatio that iformatio professioals eed to maage ad share iformatio assets i a era of mobile, social, cloud ad big data. Fouded i 1943, AIIM builds o a strog heritage of research ad member service. Today, AIIM is a global, o-profit orgaizatio that provides idepedet research, educatio ad certificatio programs to iformatio professioals. AIIM represets the etire iformatio maagemet commuity, with programs ad cotet for practitioers, techology suppliers, itegrators ad cosultats. 2012 AIIM AIIM Europe 1100 Waye Aveue, Suite 1100 The IT Cetre, Lowesmoor Wharf Silver Sprig, MD 20910 Worcester, WR1 2RR, UK 301.587.8202 +44 (0)1905 727600 www.aiim.org www.aiim.eu 19