Form Design Guidelines Part of the Best-Practices Handbook. Form Design Guidelines



Similar documents
Optical Character Recognition (OCR)

Elfring Fonts, Inc. PCL MICR Fonts

ELFRING FONTS INC. MICR FONTS FOR WINDOWS

Technical Drawing Specifications Resource A guide to support VCE Visual Communication Design study design

Contents. A July 2008 i

Instructions for Creating a Poster for Arts and Humanities Research Day Using PowerPoint

A. Scan to PDF Instructions

Learn about OCR: Optical Character Recognition Track, Trace & Control Solutions

Substitute Printed and Computer-Generated Tax Forms and Schedules TABLE OF CONTENTS. I. Purpose II. Scope... 2

Guide To Creating Academic Posters Using Microsoft PowerPoint 2010

Designing forms for auto field detection in Adobe Acrobat

Snap 9 Professional s Scanning Module

If you know exactly how you want your business forms to look and don t mind detail

Parcel Labeling Guide

Customer Barcoding Technical Specifications

Tips for optimizing your publications for commercial printing

Creating Print-Ready Files

The Revised CMS-1500 Form at a Glance

2015 BLOCK OF THE MONTH 12 ½ x 12 ½

Graphic Design Basics. Shannon B. Neely. Pacific Northwest National Laboratory Graphics and Multimedia Design Group

Divide your material into sections, for example: Abstract, Introduction, Methods, Results, Conclusions

CREATING POSTERS WITH POWERPOINT

Advanced Scanning Techniques

REVISED JUNE PLEASE DISCARD ANY PREVIOUS VERSIONS OF THIS GUIDE. Graphic Style Guide

Creating Interactive PDF Forms

Publisher 2010 Cheat Sheet

KB COPY CENTRE. RM 2300 JCMB The King s Buildings West Mains Road Edinburgh EH9 3JZ. Telephone:

What is a barcode? To take advantage of barcoding, customers need to: Barcoding provides the following benefi ts: Barcodes and Business Letter

Guide to design and layout

Online Check Deposit. Check Standards and Troubleshooting Guide. General Overview. What are acceptable items vs. unacceptable items?

Laser cutter setup instructions:

The AA Style Guide VC501 Historical and Contextual Studies Tyrone Duke

6. Optical Character Recognition (OCR) Technology

SUBMITTING A PRESS-READY COVER For Paperback Books with Perfect Binding, Plastic Comb, and Plastic Coil Binding

EPSON SCANNING TIPS AND TROUBLESHOOTING GUIDE Epson Perfection 3170 Scanner

A New Imaging System with a Stand-type Image Scanner Blinkscan BS20

Organize your project in a way that identifies the research questions and methodology you will use.

For questions regarding use of the NSF Logo, please

Image Optimization GUIDE

Barcode Labels Feature Focus Series. POSitive For Windows

Table of Contents Logo Implementation Typography Corporate Stationery Divisional Stationery Collateral Materials Web Site

A series Metric (cm) Imperial (inch)

NiceLabel Designer Standard User Guide

designed and prepared for california safe routes to school by circle design circledesign.net Graphic Standards

WHAT You SHOULD KNOW ABOUT SCANNING

Periodontology. Digital Art Guidelines JOURNAL OF. Monochrome Combination Halftones (grayscale or color images with text and/or line art)

NDSU Technology Learning & Media Center

Brand Guide for Licensees of Shippensburg University of Pennsylvania

Step 1: Setting up the Document/Poster

GUIDE. Field Service Center Backup Policy and Procedures Guide

MACHINE VISION MNEMONICS, INC. 102 Gaither Drive, Suite 4 Mount Laurel, NJ USA

Achieving 5 Nines Business Process Reliability With Barcodes. Michael Salzman, VP Marketing (408) sales@inliteresearch.

UNDER REVISION. Appendix I. NCES Graphic Standards for Publication and Other Product Covers, Title Page, and Back of Title Page

CSCA0201 FUNDAMENTALS OF COMPUTING. Chapter 4 Output Devices

Correcting the Lateral Response Artifact in Radiochromic Film Images from Flatbed Scanners

The following is an overview of lessons included in the tutorial.

All V7 registers support barcode printing, except the Sharp 410/420 1A ROM and that limitation is based upon the register.

Creating Forms with Acrobat 10

Ten steps to GS1 barcode implementation. User Manual

Making the most of your conference poster. Dr Krystyna Haq Graduate Education Officer Graduate Research School

ABBYY recognition technologies ideal alternative to manual data entry. Automating processing of exam tests.

MGL Avionics. MapMaker 2. User guide

Chapter 5 Objectives. Chapter 5 Input

Petrel TIPS&TRICKS from SCM

Learning ReportBuilder

AccuRead OCR. Administrator's Guide

EPSON PERFECTION SCANNING BASICS

CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin

Using Microsoft Word. Working With Objects

Forest Stewardship Council

Scanners and How to Use Them

Calibration Control. Calibration Management Software. Tools for Management Systems

MICROSOFT POWERPOINT STEP BY STEP GUIDE

LEAR Corporation Sweden. Odette Transport Label Guideline Vers. 1 Rev. 4

Document scanning. Done right. TM

Understanding Resolution and the meaning of DPI, PPI, SPI, & LPI

Documents to Follow (DTF) Image/Archive Reference Guide

Increase your efficiency with maximum productivity and minimal work

Appendix E: Marker Guidelines and Signs

AATB/ICCBBA Interim Guidance Document. For use of ISBT 128 by North American Tissue Banks

Contents TeleForm Bookings... 5 TeleForm System Overview... 7 TeleForm Designer... 9 Create a Template Save a Template Adding pages for

Photoshop- Image Editing

Introduction to Microsoft Word 2008

GUIDELINES FOR PREPARING POSTERS USING POWERPOINT PRESENTATION SOFTWARE

WORKING TOGETHER FOR SUCCESS

Scanned and Delivered Richard King

BookMaker. User Guide Windows & Mac OSX. Devalipi Software The Easiest Professional Digital Book Printing Tool

Florence School District #1

If you know exactly how you want your business forms to look and don t mind

BRANDING GUIDELINES DESIGNED & DEVELOPED BY FRUITION CONTACT ICON EYECARE BRANDING GUIDELINES. Online. Address. Phone

HOW TO PRINT YOUR DIGITAL SCRAPBOOK

Transcription:

Part of the Best-Practices Handbook Form Design Guidelines I T C O N S U L T I N G Managing projects, supporting implementations and roll-outs onsite combined with expert knowledge. C U S T O M D E V E L O P M E N T Developing tailor made templates designed for maximized recognition and automated workflow processing. The general layout of the form is very important to both human users and the recognition system. The general principle is to design a form that is user-friendly and at the same time includes many of the requirements for making the forms easy for humans are compatible with image-based recognition systems requirements. There are three steps towards good form design: Deciding What to Capture Designing the Physical Layout of the Form Designing the Data-Entry Fields T R A I N I N G Knowledge is power. Knowing how to use software is key in getting the greatest benefits. Capture-Experts provides custom sized training sessions. Introduction to Form Design A well-designed form speeds up and improves the reliability of any Intelligent Character Recognition process so, naturally, achieving efficient data recognition begins with correctly designing a master form. A form works well if it has the following characteristics: It is easy for the user to fill out It uses as few methods as possible for collecting the information. Some examples of different types of methods are multiple choice questions, yes/no questions, constrained answers, unconstrained answers, etc The data fields are clearly defined to encourage answers that are correctly formatted The instructions are written in clear, simple language

Deciding what to capture The first phase in designing a form for ICR processing is to decide what data needs to be captured from it. In general, the following steps should be taken: Identify the fields that will require Intelligent Character Recognition (ICR). List the fields by field name and identify the number of characters required for each field. Note: A date field should have six to eight characters representing day, month, and year. For other fields like phone and fax number and postal/zip codes, there are an exact number of characters for each field. For name or address fields, use the maximum number of characters that you would expect to see in that field. Group the fields you have listed according to function or type. A good form will contain as many yes/no and multiple choice questions as possible. Decide on the headings for each group of data on the form. Write out the instructions and examples you want to use in the form; - Place long or detailed instructions on the back of the form or in a separate document, and provide a simple direction to the instructions on the face of the form. - Keep the instructions and examples simple and use plain language; a short instruction such as "PLEASE PRINT USING BLOCK CAPITAL LETTERS" can prove invaluable, something like "PLEASE PRINT WITH BLACK INK" can also be helpful. - If space allows, it is also useful to give a small example of a correctly filled in field.

Designing the Form Once the data to be captured has been specified, the form can be laid out using a spreadsheet or a drawing application. Choosing the Form Size and Weight The size of the page used in the form design may be determined by the distribution method or by other considerations such as the number of fields on the form. The size of the form, weight of the paper, and the anticipated volume of forms to be processed will play a role in the determination of which scanner is used to process the forms. Design the size of the form so that the forms can be automatically fed into the scanner you have chosen. All scanners readily handle A4 or letter size documents. If the form design requires the use of smaller documents, special paper handling capabilities may be required. The weight of the paper selected can also influence paper handling. Most scanners specify the weight range that can be handled, so this should be checked. Too light a weight can result in tearing and folding, too heavy can result in document jams. A good selection for most purposes is 80 gm/m2. Using White Space White space provides a "buffer" around the meaningful data on a form, ensuring that the recognition system can locate data easily even if the document is scanned at a slight angle or offset. A margin of at least 1/4" (6.4 mm) around the entire form should be provided. A margin of 1/2" (13 mm) is recommended. The registration marks should be at least 1/2" away from the edge of the paper. The registration marks should also be at least 3/8" away from any other black items on the form, colored items that may show up during scanning, or areas where data is to be entered. 1/4" (6.4 mm) of clear space should be left around each recognition field. This clear space may include items printed in drop-out ink, such as field constraint lines. Specified areas for placing endorsement stamps, initials, or signatures should be included but should be placed as far away as possible from any recognition areas. If signatures are to be detected as part of the recognition they should be well away form any other data capture areas. Signatures tend to stray outside the designated area and may overlap other fields, causing errors.

Designing the Form Other Considerations A well-designed form will also include registration marks, or targets, to help the recognition system detect and correct image skew or distortions caused by the scanner. To optimize results, forms should be designed with two colors to allow for the use of drop-out ink. Drop-out ink allows inclusion of information that is visible to the human reader but is not needed for the recognition process. The information printed in a drop-out color is removed during the scanning process. Many forms also include information that identifies the form. The recognition system can use this information, known as a document identifier or Form ID, to identify the form it is currently processing. It can then locate the appropriate template to read the data. Finally, one of the most important steps in laying out the form is in the design of the data-entry fields. The data entry fields direct the user as to what data to enter and how to fill in that data. Also, the design of the fields can help or hinder the system in locating meaningful data. The better the system can locate and identify the data in the image, the faster and more accurately it can read the data.

Registration Marks What is a Registration Mark All scanners skew forms as they travel through the feeder mechanism, resulting in image distortion and recognition problems. Registration marks are special markings on the page that aid the recognition system in de-skewing scanned images. Most recognition engines have the capability to detect registration marks on a form to aid with alignment and/or auto-detection of form type. Several marks may be defined on an image. If one mark is defined then all other fields are aligned to it and keep their defined dimensions. If two marks are defined then the angle between the two marks is used to correct for skew of the image while three marks can be used to de-stretch an image. Further marks (up to a maximum of ten per form type) may be defined to provide anchor points for fields on the image. For Recognition, registration marks may either be in the form of a box, a rectangle (solid box), a blob (any distinct mark not connected to any others), a cross, corners (two lines such as form the corner of a rectangle), junctions, lines or even text. The variety of registration marks available means that most forms can be registered, even if they were not specifically designed with this in mind. Note that if two marks are defined then they need not be of the same type. Each registration mark should be fairly wide and well defined. Registration marks can also be used to auto-detect form types, provided that there is some form of detectable mark in distinct positions on each of the forms to be recognized. However, it is usually preferably to use consistent registration marks across all different form types, and achieve form-type detection by use of a unique numeric ID (discussed in more detail later). Positioning Registration Marks Registration marks should be placed on the form in such a way that the box they define encloses all fields to be read. The box defined by the registration marks should be wide and high enough to allow for the detection of image distortion. For example, if 2 registration marks are defined that are both located in the left-hand section of the image, then it makes the scale and slope calculations for fields on the right-hand side of the form less accurate. For best results: Use three well-separated marks. This allows all normal combinations of skewing and stretching of the image to be detected, and is essential for forms that may be faxed. Two marks may be sufficient in other cases. Registration marks should be placed as far apart as possible on the form. The best location is in three corners of the form. By omitting a mark in one corner, it is possible to auto-detect documents that have been rotated 180 degrees when scanned. Registration marks should be placed at least 1/2" away from the edge of the paper and at least 3/8" away from any other black items on the form, colored items that may show up during scanning, or areas where data is to be entered. The further in from the corners the marks are positioned, the less likely it is that they will be lost when corners of the paper are torn or folded.

Drop Out ink What is Drop-Out Ink? Drop-out ink is used to provide information that is visible to the human reader but is removed during the scanning process. It refers to the color ink used on a form that cannot be detected by the scanner because of its high reflective value (generally reflectance greater than 60%). Any item or print using drop-out ink is not an item for the recognition system to process. Drop-out ink should be used when printing character boxes, check boxes, and other items on a form in or near an area to be recognized. When a form is scanned, the image of the box does not appear in the scanned image, leaving only the image of the printed character to be processed. The character can then be processed by Recognition without interference from the lines or other items on the form. Drop-out ink should also be used for any punctuation in a field. Users filling out the form will understand better what is expected without the punctuation interfering with the recognition process. Drop-out ink should be used for any text that is within 1/4"(6.4 mm) of data that the recognition system must read. An Important Note on Half-Toning Printers often use a trick to produce the effect of light colors, known as half-toning. This is where a dark ink is printed as a block of very fine dots, so that to the human eye it looks like a lighter shade of the same color. This uses less ink colors, and saves money in printing costs. Do not use this technique where dropout is required. While to the human eye it looks like a light shade, a modern scanner has a very high optical resolution and will see the individual dots. The result will usually be a very noisy image, and will usually give worse images and recognition than both true dropout and using solid non-dropout colors. Non Drop-Out Forms Though using drop out ink usually increases the accuracy of the character recognition process, the disadvantage is that the printing process is more costly. If cost is critical then character boxes may still be printed in non drop-out ink, in which case the powerful background removal algorithms built into Recognition can be used to maintain recognition accuracy. To improve recognition make sure the lines of the boxes wide enough to be well defined on the scanned image (typically about 0.4mm). Also note that for faxed forms, it is usually preferable to use non drop-out forms, since faxes do not reliably ignore the drop-out ink.

Drop Out ink Drop-Out Colors Drop-out colors are a function of the scanner used. Most scanners available today are insensitive to pastel blue, green, and yellow. Some scanners have special lenses available which allow them to drop out red. Since each scanner is different, the only way to be sure which color will drop out on a particular scanner is to run a test. To test the scanner for drop-out colors Obtain a copy of a "Pantone Matching System Printing Inks" (PMS) book. This book contains the printed color samples and mixing instructions for the color inks used by all printers as a standard. Fan out the pages of this book so that all of the colors that have a 30 parts white to 1 part color are showing. Place the fanned book face down on the flatbed of your scanner and scan. Make several passes each time darkening the brightness. By completing the above process, several colors that are acceptable to a particular scanner can be found. You might try the following colors: PMS 100 Yellow PMS 196 Red PMS 217 Red PMS 250 Lt. Purple PMS 263 Blue Purple PMS 277 Blue PMS 290 Blue PMS 304 Lt. Blue PMS 317 Lt. Green PMS 331 Green Form IDs If more than one type of form is going to be read by the system a form ID feature should be included. There are three mechanisms that Recognition uses to perform form identification: registration marks, form size and form ID strings.

Drop Out ink Registration Mark Form ID Registration marks can be used to auto-detect form types, provided that there is some form of detectable mark in distinct positions on each of the forms to be recognized. Document Size Form ID The size of the form, or rather its image, can be used to determine its type. Note that this facility relies on the scanner automatically cropping images of documents to the correct size. ID Strings Printed text on a form can be used to ID the form type. This will typically be a document identifier, preferably numeric, printed in the same position on many different forms. The use of ID strings, preferably printed in OCR-B, is strongly recommended, even when only form type will initially be used in a system, since this allows additional forms or new variants to be introduced into the process at a later date without loss of recognition accuracy.

Font Styles Image-based recognition is the process of identifying objects in an image as characters or numbers and representing them in a form that a computer can process. Recognition systems can read a wide variety of data and are not usually limited to special printing or character sensing technologies. The different types of data that can be read by recognition systems include mark sense boxes, barcodes, MICR fonts, OCR fonts, machine print, and hand print. Machine print includes any data printed by a laser, dot-matrix, or impact printer, a typewriter, or a typesetter. The most common use for machine print fields are for pre-printed items such as names, addresses, ID numbers, tracking codes, or serial numbers. For pre-printed data, a simple font at least 10 points in size should be used. Fixed space fonts, such as courier, produce the best results, particularly if the characters are not touching. For data typed or machine printed by the user, particular font and size should be specified. Make sure to leave sufficient space in the field to account for misalignments by the printing device. For best results, all printing should be upper-case. Recognition can recognize any machine print font. A standard set of common fonts are supplied by default, but any other fonts can be accommodated by running the training utility. Of the common fonts, the accuracy of read is as follows Courier - excellent OCR B - excellent OCR A - excellent Arial - very good Times Roman good Courier, OCR A and OCR B are easiest to read because they are fixed pitch, i.e. each character occupies exactly the same horizontal space regardless of its actual width. This means that if one or more character is joined it is easy to split them apart. Arial is not fixed pitch but is non-serifed so is less prone to character joining than Times Roman. In serifed proportionally spaced fonts such as Times Roman, it can be difficult for example to distinguish between m and rn.

Designing the Data-entry Fields Overview Characters are naturally grouped in fields. A field is a set of data located in a particular region of the form that is to be read as a whole, such as a name or a telephone number. The length of the field is determined by the number of characters contained in it. In a well-designed form, the data fields are clearly defined to encourage answers that are correctly formatted. In addition, the better the system can locate and identify meaningful data in the image, the faster and more accurately it can read the data. To make the fields easy to locate: A minimum of 1/4" (6.4 mm) clear space should be left around the data. (Drop-out ink can be used within this zone.) A field label should be included. As many constraints as possible should be used to guide the user to enter the data correctly. Formats for dollar amounts, dates, and times should be specified. Punctuation should be in drop-out ink. Check boxes should be used for multiple choice selections or to indicate that a given item is relevant. Check Boxes Check boxes can be used for multiple choice selections or to indicate that a given item is relevant. The recognition system uses "mark sense recognition" to determine whether the box has been checked. The system treats any data within the mark sense box as a "yes" response. Therefore, the user can indicate a choice by filling in the entire box or simply marking with an 'X' or a check mark. A check box can be almost any size, and can be used for applications such as checking an option or verifying that a signature is present. A well designed form will contain as many yes/no or multiple choice questions as possible. If space allows, it is worth giving a sample of a check box filled out with an X, since this is preferable to a tick, which can easily stray into neighboring boxes. Guidelines in designing check boxes Check boxes should be printed in drop-out ink. The white space within the check box must be large enough to provide clear and accurate marks. At least 1/4" (6.4 mm) should be left between check boxes to prevent overlapping of marks destined for one box getting into another. Clear instructions and examples should be provided to show the user how to fill in the boxes correctly and distinctly. It is recommended to print an X inside the check boxes as a guide for the user so that he or she knows not to circle the box. This X must be printed in drop-out ink.

Character Fields Character Fields Field constraints are lines or boxes in a form to guide (or constrain) the user in entering data. They ensure that the data is in the correct location, is formatted correctly, and does not overlap other data. Because individual handwriting varies so widely, the more constraint you impose on the user, the more likely the characters will be distinct and consistent. Forms to be filled out with hand-printed information should be designed so that each letter or number is to be written in a specifically designated area. Individual character boxes are highly recommended. It is also a good idea to print the character set recommended to give the best results on your form. Note: It is extremely important that only drop-out ink is used within the designated field areas. If black ink is used within the fields, the system will confuse pre-printed information with the information filled out by the user. The most common types of character fields are Isolated Character Fields, Semi-Constrained Character Fields, and Unconstrained Character Fields. Isolated Character Fields An isolated character field is a field type where each character position is clearly defined and is clearly separate from the other characters in that field. Isolated character fields yield the best results for forms that are to be filled out with hand print. Isolated character fields promote faster processing of characters with higher accuracy. Recommendations: Drop-out ink should be used to print the boxes with a line thickness of 0.5-1.0 mm. The size of each character box should be a minimum of 5 x 6 mm. There should be enough white space between each character box and between each field to prevent printed characters from overflowing the box boundaries. Each character box should be slightly taller than it is wide as people tend to fill in boxes to the shape, so short wide boxes would produce short wide characters.

Character Fields Semi-Constrained Character Fields A semi-constrained character field is a field type where each character position is well defined but not necessarily isolated. Semi-constrained character fields typically provide the best results in most practical situations. They are very similar to isolated fields but the potential for each character's printing to leave its own area or "leak" into the next character's area is high. The best results occur when the character boxes are drawn separated from one another, just as with an isolated field. It is also possible to draw the character boxes so that they are touching, although the resulting accuracy may not be as precise. Unconstrained Character Fields An unconstrained character field is a field which does not contain any lines or boxes restricting the position of each character entered. Unconstrained fields are more difficult to recognize and require more processing time, but are invaluable where field designs cannot be controlled. Hand, machine, numeric, and alpha fields can be unconstrained although the best results are realized for numeric fields. Alpha fields where characters are broken or touching are the most difficult to recognize. For the best design of unconstrained fields, ample white space should be provided for the field with no lines binding the field. If the field is to be delineated (e.g., the courtesy amount on a check), the white space for the field should be surrounded with borders printed in drop-out color. Multiple Page Issues If a form has multiple pages, or is double-sided, it is necessary to include page indicators on each page. This will, in most cases, be a page number. Recognition can perform pre-recognition on the page number to determine which page is being processed and, therefore, what data to expect. Recommendations: Page numbers should be placed in exactly the same location on every page of the form. A comfortable margin of white space should be left around the number, about 1/8 inch or 4 mm. The word "page" should not be in this margin. An alternate page indication method uses rectangles, filled in according to the binary number system, used to signal the recognition system which page is being read. Recommendations: The indicator should be placed in exactly the same location on every page of the form. A comfortable margin of white space should be left around it, about 1/8 inch or 4 mm.

Support, Maintenance and Requirements I T C O N S U L T I N G Managing projects, supporting implementations and roll-outs onsite combined with expert knowledge. C U S T O M D E V E L O P M E N T Developing tailor made templates designed for maximized recognition and automated workflow processing. T R A I N I N G Knowledge is power. Knowing how to use software is key in getting the greatest benefits. Capture-Experts provides custom sized training sessions. For more information on our products and services please visit us on the Web at www.capture-experts.com Wayenborgstraat 5 BE-2800 Mechelen Tel +32 (0)477.85.49.20 Fax +32 (0)9.324.45.07 S E R V I C E S Installation and Setup Maintenance Training Application Support Custom development Cloud integration