Introduction to OCR in Revu



Similar documents
K e y b o a r d s h o rt c ut s

Creating Interactive PDF Forms

There are several ways of creating a PDF file using PDFCreator.

Microsoft Word 2011: Create a Table of Contents

Document Management Quick Start and Shortcut Guide

Quick Guide for Accessible PDF July 2013 Training:

Snap 9 Professional s Scanning Module

ADOBE ACROBAT X PRO SCAN AND OPTICAL CHARACTER RECOGNITION (OCR)

Create an Electronic Thesis or Dissertation Using Adobe Acrobat

Adobe Conversion Settings in Word. Section 508: Why comply?

Create Mailing Labels from an Electronic File

Adobe Acrobat 9 Pro Accessibility Guide: Creating Accessible PDF from Microsoft Word

Oracle Fusion Middleware

Excel 2003 Tutorial I

Managing Expense Reports Program Administrators and Approvers

Importing Contacts to Outlook

CloudCTI Recognition Configuration Tool Manual

Chapter 6. Formatting Text with Character Tags

Guide to Creating Electronic Appellate. I. Briefs II. Appendices III. Hyperlinking

Reduce File Size. Compatibility. Contents

EBA TIP SHEET 1a: Adobe PDF Document Management Creating, Optimizing, and Reducing PDFs (Rev.:5/15/15)

Document Management User Guide

EMC Documentum Webtop

Maple Quick Start. Introduction. Talking to Maple. Using [ENTER] 3 (2.1)

MS WORD 2007 (PC) Macros and Track Changes Please note the latest Macintosh version of MS Word does not have Macros.

Using Acrobat Comment and Markup tools

EPSON PERFECTION SCANNING BASICS

ImageNow User. Getting Started Guide. ImageNow Version: 6.7. x

How to Build a Form in InDesign CS5

Acrobat X Pro PDF Accessibility Repair Workflow

State of Ohio DMS Solution for Personnel Records Training

TIBCO Spotfire Automation Services 6.5. User s Manual

Installation of the KX-P8420 Print Driver And Support Files (Windows). Introduction:

GOOGLE DOCS APPLICATION WORK WITH GOOGLE DOCUMENTS

Asset Track Getting Started Guide. An Introduction to Asset Track

Microsoft Word Revising Word Documents Using Markup Tools

Task Force on Technology / EXCEL

Personal Paperless Document Manager Customer Orientation Guide

Using Microsoft Word to Create Your Theses or Dissertation

Migrating to Excel 2010 from Excel Excel - Microsoft Office 1 of 1

Using Device Discovery

Creating Digital Signatures

ABBYY PDF Transformer+ User s Guide

A. BACK UP YOUR CURRENT DATA. QuickBooks Business Accounting Software for Windows Account Conversion Instructions

Batch Scanning. 70 Royal Little Drive. Providence, RI Copyright Ingenix. All rights reserved.

Excel 2007 Basic knowledge

MICROSOFT EXCEL 2011 MANAGE WORKBOOKS

Creating and Using Links and Bookmarks in PDF Documents

Reviewing documents with track changes in Word 2013

Cataloging: Save Bibliographic Records

Network Scanner Tool R3.1. User s Guide Version

Introduction to Measurement Tools

Content Author's Reference and Cookbook

Introduction to Microsoft Publisher : Tools You May Need

Office of History. Using Code ZH Document Management System

Basic Excel Handbook

Veco User Guides. Document Management

Getting Started Guide

P2WW ENZ0. How to use ScandAll PRO

User Guide. Opening secure from the State of Oregon Viewing birth certificate edits reports in MS Excel

Word 2010: The Basics Table of Contents THE WORD 2010 WINDOW... 2 SET UP A DOCUMENT... 3 INTRODUCING BACKSTAGE... 3 CREATE A NEW DOCUMENT...

Quick Reference Guide

Contents. Overview...2. License manager Installation...2. Configure License Manager...3. Client Installation...8. FastLook Features...

Lesson 07: MS ACCESS - Handout. Introduction to database (30 mins)

Adobe Acrobat 6.0 Professional

Creating Electronic Portfolios using Microsoft Word and Excel

How to Design a Form Report (RTF) Output

Legal Notes. Regarding Trademarks. Model supported by the KX printer driver KYOCERA MITA Corporation

How to create buttons and navigation bars

Create a PDF File. Tip. In this lesson, you will learn how to:

USER GUIDE. Unit 4: Schoolwires Editor. Chapter 1: Editor

KEYBOARD SHORTCUTS. Note: Keyboard shortcuts may be different for the same icon depending upon the SAP screen you are in.

Creating a table of contents quickly in Word

Help File. Version February, MetaDigger for PC

Microsoft Word 2010 Prepared by Computing Services at the Eastman School of Music July 2010

Appointment Scheduler

Recording Supervisor Manual Presence Software

CardReader 100 Scanner Copyright 2003 Visioneer, Inc. Visioneer and Visioneer logo are registered trademarks of Visioneer, Inc. All rights reserved.

Microsoft Migrating to Access 2010 from Access 2003

Practice Fusion API Client Installation Guide for Windows

Module One: Getting Started Opening Outlook Setting Up Outlook for the First Time Understanding the Interface...

Best practices for producing high quality PDF files

Quick Reference Guide

Word 2007 Unit B: Editing Documents

Barcode Support. Table of Contents

Kareo Quick Start Guide April 2012

Getting Started with Ascent Xtrata 1.7

GETTING STARTED WITH COVALENT BROWSER

Windows XP Chinese Character Support Installation Instruction

Creating Interactive PDF Documents with CorelDRAW

MICROSOFT WORD TUTORIAL

Microsoft Migrating to Word 2010 from Word 2003

Instructions for applying data validation(s) to data fields in Microsoft Excel

University of Miami Information Technology

ITCS QUICK REFERENCE GUIDE: EXPRESSION WEB SITE

Foxit Reader Quick Guide

Configuring Internet Explorer for CareLogic

SLIDE SHOW 18: Report Management System RMS IGSS. Interactive Graphical SCADA System. Slide Show 2: Report Management System RMS 1

TABLE OF CONTENTS BACKGROUND: HIGH IMPACT 4.0 PROFESSIONAL AND ACT!. 3 SELECT MAIL MERGE OPTION ON THE MAIN SCREEN.0 TEMPLATE.

Windows, Menus, and Universal Document Shortcuts

Transcription:

Introduction to OCR in Revu Optical Character Recognition (OCR), or text recognition, translates the text in scanned PDF documents into searchable text. Once OCR has been run on a scanned PDF, you can search the document for specific text, add bookmarks and hyperlinks on text, copy text to another document or use one of Revu's advanced text editing tools. Compatibility Revu extreme 12.0 or higher Contents Running OCR on a Single Document Running OCR on Multiple Documents (Batch) Search for Text in a Document After Running OCR Copying and Editing Text in a Document After Running OCR 1

Running OCR on a Single Document 1. Open the document on which OCR is to be run. 2. From the Command bar, go to Document > OCR or use the keyboard shortcut Ctrl+Shift+O. The OCR dialog box opens. The OCR function will also be invoked when the Create PDF from Scanner or Camera function in Revu is used, opening the OCR dialog box automatically. 3. The languages that will be used by the OCR process are shown under Recognition Languages. The American English library is loaded by default. To add other libraries, click Add. To remove a library, select it and click Remove. Multiple libraries can be used on the same document. 4. Set the OCR Configuration options, as desired: Correct Skew: Enable to correct angular deviations in scanned documents. Detect Orientation: Enable to detect the page orientation (90, 180 and 270 degrees) of each page and correct it if needed. Detect Text in Pictures and Drawings: Enable to detect text in graphics. 2

Rotate Markups: If Correct Skew is enabled, use this option to also adjust existing markups so they line up with skew-corrected text or images. Skip Vector Pages: Enable to skip processing of pages with vector content. Page Chunk Size: Use to determine the maximum number of pages sent to the OCR engine at one time. Increasing chunk size can increase speed, but will also consume more of the computer's resources. Note: Enabling Page Chunk Size and setting it to 1 is recommended for OCR jobs performed on PDFs that have a large number of pages, are of substantial file size or contain large format drawings. If OCR is run on a PDF with no results, running it again with a Page Chunk Size of 1 can correct the problem. Max Vector Size: Use to set the maximum vector size that will be analyzed during the OCR process; any vectors larger than this setting will be discarded in pre-processing. Decreasing this value can increase speed, but might also cause larger text (for example, larger fonts) to be inadvertently ignored. Optimize for: Use to optimize the OCR process for the selected document type. The CAD Drawing setting tends to ignore text formatting, for example, while the Text Document setting does not. 5. To select a Page Range, click the Pages menu and select from the following: All Pages: Sets the range to all pages. Current: Sets the range to the current page only. The current page number will appear in parentheses, for example, Current (2) if page 2 is the current page. Selected: Sets the range to the current selection. This option only appears if pages were selected prior to invoking the command. Custom: Sets the range to a custom value. When this option is selected the list becomes a text box. To enter a custom range: Use a dash between page numbers to define those two pages and all pages in between. Use a comma to define pages that are separated. For example: 1-3, 5, 9 will include pages 1, 2, 3, 5 and 9. 6. Click OK to run OCR. 3

Running OCR on Multiple Documents (Batch) 1. From the Command bar, go to File > Batch > OCR. The Batch: OCR dialog box opens. 2. Add documents using one (or both) of the following methods: Click Add Open Files to add currently open files to the list. Click Add to select files from a local or network drive to the list. 3. To select a Page Range, click the Pages menu and select from the following: All Pages: Sets the range to all pages. Custom: Sets the range to a custom value. When this option is selected the list becomes a text box. To enter a custom range: Use a dash between page numbers to define those two pages and all pages in between. Use a comma to define pages that are separated. For example: 1-3, 5, 9 will include pages 1, 2, 3, 5 and 9. 4. Click the Apply To lists to select among Even Pages Only, Odd Pages Only or Odd and Even Pages and among Landscape Pages, Portrait Pages or Landscape and Portrait Pages. 4

5. Select the next PDF in the File List and repeat steps 3 and 4 until Page Range and Page Filter options have been set for each PDF. 6. Click OK. The OCR dialog box opens. 7. The languages that will be used by the OCR process are shown under Recognition Languages. The American English library is loaded by default. To add other libraries, click Add. To remove a library, select it and click Remove. Multiple libraries can be used on the same document. 8. Set the OCR Configuration options, as desired: Correct Skew: Enable to correct angular deviations in scanned documents. Detect Orientation: Enable to detect the page orientation (90, 180 and 270 degrees) of each page and correct it if needed. Detect Text in Pictures and Drawings: Enable to detect text in graphics. Rotate Markups: If Correct Skew is enabled, use this option to also adjust existing markups so they line up with skew-corrected text or images. Skip Vector Pages: Enable to skip processing of pages with vector content. Page Chunk Size: Use to determine the maximum number of pages sent to the OCR engine at one time. Increasing chunk size can increase speed, but will also consume more of the computer's 5

resources. Note: Enabling Page Chunk Size and setting it to 1 is recommended for OCR jobs performed on PDFs that have a large number of pages, are of substantial file size or contain large format drawings. If OCR is run on a PDF with no results, running it again with a Page Chunk Size of 1 can correct the problem. Max Vector Size: Use to set the maximum vector size that will be analyzed during the OCR process; any vectors larger than this setting will be discarded in pre-processing. Decreasing this value can increase speed, but might also cause larger text (for example, larger fonts) to be inadvertently ignored. Optimize for: Use to optimize the OCR process for the selected document type. The CAD Drawing setting tends to ignore text formatting, for example, while the Text Document setting does not. 9. Click OK to run OCR. Search for Text in a Document After Running OCR One advantage of running OCR on a scanned PDF is the ability to search it for a specific text string. Since scanned PDFs are images, this is not possible until after OCR is run. To search for text in a document: 1. Select the Search tab. If the Search tab is not open, go to Tab Access > Search or use the keyboard shortcut Alt+1 or Ctrl+F. 2. Enter text to search for in the Text field. 3. Select Current Document from the Search In dropdown menu. 4. Select any of the desired Options: Search Pages: Searches for text in the content of the PDF. Search Filenames: Searches the file names in the Recents list when Search In is set to Recents. Search File Properties: Searches for text in the Properties metadata fields. Search Form Fields: Searches for text in the data entered in the form fields. Search Markups: Searches for text in markups. Case Sensitive: Searches for text with the exact case typed in the Search Terms field. 6

Whole Words Only: Searches only for instances where the search term exists as a complete word. If the search term is partially contained in another word and the Whole Words Only box is checked, it will not be included in the search results. 5. Click Search. Results are shown below the Options panel. Copying and Editing Text in a Document After Running OCR Many advanced features available in Revu can be applied to text in a scanned PDF on which OCR has been run. Use the Select Text tool or the keyboard shortcut Shift+T to select text and right-click it to prompt a context menu with several useful commands. Add Bookmark: Inserts a bookmark at this location using the selected text as the name of the bookmark. Add Hyperlink: Opens the Action dialog box to define a hyperlink action for the selected text. Mark for Redaction: Marks the selected text for redaction. Copy: Copies the selected text. Paste: Pastes previously copied text over the selected text. Highlight Selected Text: Highlights the selected text. Underline Selected Text: Underlines the selected text. Squiggly Selected Text: Inserts a squiggly line under the selected text. Strikethrough Selected Text: Strikes through the selected text. Replace Selected Text: Opens a Replacement Text pop-up window for the selected text. Insert Text at Cursor: Opens an Insert Text pop-up window at the current position of the cursor, which is unavailable when text has been selected. Select All Text: Selects all text on the current page. Deselect All Text: Deselects currently selected text. Look Up: Opens a WebTab to look up the selected text in Wikipedia. 7

Search: Opens the Search tab and searches the current document for the selected text. 8