ADOBE ACROBAT X PRO SCAN AND OPTICAL CHARACTER RECOGNITION (OCR) Last Edited: 2012-07-12 1
Scan a Paper Document to PDF... 3 Configure Presets for Scan... 4 Set up Optimization Options... 11 Edit Settings... 13 Recognize Text in a Scanned PDF... 19 Review and Correct ORC Suspects... 23 The following training document is using Lynda.com Last Edited: 2012-07-12 2
Scan a Paper Document to PDF Make sure you scanner has been properly installed to your computer Open Adobe Acrobat X Click on Create > point your cursor to PDF from Scanner Here you will see five preset options Autodetect Color Mode Black & White Document Grayscale Document Color Document Color Image To see what each of these preset options does, click on Configure Presets Last Edited: 2012-07-12 3
Configure Presets for Scan In the Configure Presets window, next to Presets, you can select one of the preset options from the menu to see its settings For example, if you choose Color Document you can see that: Both sides of the paper will be scanned The color mode is Color Last Edited: 2012-07-12 4
You can adjust the resolution of the scan You can also adjust the paper size Under Document Settings section, you can do the following: Check the box next to optimize if you want Adobe to apply optimization settings to the scanned picture that is being turned into a PDF, this is usually good to keep checked Last Edited: 2012-07-12 5
Below that you can adjust the slider as to whether you prefer the PDF to be of lower quality but have a Small Size file, or if you want it to be High Quality, but with a large file size Click Options to adjust the optimization settings In the Optimize Scanned PDF window that opens, under Optimization Options check the box next to Apply Adaptive Compression if you want Adobe to compress areas that are color or grayscale differently than monochrome areas. For example, if this remains on Adobe will highly compress monochrome areas, or areas of text, to reduce the file size, but will only slightly compress pictures to retain their quality Last Edited: 2012-07-12 6
Under the Filters section, turn Deskew on if you want Adobe to straighten any PDFs that were scanned slightly crooked Choose a level of Background Removal if the paper you are scanning has a lot of creases or dirt or an image on the back side showing through, all of these will be removed If you are scanning a document that has been printed and has a half-tone screen sometimes you will get a weird pattern, called a Moire pattern. If this happens then in this window make sure to turn Descreen on. Also, sometimes when you scan text you can get a little halo around the text, making it hard to read. Turn Text Sharpening on to remove this and make the text clearer to read Last Edited: 2012-07-12 7
Click OK when you are finished Back in the Configure Presets window, check the box next to Make Searchable (Run OCR) if you want to be able to search the scanned text Click on Options Last Edited: 2012-07-12 8
In the window that opens, next to Primary OCR Language select the primary language of the document you are scanning Click Ok Check the box next to Make PDF/A-1b compliant if you want to turn the scan into an archived PDF, which is very difficult to modify, so you are basically making an archive or digital copy of the scan When you are done click Save Last Edited: 2012-07-12 9
All of these settings can be changed for each profile You can also create your own scanning profile for one time use by clicking on Create > point your cursor to PDF from Scanner > click on Custom Scan Last Edited: 2012-07-12 10
Set up Optimization Options If you have a scanned image of a document saved on your computer and you want to convert it to a PDF with searchable text, do the following: Click on Create > select Create from File In the Open window, select the scanned image you want to convert to PDF Last Edited: 2012-07-12 11
Next to Files of type: select what type of file the image is, for example JPEG or TIFF Now you can click on the Settings button Last Edited: 2012-07-12 12
Edit Settings In the Adobe PDF Settings window that opens, make sure the box next to Scan Optimization and OCR is checked Click on the Settings button Under Optimization Options check the box next to Apply Adaptive Compression if you want Adobe to compress areas that are color or grayscale differently than monochrome areas. For example, if this remains on Adobe will highly compress monochrome areas, or areas of text, to reduce the file size, but will only slightly compress pictures to retain their quality Last Edited: 2012-07-12 13
Or, below that you can adjust the slider as to whether you prefer the PDF to be of lower quality but have a Small Size file, or if you want it to be High Quality, but with a large file size In the Filters section click Edit, in the window that opens you can do the following: Turn Deskew on if you want Adobe to straighten any PDFs that were scanned slightly crooked Choose a level of Background Removal if the paper you are scanning has a lot of creases or dirt or an image on the back side showing through, all of these will be removed Last Edited: 2012-07-12 14
If you are scanning a document that has been printed and has a half-tone screen sometimes you will get a weird pattern, called a Moire pattern. If this happens then in this window make sure to turn Descreen on. Also, sometimes when you scan text you can get a little halo around the text, making it hard to read. Turn Text Sharpening on to remove this and make the text clearer to read Click Ok when you are done Back in the Optimize Scanned PDF window, in the OCR Options section, check the box next to Make Searchable (Apply OCR) to make the text in the image searchable. Click Edit Last Edited: 2012-07-12 15
Next to Primary OCR Language select the primary language of the document you are turning into a PDF Next to PDF Output Style you have the following two choices: Searchable Image: The original scan will be layered on top of the text. Not the clearest appearance but it is more accurate. Last Edited: 2012-07-12 16
Clear Scan: Just the text version, the original scan does not appear. This option is a little clearer but not as accurate as the text version may differ from the original scan Click Ok when finished In the Optimized Scanned PDF window, click Ok when you have finished adjusting the settings Last Edited: 2012-07-12 17
Back in the Adobe PDF Settings window, leave the Color Management settings as they are and click OK In the Open window, click Open The image will be converted into a searchable PDF Last Edited: 2012-07-12 18
Recognize Text in a Scanned PDF If you receive a PDF that is simply a scanned picture, that you cannot select text in, but you do not have the original scan, it is still possible to convert that PDF into a PDF with searchable text Open the PDF image Go to the Tools panel > Recognize Text pane > click on In This File Last Edited: 2012-07-12 19
Under Pages select what page you want to apply this to In the Settings section, click on Edit Next to Primary OCR Language select the primary language of the document you are turning into a PDF Last Edited: 2012-07-12 20
Next to PDF Output Style you have the following three choices: Searchable Image: The original scan will be layered on top of the text. Not the clearest appearance but it is more accurate Searchable Image (Exact): This option keeps the PDF as close to the original scan as possible, for example it won t deskew the image, however the text will still be searchable Clear Scan: Just the text version, the original scan does not appear. This option is a little clearer but not as accurate as the text version may differ from the original, for example if the typeface was something that Adobe does not recognize then it will make a guess as to what the text is. This may cause errors in the text, as well as a change in appearance Last Edited: 2012-07-12 21
Next to Downsample To you can select what resolution you want the PDF to have, with 600dpi being the highest Click Ok when you are finished In the Recognize Text window, click Ok Adobe will convert the document Last Edited: 2012-07-12 22
Review and Correct ORC Suspects If you scan a document or convert an image to a PDF that has been set with the PDF Output Style, Searchable Image, Adobe has a useful tool that can help correct any mistakes in the OCR text that differs from the original scan Go to the Tools panel > Recognize Text pane > click on Find First Suspect The Find Element window will open, showing the first suspect In the box you will see an image of what the original scan looked like Last Edited: 2012-07-12 23
This section has been highlighted in the PDF Click on the highlighted area to see the text that has been inserted for the scanned area In this example you can see that the original scan listed the product number as HP10-CP1, however when Adobe converted it to text it was converted to HPlO-CPl. So if anyone searches for HP10 they will not find a result. To change this, click in the box and type in what is closest to the original, in this case you would change to l to a 1 and the O to a 0. Last Edited: 2012-07-12 24
Click Accept and Find to go on to the next suspect If a suspect is correct, and no change needs to be made, simply click Accept and Find If you want to skip a suspect to come back to click Find next If Adobe has turned an image into a text, but you do not want it to appear as text, click Not Text to remove the text When you are finished click Close This can be done with the PDF Output Style, Clear Scan, however it is not as accurate as there is no original to compare it to so it is suggested to use Searchable image Last Edited: 2012-07-12 25