Implementing Optical Character Recognition on the Android Operating System for Business Cards
|
|
|
- Corey Cody Bell
- 10 years ago
- Views:
Transcription
1 1 Implementing Optical Character Recognition on the Android Operating System for Business Cards Sonia Bhaskar, Nicholas Lavassar, Scott Green EE 368 Digital Image Processing Abstract This report presents an algorithm for accurate recognition of text on a business card, given an Android mobile phone camera image of the card in varying environmental conditions. First a MATLAB implementaton of the algorithm is described where the main objective is to optimize the image for input to the Tesseract OCR (optical character recognition) engine. Then a simplified reduced-complexity implementation on the DROID mobile phone is discussed. The MATLAB implementation is successful in a variety of adverse environmental conditions including variable illumination across the card, varied background surrounding the card, rotation, perspective, and variable horizontal text flow. Direct implemntation of the MATLAB algorithm on DROID proved time-intensive, therefore, a simplified version was implemented on the phone. I. INTRODUCTION Our objective is to utilize the visual capabilities of the Android mobile phone to extract information from a business card. We use the camera features of the Android to capture data. Extracting information from the business card requires accurate recognition of the text of the business card. Any camera image of the business card would be subject to several environmental conditions, such as variable lighting, reflection, rotation, and scaling (we would desire the same data to be extracted from the business card regardless of the distance from the camera), among others. Our project involves working with the optical character recognition engine Tesseract, since it is the most accurate one currently available. We optimize the environmental conditions to make it conducive to using Tesseract. This report presents an algorithm for accurate recognition of text on a business card, given an Android mobile phone camera image of the card in varying environmental conditions. First a MATLAB implementaton of the algorithm is described where the main objective is to optimize the image for input to the Tesseract OCR (optical character recognition) engine. Then a reduced-complexity implementation on the DROID mobile phone is discussed. The MATLAB implementation is successful in a variety of adverse environmental conditions including variable illumination across the card, varied background surrounding the card, rotation, perspective, and variable horizontal text flow across the card. Direct implementation of the MATLAB algorithm on DROID is too slow, therefore, a simplified version was implemented. II. CHOICE OF OPTICAL CHARACTER RECOGNITION ENGINE: TESSERACT Optical character recognition (OCR) is the mechanical or electronic translation of scanned images of handwritten, typewritten, or printed text, to machine encoded text. OCR has been in development for almost 80 years, as the first patent for an OCR machine was filed by a German named Gustav Tauschek in 1929, and an American patent was filed subsequently OCR has many applications, including use in the postal serivce, language translation, and digital libraries. OCR is currently even in the hands of the general public, in the form of mobile applications. We are using open source OCR software called Tesseract as a basis for our Business Card Recognition project. Development on Tesseract began in 1985 by Hewlett Packard and continued until After this time, the project was changed hands and some development was done by the University of Northern Las Vegas, although little or no development was done between the years of 1996 and The code was eventually released under the Apache 2.0 license as open source. Google is currently developing the project and sponsors the open development project. Today, Tesseract is considered the most accurate free OCR engine in existence. Google has benefitted extensively from Tesseract in their Google books project, which has attempted to digitize the world s libraries. A. Tesseract Algorithm The Tesseract OCR algorithm has the following steps: 1) A Grayscale or color image is provided as input. The program takes.tiff and.bmp files natively but plug-ins can be installed that allow the processing of other image compression formats. The input data should ideally be a flat image from a flatbed scanner or a near parallel image capture. No rectification capability it provided to correct for perspective distortions 2) Adaptive thresholding (Otsu s method) - performs the reduction of a grayscale image to a binary image. The algorithm assumes that in an image there are foreground (black) pixels and background (white) pixels. It then calculates the optimal threshold that separates the two pixel classes such that the variance between the two is minimal. 3) Connected-component labeling - Tesseract searches through the image, identifies the foreground pixels, and marks them as blobs or potential characters. 4) Line finding algorithm - lines of text are found by analyze the image space adjacent to potential characters. This algorithm does a Y projection of the binary image and finds locations having a pixel count less than a
2 2 5) 6) 7) 8) specific threshold. These areas are potential lines, and are further analyzed to confirm. Baseline fitting algorithm - finds baselines for each of the lines. After each line of text is found, Tesseract examines the lines of text to find approximate text height across the line. This process is the first step in determining how to recognize characters. Fixed pitch detection - the other half of setting up character detection is finding the approximate character width. This allows for the correct incremental extraction of characters as Tesseract walks down a line. Non-fixed pitch spacing delimiting - characters that are not of uniform width or of a width that agrees with the surrounding neighborhood are reclassified to be processed in an alternate manner. Word recognition - after finding all of the possible character blobs in the document, Tesseract does word recognition word by word, on a line by line basis. Words are then passed through a contextual and syntactical analyzer which ensures accurate recognition. III. I MPLEMENTATION OVERVIEW We developed our own application, called OCReader, to capture and rectify images to feed to the Tesseract OCR engine. OCReader allows the user to select an image already stored on their Android device or use the device s camera to capture a new image; it then runs through an image rectification algorithm and passes the input image to the Tesseract service. When the OCR process is complete it produces a returns a string of text which is displayed on the main OCReader screen, where the user is allowed to edit the results and presented with several options. The user may choose to copy the text to the phone s clipboard, send it by , or create a new contact. If the user selects to create a new contact, a series of contextual checks are preformed to identify contact information in the results. Since Tesseract is written in the C++ programming language, it is no trivial task to use it on the Java-based Android OS. The C++ code needs to be wrapped in a Java class and run natively via the Java Native Interface (JNI). Though there is some effort involved, one great benefit to running Tesseract natively is that C++ is substantially faster than Java. After implementing our own rudimentary wrapper, we fortunately discovered that Google had already made a more complete wrapper available to the public as a part of its EyesFree project, so we began using their version. The Google implementation of Tesseract has a convenient architecture which allowed us to seemlessly interface our preprocessed images with the ocr engine. While default Tesseract settings are configured in OCReader, the user has limited control over the settings via a preference panel, where processing speed intervals can be set and training data and languages can be modified. Before implementing our project on the Android mobile phone, we first developed the image rectification algorithms in MATLAB in order to take advantage of the image processing toolbox and faster processing times. IV. MATLAB I MPLEMENTATION Prior to attempting to implement the algorithm on the Android mobile phone, an algorithm for optimal text recognition by the Tesseract OCR engine was developed. The algorithm consists of several processing steps and a detailed description is given Sections IV-A-IV-B. The algorithm is motivated by the need to optimize the conditions for text recognition before feeding the image to Tesseract. The business cards under study were subjected to a variety of adverse conditions, including variable illumination across the card, varied background surrounding the card, rotation, perspective, and variable horizontal text flow. A. Preprocessing Prior to establishing a correspondence map that would allow rectification of any indesirable geometric properties of the image (a business card that is rotated, or is in perspective), the image needed to be preprocessed in order to separate the business card from its background, as well as to establish an outline of the business card to use for the correspondence map. The following steps were used for preprocessing: 1) Adaptive thresholding and card mask. 2) Locating the four corners of the card. 3) Consistency check via text flow directions. 4) Perspective transformation. Details of these steps follow. 1) Adaptive Thresholding and Card Mask: [David Chen provided a large amount of help for adaptive thresholding; thanks David!]. The image was subdivided into an array of 2 3 blocks, over which the MATLAB command graythresh was used to compute local thresholds, all of which were stored into another array. (In order to assure smooth transitions in the threshold array between blocks, the adjacent edge was not subjected to thresholding.) The MATLAB command imresize was then used to upsample the threshold array to match the size of the image. The original image was then thresholded using the resized threshold array. The results of the adaptive thresholding are shown in Fig. 1. 1, 1 1, 2 1, 3 2, 1 2, 2 2, 3 (b) Adaptive Thresholding final pre processes result before projective correction (c) Preprocessed Image Fig. 1. and Adaptively Thresholded Image Following the adaptive thresholding, the morphological operations opening, closing, and dilation, using square structuring elements, were performed. This created regions of white
3 ρ distorted image with corners distorted image with corners outline with corners outline with corners final BW result after projective correction (if any) 3 space, and the largest region found was chosen to be the mask of the card. The mask was then stored for use in determining the outline of the card. The thresholded image was generally a white quadrilateral with black text, surrounded by a black background. This black background was set to white for processing. This image was also stored for use in the next processing step. 2) Outline and Corner Detection: The outline of the card was then obtained from the card mask using a disk structuring element, and the difference of the dilated and eroded images. This difference image was then thresholded. While the outline of the card could have been determined by common edge detectors such as Sobel and Canny, we found that these edge detectors gave edges that were weak and often were beyond repair from morphological operators. Since we desired to use the Hough transform in the next step, we desired to have highly defined lines. Additionally the edge detectors would potentially be computationally expensive to implement on the Android mobile phone. The Hough transform for line detection was then run on the outline of the business card. The outline in all cases is a quadrilateral, which is defined by four lines. Therefore, the four highest non-contiguous peaks in the Hough matrix were taken. From the output of the MATLAB command hough, we were given parameters for the four lines in terms of ρ and θ. In order to determine the four corners of the business card, the four equations for the lines were solved for their intersections, and the relevant intersections within the image boundaries were taken. A series of comparisons were then made in order to determine which corner was the top left corner, the top right corner, and so on. Several approaches could have been used for corner detection, such as the Harris corner detector. Unfortunately, the Harris corner detector was found to be a too sensitive, computationally expensive, and the output of the related MATLAB commands was found unhelpful. If four corners could not be determined, for example if we had a white business card on a white background, the original image was accepted for input to the OCR engine and no perspective transformation was performed. 3) Text Flow Check: In considering several business cards that contained a pattern, we discovered that the outline of the business card may not follow a straight line, and this could potentially throw off the calculations of the four corners. For this case we use an idea from [3] regarding text flow. First we find a rectangular bounding box for the thresholded image within the card mask by looking for the smallest rectangle that contains all the black pixels. We segment the card within the bounding box into 6 equal-height horizontal strips, and calculate the angle of the horizontal text flow within each strip using Radon transform.the topmost strip text flow angle should be close to the angle of the top-line of the card outline (the one connecting the top two corners); the same holds for the bottom-most strip and the bottom line angle of the card outline. If there is any inconsistency, we accept the text flow angle as the correct angle and redraw the top (or bottom) lines of the card outline to redefine the right corner(s). [We are unfortunately only able to make a comparison for the horizontal edges of the business card, since business cards typically lack substantial vertical text flow for us to calculate angles from (we would desire a good deal of vertically aligned text).] Once this is done we have four points corresponding to a corner on the card. An example where this is necessary is seen in Fig. 2. Fig. 2. (c) With Corner Correction from Horizontal Text Flow (b) With Faulty Corner Computed from Hough Transform Business Card with Horizontal Text Flow Corner Correction 4) Perspective Transformation: We desire to establish a correpondence map between the image of the business card we have and a standard US business card, whose dimensions follow the ratio 3.5/2. We fit the lower left corner of this standard business card to the lower left corner from the image, and compute the other three corners of the standard business card, scaling it relative to the size of the image at hand and using the 3.5/2 ratio. Once these correspondences are established, we use the MATLAB command maketform to make a perspective transform based on the two sets of four corner points. The MATLAB command is then used to perform the perspective transform on the thresholded image as well as the original image (with the background intact). Both these images are input to the post processing step. The steps described can be seen below in Fig. 3. Fig R θ (x ) θ (degrees) (a) Hough Transform (c) Image with Corners after Text Flow Correction (b) Outline of Image with Hough Corners (d) Rectified Image Geometric Rectification of the Business Card
4 image input for post post processing with BOUNDARY LINE output image 1 after post post processing output image 2 after post post processing output image 1 after post post processing image input for post post processing with BOUNDARY LINE output image 2 after post post processing old solution 4 B. Post Processing 1) Image Bounding, Resizing, and Thresholding: Initial adaptive thresholding was done on the entire image (card plus the background). After perspective correction, we again adaptively threshold the extracted card image (this time only within the rectangular bounding box; we also enlarge it by a scale factor prior to thresholding) to check if it provided a better binarized text enhancement. It turns out that it is not always so. We used the ratio of black pixels within the card boundaries in the binarized text images from the rethresholded and the original thresholded images as a measure of improvement. If the ratio was > 2.3 (significant change), we accept the re-thresholded image and store it as output for the next step, else we use the initial thresholded image (not significant change: extra black blobs appear) and store it as output for the next step. 2) Vertical Segmentation: Tesseract often has trouble reading lines of text that are not horizontally aligned. For this reason, we include an additional post processing step that segements the business card vertically. We sweep a line from the top line at different slopes from different starting points, and search along a 10 pixel wide swath to the right of it. If we find that there are no more than 2 black pixels in this 10 pixel-wide swath along the entire card image, we split the card into into halves: left part and the right part. [We do so only if the two split cards are not entirely white: no black pixels]. We see the benefits below in Figs. 4 and 5. Fig. 4. (a) Card with Line Segment Vertical Segmentation C. Tesseract Outputs (c) Left Segment (b) Right Segment The images were then input to the Tesseract OCR and the output was obtained. The image shown in Fig. 6 provided a nearly ideal output. 1) Tesseract Output, Fig. 6: GARY A. RUST, M.D. A EAR. NOSE & THROAT BOARD CERTIFIED IN OTOLARYNGOLOGY/HEAD & NECK SURGERY 45 CASTRO SI. STE. 210 SAN FRANCISCO, CA (415) FAX (415) ) Tesseract Output, from Fig. 1: Michele Marchesan Vice President, Global Sales and Marketing 3-D Modeling Products _.t 3D Systems Corporation Three D Systems Circle Fig. 5. Fig. 6. (c) Left Segment Another Case of Vertical Segmentation Original and Rectified Image " Rock Hill, SC 29730? USA C S phone fax mobile marchesanm@3dsystemsc0h (b) Image with Line Segment (d) Right Segment (b) Rectified and Post Processed Image 3) Tesseract Output, from Fig. 2: James Lee Director of Product Development 2077 N. Capitol Avenue San Jose, Califomia Tel: (408) Ext. 276 D/L: (408)941?82 /6 Cell: (949)293?1636 Fax: (406794%-0901 jamos. [email protected] W\Wl.$Up0fl8 6I\i.C0m 4) Tesseract Output, from Fig. 5: SCSI! Visi0n SyS,em$ Naseer Khan Chief oi Sales lhone (630) cell phone (630) 56l -4l 72 tax _ (248) e [email protected] e Steinbichler Vision Systems, Inc. I Country Club Drive, Suite C-30 Farmington Hllls, Ml 4833l r V. CONCLUSION Reliably interpreting text from real-world photos is a challenging problem due to variations in environmental factors. Even the best open source OCR engine available (Tesseract) is easily thwarted by uneven illumation, off-angle rotation and perspective, and misaligned text, among others. We have shown that by correcting for these factors it is possible to dramatically improve an OCR technology s accuracy. A simple adaptive thresholding technique was used to reduce the effect of illumination gradients. Rotation and perspective were determined by processing a Hough transform. Text alignment was improved by performing vertical segmentation. While some of these described operations proved to be too intensive for inclusion in version 1.0 of the Android application, we discovered it was possible to secure a majority of the accuracy gains by correcting for only illumination and rotation in the constrained environment of an Android user. The result is a highly functional and fun to use mobile application! VI. FUTURE WORK Due to time and resource constraints we were unable implement all of the preprocessing steps designed in MATLAB on the mobile device. The MATLAB-based algorithm makes very heavy use of morphological and other computationally expensive operations which are not well suited to the phone s limited resources. As a result, we decided to implement an abbreviated version of the described algorithm which performs only the most important of the processing steps. The processing algorithm used on the Android device performs the following steps: 1) Image resize to 800x600 to ease burden on mobile device. 2) 2x3 adaptive thresholding (using our own JJIL addition for Otsu thesholding). 3) Hough transform (specialized to horizontal edge pixels) 4) Flat rotation Notably, we do not implement a perspective transformation, which saves considerable computational expense (especially since we do no need to identify the corners of the card). The lack of perspective transformation on the Android platform is arguably unimportant, since the user can simply take another photo if their first attempt is heavily distorted. A simple
5 5 flat rotation will correct for most of the user s error in a majority of situations. Also, trusting the user to determine the appropriate perspective prevents the app from making assumptions about the content (that it is in fact a business card) and allows it to be used as a more general OCR tool. Probably the most important feature left out of the Android image rectification algorithm is vertical segmentation. This step considerably improves Tesseract s accuracy, but it is also extremely computationally expensive. We firmly believe, however, that by optimizing the current matlab implementation of this operation and implementing it in C++ (via the JNI as with the OCR service) this step could be brought to the Android phone. ACKNOWLEDGMENTS The authors would like to thank David Chen for his advice throughout this project as well as Professor Bernd Girod. CODE RESOURCES Eyes-Free Project (Speech Enabled Eyes-Free Android Applications, JJIL (Jon s Java Imaging Library, GMSE Imaging, JDeskew REFERENCES [1] H. Li and D. S. Doermann. Text Enhancement in Digital Video Using Multiple Frame Intergration. ACM - Multimedia 99, Orlando, Florida, pages 19-22, [2] G. Zhu and D. Doermann. Logo Matching for Document Image Retrieval, International Conference on Document Analysis and Recognition (ICDAR 2009), pp , [3] J. Liang, et. al. Geometric Rectification of Camera-captured Document Images, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp , July [4] B. Girod. EE 368 Digital Image Processing Notes, EE 368 Digital Image Processing Spring [5] D. Chen. correspondence. May-June APPENDIX A. Sonia Bhaskar Sonia Bhaskar s work is detailed in Section IV, MATLAB Implementation. She developed the algorithms (as described in Section IV) for the image processing and implemented the MATLAB side of the project. B. Nicholas LaVassar Nick LaVassar produced the Android application OCReader. He researched a vast number of image processing techniques and Android resources to determine the best way to create an app which met the group s goal of making a business card reader. He learned it was necessary to modify the original algorithm to operate smoothly on the Android phone and created a substitute which suited the platform. While coding OCReader, he found it necessary to code many image processing operations, including one which he plans to contribute to the JJIL Project. C. Scott Green Scott Green researched the Tesseract source code. He also reproduced a rudimentary version of Tesseract in MATLAB which processed flat images and found characters similar to the Rice example in class. However, full reproduction was found infeasible in the time alotted. He coded the shell to enable bitmap transformations, added new contextual rules for contact creation in Nick s Android application (multiple line addresses, multiple telephone number distinctions (cell, fax, work, home), and contact job title extraction), and added a home screen and help window. He also made the poster presentation.
Automatic License Plate Recognition using Python and OpenCV
Automatic License Plate Recognition using Python and OpenCV K.M. Sajjad Department of Computer Science and Engineering M.E.S. College of Engineering, Kuttippuram, Kerala [email protected] Abstract Automatic
Analecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
Signature Region of Interest using Auto cropping
ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 1 Signature Region of Interest using Auto cropping Bassam Al-Mahadeen 1, Mokhled S. AlTarawneh 2 and Islam H. AlTarawneh 2 1 Math. And Computer Department,
Face detection is a process of localizing and extracting the face region from the
Chapter 4 FACE NORMALIZATION 4.1 INTRODUCTION Face detection is a process of localizing and extracting the face region from the background. The detected face varies in rotation, brightness, size, etc.
Visual Structure Analysis of Flow Charts in Patent Images
Visual Structure Analysis of Flow Charts in Patent Images Roland Mörzinger, René Schuster, András Horti, and Georg Thallinger JOANNEUM RESEARCH Forschungsgesellschaft mbh DIGITAL - Institute for Information
Circle Object Recognition Based on Monocular Vision for Home Security Robot
Journal of Applied Science and Engineering, Vol. 16, No. 3, pp. 261 268 (2013) DOI: 10.6180/jase.2013.16.3.05 Circle Object Recognition Based on Monocular Vision for Home Security Robot Shih-An Li, Ching-Chang
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,
Form Design Guidelines Part of the Best-Practices Handbook. Form Design Guidelines
Part of the Best-Practices Handbook Form Design Guidelines I T C O N S U L T I N G Managing projects, supporting implementations and roll-outs onsite combined with expert knowledge. C U S T O M D E V E
LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK
vii LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK LIST OF CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF NOTATIONS LIST OF ABBREVIATIONS LIST OF APPENDICES
In: Proceedings of RECPAD 2002-12th Portuguese Conference on Pattern Recognition June 27th- 28th, 2002 Aveiro, Portugal
Paper Title: Generic Framework for Video Analysis Authors: Luís Filipe Tavares INESC Porto [email protected] Luís Teixeira INESC Porto, Universidade Católica Portuguesa [email protected] Luís Corte-Real
Tracking of Small Unmanned Aerial Vehicles
Tracking of Small Unmanned Aerial Vehicles Steven Krukowski Adrien Perkins Aeronautics and Astronautics Stanford University Stanford, CA 94305 Email: [email protected] Aeronautics and Astronautics Stanford
Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras
Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras W3A.5 Douglas Chai and Florian Hock Visual Information Processing Research Group School of Engineering and Mathematics Edith
Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture.
Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture. Chirag Gupta,Sumod Mohan K [email protected], [email protected] Abstract In this project we propose a method to improve
A Method of Caption Detection in News Video
3rd International Conference on Multimedia Technology(ICMT 3) A Method of Caption Detection in News Video He HUANG, Ping SHI Abstract. News video is one of the most important media for people to get information.
Vision based Vehicle Tracking using a high angle camera
Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu [email protected] [email protected] Abstract A vehicle tracking and grouping algorithm is presented in this work
An Algorithm for Classification of Five Types of Defects on Bare Printed Circuit Board
IJCSES International Journal of Computer Sciences and Engineering Systems, Vol. 5, No. 3, July 2011 CSES International 2011 ISSN 0973-4406 An Algorithm for Classification of Five Types of Defects on Bare
ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan
Handwritten Signature Verification ECE 533 Project Report by Ashish Dhawan Aditi R. Ganesan Contents 1. Abstract 3. 2. Introduction 4. 3. Approach 6. 4. Pre-processing 8. 5. Feature Extraction 9. 6. Verification
How to build text and objects in the Titler
How to build text and objects in the Titler You can use the Titler in Adobe Premiere Pro to create text and geometric objects. There are three methods for creating text, each capable of producing either
Navigation Aid And Label Reading With Voice Communication For Visually Impaired People
Navigation Aid And Label Reading With Voice Communication For Visually Impaired People A.Manikandan 1, R.Madhuranthi 2 1 M.Kumarasamy College of Engineering, [email protected],karur,india 2 M.Kumarasamy
CS1112 Spring 2014 Project 4. Objectives. 3 Pixelation for Identity Protection. due Thursday, 3/27, at 11pm
CS1112 Spring 2014 Project 4 due Thursday, 3/27, at 11pm You must work either on your own or with one partner. If you work with a partner you must first register as a group in CMS and then submit your
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
3D Scanner using Line Laser. 1. Introduction. 2. Theory
. Introduction 3D Scanner using Line Laser Di Lu Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute The goal of 3D reconstruction is to recover the 3D properties of a geometric
Build Panoramas on Android Phones
Build Panoramas on Android Phones Tao Chu, Bowen Meng, Zixuan Wang Stanford University, Stanford CA Abstract The purpose of this work is to implement panorama stitching from a sequence of photos taken
Determining the Resolution of Scanned Document Images
Presented at IS&T/SPIE EI 99, Conference 3651, Document Recognition and Retrieval VI Jan 26-28, 1999, San Jose, CA. Determining the Resolution of Scanned Document Images Dan S. Bloomberg Xerox Palo Alto
VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION
VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION Mark J. Norris Vision Inspection Technology, LLC Haverhill, MA [email protected] ABSTRACT Traditional methods of identifying and
Poker Vision: Playing Cards and Chips Identification based on Image Processing
Poker Vision: Playing Cards and Chips Identification based on Image Processing Paulo Martins 1, Luís Paulo Reis 2, and Luís Teófilo 2 1 DEEC Electrical Engineering Department 2 LIACC Artificial Intelligence
A Study of Automatic License Plate Recognition Algorithms and Techniques
A Study of Automatic License Plate Recognition Algorithms and Techniques Nima Asadi Intelligent Embedded Systems Mälardalen University Västerås, Sweden [email protected] ABSTRACT One of the most
Speed Performance Improvement of Vehicle Blob Tracking System
Speed Performance Improvement of Vehicle Blob Tracking System Sung Chun Lee and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA [email protected], [email protected] Abstract. A speed
Text Localization & Segmentation in Images, Web Pages and Videos Media Mining I
Text Localization & Segmentation in Images, Web Pages and Videos Media Mining I Multimedia Computing, Universität Augsburg [email protected] www.multimedia-computing.{de,org} PSNR_Y
Designing forms for auto field detection in Adobe Acrobat
Adobe Acrobat 9 Technical White Paper Designing forms for auto field detection in Adobe Acrobat Create electronic forms more easily by using the right elements in your authoring program to take advantage
Canny Edge Detection
Canny Edge Detection 09gr820 March 23, 2009 1 Introduction The purpose of edge detection in general is to significantly reduce the amount of data in an image, while preserving the structural properties
Research on Chinese financial invoice recognition technology
Pattern Recognition Letters 24 (2003) 489 497 www.elsevier.com/locate/patrec Research on Chinese financial invoice recognition technology Delie Ming a,b, *, Jian Liu b, Jinwen Tian b a State Key Laboratory
A New Imaging System with a Stand-type Image Scanner Blinkscan BS20
A New Imaging System with a Stand-type Image Scanner Blinkscan BS20 21 A New Imaging System with a Stand-type Image Scanner Blinkscan BS20 Tetsuro Kiyomatsu Yoshiharu Konishi Kenji Sugiyama Tetsuya Hori
Automatic Labeling of Lane Markings for Autonomous Vehicles
Automatic Labeling of Lane Markings for Autonomous Vehicles Jeffrey Kiske Stanford University 450 Serra Mall, Stanford, CA 94305 [email protected] 1. Introduction As autonomous vehicles become more popular,
ROBOTRACKER A SYSTEM FOR TRACKING MULTIPLE ROBOTS IN REAL TIME. by Alex Sirota, [email protected]
ROBOTRACKER A SYSTEM FOR TRACKING MULTIPLE ROBOTS IN REAL TIME by Alex Sirota, [email protected] Project in intelligent systems Computer Science Department Technion Israel Institute of Technology Under the
MetaMorph Software Basic Analysis Guide The use of measurements and journals
MetaMorph Software Basic Analysis Guide The use of measurements and journals Version 1.0.2 1 Section I: How Measure Functions Operate... 3 1. Selected images... 3 2. Thresholding... 3 3. Regions of interest...
Using MATLAB to Measure the Diameter of an Object within an Image
Using MATLAB to Measure the Diameter of an Object within an Image Keywords: MATLAB, Diameter, Image, Measure, Image Processing Toolbox Author: Matthew Wesolowski Date: November 14 th 2014 Executive Summary
Morphological segmentation of histology cell images
Morphological segmentation of histology cell images A.Nedzved, S.Ablameyko, I.Pitas Institute of Engineering Cybernetics of the National Academy of Sciences Surganova, 6, 00 Minsk, Belarus E-mail [email protected]
The Implementation of Face Security for Authentication Implemented on Mobile Phone
The Implementation of Face Security for Authentication Implemented on Mobile Phone Emir Kremić *, Abdulhamit Subaşi * * Faculty of Engineering and Information Technology, International Burch University,
Data Storage 3.1. Foundations of Computer Science Cengage Learning
3 Data Storage 3.1 Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: List five different data types used in a computer. Describe how
How to resize, rotate, and crop images
How to resize, rotate, and crop images You will frequently want to resize and crop an image after opening it in Photoshop from a digital camera or scanner. Cropping means cutting some parts of the image
Neural Network based Vehicle Classification for Intelligent Traffic Control
Neural Network based Vehicle Classification for Intelligent Traffic Control Saeid Fazli 1, Shahram Mohammadi 2, Morteza Rahmani 3 1,2,3 Electrical Engineering Department, Zanjan University, Zanjan, IRAN
Virtual Mouse Using a Webcam
1. INTRODUCTION Virtual Mouse Using a Webcam Since the computer technology continues to grow up, the importance of human computer interaction is enormously increasing. Nowadays most of the mobile devices
Barcode Based Automated Parking Management System
IJSRD - International Journal for Scientific Research & Development Vol. 2, Issue 03, 2014 ISSN (online): 2321-0613 Barcode Based Automated Parking Management System Parth Rajeshbhai Zalawadia 1 Jasmin
PageX: An Integrated Document Processing and Management Software for Digital Libraries
PageX: An Integrated Document Processing and Management Software for Digital Libraries Hanchuan Peng, Zheru Chi, Wanchi Siu, and David Dagan Feng Department of Electronic & Information Engineering The
Automatic Recognition Algorithm of Quick Response Code Based on Embedded System
Automatic Recognition Algorithm of Quick Response Code Based on Embedded System Yue Liu Department of Information Science and Engineering, Jinan University Jinan, China [email protected] Mingjun Liu
A Study on M2M-based AR Multiple Objects Loading Technology using PPHT
A Study on M2M-based AR Multiple Objects Loading Technology using PPHT Sungmo Jung, Seoksoo Kim * Department of Multimedia Hannam University 133, Ojeong-dong, Daedeok-gu, Daejeon-city Korea [email protected],
Demo: Real-time Tracking of Round Object
Page 1 of 1 Demo: Real-time Tracking of Round Object by: Brianna Bikker and David Price, TAMU Course Instructor: Professor Deepa Kundur Introduction Our project is intended to track the motion of a round
Guide To Creating Academic Posters Using Microsoft PowerPoint 2010
Guide To Creating Academic Posters Using Microsoft PowerPoint 2010 INFORMATION SERVICES Version 3.0 July 2011 Table of Contents Section 1 - Introduction... 1 Section 2 - Initial Preparation... 2 2.1 Overall
SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS
SECTION 2-1: OVERVIEW Chapter 2 Describing, Exploring and Comparing Data 19 In this chapter, we will use the capabilities of Excel to help us look more carefully at sets of data. We can do this by re-organizing
Learn about OCR: Optical Character Recognition Track, Trace & Control Solutions
: Optical Character Recognition Track, Trace & Control Solutions About Your Presenter Learn about OCR Presenting today: Juan Worle Technical Training Coordinator Microscan Corporate Headquarters Renton,
Elfring Fonts, Inc. PCL MICR Fonts
Elfring Fonts, Inc. PCL MICR Fonts This package contains five MICR fonts (also known as E-13B), to print magnetic encoding on checks, and six Secure Number fonts, to print check amounts. These fonts come
Instruction Manual. Applied Vision is available for download online at:
Applied Vision TM 4 Software Instruction Manual Applied Vision is available for download online at: www.ken-a-vision.com/support/software-downloads If you require an Applied Vision installation disk, call
Digital Image Requirements for New Online US Visa Application
Digital Image Requirements for New Online US Visa Application As part of the electronic submission of your DS-160 application, you will be asked to provide an electronic copy of your photo. The photo must
Softek Software Ltd. Softek Barcode Reader Toolkit for Android. Product Documentation V7.5.1
Softek Software Ltd Softek Barcode Reader Toolkit for Android Product Documentation V7.5.1 1 Contents 2... 1 3 Installation... 1 4 Calling Bardecoder from another App... 1 5 Settings for the Bardecoder
Improving Computer Vision-Based Indoor Wayfinding for Blind Persons with Context Information
Improving Computer Vision-Based Indoor Wayfinding for Blind Persons with Context Information YingLi Tian 1, Chucai Yi 1, and Aries Arditi 2 1 Electrical Engineering Department The City College and Graduate
Visualization of 2D Domains
Visualization of 2D Domains This part of the visualization package is intended to supply a simple graphical interface for 2- dimensional finite element data structures. Furthermore, it is used as the low
McAFEE IDENTITY. October 2011
McAFEE IDENTITY 4.2 Our logo is one of our most valuable assets. To ensure that it remains a strong representation of our company, we must present it in a consistent and careful manner across all channels
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles [email protected] and lunbo
Edge tracking for motion segmentation and depth ordering
Edge tracking for motion segmentation and depth ordering P. Smith, T. Drummond and R. Cipolla Department of Engineering University of Cambridge Cambridge CB2 1PZ,UK {pas1001 twd20 cipolla}@eng.cam.ac.uk
Graphic Design. Background: The part of an artwork that appears to be farthest from the viewer, or in the distance of the scene.
Graphic Design Active Layer- When you create multi layers for your images the active layer, or the only one that will be affected by your actions, is the one with a blue background in your layers palette.
MassArt Studio Foundation: Visual Language Digital Media Cookbook, Fall 2013
INPUT OUTPUT 08 / IMAGE QUALITY & VIEWING In this section we will cover common image file formats you are likely to come across and examine image quality in terms of resolution and bit depth. We will cover
CS231M Project Report - Automated Real-Time Face Tracking and Blending
CS231M Project Report - Automated Real-Time Face Tracking and Blending Steven Lee, [email protected] June 6, 2015 1 Introduction Summary statement: The goal of this project is to create an Android
How To Fuse A Point Cloud With A Laser And Image Data From A Pointcloud
REAL TIME 3D FUSION OF IMAGERY AND MOBILE LIDAR Paul Mrstik, Vice President Technology Kresimir Kusevic, R&D Engineer Terrapoint Inc. 140-1 Antares Dr. Ottawa, Ontario K2E 8C4 Canada [email protected]
How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm
IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode
Document Image Processing - A Review
Document Image Processing - A Review Shazia Akram Research Scholar University of Kashmir (India) Dr. Mehraj-Ud-Din Dar Director, IT & SS University of Kashmir (India) Aasia Quyoum Research Scholar University
Video Camera Image Quality in Physical Electronic Security Systems
Video Camera Image Quality in Physical Electronic Security Systems Video Camera Image Quality in Physical Electronic Security Systems In the second decade of the 21st century, annual revenue for the global
Algorithm for License Plate Localization and Recognition for Tanzania Car Plate Numbers
Algorithm for License Plate Localization and Recognition for Tanzania Car Plate Numbers Isack Bulugu Department of Electronics Engineering, Tianjin University of Technology and Education, Tianjin, P.R.
Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:
Chapter 3 Data Storage Objectives After studying this chapter, students should be able to: List five different data types used in a computer. Describe how integers are stored in a computer. Describe how
Automatic Traffic Estimation Using Image Processing
Automatic Traffic Estimation Using Image Processing Pejman Niksaz Science &Research Branch, Azad University of Yazd, Iran [email protected] Abstract As we know the population of city and number of
Correcting the Lateral Response Artifact in Radiochromic Film Images from Flatbed Scanners
Correcting the Lateral Response Artifact in Radiochromic Film Images from Flatbed Scanners Background The lateral response artifact (LRA) in radiochromic film images from flatbed scanners was first pointed
E27 SPRING 2013 ZUCKER PROJECT 2 PROJECT 2 AUGMENTED REALITY GAMING SYSTEM
PROJECT 2 AUGMENTED REALITY GAMING SYSTEM OVERVIEW For this project, you will implement the augmented reality gaming system that you began to design during Exam 1. The system consists of a computer, projector,
Medical Image Segmentation of PACS System Image Post-processing *
Medical Image Segmentation of PACS System Image Post-processing * Lv Jie, Xiong Chun-rong, and Xie Miao Department of Professional Technical Institute, Yulin Normal University, Yulin Guangxi 537000, China
CHAPTER I INTRODUCTION
CHAPTER I INTRODUCTION 1.1 Introduction Barcodes are machine readable symbols made of patterns and bars. Barcodes are used for automatic identification and usually are used in conjunction with databases.
Using Microsoft Word. Working With Objects
Using Microsoft Word Many Word documents will require elements that were created in programs other than Word, such as the picture to the right. Nontext elements in a document are referred to as Objects
VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS
VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS Aswin C Sankaranayanan, Qinfen Zheng, Rama Chellappa University of Maryland College Park, MD - 277 {aswch, qinfen, rama}@cfar.umd.edu Volkan Cevher, James
Jiří Matas. Hough Transform
Hough Transform Jiří Matas Center for Machine Perception Department of Cybernetics, Faculty of Electrical Engineering Czech Technical University, Prague Many slides thanks to Kristen Grauman and Bastian
Template-based Eye and Mouth Detection for 3D Video Conferencing
Template-based Eye and Mouth Detection for 3D Video Conferencing Jürgen Rurainsky and Peter Eisert Fraunhofer Institute for Telecommunications - Heinrich-Hertz-Institute, Image Processing Department, Einsteinufer
WHITE PAPER DECEMBER 2010 CREATING QUALITY BAR CODES FOR YOUR MOBILE APPLICATION
DECEMBER 2010 CREATING QUALITY BAR CODES FOR YOUR MOBILE APPLICATION TABLE OF CONTENTS 1 Introduction...3 2 Printed bar codes vs. mobile bar codes...3 3 What can go wrong?...5 3.1 Bar code Quiet Zones...5
Image Processing Based Automatic Visual Inspection System for PCBs
IOSR Journal of Engineering (IOSRJEN) ISSN: 2250-3021 Volume 2, Issue 6 (June 2012), PP 1451-1455 www.iosrjen.org Image Processing Based Automatic Visual Inspection System for PCBs Sanveer Singh 1, Manu
KaleidaGraph Quick Start Guide
KaleidaGraph Quick Start Guide This document is a hands-on guide that walks you through the use of KaleidaGraph. You will probably want to print this guide and then start your exploration of the product.
Optical Character Recognition (OCR)
History of Optical Character Recognition Optical Character Recognition (OCR) What You Need to Know By Phoenix Software International Optical character recognition (OCR) is the process of translating scanned
Mail Programming Topics
Mail Programming Topics Contents Introduction 4 Organization of This Document 4 Creating Mail Stationery Bundles 5 Stationery Bundles 5 Description Property List 5 HTML File 6 Images 8 Composite Images
A Lightweight and Effective Music Score Recognition on Mobile Phone
J Inf Process Syst, http://dx.doi.org/.3745/jips ISSN 1976-913X (Print) ISSN 92-5X (Electronic) A Lightweight and Effective Music Score Recognition on Mobile Phone Tam Nguyen* and Gueesang Lee** Abstract
Railway Expansion Joint Gaps and Hooks Detection Using Morphological Processing, Corner Points and Blobs
Railway Expansion Joint Gaps and Hooks Detection Using Morphological Processing, Corner Points and Blobs Samiul Islam ID: 12301053 Rubayat Ahmed Khan ID: 11301026 Supervisor RubelBiswas Department of Computer
An Approach for Utility Pole Recognition in Real Conditions
6th Pacific-Rim Symposium on Image and Video Technology 1st PSIVT Workshop on Quality Assessment and Control by Image and Video Analysis An Approach for Utility Pole Recognition in Real Conditions Barranco
Automatic Extraction of Signatures from Bank Cheques and other Documents
Automatic Extraction of Signatures from Bank Cheques and other Documents Vamsi Krishna Madasu *, Mohd. Hafizuddin Mohd. Yusof, M. Hanmandlu ß, Kurt Kubik * *Intelligent Real-Time Imaging and Sensing group,
Visual Product Identification for Blind
RESEARCH ARTICLE OPEN ACCESS Visual Product Identification for Blind Krutarth Majithia*, Darshan Sanghavi**, Bhavesh Pandya***, Sonali Vaidya**** *(Student, Department of Information Technology, St, Francis
Mouse Control using a Web Camera based on Colour Detection
Mouse Control using a Web Camera based on Colour Detection Abhik Banerjee 1, Abhirup Ghosh 2, Koustuvmoni Bharadwaj 3, Hemanta Saikia 4 1, 2, 3, 4 Department of Electronics & Communication Engineering,
Enhanced Bar Code Engine
Enhanced Bar Code Engine Introduction Access to the Kofax Standard bar code recognition engine is provided through ImageControls-based applications and ISIS-based applications. In addition to the standard
WHAT You SHOULD KNOW ABOUT SCANNING
What You should Know About Scanning WHAT You SHOULD KNOW ABOUT SCANNING If you re thinking about purchasing a scanner, you may already know some of the ways they can add visual interest and variety to
