Script Independent Word Spotting in Multilingual Documents
|
|
|
- Cathleen Farmer
- 9 years ago
- Views:
Transcription
1 Script Independent Word Spotting in Multilingual Documents Anurag Bhardwaj, Damien Jose and Venu Govindaraju Center for Unified Biometrics and Sensors (CUBS) University at Buffalo, State University of New York Amherst, New York 4228 Abstract This paper describes a method for script independent word spotting in multilingual handwritten and machine printed documents. The system accepts a query in the form of text from the user and returns a ranked list of word images from document image corpus based on similarity with the query word. The system is divided into two main components. The first component known as Indexer, performs indexing of all word images present in the document image corpus. This is achieved by extracting Moment Based features from word images and storing them as index. A template is generated for keyword spotting which stores the mapping of a keyword string to its corresponding word image which is used for generating query feature vector. The second component, Similarity Matcher, returns a ranked list of word images which are most similar to the query based on a cosine similarity metric. A manual Relevance feedback is applied based on Rocchio s formula, which re-formulates the query vector to return an improved ranked listing of word images. The performance of the system is seen to be superior on printed text than on handwritten text. Experiments are reported on documents of three different languages: English, Hindi and Sanskrit. For handwritten English, an average precision of 67% was obtained for 30 query words. For machine printed Hindi, an average precision of 7% was obtained for 75 query words and for Sanskrit, an average precision of 87% with 00 queries was obtained. Figure : A Sample English Document - Spotted Query word shown in the bounding box. Introduction The vast amount of information available in the form of handwritten and printed text in different languages poses a great challenge to the task of effective information extraction. Research in this area has primarily focussed on OCR based solutions which are adequate for Roman Language (A sample English document is shown in Figure ). However efficient solutions do not exist for scripts like Devanagari. One of the main reasons for this is lack of generalisation. OCR solutions tend to be specific to script type. Ongoing research continues to scale these methods to different types and font sizes. Furthermore, non-latin scripts exhibit complex character classes (like in the Sanskrit document shown in Figure 2) and poor quality documents are common. The notion of Word spotting [6] has been introduced as an alternative to OCR based solutions. It can be defined as an information retrieval task that finds all occurences of a typed query word in a set of handwritten
2 conclusions are outlined in Section 7. 2 Previous Work Figure 2: A Sample Sanskrit Document - Spotted Query word shown in the bounding box. or machine printed documents. While spotting words in English has been explored [3, 5, 4, 7, ], generalising these approaches to multiple scripts is still an ongoing research task. Harish et.al [] describe a Gradient, Structural, Concavity (GSC) based feature set for word spotting in multiple scripts. However, they do not report the average precision rate for all queries in their experimental results which makes it difficult to estimate the performance of their methodology. One important factor in finding a script independent solution to word spotting is use of image based features which are invariant to script type, image scale and translations. This paper proposes the use of moment based features for spotting word images in different scripts. We describe a moment-function based feature extraction scheme and use the standard vector space model to represent the word images. Similarity between the query feature vector and the indexed feature set is computed using a cosine similarity metric. We also apply the Rocchio formula based Manual Relevance feedback to improve the ranking of results obtained. We evaluate the performance of our system by conducting experiments on document images of three different scripts: English, Hindi and Sanskrit. The organization of the rest of the paper is as follows: Section 2 describes the previous work. Section 3 describes the theory of moment functions. Section 4 describes indexing word images and feature extraction. Section 5 describes the Similarity Matching and Relevance Feedback method applied to re-rank results. Section 6 describes the experiments and results. Future work and Spotting words in English has recently received considerable attention. Manmatha et al. [7], have proposed a combination of feature sets well suited for this application. For finding similarity between a query word image and the document word image, Dynamic Time warping [8] is commonly used. Although the approach has been promising with English handwritten documents, it does not generalise well across scripts. For eg., presence of Shirorekha in Devanagari script (an example shown in Figure 3) renders most of the profile based features ineffective. Also, DTW based approaches are slow. Approaches which use a filter based feature set [2], are efficient with uniform font size and type but are not able to handle font variations and translations. Harish et al. [] use a Gradient, Structural and Concavity (GSC) feature set which measures the image characteristics at local, intermediate and large scales. Features are extracted using a 4x8 sampling window to gather information locally. Since character segmentation points are not perfectly located, local information about stroke orientation and image gradient is not sufficient to characterize the change in font scale and type. Moreover, presence of noise in small regions of the word image lead to inconsistency in the overall feature extraction process. The performance of their approach is presented in terms of percentage of the number of times the correct match was returned, which does not capture the recall rate of system. For English word spotting, their results do not state the size of the dataset and precision recall values have been reported for only 4 query words. For Sanskrit word spotting, the total number of query words is not mentioned which makes understanding of precision recall curve difficult. A comparison of their results against our proposed method is presented in section 6. 3 Moment Functions Moments and functions of moments have been previously used to achieve an invariant representation of a twodimensional image pattern [9]. Geometrical moments
3 6 4 Avg. Precision Vs Moment Order 2 Figure 3: A Sample Hindi Document - Spotted Query words shown in the bounding box. [9] have the desirable property of being invariant under the image translation, scale and stretching and squeezing in either X or Y direction. Mathematically, such affine transformations are of the form of X = ax + b, and Y = cy + d [0]. Geometrical Moments (GM) of order (p + q) for a continuous image function f(x, y) are defined as : M pq = x p y q f(x, y) dx dy () where p, q = 0,, 2,...,. The above definition has the form of the projection of the function f(x, y) onto the mononomial x p y q. In our case, where the function f(x, y) has only two possible values of 0 and, the equation reduces to : M pq = X x p y q f(x, y) (2) Y where X and Y represent x, y coordinates of the image. The center of gravity of the image has the coordinates : x = M 0, ȳ = M 0, (3) If we refer to the center of gravity as origin, we obtain : M pq = X (x x) p (y ȳ) q f(x, y) (4) Y These moments are also referred to as Central Moments and can be expressed as a linear combination of M jk and the moments of lower order. The variances of the moment are defined as : σ x = M20, σ y = M02, (5) Avg Prec Moment Order Figure 4: Average Precision curve Vs Moment Order for a Hindi Image Subset. They are used to normalise the coordinates by setting: x = (x x), y (y ȳ) =, (6) σ x σ y Using the normalised values of coordinates as obtained in equation 6, the moment equation is as follows : X Y m pq = (x ) p (y ) q f(x, y) (7) which is invariant under image translation and scale transformations. 4 Feature Extraction and Indexing Feature extraction is preceeded by preprocessing of documents prior to computing moment based functions. Firstly, the Horizontal Profile feature of the document image is used to segment into line images. Thereafter, Vertical Profile features of each line image is used to extract individual word images. The word images are normalised to equal height and width of 256 pixels. Using equation 7, moments up to the 7th order are extracted from the normalised word images. A feature vector consisting of 30 moment values obtained is constructed for each word image and stored in the main in-
4 dex. Experiments were conducted to determine the number of orders up to which moments should be computed. As shown in Figure 4, average precision increases with the rise in moment orders ( up to a threshold of 7 orders ), after which the precision rate falls. This can be attributed to the nature of higher order Geometrical Moments which are prone to adding noise in the feature set and thereby reduce the overall precision after a certain threshold. After the index has been constructed using the moment features, we create a template which keeps the mapping between a word image and its corresponding text. This template is used to generate a query word image corresponding to the query text input by the user. A similar feature extraction mechanism is performed on the query word image to obtain a query feature vector which is used to find the similarity between the query word image and all other word images present in the corpus. 5 Similarity Matching and Relevance Feedback 5. Cosine Similarity A standard Vector Space Model is used represent the query word and all the candidate words. The index is maintained in the form of a word-feature matrix, where each word image w occupies one row of the matrix and all columns in a single row correspond to the moment values computed for the given word image. When the user enters any query word, a lookup operation is performed in the stored template to obtain the corresponding normalised word image for the input text. Feature extraction is performed on the word image to construct the query feature vector q. A cosine similarity score is computed for this query feature vector and all the rows of the word-feature matrix. The cosine similarity is calculated as follows: q. w SIM(q, w) = q w (8) All the words of the document corpus are then ranked according to the cosine similarity score. The top choice returned by the ranking mechanism represents the word image which is most similar to the input query word. 5.2 Relevance Feedback Since the word images present in the document corpus may be of poor print quality and may contain noise, the moment features computed may not be effective in ranking relevant word images higher in the obtained result. Also the presence of higher order moments may lead to inconsistency in the overall ranking of word images. To overcome this limitation, we have implemented a Rocchio s formula based manual Relevance Feedback mechanism. This mechanism re-formulates the query feature vector by adjusting the values of the individual moment orders present in the query vector. The relevance feedback mechanism assumes a user input after the presentation of the initial results. A user enters either a denoting a result to be relevant or 0 denoting a result to be irrelevant. The new query vector is computed as follows: q new = γ.q old + α i=r R. d i β j=nr NR. d j (9) i= j= where α, β and γ are term re-weighting constants. R denotes a relevant result set and NR denotes a non-relevant result set. For this experiment, we chose α =, β = 5 and γ = Experiments and Results The moment based features seem more robust in handling different image transformations compared to commonly used feature sets for word spotting such as GSC features [] and Gabor filter based features [2]. This can be obseerved in Figure 5. The first row of the image corresponds to different types of transformations applied to normal English handwritten word images ((a)) such as changing the image scale as in (b) or (c). The second row corresponds to linear ((f)) and scale transformation ((e)), when applied to the normal machine printed Hindi word image ((d)). Even after undergoing such transformations, the cosine similarity score between the moment features extracted from all image pairs is still close to, which reflects the strength of invariance of moment based features with respect to image transformations. Table shows the cosine similarity score between all pairs of English word
5 (a) (b) (c) Table 2: Cosine Similarity Score for Hindi Transformed Word Image Pairs. Word Image Pair (d) (e) (f) (d) (e) (f) Average Precision Curve for few Queries (d) (e) (f) Figure 5: Various forms of Image Transformations. (a) & (d) Sample Word Image. (b),(c) & (e) Scale Transformation Examples (f) Linear Transformation Example. Table : Cosine Similarity Score for English Transformed Word Image Pairs. Word Image Pair (a) (b) (c) (a) (b) (c) images. Table 2 shows the similarity score between all pairs of hindi word images. The data set for evaluating our methodology consists of documents in three scripts, namely English, Hindi and Sanskrit. For English, we used publicly available IAMdb [3] handwritten word images and word images extracted from George Washington s publicly available historical manuscripts [4]. The dataset for English consists of 707 word images. For Hindi, 763 word images were extracted from publicly available Million Book Project documents [2]. For Sanskrit, 693 word images were extracted from 5 Sanskrit documents downloaded from the URL: For public testing and evaluation, we have also made our dataset available at the location: For evaluating the system performance, we use the commonly used Average Precision Metric. Precision for Avg Precision Query Figure 6: Average Precision curve for English Word Spotting. Average Precision Avg. Precision for few Queries Queries Figure 7: Average Precision curve for Hindi Word Spotting.
6 Average Precision Avg Precision for few Queries Queries Figure 8: Spotting. Average Precision curve for Sanskrit Word each query image was calculated at every recall level, and then averaged over to give an Average Precision per query. Figure 6 shows the average precision values for some query words in English. Figure 7 shows the average precision values for query words in Hindi. Figure 8 shows the average precision values for query words in Sanskrit. The experimental results for all three scripts are summarised in Table 3. The Average Precision rates as shown in the table have been averaged over 30 queries in English, 75 queries in Hindi and 00 queries in Sanskrit. As shown here, the system works better for machine printed text (7.8 and 87.88) as compared to handwritten (67.0). The best performance is seen with Sanskrit script (87.88), which has a variable length words allowing it to be more discriminative in its feature analysis as compared to other two scripts. Table 4 compares the performance of GSC based word spotting as reported in [] against our methodology. At 50% recall level, Moment based features perform better than GSC based features for both handwritten English and machine printed Sanskrit documents. We also evaluate the performance of Gabor Feature based word spotting method [2] on our dataset. Features are extracted using an array of Gabor filters having a scale from 4 pixels to 6 pixels and 8 orientations. Table 5 summarizes the performance of Gabor features based method as opposed to our Moment based system. As shown, Mo- Table 3: Average Precision rate for word spotting in all 3 Scripts. Script Before RF After RF English Hindi Sanskrit Table 4: Comparison of GSC and Moments based features at 50% recall level. Script GSC Moments English Sanskrit ment based features outperform Gabor based features in terms of average precision rates obtained for all 3 scripts used in the experiment. 7 Summary and Conclusion In this paper, we have proposed a framework for script independent word spotting in document images. We have shown the effectiveness of using statistical Moment based features as opposed to some of the structural and profile based features which may constrain the approach to few scripts. Another advantage of using moment based features is that they are image scale and translation invariant which makes them suitable for font independent feature analysis. In order to deal with the noise sensitivity of the higher order moments, we use a manual relevance feedback to improve the ranking of the relevant word images. We are currently working on extending our methodology to larger data sets and incorporating more scripts in future experiments. Table 5: Comparison of Gabor filter based and Moments Features. Script Gabor Moments English Hindi Sanskrit
7 References [] S. N. Srihari, H. Srinivasan, C. Huang and S. Shetty, Spotting Words in Latin, Devanagari and Arabic Scripts, Vivek: Indian Journal of Artificial Intelligence, [2] Huaigu Cao, Venu Govindaraju, Template-Free Word Spotting in Low-Quality Manuscripts, the Sixth International Conference on Advances in Pattern Recognition (ICAPR), Calcutta, India, [3] S. Kuo and O. Agazzi, Keyword spotting in poorly printed documents using 2-d hidden markov models, in IEEE Trans. Pattern Analysis and Machine Intelligence, 6, pp , 994. [4] M. Burl and P.Perona, Using hierarchical shape models to spot keywords in cursive handwriting, in IEEE- CS Conference on Computer Vision and Pattern Recognition, June 23-28, pp , 998. [5] A. Kolz, J. Alspector, M. Augusteijn, R. Carlson, and G. V. Popescu, A line oriented approach to word spotting in hand written documents, in Pattern Analysis and Applications, 2(3), pp. 5368, [6] R. Manmatha and W. B. Croft, Word spotting: Indexing handwritten archives, In M. Maybury, editor, Intelligent Multimedia Information Retrieval Collection, AAAI/MIT Press, Menlo Park, CA, 997. [7] T. Rath and R. Manmatha,.Features for word spotting in historical manuscripts,. in Proc. International Conference on Document Analysis and Recognition, pp , [8] T. Rath and R. Manmatha,.Word image matching using dynamic time warping,. in Proceeding of the Conference on Computer Vision and Pattern Recognition (CVPR), pp , [9] Teh C.-H. and Chin R.T., On Image Analysis by the Methods of Moments, in IEEE Trans. Pattern Analysis and Machine Intelligence, 0, No. 4, pp , 988. [0] Franz L. Alt, Digital Pattern Recognition by Moments, in The Journal of the ACM, Vol. 9, Issue 2, pp , 962. [] Jeff L. Decurtins and Edward C. Chen, Keyword spotting via word shape recognition in Proc. SPIE Vol. 2422, p , Document Recognition II, Luc M. Vincent; Henry S. Baird; Eds., vol. 2422, pp , March 995. [2] Carnegie Mellon University - Million book project, URL: [3] IAM Database for Off-line Cursive Handwritten Text, URL: zimmerma/iamdb/iamdb.html. [4] Word Image Dataset at CIIR - UMass, URL:
Document Image Retrieval using Signatures as Queries
Document Image Retrieval using Signatures as Queries Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Harish Srinivasan, Chen Huang CEDAR, University at Buffalo(SUNY) Amherst, New York 14228 Gady Agam and
2 Signature-Based Retrieval of Scanned Documents Using Conditional Random Fields
2 Signature-Based Retrieval of Scanned Documents Using Conditional Random Fields Harish Srinivasan and Sargur Srihari Summary. In searching a large repository of scanned documents, a task of interest is
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,
Signature verification using Kolmogorov-Smirnov. statistic
Signature verification using Kolmogorov-Smirnov statistic Harish Srinivasan, Sargur N.Srihari and Matthew J Beal University at Buffalo, the State University of New York, Buffalo USA {srihari,hs32}@cedar.buffalo.edu,[email protected]
Signature Segmentation from Machine Printed Documents using Conditional Random Field
2011 International Conference on Document Analysis and Recognition Signature Segmentation from Machine Printed Documents using Conditional Random Field Ranju Mandal Computer Vision and Pattern Recognition
Using Lexical Similarity in Handwritten Word Recognition
Using Lexical Similarity in Handwritten Word Recognition Jaehwa Park and Venu Govindaraju Center of Excellence for Document Analysis and Recognition (CEDAR) Department of Computer Science and Engineering
Keywords image processing, signature verification, false acceptance rate, false rejection rate, forgeries, feature vectors, support vector machines.
International Journal of Computer Application and Engineering Technology Volume 3-Issue2, Apr 2014.Pp. 188-192 www.ijcaet.net OFFLINE SIGNATURE VERIFICATION SYSTEM -A REVIEW Pooja Department of Computer
ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan
Handwritten Signature Verification ECE 533 Project Report by Ashish Dhawan Aditi R. Ganesan Contents 1. Abstract 3. 2. Introduction 4. 3. Approach 6. 4. Pre-processing 8. 5. Feature Extraction 9. 6. Verification
NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju
NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised
Galaxy Morphological Classification
Galaxy Morphological Classification Jordan Duprey and James Kolano Abstract To solve the issue of galaxy morphological classification according to a classification scheme modelled off of the Hubble Sequence,
Analecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
Interactive person re-identification in TV series
Interactive person re-identification in TV series Mika Fischer Hazım Kemal Ekenel Rainer Stiefelhagen CV:HCI lab, Karlsruhe Institute of Technology Adenauerring 2, 76131 Karlsruhe, Germany E-mail: {mika.fischer,ekenel,rainer.stiefelhagen}@kit.edu
Offline Word Spotting in Handwritten Documents
Offline Word Spotting in Handwritten Documents Nicholas True Department of Computer Science University of California, San Diego San Diego, CA 9500 [email protected] Abstract The digitization of written
Cursive Handwriting Recognition for Document Archiving
International Digital Archives Project Cursive Handwriting Recognition for Document Archiving Trish Keaton Rod Goodman California Institute of Technology Motivation Numerous documents have been conserved
Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap
Palmprint Recognition By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap Palm print Palm Patterns are utilized in many applications: 1. To correlate palm patterns with medical disorders, e.g. genetic
Face detection is a process of localizing and extracting the face region from the
Chapter 4 FACE NORMALIZATION 4.1 INTRODUCTION Face detection is a process of localizing and extracting the face region from the background. The detected face varies in rotation, brightness, size, etc.
An Information Retrieval using weighted Index Terms in Natural Language document collections
Internet and Information Technology in Modern Organizations: Challenges & Answers 635 An Information Retrieval using weighted Index Terms in Natural Language document collections Ahmed A. A. Radwan, Minia
DIAGONAL BASED FEATURE EXTRACTION FOR HANDWRITTEN ALPHABETS RECOGNITION SYSTEM USING NEURAL NETWORK
DIAGONAL BASED FEATURE EXTRACTION FOR HANDWRITTEN ALPHABETS RECOGNITION SYSTEM USING NEURAL NETWORK J.Pradeep 1, E.Srinivasan 2 and S.Himavathi 3 1,2 Department of ECE, Pondicherry College Engineering,
Script and Language Identification for Handwritten Document Images. Judith Hochberg Kevin Bowers * Michael Cannon Patrick Kelly
Script and Language Identification for Handwritten Document Images Judith Hochberg Kevin Bowers * Michael Cannon Patrick Kelly Computer Research and Applications Group (CIC-3) Mail Stop B265 Los Alamos
Medical Image Segmentation of PACS System Image Post-processing *
Medical Image Segmentation of PACS System Image Post-processing * Lv Jie, Xiong Chun-rong, and Xie Miao Department of Professional Technical Institute, Yulin Normal University, Yulin Guangxi 537000, China
Simultaneous Gamma Correction and Registration in the Frequency Domain
Simultaneous Gamma Correction and Registration in the Frequency Domain Alexander Wong [email protected] William Bishop [email protected] Department of Electrical and Computer Engineering University
Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability
Classification of Fingerprints Sarat C. Dass Department of Statistics & Probability Fingerprint Classification Fingerprint classification is a coarse level partitioning of a fingerprint database into smaller
Introduction to Pattern Recognition
Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University [email protected] CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
Research on Chinese financial invoice recognition technology
Pattern Recognition Letters 24 (2003) 489 497 www.elsevier.com/locate/patrec Research on Chinese financial invoice recognition technology Delie Ming a,b, *, Jian Liu b, Jinwen Tian b a State Key Laboratory
DEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACH
DEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACH Vikas J Dongre 1 Vijay H Mankar 2 Department of Electronics & Telecommunication, Government Polytechnic, Nagpur, India 1 [email protected]; 2
How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm
IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode
Online Farsi Handwritten Character Recognition Using Hidden Markov Model
Online Farsi Handwritten Character Recognition Using Hidden Markov Model Vahid Ghods*, Mohammad Karim Sohrabi Department of Electrical and Computer Engineering, Semnan Branch, Islamic Azad University,
Handwritten Signature Verification using Neural Network
Handwritten Signature Verification using Neural Network Ashwini Pansare Assistant Professor in Computer Engineering Department, Mumbai University, India Shalini Bhatia Associate Professor in Computer Engineering
PARTIAL FINGERPRINT REGISTRATION FOR FORENSICS USING MINUTIAE-GENERATED ORIENTATION FIELDS
PARTIAL FINGERPRINT REGISTRATION FOR FORENSICS USING MINUTIAE-GENERATED ORIENTATION FIELDS Ram P. Krish 1, Julian Fierrez 1, Daniel Ramos 1, Javier Ortega-Garcia 1, Josef Bigun 2 1 Biometric Recognition
Detection and mitigation of Web Services Attacks using Markov Model
Detection and mitigation of Web Services Attacks using Markov Model Vivek Relan [email protected] Bhushan Sonawane [email protected] Department of Computer Science and Engineering, University of Maryland,
Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features
Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering
Incorporating Window-Based Passage-Level Evidence in Document Retrieval
Incorporating -Based Passage-Level Evidence in Document Retrieval Wensi Xi, Richard Xu-Rong, Christopher S.G. Khoo Center for Advanced Information Systems School of Applied Science Nanyang Technological
Semantic Concept Based Retrieval of Software Bug Report with Feedback
Semantic Concept Based Retrieval of Software Bug Report with Feedback Tao Zhang, Byungjeong Lee, Hanjoon Kim, Jaeho Lee, Sooyong Kang, and Ilhoon Shin Abstract Mining software bugs provides a way to develop
Make and Model Recognition of Cars
Make and Model Recognition of Cars Sparta Cheung Department of Electrical and Computer Engineering University of California, San Diego 9500 Gilman Dr., La Jolla, CA 92093 [email protected] Alice Chu Department
Towards Online Recognition of Handwritten Mathematics
Towards Online Recognition of Handwritten Mathematics Vadim Mazalov, joint work with Oleg Golubitsky and Stephen M. Watt Ontario Research Centre for Computer Algebra Department of Computer Science Western
Feature Weighting for Improving Document Image Retrieval System Performance
Feature Weighting for Improving Document Image Retrieval System Performance ohammadreza Keyvanpour 1, Reza Tavoli 2 1 Department Of Computer Engineering, Alzahra University Tehran, Iran 2 Department of
Comparison of Elastic Matching Algorithms for Online Tamil Handwritten Character Recognition
Comparison of Elastic Matching Algorithms for Online Tamil Handwritten Character Recognition Niranjan Joshi, G Sita, and A G Ramakrishnan Indian Institute of Science, Bangalore, India joshi,sita,agr @ragashrieeiiscernetin
Efficient on-line Signature Verification System
International Journal of Engineering & Technology IJET-IJENS Vol:10 No:04 42 Efficient on-line Signature Verification System Dr. S.A Daramola 1 and Prof. T.S Ibiyemi 2 1 Department of Electrical and Information
Domain Classification of Technical Terms Using the Web
Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using
Automatic Extraction of Signatures from Bank Cheques and other Documents
Automatic Extraction of Signatures from Bank Cheques and other Documents Vamsi Krishna Madasu *, Mohd. Hafizuddin Mohd. Yusof, M. Hanmandlu ß, Kurt Kubik * *Intelligent Real-Time Imaging and Sensing group,
A Method of Caption Detection in News Video
3rd International Conference on Multimedia Technology(ICMT 3) A Method of Caption Detection in News Video He HUANG, Ping SHI Abstract. News video is one of the most important media for people to get information.
Poker Vision: Playing Cards and Chips Identification based on Image Processing
Poker Vision: Playing Cards and Chips Identification based on Image Processing Paulo Martins 1, Luís Paulo Reis 2, and Luís Teófilo 2 1 DEEC Electrical Engineering Department 2 LIACC Artificial Intelligence
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
FRACTAL RECOGNITION AND PATTERN CLASSIFIER BASED SPAM FILTERING IN EMAIL SERVICE
FRACTAL RECOGNITION AND PATTERN CLASSIFIER BASED SPAM FILTERING IN EMAIL SERVICE Ms. S.Revathi 1, Mr. T. Prabahar Godwin James 2 1 Post Graduate Student, Department of Computer Applications, Sri Sairam
A Dynamic Approach to Extract Texts and Captions from Videos
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
MATH1231 Algebra, 2015 Chapter 7: Linear maps
MATH1231 Algebra, 2015 Chapter 7: Linear maps A/Prof. Daniel Chan School of Mathematics and Statistics University of New South Wales [email protected] Daniel Chan (UNSW) MATH1231 Algebra 1 / 43 Chapter
Biometric Authentication using Online Signatures
Biometric Authentication using Online Signatures Alisher Kholmatov and Berrin Yanikoglu [email protected], [email protected] http://fens.sabanciuniv.edu Sabanci University, Tuzla, Istanbul,
Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.
Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.38457 Accuracy Rate of Predictive Models in Credit Screening Anirut Suebsing
Dataset Preparation and Indexing for Data Mining Analysis Using Horizontal Aggregations
Dataset Preparation and Indexing for Data Mining Analysis Using Horizontal Aggregations Binomol George, Ambily Balaram Abstract To analyze data efficiently, data mining systems are widely using datasets
Determining the Resolution of Scanned Document Images
Presented at IS&T/SPIE EI 99, Conference 3651, Document Recognition and Retrieval VI Jan 26-28, 1999, San Jose, CA. Determining the Resolution of Scanned Document Images Dan S. Bloomberg Xerox Palo Alto
Visual Structure Analysis of Flow Charts in Patent Images
Visual Structure Analysis of Flow Charts in Patent Images Roland Mörzinger, René Schuster, András Horti, and Georg Thallinger JOANNEUM RESEARCH Forschungsgesellschaft mbh DIGITAL - Institute for Information
Vision based Vehicle Tracking using a high angle camera
Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu [email protected] [email protected] Abstract A vehicle tracking and grouping algorithm is presented in this work
Open issues and research trends in Content-based Image Retrieval
Open issues and research trends in Content-based Image Retrieval Raimondo Schettini DISCo Universita di Milano Bicocca [email protected] www.disco.unimib.it/schettini/ IEEE Signal Processing Society
How To Filter Spam Image From A Picture By Color Or Color
Image Content-Based Email Spam Image Filtering Jianyi Wang and Kazuki Katagishi Abstract With the population of Internet around the world, email has become one of the main methods of communication among
ART Extension for Description, Indexing and Retrieval of 3D Objects
ART Extension for Description, Indexing and Retrieval of 3D Objects Julien Ricard, David Coeurjolly, Atilla Baskurt LIRIS, FRE 2672 CNRS, Bat. Nautibus, 43 bd du novembre 98, 69622 Villeurbanne cedex,
Open Access A Facial Expression Recognition Algorithm Based on Local Binary Pattern and Empirical Mode Decomposition
Send Orders for Reprints to [email protected] The Open Electrical & Electronic Engineering Journal, 2014, 8, 599-604 599 Open Access A Facial Expression Recognition Algorithm Based on Local Binary
SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS. Nitin Khanna and Edward J. Delp
SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS Nitin Khanna and Edward J. Delp Video and Image Processing Laboratory School of Electrical and Computer Engineering Purdue University West Lafayette,
Low-resolution Character Recognition by Video-based Super-resolution
2009 10th International Conference on Document Analysis and Recognition Low-resolution Character Recognition by Video-based Super-resolution Ataru Ohkura 1, Daisuke Deguchi 1, Tomokazu Takahashi 2, Ichiro
CS 534: Computer Vision 3D Model-based recognition
CS 534: Computer Vision 3D Model-based recognition Ahmed Elgammal Dept of Computer Science CS 534 3D Model-based Vision - 1 High Level Vision Object Recognition: What it means? Two main recognition tasks:!
Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection
Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection Gareth J. F. Jones, Declan Groves, Anna Khasin, Adenike Lam-Adesina, Bart Mellebeek. Andy Way School of Computing,
Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
Mining Text Data: An Introduction
Bölüm 10. Metin ve WEB Madenciliği http://ceng.gazi.edu.tr/~ozdemir Mining Text Data: An Introduction Data Mining / Knowledge Discovery Structured Data Multimedia Free Text Hypertext HomeLoan ( Frank Rizzo
Static Environment Recognition Using Omni-camera from a Moving Vehicle
Static Environment Recognition Using Omni-camera from a Moving Vehicle Teruko Yata, Chuck Thorpe Frank Dellaert The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 USA College of Computing
RFID and Camera-based Hybrid Approach to Track Vehicle within Campus
2009 International Symposium on Computing, Communication, and Control (ISCCC 2009) Proc.of CSIT vol.1 (2011) (2011) IACSIT Press, Singapore RFID and Camera-based Hybrid Approach to Track Vehicle within
2. Distributed Handwriting Recognition. Abstract. 1. Introduction
XPEN: An XML Based Format for Distributed Online Handwriting Recognition A.P.Lenaghan, R.R.Malyan, School of Computing and Information Systems, Kingston University, UK {a.lenaghan,r.malyan}@kingston.ac.uk
Detection and Restoration of Vertical Non-linear Scratches in Digitized Film Sequences
Detection and Restoration of Vertical Non-linear Scratches in Digitized Film Sequences Byoung-moon You 1, Kyung-tack Jung 2, Sang-kook Kim 2, and Doo-sung Hwang 3 1 L&Y Vision Technologies, Inc., Daejeon,
APPLYING COMPUTER VISION TECHNIQUES TO TOPOGRAPHIC OBJECTS
APPLYING COMPUTER VISION TECHNIQUES TO TOPOGRAPHIC OBJECTS Laura Keyes, Adam Winstanley Department of Computer Science National University of Ireland Maynooth Co. Kildare, Ireland [email protected], [email protected]
LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE. [email protected]
LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE 1 S.Manikandan, 2 S.Abirami, 2 R.Indumathi, 2 R.Nandhini, 2 T.Nanthini 1 Assistant Professor, VSA group of institution, Salem. 2 BE(ECE), VSA
Data, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
Recognition Method for Handwritten Digits Based on Improved Chain Code Histogram Feature
3rd International Conference on Multimedia Technology ICMT 2013) Recognition Method for Handwritten Digits Based on Improved Chain Code Histogram Feature Qian You, Xichang Wang, Huaying Zhang, Zhen Sun
CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation.
CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation. Miguel Ruiz, Anne Diekema, Páraic Sheridan MNIS-TextWise Labs Dey Centennial Plaza 401 South Salina Street Syracuse, NY 13202 Abstract:
Bayesian Network Modeling of Hangul Characters for On-line Handwriting Recognition
Bayesian Network Modeling of Hangul haracters for On-line Handwriting Recognition Sung-ung ho and in H. Kim S Div., EES Dept., KAIS, 373- Kusong-dong, Yousong-ku, Daejon, 305-70, KOREA {sjcho, jkim}@ai.kaist.ac.kr
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
Euler Vector: A Combinatorial Signature for Gray-Tone Images
Euler Vector: A Combinatorial Signature for Gray-Tone Images Arijit Bishnu, Bhargab B. Bhattacharya y, Malay K. Kundu, C. A. Murthy fbishnu t, bhargab, malay, [email protected] Indian Statistical Institute,
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles [email protected] and lunbo
Towards Inferring Web Page Relevance An Eye-Tracking Study
Towards Inferring Web Page Relevance An Eye-Tracking Study 1, [email protected] Yinglong Zhang 1, [email protected] 1 The University of Texas at Austin Abstract We present initial results from a project,
Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition
Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition 1. Image Pre-Processing - Pixel Brightness Transformation - Geometric Transformation - Image Denoising 1 1. Image Pre-Processing
OCR-Based Electronic Documentation Management System
OCR-Based Electronic Documentation Management System Khalaf S. Alkhalaf, Abdulelah I. Almishal, Anas O. Almahmoud, and Majed S. Alotaibi Abstract Optical character recognition (OCR) is one of the latest
SAS BI Dashboard 3.1. User s Guide
SAS BI Dashboard 3.1 User s Guide The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2007. SAS BI Dashboard 3.1: User s Guide. Cary, NC: SAS Institute Inc. SAS BI Dashboard
Algorithm for License Plate Localization and Recognition for Tanzania Car Plate Numbers
Algorithm for License Plate Localization and Recognition for Tanzania Car Plate Numbers Isack Bulugu Department of Electronics Engineering, Tianjin University of Technology and Education, Tianjin, P.R.
Template-based Eye and Mouth Detection for 3D Video Conferencing
Template-based Eye and Mouth Detection for 3D Video Conferencing Jürgen Rurainsky and Peter Eisert Fraunhofer Institute for Telecommunications - Heinrich-Hertz-Institute, Image Processing Department, Einsteinufer
Visualizing Data: Scalable Interactivity
Visualizing Data: Scalable Interactivity The best data visualizations illustrate hidden information and structure contained in a data set. As access to large data sets has grown, so has the need for interactive
Signature Segmentation and Recognition from Scanned Documents
Signature Segmentation and Recognition from Scanned Documents Ranju Mandal, Partha Pratim Roy, Umapada Pal and Michael Blumenstein School of Information and Communication Technology, Griffith University,
Continued Fractions and the Euclidean Algorithm
Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering
Optimization of Image Search from Photo Sharing Websites Using Personal Data
Optimization of Image Search from Photo Sharing Websites Using Personal Data Mr. Naeem Naik Walchand Institute of Technology, Solapur, India Abstract The present research aims at optimizing the image search
Automated Process for Generating Digitised Maps through GPS Data Compression
Automated Process for Generating Digitised Maps through GPS Data Compression Stewart Worrall and Eduardo Nebot University of Sydney, Australia {s.worrall, e.nebot}@acfr.usyd.edu.au Abstract This paper
