2 Signature-Based Retrieval of Scanned Documents Using Conditional Random Fields

Size: px
Start display at page:

Download "2 Signature-Based Retrieval of Scanned Documents Using Conditional Random Fields"

Transcription

1 2 Signature-Based Retrieval of Scanned Documents Using Conditional Random Fields Harish Srinivasan and Sargur Srihari Summary. In searching a large repository of scanned documents, a task of interest is that of retrieving documents from a database using a signature image as a query. This chapter presents a signature retrieval strategy using document indexing and retrieval. Indexing is done using (i) a model based on Conditional Random Fields (CRF) to label extracted segments of scanned documents as Machine-Print, Signature and Noise, (ii) a technique using support vector machine to remove noise and printed text overlapping the signature images and (iii) a global shape-based feature extractor that is computed for each signature image. The documents are first segmented into patches using a region growing algorithm and the CRF based model is used to infer the labels of each of these patches. The robustness of the method is due to the inherent nature of modeling neighboring spatial dependencies in the labels as well as the observed data using CRF. The model parameters are learnt using conjugate gradient descent with line search optimization to maximize pseudo-likelihood estimates and the inference of labels is done by computing the probability of the labels under the model with Gibbs sampling. A further post processing of the labeled patches yields signature regions which are used to index the documents. Retrieval is performed using a matching algorithm to compare the query with the indexed documents. Signature matching is based on a normalized correlation similarity measure using global shape-based binary feature vectors. The end-to-end system is a content-based image retrieval system designed for signatures. Introduction Retrieving relevant documents from a repository of scanned documents has many applications including the legal and forensic domains. In particular documents containing handwriting have a potentially useful role in counterterrorism operations, e.g., retrieving forms filled out by certain applicants for opening post-office boxes, identifying envelopes of interest in the mail stream, etc. In searching complex documents, a task of relevance is relating the signature in a given document to the closest matches within a database of documents; this is the signature retrieval task which is addressed in this chapter. S. Argamon, N. Howard (eds.), Computational Methods 17 for Counterterrorism, DOI / _2, Springer-Verlag Berlin Heidelberg 2009

2 18 Harish Srinivasan and Sargur Srihari Retrieval of handwritten words has been found to be more challenging than image matching due to the lack of low level distinguishing features like color and texture. Handwritten word retrieval has been discussed in Rath et al. (2004), Zhang et al. (2004), Kolz et al. (2000), Plamodon and Lorette (2000). The method of Kolz et al. (2000) extracts profile-based holistic shape features from a line or word image and uses dynamic time warping (DTW) to match words. A word shape based method was shown to perform better than the DTW method, in terms of efficiency and effectiveness (Zhang et al. 2004). Considering historical manuscripts, Rath et al. (2004) describe a method for retrieval based on text queries without recognition using a transcribed set of pages for training. This chapter presents an effective signature extraction and retrieval technique. It is based on a statistical model for machine learning known as Conditional Random Fields (CRFs) (Lafferty et al. 2001; Kumar and Hebert 2003; Quattoni et al. 2005). CRFs are more general than Hidden Markov Models in that there are no implicit independence assumptions. The CRF model is used in extracting signatures from complex documents by isolating the different contents present in the documents. The motivation to use a CRF based model for this application arises from the spatial inter-dependencies of the different regions in documents. The problem is formulated as follows: Given a document: (i) Segment the document into a number of patches (approximately the size of a word), and (ii) Label each of the segments as one of Machine- Print, Handwriting or Noise. Then the region containing the signatures are identified from the labeled patches and isolated. Given a database of signed documents, the retrieval task (Srihari et al. 2006) is to relate a query document to other documents in this database which have been signed by the same author. The documents under consideration are indexed by the features of the signatures extracted from the documents. The retrieval task would be to retrieve all the other documents signed by the same author. This involves extracting the features of the query signature and matching these features to those of the indexed documents. A technique based on query expansion using automatic relevance feedback (Salton and McGill 1983) has also been implemented, where the highest ranked result is used along with the original query to retrieve relevant documents. This retrieval technique can be extended to accept a text query of the authors name provided each author has been previously enrolled with at least 1 signature. Indexing The steps involved in indexing the document images are described here. Signature block location The first step in indexing a scanned document image is to extract the signature block. A signature block is defined as a rectangular image snippet

3 2 Signature-Based Document Retrieval 19 Scanned Document Image Segmentation and Neighbor Determination Signature Patch Classification Compute Signature Features Noise Removal Selection of Signature Region Fig Block diagram of indexing the documents. containing the entire signature. The signature block is further processed to remove non-signature material, e.g., printed name of the signatory, portions of the accompanying text, spots, etc. The operational steps in signature extraction (Fig. 2.1) are: (i) segmentation into patches and neighbor determination, (ii) classification of patches into signature and non-signature classes, (iii) isolating the signature region (image snippet) from the rest of the image, (iv) removal of noise and printed text from the signature region and (v) extraction of features required for signature matching. Segmentation and neighbor determination A patch is defined to be a region in a document such that, if a rectangular window (size determined dynamically for each document) is drawn with each foreground pixel within the patch at its center, then the window shall not contain any foreground pixel from another patch. The size of the patch was optimized in a way to represent approximately the size of a word. The algorithm for generating these patches is a region growing algorithm and a brief description is given below. 1. Initialize every pixel to be a separate patch. 2. Start with a foreground pixel that is not already marked. 3. With this pixel as the center, draw a rectangular window of size proportional to the height and width of the document being considered. 4. All foreground pixels of connected components with any pixel enclosed within this rectangular window are marked as belonging to the same patch as that of the center pixel. 5. Repeat steps 2 through 4 until all pixels are marked. 6. Patches with pixels lesser than a fixed threshold are ignored as noise and are not attempted to be labeled as one of machine-print, handwriting/signature, noise.

4 20 Harish Srinivasan and Sargur Srihari Once all the patches are obtained for a document, the neighboring patches are identified. A total of 6 neighbors are identified for each patch. These neighbors are the closest (top/bottom) and the two closest (left/right) in terms of the convex-hull distance between the patches considered. The reason for including more neighbors from the right and left, is the fact that scanned documents have greater dependency across the width of the document. The definitions of top, bottom, left and right are determined from the center of gravity of the patch being considered. However the convex-hull distance between two patches is measured taking the entirety of both the patches. Conditional Random Field model description A model based on Conditional Random Fields is used to label each of the patches identified using the labels of the neighboring patches. The probabilistic model of the Conditional Random Field used is given below. P (y x,θ)= eψ(y,x;θ) y eψ(y,x;θ) (2.1) where y {Machine-print, Handwriting, Noise} and x : Observed document and θ : CRF model parameters. It is assumed that a document is segmented into m non-overlapping patches. Then m ψ(y, x; θ) = A(j, y j, x; θ s )+ I(j, k, y j,y k, x; θ t ) (2.2) j=1 (j,k) E The first term in Eq. 2.2 is called the state term and it associates the characteristics of that patch with its corresponding label. θ s are called the state parameters for the CRF model. Analogous to it, the second term, captures the neighbor/contextual dependencies by associating pair wise interaction of the neighboring labels and the observed data. θ t are called the transition parameters of the CRF model. E is a set of edges that represent the neighbors of a patch. The association potential can be modeled as A(j, y j, x; θ s )= i (h i θ s2 ij ) where h i is typically the state feature value associated with the patch being considered. In order to introduce a non-linear decision boundary we define h i to be a transformed state feature vector

5 2 Signature-Based Document Retrieval 21 ( ) h i = tanh (f s1 l (j, y j, x) θ s1 l i) l where fl s is the lth state features extracted for that patch. The state features that are used for this problem are defined later in Table 2.1. The state features, f l are transformed by the tanh function to give the feature vector h. The state parameters θ s are a union of the two sets of parameters θ s1 and θ s2. The interaction potential I( ) is generally an inner product between the transition parameters θ t and the transition features f t. To introduce nonlinearity, we use the idea of kernels, and the interaction potential is defined as follows: I(j, k, y j,y k, x; θ t )= (φ l θl t ) l where φ l is the lth transition feature after applying a quadratic kernel on the original transition features as defined below. Φ l = f t (j, k, y j,y k, x) f t (j, k, y j,y k, x) State Feature Table 2.1. Description of the 23 state features used. Description Height Maximum height of the patch Avg component width The mean width of the connected components within a patch Density Density of foreground pixels within the patch Aspect ratio Width/Height of the patch Gabor filter 8 features capturing the different stroke orientations Variation of height Variation in height within a patch Width variation Variation in width within a patch Overlap Sum of overlap in area between the connected components within a patch Percentage of text above Relative location of the patch with respect to the entire document Number of components Count of the connected components within a patch Maximum component Maximum size of a component within a patch size Points in convex hull Number of points in the convex hull of the patch Maximum run length The maximum horizontal run length within a patch Avg run length The average horizontal run length within a patch Horizontal Transitions A count of the number of times the pixel value transitions from white to black horizontally Vertical Transitions A count of the number of times the pixel value transitions from white to black vertically

6 22 Harish Srinivasan and Sargur Srihari Parameter estimation There are numerous ways to estimate the parameters of this CRF model (Wallach 2002). In order to avoid the computation of the partition function we learn the parameters by maximizing the pseudo-likelihood of the documents, which is an approximation of the maximum likelihood value. We estimate the Maximum pseudo-likelihood parameters using conjugate gradient descent with line search optimization. The pseudo-likelihood estimate of the parameters θ are given by Eq. 2.3: ˆ θ ML arg max θ M P (y i y Ni, x,θ) (2.3) i=1 where P (y i y Ni, x,θ) (Probability of the label y i for a particular patch i given the labels of its neighbors, y Ni ), is given below. P (y i y Ni, x,θ)= e ψ(yi,x;θ) a eψ(yi=a,x;θ) (2.4) where ψ(y i,x; θ) is defined as before in Eq Note that the Eq. 2.3 has an additional y Ni in the conditioning set and hence the factorization into products is feasible as the set of neighbors for the patch form the minimal Markov blanket. From Eqs. 2.3 and 2.4, the log pseudo-likelihood of the data is given by ( M L(θ) = ψ(y i = a, x; θ) log ) e ψ(yi=a,x;θ) a i=1 Features for signature classification State features try to associate each patch to a label using characteristics of that patch alone. Analogous to these, transition features associate a patch to a label using information from the neighboring patches. Twenty-three state features are extracted for each patch, as described in Table 2.1. Then, the four transition features described in Table 2.2 are computed using the state features and neighbor information. Using these extracted features from each of the 3500 patches in the training set, the parameters of the CRF were estimated as described above. Figure 2.2a shows an example of a document used for feature extraction. Classification The goal of inference is to assign a label to each of the patches being considered. The algorithm for inference uses the idea of Gibb s sampling (Casella and George 1992).

7 2 Signature-Based Document Retrieval 23 Table 2.2. Description of the 4 transition features used. Transition features are computed for a patch and its neighbor. Transition Feature Description Relative location Assigned weights based on the relative location - top/bottom or right/left Convex hull distance The convex hull distance between the 2 patches Ratio of aspect ratio The ratio of the aspect ratio values of the 2 patches Ratio of number of components The ratio of the number of components present in the 2 patches Fig Sample signature extraction results (a) Step 1: Feature extraction; (b) Step 2: Classification; (c) Step 3: Post-processing. 1. Randomly assign labels to each of the patches in a document based on an intuitive prior distribution of the labels. 2. Choose a patch at random and compute the probability of assigning each of the labels using the model from Eq. 2.4 to obtain a probability distribution p for the labels.

8 24 Harish Srinivasan and Sargur Srihari 3. Use Gibbs sampling to sample from this distribution p to assign a probable label to the patch. 4. Repeat steps 2 and 3 until the assignments do not change. Store the set of label assignments along with the probability distribution p. 5. Repeat steps 1 4, for a sufficient number of iterations in order to eliminate the dependency on the initial random label assignments. 6. Consider the set of arrived assignments at step 4 in each of the iterations, and for all the patches pick the labels with the maximum probability as the final set of labels. Figure 2.2b shows an example of a document image obtained as a result of the classification of the signature labels on the document in Fig. 2.2a. Post-processing In this step, only the patches labeled as possible signatures are considered. Each of these patches is merged with other neighboring possible signature patches, the components on the right and left side being weighed more than those on the top and bottom. A region growing algorithm like the one described above but with a larger window size is used to merge the patches. Other small components which were left out initially are inserted back into the signature blocks being considered. Figure 2.2c shows the result of the post-processing step on the image in Fig. 2.2b. Noise removal Noise removal is carried out to get rid of any noise or printed text overlapping the extracted signature region. We use Support Vector Machines (SVM) (Burges 1998) to classify each connected component as either a part of a signature or a noise component, comprising of printed text, small handwritten text, logos, noise, etc. The SVM is previously trained on the connected components extracted from 10 sample signatures with noise. At the end of the classification step we obtain the signature image with only the signature components remaining. The features used include directional features, height, perimeter and aspect ratio. An example of the results obtained by this noise removal procedure is shown in Fig Fig Example of noise removal.

9 2 Signature-Based Document Retrieval 25 Signature feature extraction The next step involves indexing each document by converting the signature image extracted from the document into a set of binary feature vectors. The features used here are the Gradient, Structural and Concavity (GSC) features which measure the image characteristics at local, intermediate and large scales and hence approximate a heterogeneous multi resolution paradigm to feature extraction. The features for the signature images which are extracted under a 4 8 division, contain 384 bits of gradient features, 384 bits of structural features and 256 bits of concavity features, giving us a binary feature vector of length 1024 (Zhang and Srihari 2003b). Each of these sets of binary features uniquely represents a given sample signature. Figure 2.4 shows an example of a signature image under this 4 8 division and the corresponding binary feature vector obtained. The gradient features capture the stroke flow orientation and its variations using the frequency of gradient directions, as obtained by convolving the image with a Sobel edge operator, in each of 12 directions and then thresholding the resultant values to yield a 384-bit vector. The structural features represent the coarser shape of the word and capture the presence of corners, diagonal lines, and vertical and horizontal lines in the gradient image, as determined by Fig Feature extraction (a) Signature image under a 4 8 division; (b) 1024 bit binary feature vector extracted.

10 26 Harish Srinivasan and Sargur Srihari 12 rules (Favata and Srikantan 1996). The concavity features capture the major topological and geometrical features including direction of bays, presence of holes, and large vertical and horizontal strokes. Retrieval The document retrieval is performed using a matching algorithm to compare the query with the signature. Figure 2.5 shows the various operational steps in the retrieval process: (i) noise removal from the query signature; (ii) feature extraction from the query signature after noise removal; (iii) matching the query signature features to each of the indexed documents; and (iv) ranking the documents in accordance with the results from the matching algorithm. Query Signature Image Indexed Documents Noise Removal Compute Features Matching Doc Id Dist Doc Doc Doc Fig Block diagram of document retrieval. Matching algorithm Given a query signature image, the relevant documents are retrieved using a matching algorithm. The GSC binary feature vectors are extracted for the query, and the matching algorithm s task is to compare these features with the indexed features of the signatures present in the database of documents. Figure 2.6 shows a query signature image being matched against a few extracted signatures and the resulting dissimilarity measures obtained using the matching algorithm. The distance between the queried signature and each of the indexed documents in the database is calculated using a normalized correlation similarity measure (Zhang and Srihari 2003a, b). Given the two binary feature vectors X Ω and Y Ω, each similarity score S(X, Y) uses all or some of the four possible values, i.e. S 00 ; S 01 ; S 10 ; S 11.HereS ij, (i,j) {0,1}, is the number of occurrences where pattern i occurs in the first binary vector and pattern j occurs in the second vector in the same position. The similarity distance S(X, Y ) between two feature vectors X and Y is given by Eq. 2.5.

11 2 Signature-Based Document Retrieval 27 Fig Subset of retrieval results with the query image on the left and the signatures matched against and their corresponding dissimilarity distances on the right. S 00 S 11 S 01 S 10 S(X, Y )= S 11S 00 S 10S 01 2((S 10 + S 11)(S 01 + S 00)(S 11 + S 01)(S 00 + S 10)) 1/2 (2.5) where = the first binary vector has a 0 and the second vector too has a 0 in the corresponding positions. = the first binary vector has a 1 and the second vector too has a 1 in the corresponding positions. = the first binary vector has a 0 while the second vector has a 1 in the corresponding positions. = the first binary vector has a 1 while the second vector has a 0 in the corresponding positions. When constructing the similarity distance measure all possible matches S ij 0,1 are considered for better classification. Also S 00 has been weighted with a beta value of 0.5 to boost classification. The results are ranked in the increasing order of this dissimilarity distance which varies between 0 and 1, a value of 0 indicating an exact match. In the signature retrieval process there is no prior knowledge of the writers signature, the goal is to identify the closest

12 28 Harish Srinivasan and Sargur Srihari matching signatures and to identify all the documents containing signatures by the writer of the queried signature. Each of the retrieved signature images is also linked with its corresponding document ID, which allows the user to easily retrieve its location and the document it belongs to. Before the matching algorithm is applied, the query signature image is processed to remove any overlapping printed or noisy components as mentioned above. Following this, the GSC features for this component are extracted. Query expansion using automatic relevance feedback A query expansion is done using the feedback (retrieval results) of the matching algorithm. The matching score S i for a query q, matched against a document D i, given by Eq. 2.6, is computed for each document and sorted in ascending order. The document with the lowest S i being the most relevant document retrieved. S i = S(f(q),f(D i )) (2.6) where f(q) is the binary feature vector of the image q, f(d i ) is binary feature vector indexed in D i,ands(f(q),f(d i )) is given by Eq Let document D i correspond to the document with the lowest S i.the signature image extracted from the document D i is used as a new query q new, and added to the existing query to formulate an expanded query consisting of the 2 images, q and q new. The retrieval is performed using the matching algorithm with this new query {q, q new }. The new score for each document, S i ({q, q new },D i ),iscomputed by the minimum distance obtained from the 2 queries as given by Eq S i ({q, q new },D i )=min{s(q, D i ),S(q new,d i )} (2.7) This technique improves the accuracy of the retrieved results as the matching algorithm consistently returns relevant documents in the top results. Dataset The dataset used for this experiment was taken from a set of 744 document images signed by 67 different authors. This set of documents consists of a variety of documents, a majority of which have printed text with a signature at the bottom. There are also documents with handwritten text around this printed text, only handwritten documents, documents with images like tables, graphs, etc and multiple signatures per document or no signatures at all. Many of these documents also have logos, other symbolic text and noisy components like words circled or scratched or handwritten text overlapping the printed text or printed text overlapping the signatures. There are also documents with lines and black borders and noise. Some of the writers have several

13 2 Signature-Based Document Retrieval 29 Fig All the automatically extracted samples for writer 10. types of signatures like the writer s full name, initials, only first name, etc. Documents with multiple signatures per document and purely handwritten documents with signatures have also been considered here. For this experiment we randomly picked several different authors and picked 2 5 documents per author making a total of 101 documents containing a total of 114 signatures. Figure 2.7 shows all the signature samples automatically extracted from the documents belonging to one of the writers. Experiments and results In this section, the test setup and the experimental results obtained for the signature retrieval task are described. In the test setup for Signature Retrieval, the images were divided into 2 groups per writer. One group consisting of known document images and the second group consisting of the questioned signatures for testing. The image formats supported are png, jpeg and tiff. The database of documents with known signatures are first processed to index each document. Out of the 101 documents from which the signatures were extracted, in 91.2% (= 104) of the cases the extracted region contained the entire signature image correctly extracted. Following this, the signature image in question is selected and this queried image is preprocessed to remove any overlapping printed text or noise. The set of indexed documents are selected and the signature retrieval process is carried out against this set of known documents. In each case, the precision and recall measures are calculated. The precision and recall measures (Salton and McGill 1983; van Rijsbergen 1979) for a rank R where the author of the questioned signature is represented by A are defined as follows Recall of label a = Amount of correctly classified data of label a T otal amount of data of label a P recision of label a = Amount of correctly classified text of label a T otal amount of text classified to be of label a The testing was done for 1 2 extracted signature images per writer which were randomly selected from the entire set. Each of these signatures was queried against the entire set of 114 indexed signature images in the database. The ranks of the retrieved documents which were signed by the author of the questioned signature were noted in each case and the average precision and recall values were estimated for different ranks.

14 30 Harish Srinivasan and Sargur Srihari 100 Precision Recall Curve for Signature Retrieval Precision % Recall % Fig Precision-recall curves for signature retrieval results: Precision of 84.2% at recall of 78.4% after query expansion. The experiments were conducted using query expansion, where the top results from the retrieval results for the initial query were used along with the initial query to retrieve relevant documents. Figure 2.8 shows the precision recall curves obtained in this experiment. In the top 5 results a recall of 78.4% is obtained, the precision at this point is 84.2%. Table 2.3 shows the results at the end of this phase. There is an increase in the retrieval accuracy on using query expansion, this shows that the system consistently retrieved a relevant document as the top choice. And the usage of this top choice result along with the original query strengthened the retrieval accuracy. The retrieval accuracy also has been impacted by several factors like: the signature extraction was effective in 91.2% of the cases, so some of the indexed documents contained spurious signature images; the noise removal technique has led to the removal of some components belonging to the signature in a small number of cases; and the poor quality of some of the documents. Table 2.3. Recall measures for signature retrieval from entire database. No of Results Considered Recall Measure(%) Rank < Rank < Rank < Rank < Rank < Rank < Rank

15 2 Signature-Based Document Retrieval 31 Conclusions Here the set of experiments done for the problem of document retrieval using signatures and its results were presented. The tests were conducted on a variety of document and signature samples including those with noise, logos, figures, printed and handwritten text. Although the presence of noise and text overlapping the signatures make retrieval a challenging task, our technique returned a relatively high precision and recall accuracy of 84.2% and 78.4% respectively when considering the top 5 results. This can be attributed to the usage of conditional random fields for the removal of printed and noisy data from the documents leading to an accurate signature extraction in most cases, followed by the usage of an effective matching algorithm using global shape-based features. References Burges, C A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2): Casella, G. and E. George Explaining the Gibbs sampler. The American Statistician, 46: Favata, J. T. and G. Srikantan A multiple feature resolution approach for handprinted digit and character recognition. International Journal of Imaging Systems and Technology, 7: Kolz, A., J. Alspector, M. Augusteijn, R. Carlson, and G. V. Popescu A lineoriented approach to word spotting in handwritten documents. Pattern Analysis and Applications, 2(3): Kumar, S. and M. Hebert Discriminative fields for modeling spatial dependencies in natural images. Advances in Neural Information Processing Systems (NIPS-2003). Lafferty, J., A. Macullum, and F. Pereira Conditional random fields: Probabilistic models for segmenting and labeling sequential data. Eighteenth International Conference on Machine Learning (ICML-2001). Plamondon, R. and G. Lorette On-line and offline handwriting recognition: A comprehensive survey. IEEE Transactions on Pattern Recognition and Machine Intelligence, 22(1): Quattoni, A., M. Collins, and T. Darrel Conditional random fields for object recognition. Advances in Neural Information Processing Systems 17 (NIPS 2004). Rath, T., R. Manmatha, and V. Lavrenko A search engine for historical manuscript images. Proceedings of the 27th Annual Int l SIGIR Conference. Salton, G. and M. J. McGill Introduction to Modern Information Retrieval. McGraw-Hill, New York. Srihari, S., S. Shetty, S. Chen, H. Srinivasan, and C. Huang Document image retrieval using signatures as queries. Document Image Analysis for Libraries (DIAL 06). van Rijsbergen, C. J Information Retrieval. Butterworths, London. Wallach, H Efficient training of conditional random fields. Proceedings of 6th Annual CLUK Research Colloquium.

16 32 Harish Srinivasan and Sargur Srihari Zhang, B. and S. Srihari. 2003a. Binary vector dissimilarity measures for handwriting identification. SPIE, Document Recognition and Retrieval X, pp Zhang, B. and S. Srihari. 2003b. Properties of binary vector dissimilarity measures. Cary, North Carolina, September. Zhang, B., S. N. Srihari, and C. Huang Word image retrieval using binary features. Document Recognition and Retrieval XI, SPIE, San Jose, CA.

17

Document Image Retrieval using Signatures as Queries

Document Image Retrieval using Signatures as Queries Document Image Retrieval using Signatures as Queries Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Harish Srinivasan, Chen Huang CEDAR, University at Buffalo(SUNY) Amherst, New York 14228 Gady Agam and

More information

Signature Segmentation from Machine Printed Documents using Conditional Random Field

Signature Segmentation from Machine Printed Documents using Conditional Random Field 2011 International Conference on Document Analysis and Recognition Signature Segmentation from Machine Printed Documents using Conditional Random Field Ranju Mandal Computer Vision and Pattern Recognition

More information

A Search Engine for Handwritten Documents

A Search Engine for Handwritten Documents A Search Engine for Handwritten Documents Sargur Srihari, Chen Huang, Harish Srinivasan Center of Excellence for Document Analysis and Recognition(CEDAR) University at Buffalo, State University of New

More information

Signature verification using Kolmogorov-Smirnov. statistic

Signature verification using Kolmogorov-Smirnov. statistic Signature verification using Kolmogorov-Smirnov statistic Harish Srinivasan, Sargur N.Srihari and Matthew J Beal University at Buffalo, the State University of New York, Buffalo USA {srihari,hs32}@cedar.buffalo.edu,mbeal@cse.buffalo.edu

More information

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan Handwritten Signature Verification ECE 533 Project Report by Ashish Dhawan Aditi R. Ganesan Contents 1. Abstract 3. 2. Introduction 4. 3. Approach 6. 4. Pre-processing 8. 5. Feature Extraction 9. 6. Verification

More information

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,

More information

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

How To Filter Spam Image From A Picture By Color Or Color

How To Filter Spam Image From A Picture By Color Or Color Image Content-Based Email Spam Image Filtering Jianyi Wang and Kazuki Katagishi Abstract With the population of Internet around the world, email has become one of the main methods of communication among

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

Offline Word Spotting in Handwritten Documents

Offline Word Spotting in Handwritten Documents Offline Word Spotting in Handwritten Documents Nicholas True Department of Computer Science University of California, San Diego San Diego, CA 9500 ntrue@cs.ucsd.edu Abstract The digitization of written

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

Automatic Extraction of Signatures from Bank Cheques and other Documents

Automatic Extraction of Signatures from Bank Cheques and other Documents Automatic Extraction of Signatures from Bank Cheques and other Documents Vamsi Krishna Madasu *, Mohd. Hafizuddin Mohd. Yusof, M. Hanmandlu ß, Kurt Kubik * *Intelligent Real-Time Imaging and Sensing group,

More information

Keywords image processing, signature verification, false acceptance rate, false rejection rate, forgeries, feature vectors, support vector machines.

Keywords image processing, signature verification, false acceptance rate, false rejection rate, forgeries, feature vectors, support vector machines. International Journal of Computer Application and Engineering Technology Volume 3-Issue2, Apr 2014.Pp. 188-192 www.ijcaet.net OFFLINE SIGNATURE VERIFICATION SYSTEM -A REVIEW Pooja Department of Computer

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Handwritten Character Recognition from Bank Cheque

Handwritten Character Recognition from Bank Cheque International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Special Issue-1 E-ISSN: 2347-2693 Handwritten Character Recognition from Bank Cheque Siddhartha Banerjee*

More information

Biometric Authentication using Online Signatures

Biometric Authentication using Online Signatures Biometric Authentication using Online Signatures Alisher Kholmatov and Berrin Yanikoglu alisher@su.sabanciuniv.edu, berrin@sabanciuniv.edu http://fens.sabanciuniv.edu Sabanci University, Tuzla, Istanbul,

More information

COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS

COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS B.K. Mohan and S. N. Ladha Centre for Studies in Resources Engineering IIT

More information

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail

More information

Signature Segmentation and Recognition from Scanned Documents

Signature Segmentation and Recognition from Scanned Documents Signature Segmentation and Recognition from Scanned Documents Ranju Mandal, Partha Pratim Roy, Umapada Pal and Michael Blumenstein School of Information and Communication Technology, Griffith University,

More information

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?

More information

Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

More information

How To Cluster On A Search Engine

How To Cluster On A Search Engine Volume 2, Issue 2, February 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A REVIEW ON QUERY CLUSTERING

More information

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!

More information

Analecta Vol. 8, No. 2 ISSN 2064-7964

Analecta Vol. 8, No. 2 ISSN 2064-7964 EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

More information

High-Performance Signature Recognition Method using SVM

High-Performance Signature Recognition Method using SVM High-Performance Signature Recognition Method using SVM Saeid Fazli Research Institute of Modern Biological Techniques University of Zanjan Shima Pouyan Electrical Engineering Department University of

More information

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles mjhustc@ucla.edu and lunbo

More information

Galaxy Morphological Classification

Galaxy Morphological Classification Galaxy Morphological Classification Jordan Duprey and James Kolano Abstract To solve the issue of galaxy morphological classification according to a classification scheme modelled off of the Hubble Sequence,

More information

Handwritten Signature Verification using Neural Network

Handwritten Signature Verification using Neural Network Handwritten Signature Verification using Neural Network Ashwini Pansare Assistant Professor in Computer Engineering Department, Mumbai University, India Shalini Bhatia Associate Professor in Computer Engineering

More information

Visual Structure Analysis of Flow Charts in Patent Images

Visual Structure Analysis of Flow Charts in Patent Images Visual Structure Analysis of Flow Charts in Patent Images Roland Mörzinger, René Schuster, András Horti, and Georg Thallinger JOANNEUM RESEARCH Forschungsgesellschaft mbh DIGITAL - Institute for Information

More information

Statistical Models in Data Mining

Statistical Models in Data Mining Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

More information

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar

More information

Tracking Groups of Pedestrians in Video Sequences

Tracking Groups of Pedestrians in Video Sequences Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal

More information

The Artificial Prediction Market

The Artificial Prediction Market The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory

More information

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability Classification of Fingerprints Sarat C. Dass Department of Statistics & Probability Fingerprint Classification Fingerprint classification is a coarse level partitioning of a fingerprint database into smaller

More information

Low-resolution Character Recognition by Video-based Super-resolution

Low-resolution Character Recognition by Video-based Super-resolution 2009 10th International Conference on Document Analysis and Recognition Low-resolution Character Recognition by Video-based Super-resolution Ataru Ohkura 1, Daisuke Deguchi 1, Tomokazu Takahashi 2, Ichiro

More information

High-dimensional labeled data analysis with Gabriel graphs

High-dimensional labeled data analysis with Gabriel graphs High-dimensional labeled data analysis with Gabriel graphs Michaël Aupetit CEA - DAM Département Analyse Surveillance Environnement BP 12-91680 - Bruyères-Le-Châtel, France Abstract. We propose the use

More information

SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS. Nitin Khanna and Edward J. Delp

SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS. Nitin Khanna and Edward J. Delp SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS Nitin Khanna and Edward J. Delp Video and Image Processing Laboratory School of Electrical and Computer Engineering Purdue University West Lafayette,

More information

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based

More information

DIAGONAL BASED FEATURE EXTRACTION FOR HANDWRITTEN ALPHABETS RECOGNITION SYSTEM USING NEURAL NETWORK

DIAGONAL BASED FEATURE EXTRACTION FOR HANDWRITTEN ALPHABETS RECOGNITION SYSTEM USING NEURAL NETWORK DIAGONAL BASED FEATURE EXTRACTION FOR HANDWRITTEN ALPHABETS RECOGNITION SYSTEM USING NEURAL NETWORK J.Pradeep 1, E.Srinivasan 2 and S.Himavathi 3 1,2 Department of ECE, Pondicherry College Engineering,

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will

More information

A Lightweight and Effective Music Score Recognition on Mobile Phone

A Lightweight and Effective Music Score Recognition on Mobile Phone J Inf Process Syst, http://dx.doi.org/.3745/jips ISSN 1976-913X (Print) ISSN 92-5X (Electronic) A Lightweight and Effective Music Score Recognition on Mobile Phone Tam Nguyen* and Gueesang Lee** Abstract

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Neovision2 Performance Evaluation Protocol

Neovision2 Performance Evaluation Protocol Neovision2 Performance Evaluation Protocol Version 3.0 4/16/2012 Public Release Prepared by Rajmadhan Ekambaram rajmadhan@mail.usf.edu Dmitry Goldgof, Ph.D. goldgof@cse.usf.edu Rangachar Kasturi, Ph.D.

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

MapReduce Approach to Collective Classification for Networks

MapReduce Approach to Collective Classification for Networks MapReduce Approach to Collective Classification for Networks Wojciech Indyk 1, Tomasz Kajdanowicz 1, Przemyslaw Kazienko 1, and Slawomir Plamowski 1 Wroclaw University of Technology, Wroclaw, Poland Faculty

More information

Character Image Patterns as Big Data

Character Image Patterns as Big Data 22 International Conference on Frontiers in Handwriting Recognition Character Image Patterns as Big Data Seiichi Uchida, Ryosuke Ishida, Akira Yoshida, Wenjie Cai, Yaokai Feng Kyushu University, Fukuoka,

More information

Introduction to Pattern Recognition

Introduction to Pattern Recognition Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering Khurum Nazir Junejo, Mirza Muhammad Yousaf, and Asim Karim Dept. of Computer Science, Lahore University of Management Sciences

More information

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode

More information

Online Farsi Handwritten Character Recognition Using Hidden Markov Model

Online Farsi Handwritten Character Recognition Using Hidden Markov Model Online Farsi Handwritten Character Recognition Using Hidden Markov Model Vahid Ghods*, Mohammad Karim Sohrabi Department of Electrical and Computer Engineering, Semnan Branch, Islamic Azad University,

More information

Footwear Print Retrieval System for Real Crime Scene Marks

Footwear Print Retrieval System for Real Crime Scene Marks Footwear Print Retrieval System for Real Crime Scene Marks Yi Tang, Sargur N. Srihari, Harish Kasiviswanathan and Jason J. Corso Center of Excellence for Document Analysis and Recognition (CEDAR) University

More information

SIGNATURE VERIFICATION

SIGNATURE VERIFICATION SIGNATURE VERIFICATION Dr. H.B.Kekre, Dr. Dhirendra Mishra, Ms. Shilpa Buddhadev, Ms. Bhagyashree Mall, Mr. Gaurav Jangid, Ms. Nikita Lakhotia Computer engineering Department, MPSTME, NMIMS University

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Recognition of Handwritten Digits using Structural Information

Recognition of Handwritten Digits using Structural Information Recognition of Handwritten Digits using Structural Information Sven Behnke Martin-Luther University, Halle-Wittenberg' Institute of Computer Science 06099 Halle, Germany { behnke Irojas} @ informatik.uni-halle.de

More information

Face detection is a process of localizing and extracting the face region from the

Face detection is a process of localizing and extracting the face region from the Chapter 4 FACE NORMALIZATION 4.1 INTRODUCTION Face detection is a process of localizing and extracting the face region from the background. The detected face varies in rotation, brightness, size, etc.

More information

Circle Object Recognition Based on Monocular Vision for Home Security Robot

Circle Object Recognition Based on Monocular Vision for Home Security Robot Journal of Applied Science and Engineering, Vol. 16, No. 3, pp. 261 268 (2013) DOI: 10.6180/jase.2013.16.3.05 Circle Object Recognition Based on Monocular Vision for Home Security Robot Shih-An Li, Ching-Chang

More information

How To Cluster Of Complex Systems

How To Cluster Of Complex Systems Entropy based Graph Clustering: Application to Biological and Social Networks Edward C Kenley Young-Rae Cho Department of Computer Science Baylor University Complex Systems Definition Dynamically evolving

More information

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

PDF hosted at the Radboud Repository of the Radboud University Nijmegen PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher's version. For additional information about this publication click this link. http://hdl.handle.net/2066/54957

More information

Establishing the Uniqueness of the Human Voice for Security Applications

Establishing the Uniqueness of the Human Voice for Security Applications Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.

More information

Research on Chinese financial invoice recognition technology

Research on Chinese financial invoice recognition technology Pattern Recognition Letters 24 (2003) 489 497 www.elsevier.com/locate/patrec Research on Chinese financial invoice recognition technology Delie Ming a,b, *, Jian Liu b, Jinwen Tian b a State Key Laboratory

More information

Numerical Field Extraction in Handwritten Incoming Mail Documents

Numerical Field Extraction in Handwritten Incoming Mail Documents Numerical Field Extraction in Handwritten Incoming Mail Documents Guillaume Koch, Laurent Heutte and Thierry Paquet PSI, FRE CNRS 2645, Université de Rouen, 76821 Mont-Saint-Aignan, France Laurent.Heutte@univ-rouen.fr

More information

. Learn the number of classes and the structure of each class using similarity between unlabeled training patterns

. Learn the number of classes and the structure of each class using similarity between unlabeled training patterns Outline Part 1: of data clustering Non-Supervised Learning and Clustering : Problem formulation cluster analysis : Taxonomies of Clustering Techniques : Data types and Proximity Measures : Difficulties

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

How To Cluster

How To Cluster Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015 RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

Medical Image Segmentation of PACS System Image Post-processing *

Medical Image Segmentation of PACS System Image Post-processing * Medical Image Segmentation of PACS System Image Post-processing * Lv Jie, Xiong Chun-rong, and Xie Miao Department of Professional Technical Institute, Yulin Normal University, Yulin Guangxi 537000, China

More information

STATIC SIGNATURE RECOGNITION SYSTEM FOR USER AUTHENTICATION BASED TWO LEVEL COG, HOUGH TRANSFORM AND NEURAL NETWORK

STATIC SIGNATURE RECOGNITION SYSTEM FOR USER AUTHENTICATION BASED TWO LEVEL COG, HOUGH TRANSFORM AND NEURAL NETWORK Volume 6, Issue 3, pp: 335343 IJESET STATIC SIGNATURE RECOGNITION SYSTEM FOR USER AUTHENTICATION BASED TWO LEVEL COG, HOUGH TRANSFORM AND NEURAL NETWORK Dipti Verma 1, Sipi Dubey 2 1 Department of Computer

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

Determining optimal window size for texture feature extraction methods

Determining optimal window size for texture feature extraction methods IX Spanish Symposium on Pattern Recognition and Image Analysis, Castellon, Spain, May 2001, vol.2, 237-242, ISBN: 84-8021-351-5. Determining optimal window size for texture feature extraction methods Domènec

More information

ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES

ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES International Journal of Computer Engineering & Technology (IJCET) Volume 7, Issue 2, March-April 2016, pp. 24 29, Article ID: IJCET_07_02_003 Available online at http://www.iaeme.com/ijcet/issues.asp?jtype=ijcet&vtype=7&itype=2

More information

Open Access A Facial Expression Recognition Algorithm Based on Local Binary Pattern and Empirical Mode Decomposition

Open Access A Facial Expression Recognition Algorithm Based on Local Binary Pattern and Empirical Mode Decomposition Send Orders for Reprints to reprints@benthamscience.ae The Open Electrical & Electronic Engineering Journal, 2014, 8, 599-604 599 Open Access A Facial Expression Recognition Algorithm Based on Local Binary

More information

Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References

More information

Robust Outlier Detection Technique in Data Mining: A Univariate Approach

Robust Outlier Detection Technique in Data Mining: A Univariate Approach Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Template-based Eye and Mouth Detection for 3D Video Conferencing

Template-based Eye and Mouth Detection for 3D Video Conferencing Template-based Eye and Mouth Detection for 3D Video Conferencing Jürgen Rurainsky and Peter Eisert Fraunhofer Institute for Telecommunications - Heinrich-Hertz-Institute, Image Processing Department, Einsteinufer

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Department of Mechanical Engineering, King s College London, University of London, Strand, London, WC2R 2LS, UK; e-mail: david.hann@kcl.ac.

Department of Mechanical Engineering, King s College London, University of London, Strand, London, WC2R 2LS, UK; e-mail: david.hann@kcl.ac. INT. J. REMOTE SENSING, 2003, VOL. 24, NO. 9, 1949 1956 Technical note Classification of off-diagonal points in a co-occurrence matrix D. B. HANN, Department of Mechanical Engineering, King s College London,

More information

Relative Permeability Measurement in Rock Fractures

Relative Permeability Measurement in Rock Fractures Relative Permeability Measurement in Rock Fractures Siqi Cheng, Han Wang, Da Huo Abstract The petroleum industry always requires precise measurement of relative permeability. When it comes to the fractures,

More information

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

Lecture 6: Classification & Localization. boris. ginzburg@intel.com Lecture 6: Classification & Localization boris. ginzburg@intel.com 1 Agenda ILSVRC 2014 Overfeat: integrated classification, localization, and detection Classification with Localization Detection. 2 ILSVRC-2014

More information

Poker Vision: Playing Cards and Chips Identification based on Image Processing

Poker Vision: Playing Cards and Chips Identification based on Image Processing Poker Vision: Playing Cards and Chips Identification based on Image Processing Paulo Martins 1, Luís Paulo Reis 2, and Luís Teófilo 2 1 DEEC Electrical Engineering Department 2 LIACC Artificial Intelligence

More information

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE. indhubatchvsa@gmail.com

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE. indhubatchvsa@gmail.com LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE 1 S.Manikandan, 2 S.Abirami, 2 R.Indumathi, 2 R.Nandhini, 2 T.Nanthini 1 Assistant Professor, VSA group of institution, Salem. 2 BE(ECE), VSA

More information

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 ushaduraisamy@yahoo.co.in 2 anishgracias@gmail.com Abstract A vast amount of assorted

More information

Signature Region of Interest using Auto cropping

Signature Region of Interest using Auto cropping ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 1 Signature Region of Interest using Auto cropping Bassam Al-Mahadeen 1, Mokhled S. AlTarawneh 2 and Islam H. AlTarawneh 2 1 Math. And Computer Department,

More information

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28 Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object

More information

How To Identify A Churner

How To Identify A Churner 2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management

More information

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

More information

Introducing diversity among the models of multi-label classification ensemble

Introducing diversity among the models of multi-label classification ensemble Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and

More information

Content-Based Recommendation

Content-Based Recommendation Content-Based Recommendation Content-based? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items User-based CF Searches

More information

Efficient on-line Signature Verification System

Efficient on-line Signature Verification System International Journal of Engineering & Technology IJET-IJENS Vol:10 No:04 42 Efficient on-line Signature Verification System Dr. S.A Daramola 1 and Prof. T.S Ibiyemi 2 1 Department of Electrical and Information

More information

Semi-Supervised Support Vector Machines and Application to Spam Filtering

Semi-Supervised Support Vector Machines and Application to Spam Filtering Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery

More information

Multi-class Classification: A Coding Based Space Partitioning

Multi-class Classification: A Coding Based Space Partitioning Multi-class Classification: A Coding Based Space Partitioning Sohrab Ferdowsi, Svyatoslav Voloshynovskiy, Marcin Gabryel, and Marcin Korytkowski University of Geneva, Centre Universitaire d Informatique,

More information

The Need for Training in Big Data: Experiences and Case Studies

The Need for Training in Big Data: Experiences and Case Studies The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor

More information

Using Lexical Similarity in Handwritten Word Recognition

Using Lexical Similarity in Handwritten Word Recognition Using Lexical Similarity in Handwritten Word Recognition Jaehwa Park and Venu Govindaraju Center of Excellence for Document Analysis and Recognition (CEDAR) Department of Computer Science and Engineering

More information

Identity authentication using improved online signature verification method

Identity authentication using improved online signature verification method Pattern Recognition Letters 26 (2005) 2400 2408 www.elsevier.com/locate/patrec Identity authentication using improved online signature verification method Alisher Kholmatov, Berrin Yanikoglu * Sabanci

More information