Optimizing the Global Execution Time with CUDA and BIGDATA from a Neural System of Off-line Signature Verification on Checks.

Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'5 495 Optimizing the Global Execution Time with CUDA and BIGDATA from a Neural System of Off-line Signature Verification on Checks. Francisco Javier Luna Rosas,2, Julio Cesar Martínez Romo, Damián Martínez Díaz 2, Gricelda Medina Veloz 3, Valentín López Rivas, Cesar Dunay Acevedo e-mail: fcoluna2000@yahoo.com.mx Computer Science Department, Inst. Tec. Aguascalientes, México. 2 Universidad Cuauhtémoc, Campus Aguascalientes, México 3 Universidad Tecnológica del Norte de Aguascalientes, México Abstract - The CUDA platform enables us to use the graphic cards not only to process the graphs but also to process and perform instructions in a parallel way, these improvements in the software have generated a wide catalogue of applications powered by the CUDA architecture. In this article we propose to optimize the global execution time of an off-line system to verify signatures on checks, our architecture operates on two phases, the training phase and the verifying phase. The training phase is made up of various stages with the purpose of generating a model of neuronal networks to recognize an off-line signature on checks. The verifying phase consists in repeating the first stages of the training phase with the purpose of extracting features from the signature. The features extracted from the signature on the verifying phase are compared on the classifier with the results gotten from the training phase model. Keywords: CUDA, Redes Neuronales, Verificación de Firmas Off-line, Checks, Optimization.. Overview. Signature verification consists on determining if given a number of samples of the signature of a person, an additional signature was performed by the same person. In this case the signature verification can be used as an authentifier of personality. A signature can be verified on-line and off-line. In the first case, an instrumented pen or a digitizing tablet is used to capture the shape of the signature and the dynamic movement of the hand [5] and, of course, it requires the signature s owner to be present. The offline technique refers to situations in which the signature was performed on paper previously and it was recorded as an image [5], in this way, the valuable dynamic information is lost, and it is basically not recovered. In both methods, one signature is available, we proceed to get a number of features that should be reliable in order to recognize genuine signatures as well as to reject forged signatures, even skilled forgeries; a certain number of features is extracted and figured out from each of the sample signatures, and this way a group of patterns is formed and at the same time it is useful for the training and testing of a classifier. The job of the classifier is to learn the habitual behavior of the features in a signature to test later if such features behave the same way in a test signature. Off-line signature verification is a problem in which the performance archived can not be as high as in the case of on-line signature verification, due to the lack of dynamic information.. Previous work in off-line handwritten signature verification. Off-line signature verification has been a research problem during many decades and in different countries [5], with many practical and potential applications. In [8], the authors identified the three kinds of forgeries described in Table. In the literature related to off-line verification we have generally seen good performances of error classification in random forgeries, as in [8] where it is about 0.38% or in [4] of 3%; however, in simulated forgeries and skilled forgeries the percentage of classification errors grows dramatically, and just to mention one case, lets say Fang in [5], who reports up to a 33.7%. However, this situation is not related with the ability of the researcher but with the nature of the off-line signature verification problem itself and it is also due to the loss of dynamic information; a factor that worsens the problem of high error under skilled

496 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'5 forgeries, this is because a model of signature from each individual should be prepared based on a few samples say 8 to 5 [5], which generates uncertainty in the probability to recognize genuine signatures as well as to reject forgeries. Table. Types of Forgeries According to Justino. Courtesy of [8]. No. Name Description Random Forgery 2 Simulated Forgery No attempt is made to reproduce the shape or aspect of the genuine signatures. No resemblance of the genuine signature is desired by the forger. Signature is loosely copied, not too detailed or accurate. The false signature tends to be similar to a genuine one. 3 Skilled Forgery Signature is very similar to a genuine one. The architecture of our automatic verifier for off-line handwritten signatures is shown on Fig. ; two phases are distinguished: the training phase (upper part) and the verification phase (lower part). The objective of the training phase is to generate a model for each person enrolled in the system, while the verification phase is to do the verification itself. Next we describe the constitutive blocks of each phase. 2. Training Phase. The stages of the training phase are found on the upper part of Fig. and are described in this section. Another main problem in the off-line verification approach is to establish and extract a group of features from the test signatures that should be enough to allow a high capacity to recognize genuine test specimens and at the same time to discriminate and reject forgeries. In the literature of this topic we can find basically three approaches to generate features. One consists on isolating and characterize some segments of the signature, (curvature, smoothness) [5]; another is to place a grid or array of squares on the signature and consider each element (square) of the grid as an area to be characterized; a classical example of the last mentioned is in [7]; in both cases we are searching for a clear representation using vectors that describe the values of each feature. In some cases, squares overlapping is considered to represent the signature like in [2], in which Murshed and others used squares of 6x6 pixels with an overlapping of 50%. Finally, another approach is based on schemes in which the features are implicit in one kind of parameter, such as the case of Gouvêa in [7], who used neuronal networks in its auto associative version. The features are implicit in the neuron s weights. In the last stage, the classifier will decide whether the signature is genuine or not. Neuronal networks of different types have been used [7], [4], [2], [5]; and hidden Markov models [8] and other less sophisticated classifiers such as the nearest k- neighbors [7] and the minimum distance based on Mahalanobis distance [5]. No matter the kind of classifier, the reported results on skilled forgeries are lower than those obtained under on-line verification. 2. Verifier Architecture in Checks. Fig. Verifier s Architecture in Checks. Courtesy of []. 2.. Acquiring signatures on checks. The first block is the acquisition of images on checks with signatures, and it is performed with an image searcher on checks on the WEB. Fig. 2 Data Base of Checks. Courtesy of [3]. The searcher gets different images of checks of different signatures and are stored on a data base of signatures on checks (Fig. 2). An example of John Joner`s signature is shown on Fig 3.

Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'5 497 Fig. 3 Acqusition of Johon Joner`s Signature on a Check`s Image. 2..2 Extracting and Processing of Signatures. On each image of the digital check there is a variety of related patterns, therefore, it`s necessary to separate the outstanding features from the rest of the image (Fig. 4). Fig. 4 Segmentation of a Check`s Image. Courtesy of of []. As we can observe, point eight is the place reserved for the person who signs the check, we have to take on account that such signature has to be hand-written. The signature remains on a fixed region of the image (lower right side) and it`s necessary to extract the signature from the check, after being located, in order to do this we apply the following algorithm:. Change the RGB into a grey scale. 2. Binarize the image using a specific threshold per image. 3. Apply the search of lighted bits (black) on a binary image. 4. Add a list of coordinates where the lighted bits were found. 5. With the coordinates in the list we generate the size of the outcoming image (see Fig. 5). In case that the generated image doesn`t have a threshold, repeat steps 2 to 5 incrementing the threshold for the binarization of the image until it fulfills optimal dimentions. Fig. 5 Extraction of a Signature from a Check.. As we can see on Fig. 5, the signature still has a noise so it`s necessary to apply some post-processing to obtain only the signature. 2..3 Feature Extraction. Features Description. Since we have only the static information of the handwritten signature, the problem of verifying a signature is more complex than the verification online if the purpose is to reject the skilled forgeries. Our verification strategy is based on the verification method of handwritten signatures used by human experts. The elements on which a human verifier is based, according to Slyter [6], include static and dynamic elements. The static elements have to do with the shape and design of the signature. The dynamic elements include the absolute pressure, the variations of pressure and speed grouped in what Slyter calls the rhythms. The rhythms and shape are mixed during the performance of the signature in a unique way for each individual, which shows the habits developed when performing his signature consecutively and for a long period of time, thus any attempt to verify a signature should consider the balance between rhythms and shape. Fig. 6 Feature Vector of 0 Signatures of John Doe. Courtesy of []. Using mathematical morphology and a number of structural elements it is possible to detect for any signature the position of the curved lines (equivalent regions of low speed). A graph of one feature vector of John Doe s signature is shown in the left side of Fig. 6. Each value in that vector represents the number of pixels lighted (which represent a value of ) after the signatures image have been erosioned; such number is the same as the number of

498 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'5 occurrences of the structural element in the image. The right side of Fig. 6 shows a graph of ten feature vectors corresponding to the same number of John Doe s signatures. Notice the repetition in the vectors. Table 2. Examples of Training for the Neural Network. Courtesy of []. 2..4 Generation of the Model. In this sub-section the generation of the model of the signature is explained and the classifier s design is included. After the signature has been binarized, we applied the following steps to generate the model of the signature and classifier.. Morphological Filtering. The basic idea of the morphological image processing is that the structural element be used to examine a group of images. A group of operations produces structural information about the image. Historically, the morphological image processing is with binary images and image processing in the gray scale. In this case, we will work only on the binary case, and the morphological operation used is the erosion. 2. Generation of Training Patterns. On Table 2 we observe the entering patterns that will be provided to the neural network, note that each row on the table is considered as a pattern (or training example). The table is divided in the following sections: Real Patterns: Come from line to 0 and are formed by the erosion of each one of the ten signatures of a single signer (our fictitious signing person John Doe ). Notice that each column of this section belongs to a value that represents the number of the structural element from which such signature was eroded, thus the whole number that appears in the column is the sum of the lighted bits (ones) which remained after eroding the signature with the structural element in question, each line is a feature vector. Synthetic Positive Patterns: Belong from line to 60. To obtain a column of these numbers, random numbers are generated in a range of 0 xk k = ± σ 0 () where x k is the value of each EE from Table 2 along the signatures Sig, Sig2, Sig3,,Sig0. Synthetic Negative Patterns: Belong from 6 to 0. To obtain a column of these numbers, random numbers are generated in a range of and 300. 3. Backpropagation Neural Network (Classifier). The architecture of the neural network [2] from Fig. 7 is formed by an input layer with 54 neurons, a single hidden layer with 08 neurons and a neuron in the output layer, with a sigmoidal function. There is a single neuron in the output layer to map each input pattern to +5 (genuine) or -5 (forgery). I n p u t s ( E x a m p l e s) 53 829 568 448 405 393 I n p u t L a y e r 2 3 4 5 54 H i d d e n L a y e r 2 08 O u t p u t L a y e r Trainig Range... 0.. 6 0 6..0 Target Fig. 7 Architecture of the Neural Network. Courtesy of []. 4. Classifier Training. After the classifier architecture has been designed the next step is to train the classifier in such a way that it works as a recognition tool in the next phase of the verifier. The quadratic error was considered a 0.003, the maximum number of interactions in case the quadratic error is not reached was determined in 000 interactions, the learning rate was established in a value of E-20 to advance over the surface of the error with small increases of the weights. 2.2 Verification Phase. When a signature under testing is presented to the verifier, the following events take place: + 5 + 5-5

Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'5 499.- The first three verification stages are carried out, which generate the features that originate the training patterns in each signature. 2.- After the training patterns are generated, the classifier is not trained it only verifies the signature declaring it as a genuine (+5) or false (-5). Table 3 shows the results of verification over a group of training and test signatures from a single person. As we can observe, the genuine test signatures (Sig- Sig5) obtain outputs from the neural network very close to +5, the success in the verification is because in the training group a plenty of samples of genuine signatures were provides to the neural network (real plus synthetic) and that the neural network also has knowledge of the way the genuine signatures are NOT, information contained in the group of negative synthetic examples. genuine and on the other hand, a very high θ will cause that genuine signatures be classified as forgeries; the output of the neural network is transformed to a degree of certainty that the signature is genuine to a range of 0-00 %, so the final classification is given on a basis of a parameter function type S in such a way that θ will be mapped as shown on Fig. 8, and the genuine/false verdict is also shown in such figure. Table 3. Neural Network Output Obtained During Verification. Courtesy of []. Fig. 8 Making Decision About the Genuineness of a Signature as an Output Function of the Neural Network. Courtesy of []. 3. Optimizing the Global Execution from a Neural System of Off-line Signature Verification on Checks. 3. Big Data and CUDA. The negative outputs close to -5 are signatures greatly different from the genuine due to the knowledge provided by the negative synthetic examples. The neural network was further tested with more positive synthetic examples (Sig 6 Sig 0), which were recognized as genuine. With non-skilled forgeries (Sig Sig 5), the neural network showed a good rejection; with skilled forgeries (Sig 6 Sig 7) the neural network could reject two (according to the criteria that we will establish on the next paragraph) and was unable to reject one (Sig 8). There is a region of uncertainty in the output of the neural network to classify a signature as genuine or false, which is the region between +2 and +3. In terms of similarity from a test signature to a genuine signature, it is the region where a forgery can look greatly like a genuine signature; therefore, the decision must be made based on a threshold θ, being the effectiveness of the verifier affected by this parameter. A very low θ will permit false signatures to be classified as The Big Data current applications require great capacity of computing, which can be combined with new programing architectures in parallel through the use of graphic cards (GPU`s) to try to improve the performance. CUDA is a platform that allows parallel programing making use of the processing power and the memory that the current video cards have (GPU`s), which have a greater number of processing nucleus compared to a CPU [3]. The CUDA platform allows us to use graphic cards not only for the processing of graphs but also for performing and processing instructions and information in a parallel way, doing the same task with different data thus reducing the processing time on operations with high arithmetic costs. We use the CUDA technology for the process of extracting the signature from the check, the creating of structural elements, training and recognizing of the signature. The process was done as a global process, and we consider each global

500 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'5 process as a sprint, we made 0 sprints, as can observe on Table 4. Tabla 4. Number of Signatures by Global Process. Sprint Signatories Signature by Total Signatory 5 5 225 2 30 5 450 3 45 5 675 4 60 5 900 5 75 5 25 6 90 5 350 7 05 5 575 8 20 5 800 9 35 5 2025 0 50 5 2250 It is due to mention that the segmentation of the check and the mathematic morphology were done in a sequential process and only in part of the training and verification of the signature is where we did the parallel process. Table 5 shows the training times in CPU vs GPUs for the Neuronal Network Backpropagation. Tabla 5. Training Times of the Neuronal Network Backpropagation in CPU vs GPU. Sprint CPU GPUs 0:4:0.000 0:0:3.000 2 0:29:5.000 0:02:59.000 3 0:55:2.000 0:04:29.000 4 :02:00.000 0:06:0.000 5 :2:00.000 0:07:9.000 6 :5:00.000 0:08:53.000 7 2:0:00.000 0:0:2.000 8 2:9:00.000 0:27:57.000 9 2:5:00.000 0:40:8.000 0 3:3:00.000 0:48:28.000 Fig. 9 shows the graph of training times of the signature, as we can observe on Fig. 9, the performing times on GPU`s are minor compared to those on CPU, demonstrating the applications that implement parallel process in CUDA architectures (GPU`s) are more efficient vs CPU, because the times reflect a wide distance between each architecture. Fig.9 CPU vs GPU`s Processing Times in the Training of Neural Network. Table 6 shows the global performing times for the process of extraction from the check`s signature, creation of structural elements, training and recognizing of the signature. Tabla 6. Global Execution Time. Sprint Total of CPU GPUs Signatures 225 0:24:2.000 0::52.000 2 450 0:50:7.000 0:23:47.000 3 675 :25:00.000 0:35:38.000 4 900 :44:00.000 0:47:34.000 5 25 2:2:00.000 0:59:57.000 6 350 2:53:00.000 ::00.000 7 575 3:24:00.000 :23:00.000 8 800 3:43:00.000 :52:00.000 9 2025 4:27:00.000 2:26:00.000 0 2250 4:58:00.000 2:57:00.000 On Fig. 0 we can observe the difference of times between both implementations (CPU vs GPU`s) to find a solution for the off-line verification system of signatures on checks. Based on the results, it`s demonstrated that using GPU`s can reduce the performing time of a verification architecture of checks off-line. Fig. 0 Global Execution Time from a Neural System of Off-line Signature Verification on Checks. 4. Conclusions. The current Big Data applications require great computing capacities that can be combined with new programing architectures in parallel through the use of graphic cards (GPU`s) to try to improve performance. CUDA allows us to use the graphic cards not only for the processing of graphs but also for processing and performing instructions and information in a parallel way, doing the same task to different data thus reducing the processing time on operations with high arithmetic costs, these improvements have generated a wide catalogue of applications powered by the CUDA architecture. In this article we proposed a verification neuronal system of handwritten signatures on checks off-line. Our architecture operates on two stages, the training

Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'5 50 stage and the verification stage. The training stage is made up of various stages with the purpose of generating a neuronal network to recognize the signature. The verifying stage consists on repeating the first phases of the training stage with the purpose of extracting features from the signature. The features of the signature extracted in the verification stage are compared in the classifier against the results from the model on the training stage. The neuronal verification system of hand-written signatures off-line on checks requires great computing capacities, which can be combined with new programing architectures in parallel through the use of graphic cards GPU`s to optimize the global answering time. 5. References. [] Asesores Bancarios y Financieros. Cheque Bancario, Conceptos y Caracteristicas 205. http://www.abanfin.com/?tit=cheque-bancarioconcepto-ycaracteristicas&name=manuales&fid=eh0bcab. [2] R. Baron, and R. Plamondo. Acceleration measurement with an instrumented pen for signature verification and handwriting analysis. IEEE Transactions on Instrumentation and Measurement, 38:32-38, 989. [3] Cook Shane. CUDA Programming A Developer s Guide to Parallel Computing With GPUs. Morgan Kaufmann 203, ISBN:978-0-2-45933-4. [4] J. P. Drouhard, R. Sabourin, and M. Godbout. Evaluation of a training method and of various rejection criteria for a neural network classifier used for off-line signature verification. IEEE International Conference on Neural Networks, pp. 4294-4299, IEEE World Congress on Computational Intelligence, NY, USA, 994. [5] B. Fang, Y. Wang, C. H. Leung, Y. Tang, P. C. K. Kwok, K. W. Tse, and Y. K. Wong. An Smoothness Index Based Approach for Off-line Signature Verification. IEEE., Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp., 785-787. ICDAR, N.Y., USA, 999. [6] J. B, Fasquel, C. Stolz, and M. Bruynooghe. Realtime verification of handwritten signatures using a hybrid opto-electronical method. Proceedings of the 2nd.International Symposium on Image and Signal Processing and Analysis, pp., 552-557, Pula, Croatia, 200. [7] R. J. N. Gouvêa, and G. C. Vasconcelos. Off-line Signature Verification Using an Autoassociator Cascade-Correlation Arquitecture. IEEE Proceedings of the Fith International Conference on Document Analysis and Recognition, pp. 2882-2886, NY, USA 999. [8] E. J. R. Justino, F. Bortolozi, and R. Sabourin. Off-line signature verification using HMM for random, simple and skilled forgeries. IEEE Proceedings of the Sixth International Conference on Document Analysis and Recognition, pp., 03-034, Seattle, WA, USA., 200. [9] Kirk David B. and Hwu Wen-mei W. Programming Massively Parallel Processors. Second Edition Morgan Kaufmann 203, ISBN:978-0-2-45992-. [0] L. Lee, and M. G. Lizárraga. An Off-Line Method for Human Signature Verification. IEEE Proceedings of the 3th International Conference on Pattern Recognition, pp. 95-98. N.Y., USA, 996. [] Luna Rosas Fco. Javier, Martínez Romo Julio Cesar. Improving Dynamic Load balancing Under CORBA with a Genetic Startegy in a Neural System of Off-line Signature Verification. The 2007 International Conference on Paralled and Distributed Processing Techniques and Applications. In Computer Science & Computer Engineering, Las Vegas Nevada, USA, ISBN: -6032-093-0, - 6032-094-9 (-6032-095-7) CSREA Press, June, 2007. [2] N. A. Murshed, F. Bortolozi, and R. Sabourin. Off-line Signature Verification, Without a Priori Knowledge of Class w2. IEEE Proceedings of the Third International Conference on Document Analysis and Recognition, pp. 9-96. N.Y., USA, 995. [3] Pemeena Priyadarsini M. J., Murugesan K., Rao Inbathini S., Jabeena A., Sai Tej K., Bank Cheque Authentication Using Signature. Intenational Journal of Advanced Research in Computer Science and Software Engineering, Volumen 3, Issue 5, May 203. ISSN:227728X. [4] R. Plamondon and M. Parizeau. Signature verification from position, velocity and acceleration signals: a comparative study. IEEE Proceedings of the 9th International Conference on Pattern Recognition, pp. 260-265. USA, 988. [5] R. Plamondon and S. N. Shihari. Online and offline handwritting recognition: a comprehensive survey. IEEE Tr ansactions on Pattern Analysis and Machine Intelligence, 22:63-84, 2000. [6] S. A. Slyter. Forensic Signature Examination, ed. Springfield, Illinois 995. [7] R. Sabourin and G. Genest. An extendedshadow-code based approach for off-line signature verification. IEEE Proceedings of the 2th. IAPR International Conference on Computer Vision & Image Processing, pp. 450-453. N.Y., USA. 994. [8] R. Sabourin, G. Genest and F. Preteux.Off-line signature verification by local granulometric size distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9:976-988, 997.