Incremental and Offline Handwriting Recognition for the Venice Time Machine



Similar documents
. Learn the number of classes and the structure of each class using similarity between unlabeled training patterns

Cursive Handwriting Recognition for Document Archiving

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan

Keywords image processing, signature verification, false acceptance rate, false rejection rate, forgeries, feature vectors, support vector machines.

Offline Word Spotting in Handwritten Documents

Automatic Extraction of Signatures from Bank Cheques and other Documents

Document Image Retrieval using Signatures as Queries

Using Lexical Similarity in Handwritten Word Recognition

Supervised DNA barcodes species classification: analysis, comparisons and results. Tutorial. Citations

Identity Guide. HHMI Identity Guidelines V 1.2 1

Introduction to Pattern Recognition

Make your own Temple Run game

When the fluid velocity is zero, called the hydrostatic condition, the pressure variation is due only to the weight of the fluid.

Template-based Eye and Mouth Detection for 3D Video Conferencing

Cricut Design Space Reference Guide & Glossary

Machine Learning. CS494/594, Fall :10 AM 12:25 PM Claxton 205. Slides adapted (and extended) from: ETHEM ALPAYDIN The MIT Press, 2004

Analecta Vol. 8, No. 2 ISSN

High-Performance Signature Recognition Method using SVM

Recognition Method for Handwritten Digits Based on Improved Chain Code Histogram Feature

Safety Zone and Minimum Size (Vertical)

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

Tips for optimizing your publications for commercial printing

Lateral and Surface Area of Right Prisms

DIRECT SHEAR TEST SOIL MECHANICS SOIL MECHANICS LABORATORY DEPARTMENT OF CIVIL ENGINEERING UNIVERSITY OF MORATUWA SRI LANKA

2. Distributed Handwriting Recognition. Abstract. 1. Introduction

TABLE OF CONTENTS. SECTION ONE: OVERVIEW... 4 Who are these guidelines for?... 4 What is a visual identity guideline?... 4

Poker Vision: Playing Cards and Chips Identification based on Image Processing

Visualizing Data: Scalable Interactivity

ACE: Illustrator CC Exam Guide

Brand Identity Guidelines

Galaxy Morphological Classification

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Teacher Page. 1. Reflect a figure with vertices across the x-axis. Find the coordinates of the new image.

INFORMATIKAI ALAPISMERETEK ANGOL NYELVEN

A new normalization technique for cursive handwritten words

Handwritten Signature Verification using Neural Network

LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

Visual Structure Analysis of Flow Charts in Patent Images

Using Microsoft Picture Manager

III. SEGMENTATION. A. Origin Segmentation

Offline Recognition of Unconstrained Handwritten Texts Using HMMs and Statistical Language Models. Alessandro Vinciarelli, Samy Bengio and Horst Bunke

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures

Adobe Illustrator CS6. Illustrating Innovative Web Design

2 Signature-Based Retrieval of Scanned Documents Using Conditional Random Fields

Brand Guidelines Visual Identity

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

DEVELOPMENT OF AN IMAGING SYSTEM FOR THE CHARACTERIZATION OF THE THORACIC AORTA.

A Lightweight and Effective Music Score Recognition on Mobile Phone

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

STATIC SIGNATURE RECOGNITION SYSTEM FOR USER AUTHENTICATION BASED TWO LEVEL COG, HOUGH TRANSFORM AND NEURAL NETWORK

So you say you want something printed...

Installing and using the driver

Quick Guide to IrfanView

Interactive person re-identification in TV series

Online Farsi Handwritten Character Recognition Using Hidden Markov Model

UNIVERSITY OF OSLO. Faculty of Mathematics and Natural Sciences

CS 534: Computer Vision 3D Model-based recognition

Character Image Patterns as Big Data

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

PowerPoint: Graphics and SmartArt

Quick Reference Guide

Paper 1. Calculator not allowed. Mathematics test. First name. Last name. School. Remember KEY STAGE 3 TIER 6 8

customer community Getting started Visual Editor Guide!

BAR CODE 39 ELFRING FONTS INC.

Using Microsoft Word. Working With Objects

USING ADOBE PhotoShop TO MEASURE EarthKAM IMAGES

1 ImageBrowser Software Guide

Using Segmentation Constraints in an Implicit Segmentation Scheme for Online Word Recognition


CENTRE FOR THE BRAZILIAN TANNING INDUSTRY BRAND BOOK

New development of automation for agricultural machinery

Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras

Introduction. Chapter 1

Efficient on-line Signature Verification System

TECHNICAL OPERATING SPECIFICATIONS

5. Binary objects labeling

CLOUD DIGITISER 2014!

BT CONTENT SHOWCASE. JOOMLA EXTENSION User guide Version 2.1. Copyright 2013 Bowthemes Inc.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Revision 5 - (Released April 2008) Added further definitions of RFID tag contents and testing requirements for RFID tag read range.

Script and Language Identification for Handwritten Document Images. Judith Hochberg Kevin Bowers * Michael Cannon Patrick Kelly

Brand and Identity Guidelines

Autumn 1 Maths Overview. Year groups Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 1 Number and place value. Counting. 2 Sequences and place value.

Face detection is a process of localizing and extracting the face region from the

Lab 2: Visualization with d3.js

APPLYING COMPUTER VISION TECHNIQUES TO TOPOGRAPHIC OBJECTS

Applications of Deep Learning to the GEOINT mission. June 2015

9. Text & Documents. Visualizing and Searching Documents. Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08

VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

HIGH RESOLUTION MONITORING OF CAMPI FLEGREI (NAPLES, ITALY) BY EXPLOITING TERRASAR-X DATA: AN APPLICATION TO SOLFATARA CRATER

The main imovie window is divided into six major parts.

Using this Brand Guide

Transcription:

CERL seminar Oslo - October 2014 Incremental and Offline Handwriting Recognition for the Venice Time Machine Andrea Mazzei, Fouad Slimane, Giovanni Colavizza, Lorenzo Tomasin, Frédéric Kaplan Digital Humanities Laboratory

State Archive of Venice 80 km of archives documenting every detail of Venetian history over a 1000 years.

A 10 year digitization program to transform this archive into an information system.

Marciana Library A new digitization project focusing on rare books.

Digitized materials

Segmentation

Segmentation Detect left and right page Crop pages (arbitrary shape, deformed and damaged Isolate the text Detect textlines / words

Deliberazioni Senato, Terra (Registro 105) Is this the left page or the right one?

Deliberazioni Senato, Terra (Registro 105) And this one?

Where is the page?

What is missing? Or what should be missing?

ASVe. X Savi sopra le Decime. Condizioni di decima (Filze) Where is the text?

ASVe. X Savi sopra le Decime. Condizioni di decima (Filze) Where is the text?

Parameters Optimization score Argmax { StdX * contour_size / StdY * num_contours} param 1 param 2

Where are the textlines?

Word Spotting

Principle Compression of historical document images into clusters of visually similar and syntactically equivalent text sequences Limitation It depends on the handwriting style

Are these two words the same? 36

Shape matching using shape contexts Subset of correspondences between the two shapes Estimate a transformation of the plane in order to map any point on the first shape to a point on the second shape.

Image Substraction Before and After Morphing the Image

Cluster Analysis using KNN

Handwriting Recognition using HMM Principal investigator: Fouad Slimane Email: fouad.slimane@epfl.ch

HMM based Recognition system Text sequences are transformed into arrays of feature vectors which constitute observations input for the HMM An HMM is then used to model the text sequence Each state is associated to characters, subcharacters or directly to their variations. Transition probabilities between states are typically modelling the probability that one character follows another one

HMM Text Recognition System m e t u d o HMM-based text recognition system d e b i a

Preprocessing and Feature Extraction Pre-processing and feature extraction Sliding window procedure Image transformed into a sequence of feature vectors Size window =W S Shift window = W Sh Feature vector size = n. C 0 C 1.. Window Size C n

Transcription Alignment using HMMs chose opagera. Et etiam deo stagando collui encarcere se sauera la che sia dellauer de collui lodoxe comandera chello sia entromesso edara sse allo so credetor. Et etiam deo selo creditor uora enuestir lapprietade del debitor enquella fia da alcreditor sera data en uestixon. Mosella femena che none maritata sera 9depnata segon do che desoura edito tuto se fara segondo che nui auemo soura dito delomo remetuda questa cho sa chello stara enlo teratorio de san ҫacharia e

Transcription Alignment using HMMs encarcere se sauera la che sia dellauer de collui lodoxe comandera chello sia entromesso edara sse allo so credetor. Et etiam deo selo creditor uora enuestir lapprietade del debitor enquella fia da alcreditor sera data en uestixon. Mosella femena che none maritata sera 9depnata segon do che desoura edito tuto se fara segondo che nui auemo soura dito delomo remetuda questa cho sa chello stara enlo teratorio de san ҫacharia e

Transcription Alignment using HMMs lodoxe comandera chello sia entromesso edara sse allo so credetor. Et etiam deo selo creditor uora enuestir lapprietade del debitor enquella fia da alcreditor sera data en uestixon. Mosella femena che none maritata sera 9depnata segon do che desoura edito tuto se fara segondo che nui auemo soura dito delomo remetuda questa cho sa chello stara enlo teratorio de san ҫacharia e

Transcription Alignment using HMMs sse allo so credetor. Et etiam deo selo creditor uora enuestir lapprietade del debitor enquella fia da alcreditor sera data en uestixon. Mosella femena che none maritata sera 9depnata segon do che desoura edito tuto se fara segondo che nui auemo soura dito delomo remetuda questa cho sa chello stara enlo teratorio de san ҫacharia e

Thank you for your attention Questions?

Backup Slides

Preprocessing and Feature Extraction Density of pixels in the window Vertical position of the gravity center in the whole window normalized by the height of the window 12 Zernike moments computed from the window Mean of vertical projection normalized by the window width Mean of horizontal projection Number N1 of black connected components Number N2 of white connected components Ratio N1/N2 of black and white connected components Position of the smallest black connected component divided by the height of the window Perimeter of all components in window/perimeter of window Compactness Gravity centre of the window, of the right and left half and of the first third, the second and the last part of the window