Overview of the EVALITA 2009 PoS Tagging Task



Similar documents
PoS-tagging Italian texts with CORISTagger

POS Tagging 1. POS Tagging. Rule-based taggers Statistical taggers Hybrid approaches

Evalita 09 Parsing Task: constituency parsers and the Penn format for Italian

Benefits of HPC for NLP besides big data

NLP Programming Tutorial 5 - Part of Speech Tagging with Hidden Markov Models

Word Completion and Prediction in Hebrew

Training and evaluation of POS taggers on the French MULTITAG corpus

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic

UNKNOWN WORDS ANALYSIS IN POS TAGGING OF SINHALA LANGUAGE

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test

Customizing an English-Korean Machine Translation System for Patent Translation *

Annotation and Extraction of Relations from Italian Medical Records

Why language is hard. And what Linguistics has to say about it. Natalia Silveira Participation code: eagles

Improving Data Driven Part-of-Speech Tagging by Morphologic Knowledge Induction

A Knowledge-Poor Approach to BioCreative V DNER and CID Tasks

Proceedings of the First Italian Conference on Computational Linguistics CLiC-it 2014 & the Fourth International Workshop EVALITA 2014

TS3: an Improved Version of the Bilingual Concordancer TransSearch

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari

Ranking Analysis. file://c:\programmi\web CEO\Cache\WCSE\{5CD4ADC5-1EEA-4D77-BB59-10F64F4F2CB4}\WCSE_report.htm

Design and Development of Part-of-Speech-Tagging Resources for Wolof

Context Grammar and POS Tagging

Schema documentation for types1.2.xsd

Introduction to Text Mining. Module 2: Information Extraction in GATE

Chapter 8. Final Results on Dutch Senseval-2 Test Data

Curriculum Vitae et Studiorum

Natural Language Processing

Simple Type-Level Unsupervised POS Tagging

A Mixed Trigrams Approach for Context Sensitive Spell Checking

ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking

Digital performance of Italy

Brill s rule-based PoS tagger

Dutch Parallel Corpus

Improving statistical POS tagging using Linguistic feature for Hindi and Telugu

Is Part-of-Speech Tagging a Solved Task? An Evaluation of POS Taggers for the German Web as Corpus

Real-Time Identification of MWE Candidates in Databases from the BNC and the Web

Transition-Based Dependency Parsing with Long Distance Collocations

Multilingual Sentiment Analysis on Social Media. Erik Tromp. July 2011

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information

Curriculum Vitae et Studiorum

Annotation and Evaluation of Swedish Multiword Named Entities

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures

Chinese-Japanese Machine Translation Exploiting Chinese Characters

A Framework-based Online Question Answering System. Oliver Scheuer, Dan Shen, Dietrich Klakow

Combining Social Data and Semantic Content Analysis for L Aquila Social Urban Network

WebLicht: Web-based LRT services for German

A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model

Micro blogs Oriented Word Segmentation System

Semantic annotation of requirements for automatic UML class diagram generation

Mining a Corpus of Job Ads

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde

Applying Co-Training Methods to Statistical Parsing. Anoop Sarkar anoop/

CorA: A web-based annotation tool for historical and other non-standard language data

Web-based Bengali News Corpus for Lexicon Development and POS Tagging

Open Domain Information Extraction. Günter Neumann, DFKI, 2012

Markus Dickinson. Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013

Workshop. Neil Barrett PhD, Jens Weber PhD, Vincent Thai MD. Engineering & Health Informa2on Science

Turkish Radiology Dictation System

Ingegneria del Software Corso di Laurea in Informatica per il Management. Object Oriented Principles

Building A Vocabulary Self-Learning Speech Recognition System

Shallow Parsing with Apache UIMA

Terminology Extraction from Log Files

Why Evaluation? Machine Translation. Evaluation. Evaluation Metrics. Ten Translations of a Chinese Sentence. How good is a given system?

Machine Translation. Why Evaluation? Evaluation. Ten Translations of a Chinese Sentence. Evaluation Metrics. But MT evaluation is a di cult problem!

From Terminology Extraction to Terminology Validation: An Approach Adapted to Log Files

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

The Design of a Proofreading Software Service

Terminology Extraction from Log Files

PiQASso: Pisa Question Answering System

UNICA Rome Meeting 6-7/12/12. Nicola Vittorio Vice Rector for Education University of Rome Tor Vergata

Integrating Annotation Tools into UIMA for Interoperability

TagMiner: A Semisupervised Associative POS Tagger E ective for Resource Poor Languages

ETL Ensembles for Chunking, NER and SRL

Transcription:

Overview of the EVALITA 2009 PoS Tagging Task G. Attardi, M. Simi Dipartimento di Informatica, Università di Pisa + team of project SemaWiki

Outline Introduction to the PoS Tagging Task Task definition Data sets Evaluation metrics Results systems results discussion Conclusion

Introduction State of the art: Penn TreeBank data set: 36 categories, accuracy above 97% Evalita 2007 data sets: EAGLES compliant (32 categories); DISTRIB (16 categories), accuracy above 98% with external resources and multi-words lists. The challenge for EVALITA 2009: A larger tag set: 37 tags with morphological variants, 336 morphed tags (234 actually present in corpus). Domain adaptation: training data from newspaper articles (La Repubblica), dev and test data from the Italian Wikipedia

Task definition PoS Categories TANL tagset, a Eagles compliant tagset (236 different tags) http://medialab.di.unipi.it/wiki/tanl_pos_tagset/ Data sets Training set: from newspaper La Repubblica;108.874 word forms divided into 3.719 sentences. Development set: from Wikipedia; 5021 forms; 147 sentences Test set: from Wikipedia; 5066 forms; 147 sentences New words: about 17% Data format: one token per line, UTF-8 encoded

Two subtaks Open Task: participants can use external resources Closed Task: participants are not allowed to use external resources

Evaluation metrics Tagging Accuracy (TA): the percentage of correctly tagged tokens with respect to the total number of tokens in the Test Set. Unknown Words Tagging Accuracy (UWTA): tagging accuracy restricted to the unknown words, i.e. tokens present in the Test Set but not in the Training Set. TA and UWTA measured w.r.t.: POS: the complete tagset CPOS: fine-grained tags without morphology

Participating groups Research Team Main investigator Affiliation SemaWiki G. Attardi Dip. di Informatica, Univ. di Pisa, Italy CST_S gaard A. S gaard Centre for Lang. Tech., Univ. of Copenhagen, Denmark Gesmundo A. Gesmundo Universitˆ di Genova, Italy Felice-ILC F. DellÕOrletta ILC-CNR, Pisa, Italy Lesmo L. Lesmo Dip. di Informatica, Univ. di Turin, Italy Pianta E. Pianta Found. B. Kessler Ğ IRST, Trento, Italy Rigutini L. Rigutini Dip. di Ing. Informatica, Univ. di Siena, Italy Tamburini F. Tamburini DSLO, Universitˆ di Bologna, Italy

Open task results Team POS TA CPOS TA POS UWTA CPOS UWTA Rank SemaWiki 2 96.75% 97.03% 94.62% 95.30% 1 SemaWiki 1 96.44% 96.73% 94.27% 95.07% 2 SemaWiki 4 96.38% 96.67% 93.13% 93.81% 3 SemaWiki 3 96.14% 96.42% 92.55% 93.24% 4 Pianta 96.06% 96.36% 92.21% 93.24% 5 Lesmo 95.95% 96.26% 92.33% 93.01% 6 Tamburini 1 95.93% 96.40% 90.95% 92.67% 7 Tamburini 2 95.63% 96.16% 91.07% 92.78% 8

Closed task results Team POS TA CPOS TA POS UWTA CPOS UWTA Rank Felice_ILC 96,34% 96,91% 91,07% 93,36% 1 Gesmundo 95,85% 96,48% 91,41% 93,81% 2 SemaWiki 2 95,73% 96,52% 90,15% 93,47% 3 SemaWiki 1 95,24% 96,00% 87,40% 90,72% 4 Pianta 93,54% 94,10% 85,45% 87,74% 5 Rigutini 2 93,37% 94,15% 86,03% 88,43% 6 Rigutini 3 93,31% 94,15% 86,03% 88,55% 7 Rigutini 4 93,29% 94,17% 85,34% 88.09% 8 Rigutini 1 93,10% 93,76% 84,54% 87,06% 9 CSTS gaard 1 91,90% 93,21% 86,03% 89,58% 10 CSTS gaard 2 91,64% 93,21% 86,14% 89,92% 11

Summary of approaches Team Type Components SemaWiki Combination or cascade + rules CST_S gaard Combination Hunpos, TreeTagger Brill, TreeTagger, MaxEntropy + combination classifier Model Train. n-gram Search order Feat. Second 3-gram Viterbi 1-gram in classifier Gesmundo Single Perceptron First 2-gram bidirectional k 515 none beamsearch Felice-ILC Combination HMM, SVM, MaxEntropy Second 4-gram SV M: 91400, ME: 939000 Viterbi Lesmo Rule based 573 rules Pianta Cascade of 4 classifiers Rigutini SVM First varying Viterbi Tamburini Combination HMM, TBL 2-gram, 3-gram Viterbi; Brill algorith m

Error Analysis

Conclusions We can deal with the higher complexity of the task build practical taggers performing tagging and morphology at the same time reusable across domains Comparison with Hunpos, open source implementation of TnT Open task: 96.27% TA, 4th scoring system (same lexicon as 1st) Closed task: 95.14% TA, last scoring system Efficiency: ~0.03min for training, ~0.07min for tagging the Test Set