Building and exploiting a dependency treebank for French radio broadcasts
|
|
- Lucas Carpenter
- 7 years ago
- Views:
Transcription
1 Building and exploiting a dependency treebank for French radio broadcasts Christophe Cerisara, Claire Gardent and Corinna Anderson LORIA, Nancy
2 Goals Corpus Annotation Tools and Methodology Annotation schema The Impact of Speech Constructs on Parsing Conclusions
3 Goals Long term Use syntax to improve speech recognition (INRIA Collaborative Action Rapsodis ) Medium term Build a tree bank of spoken data (transcription of radio broadcast news) Empirical study of speech constructs Analyse impact of speech constructs on parsing Parse speech
4 The Ester Corpus 37 hours of manual transcriptions of French radios ( ) Annotations with speakers, words, noise symbols, sometimes punctuation Normalisation to match the output of speech recognition systems: Remove punctuation, first word upper-case letters Remove incomplete words: Le pe- petit... But keep disfluencies with complete words: Le le petit
5 Example transcriptions quiberon Frédéric Colas France Bleue Armorique pour France-Inter Header; No punctuation bonsoir Non verbal utterance l enquête sur l office HLM de Paris Jean Tiberi le maire de la capitale annonce lui-même dans une interview au Monde Incomplete utterances sa mise en examen pour complicité de trafic d influence Incorrect sentence segmentation je pense que cela doit conduire euh Jean Tiberi le premier euh à une réflexion Hesitations
6 Methodology for constructing ETB Manual annotation supported by pre-parsing Active Learning for selectively extending the annotated data and improve the parser using a small training corpus (Christophe s talk)
7 Manual Annotation On and Off since 2009 Uses JSafran framework Iterative process: 1. Design of an annotation scheme. 2. Manual annotation of 5000 words 3. Training of a Malt Parser model 4. Automatic parsing of a new corpus segment 5. Manual correction of this corpus segment 6. Addition of this corrected segment to the training corpus 7. Iterate from step 3
8 The ATB Annotation Schema 15 dependency relations: SUJ (subject) OBJ (object) POBJ (prepositional object) ATTS (subject attribute) ATTO (object attribute) MOD (modifier) COMP (complementizer) AUX (auxiliary) DET (determiner) CC (coordination) REF (reflexive pronoun) JUXT (juxtaposition) APPOS (apposition) DUMMY (syntactically governed but semantically empty dependent) e.g. expletive subject DISFL (disfluency).
9 ETB and PTB annotations ETB Label description P7Dep MOD modifier mod, mod rel, dep COMP complementizer obj DET determiner det SUJ subject suj OBJ object obj DISFL disfluency mod CC coordination coord, dep coord POBJ prepositional object a obj, de obj, p obj ATTS subject attribute ats JUXT juxtaposition mod MultiMots multi-word expression mod AUX auxiliary aux tps, aux pass, aux caus DUMMY empty dependent aff REF reflexive pronoun obj, a obj, de obj APPOS apposition mod ATTO object attribute ato
10 Rule Converter ETB PTB ETB MOD CC POBJ AUX REF P7Dep mod, mod rel, dep coord, dep coord a obj, de obj, p obj aux tps, aux pass, aux caus obj, a obj, de obj ETB DISFL JUXT MultiMots APPOS P7Dep mod mod mod mod Converter accuracy on an ESTER test corpus manually annotated with the P7Dep format LAS (labelled attachment score) = 92.6% UAS = 98.5%.
11 Example Annotations Figure: Screenshot of the J-Safran GUI for dependency tree edition
12 J-Safran software GUI with the following functionalities Vizualisation and Edition of dependency graphs POS-tagging: Tree-Tagger (French version) and OpenNLP Tagger (CRF trained on French TreeBank) Parsing with the Malt Parser (ETB or FTB models) Training of parsing models on annotated data Search functions (words, dependencies, sequences,...) Evaluation with CoNLL scripts
13 Utterance-level annotations Part 2 of the ETB corpus was annotated with utterance level annotations. GUEST: et euh je je pense que pourri beaucoup l image de de la conduite (and hum I I think deteriorates much the image of of driving) SPEAKER: les deux gouvernements cherchent un compromis (both governments look for some compromise) ELLIPSIS: je cite de mémoire qu un tiers des morts à l avant euh n avaient pas leur ceinture et euh non un quart à l avant et je crois près du tiers à l arrière (... a third of the deads in front did not have their safety belt on huh no a quarter in front and I think a third at the back) HEADER: quiberon frédéric colas france bleue armorique pour france-inter (Quiberon Frédéric Colas france bleue Armorique for france-inter)
14 Models performance on ETB Part 2 Training corpus: 8544 words Test corpus: 1747 words Labelled attachment score i.e., percentage of tokens with correct governor and dependency relation (LAS): 63.6% Which constructs most affect parsing accuracy? We look at Speaker/Guest differences, disfluencies and radio headlines
15 Impact of disfluencies Ratio of utterances with disfluencies: 41% (D sub-corpus) Manual removal of disfluencies in the test corpus Performances on the D sub-corpus: W/o With (w,w/o) disfl disfl LAS 70.2% 66.1% +4.1 UAS 77.2% 73.5% +3.7 LAC 76.5% 72.7% +3.8 Performances on the whole test corpus: W/o With (w,w/o) disfl disfl LAS 67.3% 65.7% +1.6 UAS 74.2% 73.0% +1.2 LAC 74.2% 72.6% +1.6
16 Impact of speaking style Ratio of journalistic/guest utterances: 72%/28% Performances on both types of speech: Journalist Guest (J,G) LAS 70.8% 65.2% -5.6 UAS 76.5% 71.8% -4.7 LAC 77.5% 72.0% -5.5 Is this difference due to disfluencies? remove disfluencies: Journalist Guest (J,G) LAS 71.2% 67.8% -3.4 UAS 77.2% 74.1% -3.1 LAC 78.2% 74.5% -3.7 Disfluencies explain 40% of the degradation observed between journalist and guest speaker parsing.
17 Impact of headers Ratio of header utterances: 14% Guest utterances removed 10-fold cross-validation Comparative results on headers / journalist style: Journalist without headers Headers (-H,+H) LAS 70.6% 61.7% -8.9 UAS 76.2% 69.7% -6.5 LAC 77.4% 67.5% -9.9
18 Summarising Disfluencies degrades parsing performance in average by 1.6 points Guest utterances are harder to parse (even after disfluencies are removed) with a LAS decrease of 3.4 points Radio specific constructs (headlines) show a LAS decrease of 8.9 points (different syntactic structure, sparse data)
19 Conclusions and future work Current status Current ETB: words (53000 Ester 2, Etape) LAS with MATE Parser: 76% Future work Continue annotations Automatically detect incorrect annotations Finer grained annotation of disfluencies (hesitation,repairs,repetitions,false start) Investigate Active Learning Investigate different parsing strategies (preparse disfluencies and named entities, joint model for named entity recognition and parsing)
Syntactic annotation of spontaneous speech: application to call center conversation data
Syntactic annotation of spontaneous speech: application to call center conversation data Frédéric Béchet, Thierry Bazillon, Benoit Favre, Alexis Nasr Aix Marseille Université LIF-CNRS Laboratoire d Informatique
More informationEvaluation of speech technologies
CLARA Training course on evaluation of Human Language Technologies Evaluations and Language resources Distribution Agency November 27, 2012 Evaluation of speaker identification Speech technologies Outline
More informationTS3: an Improved Version of the Bilingual Concordancer TransSearch
TS3: an Improved Version of the Bilingual Concordancer TransSearch Stéphane HUET, Julien BOURDAILLET and Philippe LANGLAIS EAMT 2009 - Barcelona June 14, 2009 Computer assisted translation Preferred by
More informationD2.4: Two trained semantic decoders for the Appointment Scheduling task
D2.4: Two trained semantic decoders for the Appointment Scheduling task James Henderson, François Mairesse, Lonneke van der Plas, Paola Merlo Distribution: Public CLASSiC Computational Learning in Adaptive
More informationShallow Parsing with Apache UIMA
Shallow Parsing with Apache UIMA Graham Wilcock University of Helsinki Finland graham.wilcock@helsinki.fi Abstract Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic
More informationOpen Domain Information Extraction. Günter Neumann, DFKI, 2012
Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for
More informationLevel 3 French, 2015
91543 915430 3SUPERVISOR S Level 3 French, 2015 91543 Demonstrate understanding of a variety of extended spoken French texts 9.30 a.m. Wednesday 18 November 2015 Credits: Five Achievement Achievement with
More informationAP FRENCH LANGUAGE AND CULTURE 2013 SCORING GUIDELINES
AP FRENCH LANGUAGE AND CULTURE 2013 SCORING GUIDELINES Interpersonal Writing: E-mail Reply 5: STRONG performance in Interpersonal Writing Maintains the exchange with a response that is clearly appropriate
More informationTest Suite Generation
Test uite Generation ylvain chmitz LORIA, INRIA Nancy - Grand Est, Nancy, France NaTAL Workshop, Nancy, June 25, 2008 Issues with urface Generation *Jean que cherches-tu est grand. Jean qui baille s endort.
More informationElizabethtown Area School District French II
Elizabethtown Area School District French II Course Number: 605 Length of Course: 18 weeks Grade Level: 9-12 Total Clock Hours: 120 Length of Period: 80 minutes Date Written: Spring 2009 Periods per Week/Cycle:
More informationAP FRENCH LANGUAGE 2008 SCORING GUIDELINES
AP FRENCH LANGUAGE 2008 SCORING GUIDELINES Part A (Essay): Question 31 9 Demonstrates STRONG CONTROL Excellence Ease of expression marked by a good sense of idiomatic French. Clarity of organization. Accuracy
More informationAutomatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990
More informationAutomatic Text Analysis Using Drupal
Automatic Text Analysis Using Drupal By Herman Chai Computer Engineering California Polytechnic State University, San Luis Obispo Advised by Dr. Foaad Khosmood June 14, 2013 Abstract Natural language processing
More informationAnnotation Guidelines for Dutch-English Word Alignment
Annotation Guidelines for Dutch-English Word Alignment version 1.0 LT3 Technical Report LT3 10-01 Lieve Macken LT3 Language and Translation Technology Team Faculty of Translation Studies University College
More informationAP FRENCH LANGUAGE AND CULTURE EXAM 2015 SCORING GUIDELINES
AP FRENCH LANGUAGE AND CULTURE EXAM 2015 SCORING GUIDELINES Identical to Scoring Guidelines used for German, Italian, and Spanish Language and Culture Exams Presentational Writing: Persuasive Essay 5:
More informationTrameur: A Framework for Annotated Text Corpora Exploration
Trameur: A Framework for Annotated Text Corpora Exploration Serge Fleury (Sorbonne Nouvelle Paris 3) serge.fleury@univ-paris3.fr Maria Zimina(Paris Diderot Sorbonne Paris Cité) maria.zimina@eila.univ-paris-diderot.fr
More informationGSAC CONSIGNE DE NAVIGABILITE définie par la DIRECTION GENERALE DE L AVIATION CIVILE Les examens ou modifications décrits ci-dessous sont impératifs. La non application des exigences contenues dans cette
More informationCourse Title: French II Topic/Concept: ir and re verbs Time Allotment: 2 weeks Unit Sequence: 1 Major Concepts to be learned:
Course Title: French II Topic/Concept: ir and re verbs Time Allotment: 2 weeks Unit Sequence: 1 1. To be able to conjugate regualr ir verbs 2. To be able to conjugate regular re verbs 3. Common outdoor
More informationCURRICULUM VITAE Studies Positions Distinctions Research interests Research projects
1 CURRICULUM VITAE ABEILLÉ Anne Address : LLF, UFRL, Case 7003, Université Paris 7, 2 place Jussieu, 75005 Paris Tél. 33 1 57 27 57 67 Fax 33 1 57 27 57 88 abeille@linguist.jussieu.fr http://www.llf.cnrs.fr/fr/abeille/
More informationIsabelle Debourges, Sylvie Guilloré-Billot, Christel Vrain
/HDUQLQJ9HUEDO5HODWLRQVLQ7H[W0DSV Isabelle Debourges, Sylvie Guilloré-Billot, Christel Vrain LIFO Rue Léonard de Vinci 45067 Orléans cedex 2 France email: {debourge, billot, christel.vrain}@lifo.univ-orleans.fr
More informationLASSY: LARGE SCALE SYNTACTIC ANNOTATION OF WRITTEN DUTCH
LASSY: LARGE SCALE SYNTACTIC ANNOTATION OF WRITTEN DUTCH Gertjan van Noord Deliverable 3-4: Report Annotation of Lassy Small 1 1 Background Lassy Small is the Lassy corpus in which the syntactic annotations
More informationTransition-Based Dependency Parsing with Long Distance Collocations
Transition-Based Dependency Parsing with Long Distance Collocations Chenxi Zhu, Xipeng Qiu (B), and Xuanjing Huang Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science,
More informationAP FRENCH LANGUAGE AND CULTURE EXAM 2015 SCORING GUIDELINES
AP FRENCH LANGUAGE AND CULTURE EXAM 2015 SCORING GUIDELINES Identical to Scoring Guidelines used for German, Italian, and Spanish Language and Culture Exams Interpersonal Writing: E-mail Reply 5: STRONG
More informationOnline Tutoring System For Essay Writing
Online Tutoring System For Essay Writing 2 Online Tutoring System for Essay Writing Unit 4 Infinitive Phrases Review Units 1 and 2 introduced some of the building blocks of sentences, including noun phrases
More informationElizabethtown Area School District French III
Elizabethtown Area School District French III Course Number: 610 Length of Course: 18 weeks Grade Level: 10-12 Elective Total Clock Hours: 120 Length of Period: 80 minutes Date Written: Spring 2009 Periods
More informationGCSE FRENCH 8658/LF. Foundation Tier Paper 1 Listening
SPEIMEN MTERIL GSE FRENH Foundation Tier Paper 1 Listening F Specimen 2018 Morning Time allowed: 35 minutes (including 5 minutes reading time before the test) You will need no other materials. The pauses
More informationEVALITA 2011. http://www.evalita.it/2011. Named Entity Recognition on Transcribed Broadcast News Guidelines for Participants
EVALITA 2011 http://www.evalita.it/2011 Named Entity Recognition on Transcribed Broadcast News Guidelines for Participants Valentina Bartalesi Lenzi Manuela Speranza Rachele Sprugnoli CELCT, Trento FBK,
More informationDECODA: a call-center human-human spoken conversation corpus
DECODA: a call-center human-human spoken conversation corpus F. Bechet 1, B. Maza 2, N. Bigouroux 3, T. Bazillon 1, M. El-Bèze 2, R. De Mori 2, E. Arbillot 4 1 Aix Marseille Univ, LIF-CNRS, Marseille,
More information2. Il faut + infinitive and its more nuanced alternative il faut que + subjunctive.
Teaching notes This resource is designed to enable students to broaden their range of expression on the issue of homelessness and poverty, specifically in terms of suggesting possible solutions. The aim
More informationSPEAKER IDENTITY INDEXING IN AUDIO-VISUAL DOCUMENTS
SPEAKER IDENTITY INDEXING IN AUDIO-VISUAL DOCUMENTS Mbarek Charhad, Daniel Moraru, Stéphane Ayache and Georges Quénot CLIPS-IMAG BP 53, 38041 Grenoble cedex 9, France Georges.Quenot@imag.fr ABSTRACT The
More informationParsing Technology and its role in Legacy Modernization. A Metaware White Paper
Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks
More informationSUPPLEMENT N 4 DATED 12 May 2014 TO THE BASE PROSPECTUS DATED 22 NOVEMBER 2013. BPCE Euro 40,000,000,000 Euro Medium Term Note Programme
SUPPLEMENT N 4 DATED 12 May 2014 TO THE BASE PROSPECTUS DATED 22 NOVEMBER 2013 BPCE Euro 40,000,000,000 Euro Medium Term Note Programme BPCE (the Issuer ) may, subject to compliance with all relevant laws,
More informationNatural Language Processing
Natural Language Processing 2 Open NLP (http://opennlp.apache.org/) Java library for processing natural language text Based on Machine Learning tools maximum entropy, perceptron Includes pre-built models
More informationFactoring Surface Syntactic Structures
MTT 2003, Paris, 16 18 jui003 Factoring Surface Syntactic Structures Alexis Nasr LATTICE-CNRS (UMR 8094) Université Paris 7 alexis.nasr@linguist.jussieu.fr Mots-clefs Keywords Syntaxe de surface, représentation
More informationArchived Content. Contenu archivé
ARCHIVED - Archiving Content ARCHIVÉE - Contenu archivé Archived Content Contenu archivé Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject
More informationRobustness of a Spoken Dialogue Interface for a Personal Assistant
Robustness of a Spoken Dialogue Interface for a Personal Assistant Anna Wong, Anh Nguyen and Wayne Wobcke School of Computer Science and Engineering University of New South Wales Sydney NSW 22, Australia
More informationPlugin SMILK. données liées et traitement de la langue pour plus d'intelligence dans la navigation sur le Web
Plugin SMILK données liées et traitement de la langue pour plus d'intelligence dans la navigation sur le Web Elena Cabrio, Jordan Calvi, Fabien Gandon, Cédric Lopez, Farhad Nooralahzadeh, Thibault Parmentier,
More informationTrameur: A Framework for Annotated Text Corpora Exploration
Trameur: A Framework for Annotated Text Corpora Exploration Serge Fleury Sorbonne Nouvelle Paris 3 SYLED-CLA2T, EA2290 75005 Paris, France serge.fleury@univ-paris3.fr Maria Zimina Paris Diderot Sorbonne
More informationDEPENDENCY PARSING JOAKIM NIVRE
DEPENDENCY PARSING JOAKIM NIVRE Contents 1. Dependency Trees 1 2. Arc-Factored Models 3 3. Online Learning 3 4. Eisner s Algorithm 4 5. Spanning Tree Parsing 6 References 7 A dependency parser analyzes
More informationWhy language is hard. And what Linguistics has to say about it. Natalia Silveira Participation code: eagles
Why language is hard And what Linguistics has to say about it Natalia Silveira Participation code: eagles Christopher Natalia Silveira Manning Language processing is so easy for humans that it is like
More informationApplying Repair Processing in Chinese Homophone Disambiguation
Applying Repair Processing in Chinese Homophone Disambiguation Yue-Shi Lee and Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan, R.O.C.
More informationIn-Home Caregivers Teleconference with Canadian Bar Association September 17, 2015
In-Home Caregivers Teleconference with Canadian Bar Association September 17, 2015 QUESTIONS FOR ESDC Temporary Foreign Worker Program -- Mr. Steve WEST *Answers have been updated following the conference
More informationAssessments; Optional module 1 vert assessments. Own test on the perfect tense
French Scheme of Work (MFL1/Third Year) Studio 3 Module 1 vert (By half-term of the Autumn Term) Learning Objectives/Module 1 Ma vie sociale d ado Concentrate on verbs and tenses. Opportunity to revise
More informationSpeech Transcription
TC-STAR Final Review Meeting Luxembourg, 29 May 2007 Speech Transcription Jean-Luc Gauvain LIMSI TC-STAR Final Review Luxembourg, 29-31 May 2007 1 What Is Speech Recognition? Def: Automatic conversion
More informationConsiderations for developing VoiceXML in Canadian French
Considerations for developing VoiceXML in Canadian French This section contains information that is specific to Canadian French. If you are developing Canadian French voice applications, use the information
More information10th Grade Language. Goal ISAT% Objective Description (with content limits) Vocabulary Words
Standard 3: Writing Process 3.1: Prewrite 58-69% 10.LA.3.1.2 Generate a main idea or thesis appropriate to a type of writing. (753.02.b) Items may include a specified purpose, audience, and writing outline.
More informationMarie Dupuch, Frédérique Segond, André Bittar, Luca Dini, Lina Soualmia, Stefan Darmoni, Quentin Gicquel, Marie-Hélène Metzger
Separate the grain from the chaff: designing a system to make the best use of language and knowledge technologies to model textual medical data extracted from electronic health records Marie Dupuch, Frédérique
More informationGCSE French. Other Guidance. Exemplar Material: Controlled Assessment Writing Autumn 2010
GCSE French Other Guidance Exemplar Material: Controlled Assessment Writing Autumn 2010 Teacher Resource Bank / GCSE French / Exemplar Material Controlled Assessment Writing / Version 1.2 IMPORTANT INFORMATION
More informationPOS Tagsets and POS Tagging. Definition. Tokenization. Tagset Design. Automatic POS Tagging Bigram tagging. Maximum Likelihood Estimation 1 / 23
POS Def. Part of Speech POS POS L645 POS = Assigning word class information to words Dept. of Linguistics, Indiana University Fall 2009 ex: the man bought a book determiner noun verb determiner noun 1
More informationModern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability
Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability Ana-Maria Popescu Alex Armanasu Oren Etzioni University of Washington David Ko {amp, alexarm, etzioni,
More informationSense-Tagging Verbs in English and Chinese. Hoa Trang Dang
Sense-Tagging Verbs in English and Chinese Hoa Trang Dang Department of Computer and Information Sciences University of Pennsylvania htd@linc.cis.upenn.edu October 30, 2003 Outline English sense-tagging
More informationTraining and evaluation of POS taggers on the French MULTITAG corpus
Training and evaluation of POS taggers on the French MULTITAG corpus A. Allauzen, H. Bonneau-Maynard LIMSI/CNRS; Univ Paris-Sud, Orsay, F-91405 {allauzen,maynard}@limsi.fr Abstract The explicit introduction
More informationHOW MUCH DO YOU KNOW ABOUT RUGBY???
HOW MUCH DO YOU KNOW ABOUT RUGBY??? TÂCHE: Je peux donner quelques informations à propos du rugby. EO/PE (A2/B1) Rugby History? How is it played? The different rugby competitions? The Rugby World Cup?
More informationOpen issues regarding legal metadata: IP licensing and management of different cognitive levels
Open issues regarding legal metadata: IP licensing and management of different cognitive levels FLORENCE MAY 6th, 2011 Danièle Bourcier Meritxell Fernández-Barrera 1 Cersa CNRS-Université Paris 2, Paris
More informationJune 2016 Language and cultural workshops In-between session workshops à la carte June 13-25 2 weeks All levels
June 2016 Language and cultural workshops In-between session workshops à la carte June 13-25 2 weeks All levels We have designed especially for you a new set of language and cultural workshops to focus
More informationRaconte-moi : Les deux petites souris
Raconte-moi : Les deux petites souris 1. Content of the story: Two little mice called Sophie and Lulu live in a big house in Paris. Every day, an animal knock at the door of their big house.if they like
More informationSpecialty Answering Service. All rights reserved.
0 Contents 1 Introduction... 2 1.1 Types of Dialog Systems... 2 2 Dialog Systems in Contact Centers... 4 2.1 Automated Call Centers... 4 3 History... 3 4 Designing Interactive Dialogs with Structured Data...
More informationDHI a.s. Na Vrsich 51490/5, 100 00, Prague 10, Czech Republic ( t.metelka@dhi.cz, z.svitak@dhi.cz )
NOVATECH Rehabilitation strategies in wastewater networks as combination of operational, property and model information Stratégies de réhabilitation des réseaux d'égouts combinant des données d exploitation,
More informationNgram Search Engine with Patterns Combining Token, POS, Chunk and NE Information
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University sekine@cs.nyu.edu Kapil Dalwani Computer Science Department
More informationAssessment software development for distributed firewalls
Assessment software development for distributed firewalls Damien Leroy Université Catholique de Louvain Faculté des Sciences Appliquées Département d Ingénierie Informatique Année académique 2005-2006
More informationComma checking in Danish Daniel Hardt Copenhagen Business School & Villanova University
Comma checking in Danish Daniel Hardt Copenhagen Business School & Villanova University 1. Introduction This paper describes research in using the Brill tagger (Brill 94,95) to learn to identify incorrect
More informationWIRING DIAGRAM EXAMPLE EXEMPLE DE SCHEMA DE CABLAGE
Revision Modification Date Auteur Controle APPR. WIRING DIAGRAM EXAMPLE EXEMPLE DE SCHEMA DE CABLAGE Website: www.cretechnology.com Email: info@cretechnology.com Technical support: + (0) Email: support@cretechnology.com
More informationTerminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar 1,2, Stéphane Bonniol 2, Anne Laurent 1, Pascal Poncelet 1, and Mathieu Roche 1 1 LIRMM - Université Montpellier 2 - CNRS 161 rue Ada, 34392 Montpellier
More informationGeneral Certificate of Education Advanced Level Examination June 2012
General Certificate of Education Advanced Level Examination June 2012 French Unit 4 Speaking Test Candidate s Material To be conducted by the teacher examiner between 7 March and 15 May 2012 (FRE4T) To
More information11520 Alberta CALGARY 6 6. 11161 Nova Scotia / Nouvelle-Écosse HALIFAX 5 5. 13123 Quebec / Québec MONTREAL 26 23. 15736 Ontario OTTAWA 162 160
Table S1 - Service to the Public by Bilingual Office / Point of Service as of March 31st of year Tableau S1 - Service au public par bureau bilingue /point de service en date du 31 mars de l'année Office
More informationOutline of today s lecture
Outline of today s lecture Generative grammar Simple context free grammars Probabilistic CFGs Formalism power requirements Parsing Modelling syntactic structure of phrases and sentences. Why is it useful?
More informationIdentifying Focus, Techniques and Domain of Scientific Papers
Identifying Focus, Techniques and Domain of Scientific Papers Sonal Gupta Department of Computer Science Stanford University Stanford, CA 94305 sonal@cs.stanford.edu Christopher D. Manning Department of
More informationThird Supplement dated 8 September 2015 to the Euro Medium Term Note Programme Base Prospectus dated 12 December 2014
Third Supplement dated 8 September 2015 to the Euro Medium Term Note Programme Base Prospectus dated 12 December 2014 HSBC France 20,000,000,000 Euro Medium Term Note Programme This third supplement (the
More informationVIREMENTS BANCAIRES INTERNATIONAUX
Les clients de Markets.com peuvent financer leur compte en effectuant des virements bancaires depuis de nombreuses banques dans le monde. Consultez la liste ci-dessous pour des détails sur les virements
More informationHow To Write A Police Budget
ARCHIVED - Archiving Content ARCHIVÉE - Contenu archivé Archived Content Contenu archivé Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject
More informationTREATIES AND OTHER INTERNATIONAL ACTS SERIES 12859. Agreement Between the UNITED STATES OF AMERICA and CONGO
1 TREATIES AND OTHER INTERNATIONAL ACTS SERIES 12859 EMPLOYMENT Agreement Between the UNITED STATES OF AMERICA and CONGO Effected by Exchange of Notes Dated at Washington April 11 and May 23, 1997 2 NOTE
More informationLEÇON 17 Le français pratique: L achat des vêtements
Nom Unité 6. Le shopping LEÇON 17 Le français pratique: L achat des vêtements Projet 1 Articles promotionnels Unité 6 Leçon 17 Work in a group of four. Imagine that you are the editors and art designers
More informationAmazigh ConCorde: an appropriate concordance for Amazigh
SITACAM 09, Agadir, 12-13 December 2009 Amazigh ConCorde: an appropriate concordance for Amazigh Siham Boulaknadel Institut Royal de la Culture Amazighe Avenue Allal El Fassi, Madinat Al Irfane, Rabat
More informationOnline free translation services
[Translating and the Computer 24: proceedings of the International Conference 21-22 November 2002, London (Aslib, 2002)] Online free translation services Thei Zervaki tzervaki@hotmail.com Introduction
More informationKindly go through this entire document (5 pages) carefully before booking your flight.
Dear Student, Greetings from Vatel International Business School, specialising in Hotel & Tourism Management! We are pleased to inform you that for the upcoming intake, Monday 01 st of March 2010, the
More informationWhat about me and you? We can also be objects, and here it gets really easy,
YOU MEAN I HAVE TO KNOW THIS!? VOL 3 PRONOUNS Object pronouns Object pronouns If subjects do the verb, guess what the objects do? They get the verb done to them! Consider the following sentences: We eat
More informationA chart generator for the Dutch Alpino grammar
June 10, 2009 Introduction Parsing: determining the grammatical structure of a sentence. Semantics: a parser can build a representation of meaning (semantics) as a side-effect of parsing a sentence. Generation:
More informationRAPPORT FINANCIER ANNUEL PORTANT SUR LES COMPTES 2014
RAPPORT FINANCIER ANNUEL PORTANT SUR LES COMPTES 2014 En application de la loi du Luxembourg du 11 janvier 2008 relative aux obligations de transparence sur les émetteurs de valeurs mobilières. CREDIT
More informationThe Addition to residences in Scotland, Canada
1 Report to/rapport au : Ottawa Built Heritage Advisory Committee Comité consultatif sur le patrimoine bâti d Ottawa and/et Planning Committee Comité de l'urbanisme and Council / et au Conseil October
More information2013 - Temporary Supplement Stamps, Templates & Chipboards. Janvier 2013 - Supplément temporaire Étampes, pochoirs & chipboards
January 2013 - Temporary Supplement Stamps, Templates & Chipboards Janvier 2013 - Supplément temporaire Étampes, pochoirs & chipboards Through the years many artists have contributed to the success of
More informationBonPatronPro to the rescue
OMLTA H2 General Gr. 7-12 Room: Aurora BonPatronPro to the rescue Online help for those perpetual written errors for students AND teachers Maria Gauthier mariagauthier@ucc.on.ca Saturday March 31, 2012
More informationAgroMarketDay. Research Application Summary pp: 371-375. Abstract
Fourth RUFORUM Biennial Regional Conference 21-25 July 2014, Maputo, Mozambique 371 Research Application Summary pp: 371-375 AgroMarketDay Katusiime, L. 1 & Omiat, I. 1 1 Kampala, Uganda Corresponding
More informationNatural Language Database Interface for the Community Based Monitoring System *
Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University
More informationFRENCH AS A SECOND LANGUAGE TRAINING
FRENCH AS A SECOND LANGUAGE TRAINING Beginner 1 This course is intended for people who have never studied French or people who have taken French in the past but have either forgotten most of it or have
More informationROME INTERNATIONAL MIDDLE/HIGH SCHOOL SYNOPSIS 2015-16
ROME INTERNATIONAL MIDDLE/HIGH SCHOOL SYNOPSIS 2015-16 TEACHER: SUBJECT: French CLASS: Grade 9 UNIT TITLE DURATION UNIT SUMMARY ASSESSMENT Our world September Pupils develop speaking and writing at greater
More informationFOR TEACHERS ONLY The University of the State of New York
FOR TEACHERS ONLY The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION F COMPREHENSIVE EXAMINATION IN FRENCH Wednesday, June 22, 2011 9:15 a.m. to 12:15 p.m., only SCORING KEY Updated
More informationFinding Syntactic Characteristics of Surinamese Dutch
Finding Syntactic Characteristics of Surinamese Dutch Erik Tjong Kim Sang Meertens Institute erikt(at)xs4all.nl June 13, 2014 1 Introduction Surinamese Dutch is a variant of Dutch spoken in Suriname, a
More informationModule 6: Le Shopping. 06.01: Les Vêtements
Module 6: Le Shopping 06.00 : Le Métro Please spend time in the lesson reading about Le Métro and its role in the lives of many Parisians. Write down a few notes about what you learned. 06.01: Les Vêtements
More informationUnderstanding Video Lectures in a Flipped Classroom Setting. A Major Qualifying Project Report. Submitted to the Faculty
1 Project Number: DM3 IQP AAGV Understanding Video Lectures in a Flipped Classroom Setting A Major Qualifying Project Report Submitted to the Faculty Of Worcester Polytechnic Institute In partial fulfillment
More informationPiQASso: Pisa Question Answering System
PiQASso: Pisa Question Answering System Giuseppe Attardi, Antonio Cisternino, Francesco Formica, Maria Simi, Alessandro Tommasi Dipartimento di Informatica, Università di Pisa, Italy {attardi, cisterni,
More informationCINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test
CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed
More informationI will explain to you in English why everything from now on will be in French
I will explain to you in English why everything from now on will be in French Démarche et Outils REACHING OUT TO YOU I will explain to you in English why everything from now on will be in French All French
More informationAutomatic Detection and Correction of Errors in Dependency Treebanks
Automatic Detection and Correction of Errors in Dependency Treebanks Alexander Volokh DFKI Stuhlsatzenhausweg 3 66123 Saarbrücken, Germany alexander.volokh@dfki.de Günter Neumann DFKI Stuhlsatzenhausweg
More informationSEPA Mandate Guide. Contents. 1.0 The purpose of this document 2. 2.0 Why mandates are required 2. 2.1 When a new mandate is required 2
SEPA Mandate Guide Contents 1.0 The purpose of this document 2 2.0 Why mandates are required 2 2.1 When a new mandate is required 2 2.2 Cancellation of a mandate 2 2.3 When to amend a mandate 2 3.0 Mandate
More informationIndustry Guidelines on Captioning Television Programs 1 Introduction
Industry Guidelines on Captioning Television Programs 1 Introduction These guidelines address the quality of closed captions on television programs by setting a benchmark for best practice. The guideline
More informationPost-Secondary Opportunities For Student-Athletes / Opportunités post-secondaire pour les étudiantathlètes
Post-Secondary Opportunities For Student-Athletes / Opportunités post-secondaire pour les étudiantathlètes Jean-François Roy Athletics Canada / Athlétisme Canada Talent Development Coordinator / Coordonnateur
More informationCALICO Journal, Volume 9 Number 1 9
PARSING, ERROR DIAGNOSTICS AND INSTRUCTION IN A FRENCH TUTOR GILLES LABRIE AND L.P.S. SINGH Abstract: This paper describes the strategy used in Miniprof, a program designed to provide "intelligent' instruction
More informationTerminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet, Mathieu Roche To cite this version: Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet,
More informationLing 201 Syntax 1. Jirka Hana April 10, 2006
Overview of topics What is Syntax? Word Classes What to remember and understand: Ling 201 Syntax 1 Jirka Hana April 10, 2006 Syntax, difference between syntax and semantics, open/closed class words, all
More informationSurvey on Conference Services provided by the United Nations Office at Geneva
Survey on Conference Services provided by the United Nations Office at Geneva Trade and Development Board, fifty-eighth session Geneva, 12-23 September 2011 Contents Survey contents Evaluation criteria
More information