Universal Dependencies



Similar documents
Towards a Universal Grammar for Natural Language Processing

Ling 201 Syntax 1. Jirka Hana April 10, 2006

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test

Syntactic Transfer Using a Bilingual Lexicon

Why language is hard. And what Linguistics has to say about it. Natalia Silveira Participation code: eagles

Hybrid Strategies. for better products and shorter time-to-market

Annotation Guidelines for Dutch-English Word Alignment

Context Grammar and POS Tagging

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic

Chinese Open Relation Extraction for Knowledge Acquisition

Research Portfolio. Beáta B. Megyesi January 8, 2007

Special Topics in Computer Science

Morphology. Morphology is the study of word formation, of the structure of words. 1. some words can be divided into parts which still have meaning

POS Tagsets and POS Tagging. Definition. Tokenization. Tagset Design. Automatic POS Tagging Bigram tagging. Maximum Likelihood Estimation 1 / 23

Parsing Swedish. Atro Voutilainen Conexor oy CG and FDG

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde

Paraphrasing controlled English texts

Statistical Machine Translation

EAP Grammar Competencies Levels 1 6

Machine Learning for natural language processing

Identifying Focus, Techniques and Domain of Scientific Papers

Comma checking in Danish Daniel Hardt Copenhagen Business School & Villanova University

Wiki-ly Supervised Part-of-Speech Tagging

According to the Argentine writer Jorge Luis Borges, in the Celestial Emporium of Benevolent Knowledge, animals are divided

Building gold-standard treebanks for Norwegian

Outline of today s lecture

Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives

Open Domain Information Extraction. Günter Neumann, DFKI, 2012

31 Case Studies: Java Natural Language Tools Available on the Web

Automatic Detection and Correction of Errors in Dependency Treebanks

C o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER

Doctoral School of Historical Sciences Dr. Székely Gábor professor Program of Assyiriology Dr. Dezső Tamás habilitate docent

Understanding English Grammar: A Linguistic Introduction

Structure of Clauses. March 9, 2004

PyCantonese: Cantonese linguistic research in the age of big data

Evalita 09 Parsing Task: constituency parsers and the Penn format for Italian

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

#hardtoparse: POS Tagging and Parsing the Twitterverse

Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1

English auxiliary verbs

The course is included in the CPD programme for teachers II.

AUTOMATIC F-STRUCTURE ANNOTATION FROM THE AP TREEBANK

Surface Realisation using Tree Adjoining Grammar. Application to Computer Aided Language Learning

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX

CS 6740 / INFO Ad-hoc IR. Graduate-level introduction to technologies for the computational treatment of information in humanlanguage

DEPENDENCY PARSING JOAKIM NIVRE

Social Media Text and Predictive Analytics

Grammar Rules: Parts of Speech Words are classed into eight categories according to their uses in a sentence.

Chapter 13, Sections Auxiliary Verbs CSLI Publications

Teaching English as a Foreign Language (TEFL) Certificate Programs

Annotation and Evaluation of Swedish Multiword Named Entities

SWIFT Aligner, A Multifunctional Tool for Parallel Corpora: Visualization, Word Alignment, and (Morpho)-Syntactic Cross-Language Transfer

Rethinking the relationship between transitive and intransitive verbs

SYNTAX: THE ANALYSIS OF SENTENCE STRUCTURE

Strategies for Technical Writing

Syntactic Theory on Swedish

Simple Type-Level Unsupervised POS Tagging

Development of a Dependency Treebank for Russian and its Possible Applications in NLP

COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE. Fall 2014

UNKNOWN WORDS ANALYSIS IN POS TAGGING OF SINHALA LANGUAGE

Introduction. Philipp Koehn. 28 January 2016

Compound Sentences and Coordination

SCHOOL OF HEALTH SCIENCES DIVISION OF SPEECH & HEARING SCIENCES LEVEL MASTERS / DIET 1. SM025/ Linguistics 2: Clinical Linguistics

Index. 344 Grammar and Language Workbook, Grade 8

Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata

Learning Translation Rules from Bilingual English Filipino Corpus

Differences in linguistic and discourse features of narrative writing performance. Dr. Bilal Genç 1 Dr. Kağan Büyükkarcı 2 Ali Göksu 3

The Role of Sentence Structure in Recognizing Textual Entailment

Introduction. BM1 Advanced Natural Language Processing. Alexander Koller. 17 October 2014

Automatic Pronominal Anaphora Resolution. in English Texts

Automated Content Analysis of Discussion Transcripts

Curriculum Vitae. Joakim Nivre. Personal Information. Education

Chapter 1. Introduction Topic of the dissertation

Veronika VINCZE, PhD. PERSONAL DATA Date of birth: 1 July 1981 Nationality: Hungarian

TRANSLATING POLISH TEXTS INTO SIGN LANGUAGE IN THE TGT SYSTEM

the primary emphasis on explanation in terms of factors outside the formal structure of language.

Syntax: Phrases. 1. The phrase

Genre distinctions and discourse modes: Text types differ in their situation type distributions

CALICO Journal, Volume 9 Number 1 9

UNIVERSITÀ DEGLI STUDI DELL AQUILA CENTRO LINGUISTICO DI ATENEO

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR

INF5820 Natural Language Processing - NLP. H2009 Jan Tore Lønning jtl@ifi.uio.no

What s in a Lexicon. The Lexicon. Lexicon vs. Dictionary. What kind of Information should a Lexicon contain?

The Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems

Linguistics to Structure Unstructured Information

Learning Morphological Disambiguation Rules for Turkish

Linguistic Universals

Customizing an English-Korean Machine Translation System for Patent Translation *

Transcription:

Universal Dependencies Joakim Nivre! Uppsala University Department of Linguistics and Philology Based on collaborative work with Jinho Choi, Timothy Dozat, Filip Ginter, Yoav Goldberg, Jan Hajič, Chris Manning, Ryan McDonald, Natalia Silveira, Marie de Marneffe, Slav Petrov, Sampo Pyysalo, Reut Tsarfaty, Daniel Zeman

Natural Language Processing

Natural Language Processing Linguistic diversity makes our life harder Why 90% parsing accuracy for English but only 80% for Finnish? Can we even compare the numbers?

Natural Language Processing Linguistic diversity makes our life harder Why 90% parsing accuracy for English but only 80% for Finnish? Can we even compare the numbers? Current NLP relies heavily on linguistic annotation Treebank annotation schemes vary across languages How do we avoid comparing apples and oranges?

Language X conj conj En katt jagar råttor och möss? cc conj En kat jager råder og møs cc conj A cat chases rats and mice

Language X conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss Language Y? cc conj En kat jager råder og møs En kat jager råder og møs conj conj cc A cat chases rats and mice A cat chases rats and mice conj? cc conj cc

Language X conj conj conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss En katt jagar råttor och möss Language Y? cc conj?? cc conj cc conj En kat jager råder og møs En kat jager råder og møs En kat jager råder og møs conj cc cc cc A cat chases A rats cat and chases mice rats and mice A cat chases rats and mice conj Language Z conj conj

Language X conj conj conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss En katt jagar råttor och möss Language Y? cc conj?? cc conj cc conj En kat jager råder og møs En kat jager råder og møs En kat jager råder og møs conj cc cc cc A cat chases A rats cat and chases mice rats and mice A cat chases rats and mice conj Language Z conj conj Which languages are most closely related?

1/5 Language X conj conj conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss En katt jagar råttor och möss Language Y? cc conj?? cc conj cc conj En kat jager råder og møs En kat jager råder og møs En kat jager råder og møs conj cc cc cc A cat chases A rats cat and chases mice rats and mice A cat chases rats and mice conj Language Z conj conj Which languages are most closely related?

1/5 Language X conj conj conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss En katt jagar råttor och möss Language Y? cc conj?? cc conj cc conj En kat jager råder og møs En kat jager råder og møs En kat jager råder og møs conj cc cc cc A cat chases A rats cat and chases mice rats and mice A cat chases rats and mice conj Language Z conj conj 2/5 Which languages are most closely related?

1/5 Language X conj conj conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss En katt jagar råttor och möss Language Y? cc conj?? cc conj cc conj En kat jager råder og møs En kat jager råder og møs En kat jager råder og møs conj cc cc cc A cat chases A rats cat and chases mice rats and mice A cat chases rats 2/5 and mice conj Language Z conj conj 2/5 Which languages are most closely related?

Language Swedish X conj conj conj conj conj conj En conj katt jagar conj råttor och möss En katt jagar råttor och möss En En katt katt jagar jagar råttor råttor och och möss mössen katt jagar råttor och möss 1/5 Language Danish Y? cc cc conj conj cc conj? cc conj? En En kat cc kat jager conj jager rotter råder og og mus møs råder og 2/5 møs En kat jager rotter råder og mus møs En kat jager rotter og mus Language English Z conj conj cc conj cc cc A cat cc cc chases rats and mice A cat chases rats and mice A cat cat chases chases rats rats 2/5 and and mice A cat chases rats and mice mice advmod Which languages advmod are most closely related? Toutefois, les filles adorent les Toutefois, toutefois les, fillestoutefois les adorent, fille les les adorer filles desserts les toutefois, ADV les PUNCTfilletoutefois DET adorer, NOUN les les VERB fille dessert DET conj advmod conj conj dob

Why is this a problem?

Why is this a problem? Hard to compare empirical results across languages

Why is this a problem? Hard to compare empirical results across languages Hard to evaluate cross-lingual learning

Why is this a problem? Hard to compare empirical results across languages Hard to evaluate cross-lingual learning Hard to build and maintain multilingual systems

Why is this a problem? Hard to compare empirical results across languages Hard to evaluate cross-lingual learning Hard to build and maintain multilingual systems Hard to make progress towards a universal parser

conj conj En katt Universal jagar råttor och möss Dependencies? cc conj En kat jager rotter og mus http://universaldependencies.github.io/docs/ cc conj A cat chases rats and mice advmod Toutefois, les filles adorent les desserts. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres

conj conj En katt Universal jagar råttor och möss Dependencies? cc conj En kat jager rotter og mus http://universaldependencies.github.io/docs/ cc conj A cat chases rats and mice advmod Toutefois, les filles adorent les desserts. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Part-of-speech tags

conj conj En katt Universal jagar råttor och möss Dependencies? cc conj En kat jager rotter og mus http://universaldependencies.github.io/docs/ cc conj A cat chases rats and mice advmod Toutefois, les filles adorent les desserts. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Part-of-speech tags Morphological features

conj conj En katt Universal jagar råttor och möss Dependencies? cc conj En kat jager rotter og mus http://universaldependencies.github.io/docs/ cc conj A cat chases rats and mice Dependency relations advmod Toutefois, les filles adorent les desserts. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Part-of-speech tags Morphological features

Universal Dependencies http://universaldependencies.github.io/docs/

Universal Dependencies http://universaldependencies.github.io/docs/ Stanford Dependencies

Universal Dependencies http://universaldependencies.github.io/docs/ Stanford Dependencies CLEAR

Universal Dependencies http://universaldependencies.github.io/docs/ Stanford Dependencies Google UD CLEAR

Universal Dependencies http://universaldependencies.github.io/docs/ Stanford Dependencies CLEAR Google UD Stanford UD

Universal Dependencies http://universaldependencies.github.io/docs/ Stanford Dependencies Google UD HamleDT CLEAR Stanford UD

Universal Dependencies http://universaldependencies.github.io/docs/ Stanford Dependencies Google UD HamleDT CLEAR Stanford UD Interset

Universal Dependencies http://universaldependencies.github.io/docs/ Stanford Dependencies Google UD HamleDT Google universal tags CLEAR Stanford UD Interset

Universal Dependencies http://universaldependencies.github.io/docs/ Universal Dependencies

Universal Dependencies http://universaldependencies.github.io/docs/ Universal Dependencies Milestones: Kick-off meeting at EACL in Gothenburg, April 2014 Release of annotation guidelines, Version 1, October 2014 Release of treebanks for 10 languages, January 2015 Release of treebanks for 18 languages, May 2015 Open community effort anyone can contribute!

Goals and Requirements

Goals and Requirements Cross-linguistically consistent grammatical annotation

Goals and Requirements Cross-linguistically consistent grammatical annotation Support multilingual research and development in NLP

Goals and Requirements Cross-linguistically consistent grammatical annotation Support multilingual research and development in NLP Based on common usage and existing de facto standards

Design Principles

Design Principles Dependency Widely used in practical NLP systems Available in treebanks for many languages

Design Principles Dependency Widely used in practical NLP systems Available in treebanks for many languages Lexicalism Basic annotation units are words syntactic words Words have morphological properties Words enter into syntactic relations

Design Principles Dependency Widely used in practical NLP systems Available in treebanks for many languages Lexicalism Basic annotation units are words syntactic words Words have morphological properties Words enter into syntactic relations Recoverability Transparent mapping from input text to word segmentation

Golden Rules

Golden Rules Maximize parallelism Don t annotate the same thing in different ways Don t make different things look the same

Golden Rules Maximize parallelism Don t annotate the same thing in different ways Don t make different things look the same But don t overdo it Don t annotate things that are not there Languages select from a universal pool of categories Allow language-specific extensions

En kat jager rotter og mus cc conj A cat chases rats Morphology and mice advmod Toutefois, les filles adorent les desserts. toutefois, les fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres

En kat jager rotter og og mus mus cc cc conj conj A cat chases rats rats Morphology and and mice mice advmod Toutefois, les filles adorent les desserts. toutefois, les fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Lemma representing advmod the semantic content of the word Toutefois, les filles adorent les desserts. toutefois, le fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres

En kat jager rotter og og mus mus cc cc conj conj A cat chases rats rats Morphology and and mice mice advmod Toutefois, les filles adorent les desserts. toutefois, les fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Lemma representing advmod the semantic content of the word Part-of-speech tag representing the abstract lexical Toutefois, les filles adorent les desserts. category associated with the word toutefois, le fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres

En kat jager rotter og og mus mus cc cc conj conj A cat chases rats rats Morphology and and mice mice advmod Toutefois, les filles adorent les desserts. toutefois, les fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Lemma representing advmod the semantic content of the word Part-of-speech tag representing the abstract lexical Toutefois, les filles adorent les desserts. category associated with the word toutefois, le fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Features Definite=Def representing Gender=Fem lexical Number=Plur and grammatical Definite=Def Gender=Masc properties Number=Plur Number=Plur Person=3 Number=Plur Number=Plur associated with the lemma Tense=Pres or the particular word form

Part-of-Speech Tags Open Closed Other ADJ ADP PUNCT ADV AUX SYM INTJ CONJ X NOUN PROPN VERB DET NUM PART PRON SCONJ Taxonomy of 17 universal part-of-speech tags, based on the Google Universal Tagset (Petrov et al., 2012) All languages use the same inventory, but not all tags have to be used by all languages

Features Lexical Inflectional Nominal Inflectional Verbal PronType Gender VerbForm NumType Animacy Mood Poss Number Tense Reflex Case Aspect Definite Voice Degree Person Negative Standardized inventory of morphological features, based on the Interset system (Zeman, 2008) Languages select relevant features and can add languagespecific features or values with documentation

Syntax nmod aux case aux The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT nmod aux case aux The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT

nmod aux aux Syntax The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT case nmod The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT Content words are related by dependency relations

nmod aux aux Syntax The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT case nmod aux case aux The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT Content words are related by dependency relations nmod Function words attach to the content word they modify The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT

Syntax nmod aux case aux The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT Content words are related by dependency relations Function words attach to the content word they modify aux Punctuation attach aux to head of phrase or clause The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT nmod case

pass case Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def pass nmod The dog was chased by the cat. DET NOUN AUX VERB ADP DET NOUN PUNCT pass nmod Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def nmod

pass Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def pass nmod The dog was chased by the cat. DET NOUN AUX VERB ADP DET NOUN PUNCT pass nmod Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def

pass auxpass nmod The dog was chased by the cat. DET NOUN AUX VERB ADP DET NOUN PUNCT pass nmod Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def

pass auxpass nmod case The dog was chased by the cat. DET NOUN AUX VERB ADP DET NOUN PUNCT pass nmod case Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def nmod

Dependency Relations

Dependency Relations Taxonomy of 40 universal grammatical relations, broadly attested in language typology (de Marneffe et al., 2014) Language-specific subtypes may be added

Dependency Relations Taxonomy of 40 universal grammatical relations, broadly attested in language typology (de Marneffe et al., 2014) Language-specific subtypes may be added Organizing principles Three types of structures: nominals, clauses, modifiers Core arguments vs. other dependents (not complements vs. adjuncts)

Dependents of Clausal Predicates Nominal Clausal Other Core pass iobj csubj csubjpass ccomp xcomp Non-Core nmod vocative discourse expl advcl advmod neg aux auxpass cop mark

nmod nmod aux case advmod Mary was quietly reading a book in the garden. PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT advcl mark aux cop neg If you are sick, you should not exercise. SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT ccomp mark aux xcomp Peter thought that he should stop smoking. PROPN VERB SCONJ PRON AUX VERB VERB PUNCT appos mark

nmod nmod aux case aux case advmod advmod Mary was quietly reading book in the garden Mary was quietly reading a book in the garden. PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT advcl advcl mark mark aux aux cop neg cop neg If you are sick you should not exercise If you are sick, you should not exercise. SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT ccomp ccomp mark mark aux aux xcomp xcomp Peter thought that he should stop smoking Peter thought that he should stop smoking. PROPN VERB SCONJ PRON AUX VERB VERB PUNCT PROPN VERB SCONJ PRON AUX VERB VERB PUNCT appos appos mark mark nmod

aux aux auxadvmod advmod advmod nmod nmod nmod case case case Mary was quietly reading book in the garden Mary was quietly reading book in the garden PROPN Mary AUX was quietly ADV reading VERB DET a NOUN book ADP in DET the NOUN garden PUNCT. PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT mark mark mark cop cop cop advcl advcl advcl aux aux aux If you are sick you should not exercise If you are sick you should not exercise SCONJ If PRON you AUX are ADJ sick PUNCT, PRON you should AUX ADV not exercise VERB PUNCT. SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT ccomp ccomp ccompmark mark mark aux aux aux xcomp xcomp xcomp Peter thought that he should stop smoking Peter thought that he should stop smoking PROPN Peter thought VERB SCONJ that PRON he should AUX VERB stop smoking VERB PUNCT. PROPN VERB SCONJ PRON AUX VERB VERB PUNCT PROPN VERB SCONJ PRON AUX VERB VERB PUNCT appos appos appos mark mark mark nmod neg neg neg

PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT advcl Dependents mark of Nominals aux cop neg If you are sick, you should not exercise. SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT Nominal Clausal Other nummod appos nmod acl ccomp mark amod case xcomp Peter thought that he should stop smoking. PROPN VERB SCONJ PRON AUX VERB VERB PUNCT appos mark amod nmod case Cairo, the lovely capital of Egypt PROPN PUNCT DET ADJ NOUN ADP PROPN

Coordination conj cc () appos mark Coordination Cairo, the lovely capital of Eg PROPN PUNCT DET ADJ NOUN ADP PR amod nmod case conj cc conj Huey, Dewey and Louie PROPN PUNCT PROPN CONJ PROPN Coordinate structures are headed by the first conjunct Subsequent conjuncts depend on it via the conj relation Conjunctions depend on it via the cc relation Punctuation marks depend on it via the relation

Multiword Expressions Relation mwe name compound goeswith Examples in spite of, as well as, ad hoc Roger Bacon, Carl XVI Gustaf, New York phone book, four thousand, dress up notwith standing, with out UD annotation does not permit words with spaces Multiword expressions are analysed using special relations The mwe, name and goeswith relations are always head-initial The compound relation reflects the internal structure

Other Relations Relation parataxis list remnant reparandum foreign dep Explanation Loosely linked clauses of same rank Lists without syntactic structure Orphans in ellipsis linked to parallel elements Disfluency linked to (speech) repair Elements within opaque stretches of code switching Unspecified dependency Syntactically independent element of clause/phrase

Language-Specific Relations Language-specific relations are subtypes of universal relations added to capture important phenomena Subtyping permits us to back off to universal relations Relation acl:relcl compound:prt nmod:poss nmod:agent cc:preconj :pre Explanation Relative clause Verb particle (dress up) Genitive nominal (Mary s book) Agent in passive (saved by the bell) Preconjunction (both and) Preerminer (all those )

Papers Nivre, J. (2015) Towards a Universal Grammar of Natural Language Processing. In Proceedings of CICLing, 3 16. Naseem, T., Chen, H., Barzilay, R. and Johnson, M. (2010) Using Universal Linguistic Knowledge to Guide Grammar Induction. In Proceedings of EMNLP, 1234-1244. McDonald, R., Petrov, S. and Hall, K. (2011) Multi-Source Transfer of Delexicalized Dependency Parsers. In Proceedings of EMNLP, 62 72. Zeman, D., Marecek, D., Popel, M., Ramasamy, L., Stepánek, J., Zabokrtský, Z., Hajic, J. (2012) HamleDT: To Parse or Not to Parse? In Proceedings of LREC, 2735 2741. McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., Bedini, C., Bertomeu Castelló, N. and Lee, J. (2013) Universal Dependency Annotation for Multilingual Parsing. In Proceedings of ACL (Short Papers), 92 97. De Marneffe, M.-C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J. and Manning, C.D. (2014) Universal Stanford Dependencies: A Cross-Linguistic Typology. In Proceedings of LREC, 4585 4592. Swanson, B. and Charniak, E. (2014) Data Driven Language Transfer Hypotheses. In Proceedings of EACL, 169 173. Östling, R. (2015) Word Order Typology through Multilingual Word Alignment. In Proceedings of ACL (Short Papers), 205 211. Futrell, R., Mahowald, K. and Gibson, E. (2015) Quantifying Word Order Freedom in Dependency Corpora. In Proceedings of Depling, 91 100.

Papers Nivre, J. (2015) Towards a Universal Grammar of Natural Language Processing. In Proceedings of CICLing, 3 16. Naseem, T., Chen, H., Barzilay, R. and Johnson, M. (2010) Using Universal Linguistic Knowledge to Guide Grammar Induction. In Proceedings of EMNLP, 1234-1244. McDonald, R., Petrov, S. and Hall, K. (2011) Multi-Source Transfer of Delexicalized Dependency Parsers. In Proceedings of EMNLP, 62 72. Zeman, D., Marecek, D., Popel, M., Ramasamy, L., Stepánek, J., Zabokrtský, Z., Hajic, J. (2012) HamleDT: To Parse or Not to Parse? In Proceedings of LREC, 2735 2741. McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., Bedini, C., Bertomeu Castelló, N. and Lee, J. (2013) Universal Dependency Annotation for Multilingual Parsing. In Proceedings of ACL (Short Papers), 92 97. De Marneffe, M.-C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J. and Manning, C.D. (2014) Universal Stanford Dependencies: A Cross-Linguistic Typology. In Proceedings of LREC, 4585 4592. Swanson, B. and Charniak, E. (2014) Data Driven Language Transfer Hypotheses. In Proceedings of EACL, 169 173. Östling, R. (2015) Word Order Typology through Multilingual Word Alignment. In Proceedings of ACL (Short Papers), 205 211. Futrell, R., Mahowald, K. and Gibson, E. (2015) Quantifying Word Order Freedom in Dependency Corpora. In Proceedings of Depling, 91 100.

Papers Nivre, J. (2015) Towards a Universal Grammar of Natural Language Processing. In Proceedings of CICLing, 3 16. Naseem, T., Chen, H., Barzilay, R. and Johnson, M. (2010) Using Universal Linguistic Knowledge to Guide Grammar Induction. In Proceedings of EMNLP, 1234-1244. McDonald, R., Petrov, S. and Hall, K. (2011) Multi-Source Transfer of Delexicalized Dependency Parsers. In Proceedings of EMNLP, 62 72. Zeman, D., Marecek, D., Popel, M., Ramasamy, L., Stepánek, J., Zabokrtský, Z., Hajic, J. (2012) HamleDT: To Parse or Not to Parse? In Proceedings of LREC, 2735 2741. McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., Bedini, C., Bertomeu Castelló, N. and Lee, J. (2013) Universal Dependency Annotation for Multilingual Parsing. In Proceedings of ACL (Short Papers), 92 97. De Marneffe, M.-C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J. and Manning, C.D. (2014) Universal Stanford Dependencies: A Cross-Linguistic Typology. In Proceedings of LREC, 4585 4592. Swanson, B. and Charniak, E. (2014) Data Driven Language Transfer Hypotheses. In Proceedings of EACL, 169 173. Östling, R. (2015) Word Order Typology through Multilingual Word Alignment. In Proceedings of ACL (Short Papers), 205 211. Futrell, R., Mahowald, K. and Gibson, E. (2015) Quantifying Word Order Freedom in Dependency Corpora. In Proceedings of Depling, 91 100.

Papers Nivre, J. (2015) Towards a Universal Grammar of Natural Language Processing. In Proceedings of CICLing, 3 16. Naseem, T., Chen, H., Barzilay, R. and Johnson, M. (2010) Using Universal Linguistic Knowledge to Guide Grammar Induction. In Proceedings of EMNLP, 1234-1244. McDonald, R., Petrov, S. and Hall, K. (2011) Multi-Source Transfer of Delexicalized Dependency Parsers. In Proceedings of EMNLP, 62 72. Zeman, D., Marecek, D., Popel, M., Ramasamy, L., Stepánek, J., Zabokrtský, Z., Hajic, J. (2012) HamleDT: To Parse or Not to Parse? In Proceedings of LREC, 2735 2741. McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., Bedini, C., Bertomeu Castelló, N. and Lee, J. (2013) Universal Dependency Annotation for Multilingual Parsing. In Proceedings of ACL (Short Papers), 92 97. De Marneffe, M.-C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J. and Manning, C.D. (2014) Universal Stanford Dependencies: A Cross-Linguistic Typology. In Proceedings of LREC, 4585 4592. Swanson, B. and Charniak, E. (2014) Data Driven Language Transfer Hypotheses. In Proceedings of EACL, 169 173. Östling, R. (2015) Word Order Typology through Multilingual Word Alignment. In Proceedings of ACL (Short Papers), 205 211. Futrell, R., Mahowald, K. and Gibson, E. (2015) Quantifying Word Order Freedom in Dependency Corpora. In Proceedings of Depling, 91 100.

Papers Nivre, J. (2015) Towards a Universal Grammar of Natural Language Processing. In Proceedings of CICLing, 3 16. Naseem, T., Chen, H., Barzilay, R. and Johnson, M. (2010) Using Universal Linguistic Knowledge to Guide Grammar Induction. In Proceedings of EMNLP, 1234-1244. McDonald, R., Petrov, S. and Hall, K. (2011) Multi-Source Transfer of Delexicalized Dependency Parsers. In Proceedings of EMNLP, 62 72. Zeman, D., Marecek, D., Popel, M., Ramasamy, L., Stepánek, J., Zabokrtský, Z., Hajic, J. (2012) HamleDT: To Parse or Not to Parse? In Proceedings of LREC, 2735 2741. McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., Bedini, C., Bertomeu Castelló, N. and Lee, J. (2013) Universal Dependency Annotation for Multilingual Parsing. In Proceedings of ACL (Short Papers), 92 97. De Marneffe, M.-C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J. and Manning, C.D. (2014) Universal Stanford Dependencies: A Cross-Linguistic Typology. In Proceedings of LREC, 4585 4592. Swanson, B. and Charniak, E. (2014) Data Driven Language Transfer Hypotheses. In Proceedings of EACL, 169 173. Östling, R. (2015) Word Order Typology through Multilingual Word Alignment. In Proceedings of ACL (Short Papers), 205 211. Futrell, R., Mahowald, K. and Gibson, E. (2015) Quantifying Word Order Freedom in Dependency Corpora. In Proceedings of Depling, 91 100.

Papers Nivre, J. (2015) Towards a Universal Grammar of Natural Language Processing. In Proceedings of CICLing, 3 16. Naseem, T., Chen, H., Barzilay, R. and Johnson, M. (2010) Using Universal Linguistic Knowledge to Guide Grammar Induction. In Proceedings of EMNLP, 1234-1244. McDonald, R., Petrov, S. and Hall, K. (2011) Multi-Source Transfer of Delexicalized Dependency Parsers. In Proceedings of EMNLP, 62 72. Zeman, D., Marecek, D., Popel, M., Ramasamy, L., Stepánek, J., Zabokrtský, Z., Hajic, J. (2012) HamleDT: To Parse or Not to Parse? In Proceedings of LREC, 2735 2741. McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., Bedini, C., Bertomeu Castelló, N. and Lee, J. (2013) Universal Dependency Annotation for Multilingual Parsing. In Proceedings of ACL (Short Papers), 92 97. De Marneffe, M.-C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J. and Manning, C.D. (2014) Universal Stanford Dependencies: A Cross-Linguistic Typology. In Proceedings of LREC, 4585 4592. Swanson, B. and Charniak, E. (2014) Data Driven Language Transfer Hypotheses. In Proceedings of EACL, 169 173. Östling, R. (2015) Word Order Typology through Multilingual Word Alignment. In Proceedings of ACL (Short Papers), 205 211. Futrell, R., Mahowald, K. and Gibson, E. (2015) Quantifying Word Order Freedom in Dependency Corpora. In Proceedings of Depling, 91 100.

Research Problems Annotation Produce guidelines and/or annotation for language X Study the annotation of construction Y across languages Parsing Develop and/or evaluate a parser for language X Study cross-lingual transfer learning and/or annotation projection Develop better evaluation schemes for multilingual parsing Typology Study word order patterns in language X Compare the realisation of construction Y across languages

Questions?