Outlook. Hungarian verbal complexes and the pre-verbal field: towards an MCTAG analysis. (L)TAG: Basics. Motivation fot (L)TAG



Similar documents
ž Lexicalized Tree Adjoining Grammar (LTAG) associates at least one elementary tree with every lexical item.

Constraints in Phrase Structure Grammar

Chapter 13, Sections Auxiliary Verbs CSLI Publications

Outline of today s lecture

Surface Realisation using Tree Adjoining Grammar. Application to Computer Aided Language Learning

L130: Chapter 5d. Dr. Shannon Bischoff. Dr. Shannon Bischoff () L130: Chapter 5d 1 / 25

Complex Predications in Argument Structure Alternations

Parsing Beyond Context-Free Grammars: Introduction

Structure of the talk. The semantics of event nominalisation. Event nominalisations and verbal arguments 2

Paraphrasing controlled English texts

XMG : extensible MetaGrammar

Building a Question Classifier for a TREC-Style Question Answering System

Noam Chomsky: Aspects of the Theory of Syntax notes

Ling 201 Syntax 1. Jirka Hana April 10, 2006

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

Tools and resources for Tree Adjoining Grammars

Mildly Context Sensitive Grammar Formalisms

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde

Scrambling in German - Extraction into the Mittelfeld

Semantics and Generative Grammar. Quantificational DPs, Part 3: Covert Movement vs. Type Shifting 1

A note on the complexity of the recognition problem for the Minimalist Grammars with unbounded scrambling and barriers

Slot Grammars. Michael C. McCord. Computer Science Department University of Kentucky Lexington, Kentucky 40506

Object Agreement in Hungarian

IP PATTERNS OF MOVEMENTS IN VSO TYPOLOGY: THE CASE OF ARABIC

The Adjunct Network Marketing Model

Proceedings of the LFG02 Conference. National Technical University of Athens, Athens. Miriam Butt and Tracy Holloway King (Editors) CSLI Publications

ORAKEL: A Portable Natural Language Interface to Knowledge Bases


Learning Translation Rules from Bilingual English Filipino Corpus

Test Suite Generation

I have eaten. The plums that were in the ice box

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Cross-linguistic differences in the interpretation of sentences with more than one QP: German (Frey 1993) and Hungarian (É Kiss 1991)

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test

Structure of Clauses. March 9, 2004

How To Understand A Sentence In A Syntactic Analysis

Tense as an Element of INFL Phrase in Igbo

Lecture 9. Semantic Analysis Scoping and Symbol Table

What s in a Lexicon. The Lexicon. Lexicon vs. Dictionary. What kind of Information should a Lexicon contain?

CALICO Journal, Volume 9 Number 1 9

Statistical Machine Translation

The finite verb and the clause: IP

Natural Language Database Interface for the Community Based Monitoring System *

Errata: Carnie (2008) Constituent Structure. Oxford University Press (MAY 2009) A NOTE OF EXPLANATION:

19. Morphosyntax in L2A

LGPLLR : an open source license for NLP (Natural Language Processing) Sébastien Paumier. Université Paris-Est Marne-la-Vallée

Syntactic and Semantic Differences between Nominal Relative Clauses and Dependent wh-interrogative Clauses

Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1

According to the Argentine writer Jorge Luis Borges, in the Celestial Emporium of Benevolent Knowledge, animals are divided

Context Grammar and POS Tagging

Treebank Search with Tree Automata MonaSearch Querying Linguistic Treebanks with Monadic Second Order Logic

CHARTES D'ANGLAIS SOMMAIRE. CHARTE NIVEAU A1 Pages 2-4. CHARTE NIVEAU A2 Pages 5-7. CHARTE NIVEAU B1 Pages CHARTE NIVEAU B2 Pages 11-14

The Predicate-Argument Structure of Discourse Connectives: A Corpus-based Study 1 Introduction

Translating while Parsing

Double Genitives in English

Syntax: Phrases. 1. The phrase

Comprendium Translator System Overview

Non-nominal Which-Relatives

Movement and Binding

The Specific Text Analysis Tasks at the Beginning of MDA Life Cycle

Negative quantifiers in Hungarian 1. Katalin É. Kiss

Constituency. The basic units of sentence structure


Natural Language News Generation from Big Data

LASSY: LARGE SCALE SYNTACTIC ANNOTATION OF WRITTEN DUTCH

Phrase Structure Rules, Tree Rewriting, and other sources of Recursion Structure within the NP

Hybrid Strategies. for better products and shorter time-to-market

Towards portable natural language interfaces to knowledge bases The case of the ORAKEL system

Index. 344 Grammar and Language Workbook, Grade 8

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX

A chart generator for the Dutch Alpino grammar

Interactive Dynamic Information Extraction

A Chart Parsing implementation in Answer Set Programming

D2.4: Two trained semantic decoders for the Appointment Scheduling task

Semantic Mapping Between Natural Language Questions and SQL Queries via Syntactic Pairing

Syntactic Theory. Background and Transformational Grammar. Dr. Dan Flickinger & PD Dr. Valia Kordoni

The Predicate-Argument Structure of Discourse Connectives A Corpus-based Study

English prepositional passive constructions

Structurally ambiguous sentences

SYNTAX: THE ANALYSIS OF SENTENCE STRUCTURE

Sentence Structure/Sentence Types HANDOUT

ENGLISH GRAMMAR Elementary

Varieties of specification and underspecification: A view from semantics

Introduction. BM1 Advanced Natural Language Processing. Alexander Koller. 17 October 2014

The syntactic position of the subject in Hungarian existential constructions

Special Topics in Computer Science

Lecture 9. Phrases: Subject/Predicate. English 3318: Studies in English Grammar. Dr. Svetlana Nuernberg

Syntactic Theory on Swedish

CONSTRAINING THE GRAMMAR OF APS AND ADVPS. TIBOR LACZKÓ & GYÖRGY RÁKOSI rakosi, laczko

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang

31 Case Studies: Java Natural Language Tools Available on the Web

Formalism. Abstract This paper describes our work on parsing Turkish using the lexical-functional grammar

Self-Training for Parsing Learner Text

Predicate Logic. For example, consider the following argument:

Appendix to Chapter 3 Clitics

Natural Language Processing

LESSON THIRTEEN STRUCTURAL AMBIGUITY. Structural ambiguity is also referred to as syntactic ambiguity or grammatical ambiguity.

Natural Language Dialogue in a Virtual Assistant Interface

Example-Based Treebank Querying. Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde

Chinese Open Relation Extraction for Knowledge Acquisition

Transcription:

Outlook Hungarian verbal complexes and the pre-verbal field: towards an MCTAG analysis Kata Balogh HHU Düsseldorf 16 th zklarska Poreba Workshop 20-23 February 2015 goal: Hungarian grammar using TAG + XMG Tree-Adjoining Grammar + extensible MetaGrammar today: some problems and analysis around free word order a very quick introduction to LTAG Lexicalized TAG grammar writing with XMG Hungarian data: verbal complexes and the pre-verbal field some results and proposed analysis using XMG & MCTAG Multi-component TAG Balogh HHU zklarska 15 1/30 Balogh HHU zklarska 15 2/30 Motivation fot LTAG TAGs are mildly context-sensitive parsing in polynomial time generation of crossing dependencies constant growth property semilinearity large coverage TAG grammars English and Korean XTAG; Joshi et al. French TAG Crabbé s PhD-thesis; German GerTT; Kallmeyer & Lichte grammar implementation with TAG XTAG tools UPenn! parser editor viewer... XMG + TuLiPA Tübingen F XMG: extensible MetaGrammar Duchier et al 2004 F TuLiPA: Tübingen Linguistic Parsing Architecture Parmentier et al 2008 Balogh HHU zklarska 15 3/30 LTAG: Basics Tree Adjoining Grammar TAG is a set of elementary trees a finite set of initial trees a finite set of auxiliary trees two combinatorial operations substitution: replacing a non-terminal leaf with an initial tree adjunction: replacinganinternalnodewithanauxiliarytree NP pim Adv always LTAG:LexicalizedTAG each elementary tree contains at least one lexical item Balogh HHU zklarska 15 4/30

LTAG: Basics to increase the expressive power: adjunction constraints whether adjunction is mandatory and which trees can be adjoined: Null Adjunction NA Obligatory Adjunction OA elective Adjunction A feature structures as non-terminal nodes; reasons wrt TAG: generalizing agreement and case marking via underspecification modeling adjunction constraints smaller grammars that are easier to maintain 2 cat case 6 4 agr 3 np nom apple 7 pers 3 5 num sg she 2 3 cat np 6 7 4case nom5 agr 1 hcat s i # 2 3 cat vp 6 apple 4 pers 3 7 5 agr 1 num sg Balogh HHU zklarska 15 5/30 Linguistic analyses with LTAG the ideal grammar formalism! linguistically adequate: phenomena: linearization agreement discontinuity ellipsis... generalizations: valency active/passive diathesis alternations... intuitive implementation LTAG = set of elementary trees What is an elementary tree and what is its shape?? syntactic/semantic elementary trees = of linguistic objects yntactic design principles Frank 2002: Lexicalization Condition on Elementary Tree Minimality CETM Fundamental TAG Hypothesis FTH -Criterion for TAG emantic design principles Abeillé & Rambow 2000 Design principle of economy properties Balogh HHU zklarska 15 6/30 yntactic design principles Grammar architecture Fundamental TAG Hypothesis FTH Every syntactic dependency subcategorization binding... is expressed locally within an elementary tree. -Criterion for TAG a f H is the lexical head of an elementary tree T H assigns all of its -roles in T. b f A is a frontier non-terminal in T A must be assigned a -role in T. alency/subcategorization is expressed within the elementary tree of the predicate: either a substitution node or a footnode Morph Database yntactic Database Tree Database inflected form root form PO inflectional information list of tree templates and tree families likes wants Balogh HHU zklarska 15 7/30 Balogh HHU zklarska 15 8/30

Tree templates and tree families Tree templates: e.g. for the declarative transitive verb and the transitive verb with object extraction marks the lexical insertion site a tree family is a set of tree templates represents a subcategorization frame and unifies all syntactic configurations the subcategorization frame can be realized in Balogh HHU zklarska 15 9/30 NP Hungarian flexible word order & discourse configurationality w.r.t. information structure: post-verbal and pre-verbal field post-verbal field: argument positions order is free 1 Adott gave 2 Adott gave egy könyvet Marinak..nom a book.acc Mary.dat Marinak egy könyvet. Mary.dat.nom a book.acc all 6 permutations: gave a book to Mary. pre-verbal field: functional projections fixed order Topic < Quantifier < Focus < erb <... 3 Marinak mindenki egy könyvet adott. Mary.dat everyone.nom a book.acc gave t was a book that everyone gave to Mary. Balogh HHU zklarska 15 10 / 30 Hungarian and LTAG XMG LTAG! fixed positions for grammatical functions flexible word order? larger tree families larger set of elementary trees; e.g. grammar writing! using extensible MetaGrammar XMG [Crabbé et al. 2012]... etc. extensible MetaGrammar specifying an F-LTAG LTAG set of elementary trees most information contained in the elementary trees XMG generate the elementary trees for a given grammar meta-grammar expressing generalizations additional abstraction level factoring out reusable tree-fragments: classes; e.g. F ubject position in English or Topic/Focus positions in Hungarian appearing in elementary trees of verbs with di erent subcategorization frames classes tree-fragments can be combined by conjunction and disjunction Balogh HHU zklarska 15 11 / 30 Balogh HHU zklarska 15 12 / 30

XMG XMG & Hungarian by combining tree fragments! tree templates; e.g. ubject! Canubject _ WhNpubject Object! CanObject _ WhNpObject ActiveTranserb! ubject ^ Activeerb ^ CanObject description language for tree fragments class Canubj declare???np { <syn> { node? color=black [cat=s] ; node?np mark=substcolor=black [cat=np] ; node? color=white [cat=vp] ;? ->?NP ;? ->? ;?NP >>? } } dominance -> and precedence>> also with transitive closure color codes for specifying node equations post-verbal field! free argument order example without verbal prefixes; the tree fragments: Proj NoMerb subject object and oblique argument in post-verbal argument position ubjarg ObjArg OblArg nom acc dat DiTrerbP! Proj ^ NoMerb ^ ubjarg ^ ObjArg providing all 6 elementary trees Balogh HHU zklarska 15 13 / 30 Balogh HHU zklarska 15 14 / 30 XMG & Hungarian pre-verbal field! fixed positions for topic & focus arguments can be in topic position ubjtop ObjTop... etc. erbal complexes verb + verbal modifier M! verbal complexes verbal modifiers: broad group verbal prefixes " # case nom ref + " # case acc ref + arguments can be in focus position ubjfoc ObjFoc... etc. " # case nom foc + h vminv +/nil i " # case acc foc + also implemented: verbal modifiers sentential negation h vminv +/nil i 4 meg-látogatta Marit. Pref-visited Mary.acc visited Mary. infinitives without M 5 úszni akar. swim.inf wants wants to swim. DPs adjectives bare nouns syntax: complementary distribution with negation and focus semantics: secondary predication relation to aspect Balogh HHU zklarska 15 15 / 30 Balogh HHU zklarska 15 16 / 30

yntactic position XMG & Hungarian in neutral sentences without Foc and Neg pre-verbal position 6 meg-látogatta Marit. Pref-visited Mary.acc visited Mary. in non-neutral sentences with Foc and/or Neg post-verbal position 7 nem látogatta meg Marit. not visited pref Mary.acc did not visit Mary. 8 MART nem látogatta meg. Mary.acc not visited pref t is Mary whom did not visited. two positions of the verbal modifier " vcomp # + vminv - M tree templates " vcomp # + vminv + M Neutubj! ubjarg _ ubjtop NeutObj! ObjArg _ ObjTop entproj! Proj ^ NoMerb _ Merb NeutTranserb! entproj ^ Neutubj ^ NeutObj NonNeutTranserb! entproj ^ ubjfoc ^ NeutObj _ ObjFoc ^ Neutubj _ entneg ^ Neutubj ^ NeutObj _ entneg ^ ubjfoc^neutobj _ ObjFoc^Neutubj Balogh HHU zklarska 15 17 / 30 Balogh HHU zklarska 15 18 / 30 erbal complexes and clausal complements nfinitival complements two types of control verbs e.g. fél is afraid! take main stress e.g. akar want! avoids main stress di erent behavior wrt the M of the embedded infinitive 9 el fél el- a levelet. Pref is-afraid Pref-read.inf the letter.acc is afraid to read the letter. 10 el akarja el- a levelet. Pref wants Pref-read.inf the letter.acc wants to read the letter. two verbs both with pre-verbal and post-verbal fields [... pre-...] matrix- [... post-...] [... pre-...] inf- [... post-...] preferred position of a focused/topicalized argument of the embedded verb is in the pre-verbal field of the matrix verb need to deal with scrambling Multi-component TAG MCTAG Koopman-zabolcsi 2004 classification: Auxiliaries: no main accent M-climbing e.g. akar want Nonauxiliaries 1: main accent no M-climbing e.g. fél is-afraid Balogh HHU zklarska 15 19 / 30 Balogh HHU zklarska 15 20 / 30

Nonauxiliaries Nonauxiliaries Example: fél is afraid 11 fél el- a levelet. is-afraid Pref-read.inf the letter.acc is afraid to read the letter. neutral sentences no Foc! standard LTAG analysis 12? fél [a levelet] T el-. is-afraid the letter.acc Pref-read.inf 13 [a levelet] T fél el-. the letter.acc is-afraid Pref-read.inf is afraid to read the letter. 14 fél [a levelet] F is-afraid the letter.acc el. read.inf Pref M el 15 [a levelet] F fél el-. the letter.acc is-afraid Pref-read.inf t is the letter that is afraid the read. fél Balogh HHU zklarska 15 21 / 30 Balogh HHU zklarska 15 22 / 30 Auxiliaries Auxiliaries Example: akar want 16 el akarja a levelet. Pref wants read.inf the letter.acc wants to read the letter. 17 akarja wants el- a levelet. Pref-read.inf the letter.acc standard LTAG analysis cannot derive the M-climbing M 18? el akarja [a levelet] T. Pref wants the letter.acc read.inf 19 [a levelet] T el akarja. the letter.acc Pref wants read.inf wants to read the letter. 20 el akarja [a levelet] F Pref wants the letter.acc. read.inf 21 [a levelet] F akarja el-. the letter.acc wants Pref-read.inf t is the letter what wants to read. el akar Balogh HHU zklarska 15 23 / 30 Balogh HHU zklarska 15 24 / 30

Multi-component TAG MCTAG - Basics standard LTAG cannot analyze discontinuity extraposition extraction scrambling ellipsis gapping subject deletion right node raising crambling: challenge variability in word order; German example: daß ihm Peter den Kühlschrank heute zu reparieren versprach daß ihm den Kühlschrank Peter heute zu reparieren versprach... that Peter promised him to repair the fridge today Problem: if ihm is considered to be an argument/complement of versprach thetreeforversprach has to split into three pieces when conjoined with the tree of zu reparieren possibilities in an LTAG-Analysis: zu reparieren adjoins to versprach contradicts -criterion ihm adjoins to zu reparieren contradicts -criterion multi-component TAG! elementary structures are sets of trees tree-local MCTAG TT-MCTAG all trees in the set have to attach to the same elementary tree strongly equivalent to TAG set-local MCTAG all trees in the set have to attach to the same elementary tree set weakly equivalent to LCFR and simple RCG non-local MCTAG the fixed recognition problem is NP-complete even with lexicalization and dominance links mainly TT-MCTAGs are considered for natural language grammars due to complexity issues Balogh HHU zklarska 15 25 / 30 Balogh HHU zklarska 15 26 / 30 MCTAG - German scrambling TT-MCTAG can handle scrambling up to two levels of embedding i.e. three verbs with one complement each forming a coherent construction. [Joshi et. al 2000] daß ihm den Kühlschrank Peter zu reparieren versprach that Peter promised him to repair the fridge zu reparieren NP dat # versprach Balogh HHU zklarska 15 27 / 30 TT-MCTAG for Hungarian Auxiliaries vs. Nonauxiliaries fél is-afraid vs. akar want proposal using TT-MCTAG elementary trees of the infinitival verb el- Pref-read.inf h av-str i acc M el h av-str + i acc obtained by XMG as before M NA el M NA el Balogh HHU zklarska 15 28 / 30

TT-MCTAG for Hungarian TT-MCTAG for Hungarian tree sets for fél is-afraid and want want and NP nom # NP nom # h av-str i fél h av-str i M h av-str h av-str + i akar el + i acc tree sets for fél is-afraid and want want and NP nom # NP nom # h av-str i fél acc h av-str +/ i M NA h av-str + i akar el Balogh HHU zklarska 15 29 / 30 Balogh HHU zklarska 15 30 / 30