Treebank Search with Tree Automata MonaSearch Querying Linguistic Treebanks with Monadic Second Order Logic

Similar documents
Topological Field Chunking in German

Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1

Today s Agenda. Automata and Logic. Quiz 4 Temporal Logic. Introduction Buchi Automata Linear Time Logic Summary

Informatique Fondamentale IMA S8

Algebraic Recognizability of Languages

Testing LTL Formula Translation into Büchi Automata

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Compiler Construction

Constraints in Phrase Structure Grammar

Syntax: Phrases. 1. The phrase

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test

03 - Lexical Analysis

Bounded Treewidth in Knowledge Representation and Reasoning 1

VIQTORYA A Visual Query Tool for Syntactically Annotated Corpora

Fixed-Point Logics and Computation

The Model Checker SPIN

Reading 13 : Finite State Automata and Regular Expressions

Symbiosis of Evolutionary Techniques and Statistical Natural Language Processing

Formal Verification of Software

Dynamic Cognitive Modeling IV

Temporal Logics. Computation Tree Logic

Outline of today s lecture


Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

T Reactive Systems: Introduction and Finite State Automata

Software Engineering using Formal Methods

How To Understand A Sentence In A Syntactic Analysis

The Halting Problem is Undecidable

Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm

Building a Question Classifier for a TREC-Style Question Answering System

Software Modeling and Verification

International Journal of Software Engineering and Knowledge Engineering c World Scientific Publishing Company

Automata and Formal Languages

Algorithmic Software Verification

Software Verification and Testing. Lecture Notes: Temporal Logics

DEPENDENCY PARSING JOAKIM NIVRE

Runtime Verification - Monitor-oriented Programming - Monitor-based Runtime Reflection

Introduction to Automata Theory. Reading: Chapter 1

LASSY: LARGE SCALE SYNTACTIC ANNOTATION OF WRITTEN DUTCH

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Query optimization. DBMS Architecture. Query optimizer. Query optimizer.

Regular Languages and Finite State Machines

INF5820 Natural Language Processing - NLP. H2009 Jan Tore Lønning jtl@ifi.uio.no

Lexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar

Grammars and introduction to machine learning. Computers Playing Jeopardy! Course Stony Brook University

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Semantic parsing with Structured SVM Ensemble Classification Models

CS510 Software Engineering

Errata: Carnie (2008) Constituent Structure. Oxford University Press (MAY 2009) A NOTE OF EXPLANATION:

Binary Trees and Huffman Encoding Binary Search Trees

Monitoring Metric First-order Temporal Properties

Model Checking II Temporal Logic Model Checking

Compiler I: Syntax Analysis Human Thought

Special Topics in Computer Science

Path Querying on Graph Databases

Automata Theory. Şubat 2006 Tuğrul Yılmaz Ankara Üniversitesi

CHAPTER 7 GENERAL PROOF SYSTEMS

Fundamentals of Software Engineering

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

Binary Heap Algorithms

Kybots, knowledge yielding robots German Rigau IXA group, UPV/EHU

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and:

Noam Chomsky: Aspects of the Theory of Syntax notes

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde

Constituency. The basic units of sentence structure

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

Analysis of Algorithms I: Optimal Binary Search Trees

5HFDOO &RPSLOHU 6WUXFWXUH

Pushdown Automata. place the input head on the leftmost input symbol. while symbol read = b and pile contains discs advance head remove disc from pile

Analysis of Algorithms I: Binary Search Trees

6 Creating the Animation

Artificial Intelligence Exam DT2001 / DT2006 Ordinarie tentamen

Optimizing Description Logic Subsumption

logic language, static/dynamic models SAT solvers Verified Software Systems 1 How can we model check of a program or system?

Statistical Machine Translation

University of Ostrava. Reasoning in Description Logic with Semantic Tableau Binary Trees

Automatic Text Analysis Using Drupal

Binary Coded Web Access Pattern Tree in Education Domain

Lecture 5 - CPA security, Pseudorandom functions

Shallow Parsing with Apache UIMA

A First Investigation of Sturmian Trees

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November ISSN

This asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements.

Interactive Dynamic Information Extraction

How To Understand The Relation Between Simplicity And Probability In Computer Science

Surface Realisation using Tree Adjoining Grammar. Application to Computer Aided Language Learning

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Full and Complete Binary Trees

The Modal Logic Programming System MProlog

Algorithms Chapter 12 Binary Search Trees

Model Checking: An Introduction

Data Structures Fibonacci Heaps, Amortized Analysis

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

Integrating NLTK with the Hadoop Map Reduce Framework Human Language Technology Project

Lecture 2: Regular Languages [Fa 14]

3515ICT Theory of Computation Turing Machines

Programming Languages

Physical Data Organization

Transcription:

Treebank Search with Tree Automata MonaSearch Querying Linguistic Treebanks with Monadic Second Order Logic Authors: H. Maryns, S. Kepser Speaker: Stephanie Ehrbächer July, 31th

Treebank Search with Tree Automata MonaSearch Querying Linguistic Treebanks with Monadic Second Order Logic Content Treebanks Query language: Monadic Second Order Logic Query tool: MonaSearch

Part 1: Treebanks Part 1: Motivation and Treebanks

Treebanks Remember the part of our topic: corpus search corpus or text corpus (in linguistics): large structured set of texts usually electronically stored and processed What is a treebank? a text corpus each sentence parsed, annotated with syntactic structure syntactic structure represented as tree treebank parsed corpus treebank 4

Treebanks can be created completely manually: each sentence is annotated with syntactical structure by linguists semi automatically: syntactic structure is assigned by parser and checked by linguist; if necessary corrected Two main groups can be distinguished: treebanks that annotate phrase structure (e.g. the Penn Treebank) treebanks that annotate dependency structure (e.g. the Prague Dependency Treebank) 5

Examples for Treebanks TIGER treebank (http://www.ims.uni stuttgart.de/projekte/tiger) NEGRA (http://www.coli.uni saarland.de/projects/sfb378/negra corpus) TueBa D/S The Tuebingen Treebank of Spoken German (http://www.sfs.uni tuebingen.de/en_tuebads.shtml) TueBa D/Z The Tuebingen Treebank of Written German (http://www.sfs.uni tuebingen.de/en_tuebadz.shtml) Penn (http://www.cis.upenn.edu/~treebank) English Dependency Treebank (http://www.cis.upenn.edu/~creswell/dependency/) British Component of the International Corpus of English (ICE GB ; http://www.ucl.ac.uk.english usage/projects/ice gb) 6

Purpose of Treebanks Treebank can be used to investigate linguistic theories to study syntactic phenomena for training or testing parsers 7

Purpose of Treebanks Example problem: we want to investigate how often the order verb/subject occurs in German or English sentences (yes/no ques tions) What are we doing 8

Purpose of Treebanks Example problem: we want to investigate how often the order verb/subject occurs in German or English sentences (yes/no ques tions) What are we doing We investigate a treebank, but what do we have to mind 8

Treebanks Treebanks can be large: treebanks of several tens of thousends of trees are no exception manually searching not possible query tool necessary query tool has to have expressive power; reason: small answer sets 9

Overview MonaSearch: query tool for linguistic treebanks Query language monadic second order logic MSO MonaSearch MSO query TA treebank Queries are compiled into tree automata Each tree of the linguistic treebank is checked, if the TA of the query accepts it 10

PART 2: Query Language: Monadic Second Order Logic MSO PART 2: Query Language: Monadic Second Order Logic MSO

Monadic Second Order Logic MSO Here used as query language Decidable over trees Extension of first order predicate logic by set variables Set variables can be quantified over, representing (finite) sets of nodes 12

Monadic Second Order Logic MSO Here used as query language Decidable over trees Extension of first order predicate logic by set variables Set variables can be quantified over, representing (finite) sets of nodes Example: set variable P( x (x P)) There is an empty set. P( x (x P)) There is no empty set. 12

MSO Examples For all predicates holds: predicate holds for So krates or predicate does not hold for Sokrates. P((Sokrates P) (Sokrates P)) Peano's axiom of induction for natural numbers: For all predicates P with arity 1 holds: if P holds for 0 and if it holds with all element x also for the successor x' of x, then the predicate holds for all natural numbers. P((0 P) x((x P) (x' P)) x(x P)) 13

Monadic Second Order Logic MSO Monadic: remember that arity of predicates is 1 Counter example: P x y z(pxyz) Why MSO and not simply FOL? 14

Monadic Second Order Logic MSO Monadic: remember that arity of predicates is 1 Counter example: P x y z(pxyz) Why MSO and not simply FOL? Necessary expressive power Example: ability to express transitive closure of any binary relation that is definable in this language 14

Monadic Second Order Logic MSO Queries ending in a sequence of prepositional phrases: embedded:...[pp...[pp...[pp...[pp...]]]] The dog buried the bone [pp behind the tree [pp in the garden [pp in front of the house [pp at the end of the street ]]]]. independent:...[pp...][pp...][pp...][pp...] The dog buried the bone [pp with his paws ] [pp under a stone ] [pp behind the tree ] [pp in the afternoon ]. 15

Monadic Second Order Logic MSO Queries ending in a sequence of prepositional phrases: embedded:...[pp...[pp...[pp...[pp...]]]] The dog buried the bone [pp behind the tree [pp in the garden [pp in front of the house [pp at the end of the street ]]]]. independent:...[pp...][pp...][pp...][pp...] The dog buried the bone [pp with his paws ] [pp under a stone ] [pp behind the tree ] [pp in the afternoon ]. Consider extensions of arbitrary length of PPs. Can both queries be formulated in FOL now? Try it! 15

Monadic Second Order Logic MSO Queries ending in a sequence of prepositional phrases: embedded:...[pp...[pp...[pp...[pp...]]]] The dog buried the bone [pp behind the tree [pp in the garden [pp in front of the house [pp at the end of the street ]]]]. independent:...[pp...][pp...][pp...][pp...] The dog buried the bone [pp with his paws ] [pp under a stone ] [pp behind the tree ] [pp in the afternoon ]. Consider extensions of arbitrary length of PPs. Can both queries be formulated in FOL now? Try it! Which query cannot be expressed in FOL? 15

Monadic Second Order Logic MSO Queries ending in a sequence of prepositional phrases: embedded:...[pp...[pp...[pp...[pp...]]]] The dog buried the bone [pp behind the tree [pp in the garden [pp in front of the house [pp at the end of the street ]]]]. extension cannot be independent:...[pp...][pp...][pp...][pp...] expressed in FOL. Why? The dog buried the bone [pp with his paws ] [pp under a stone ] [pp behind the tree ] [pp in the afternoon ]. Consider extensions of arbitrary length of PPs. Can both queries be formulated in FOL now? Try it! Which query cannot be expressed in FOL? 15

Monadic Second Order Logic MSO independent:...[pp...][pp...][pp...][pp...] The dog buried the bone [pp with his paws ] [pp under a stone ] [pp behind the tree ] [pp in the afternoon ]... extension.... S NP VP D N V NP D N PP PP... PP The dog buried the bone with his paws under a stone...... FOL x: Px 16

Monadic Second Order Logic MSO embedded:...[pp...[pp...[pp...[pp...]]]] The dog buried the bone [pp behind the tree [pp in the garden [pp in front of the house [pp at the end of the street... extension... ]]]]. S NP VP D N V NP The dog buried D N PP the bone behind the tree PP in the garden PP 17

Monadic Second Order Logic MSO embedded:...[pp...[pp...[pp...[pp...]]]] The dog buried the bone [pp behind the tree [pp in the garden [pp in front of the house [pp at the end of the street... extension... ]]]]. Transitivity S cannot be expressed in NP VP FOL! D N V NP The dog buried D N PP the bone behind the tree PP in the garden PP 17

Monadic Second Order Logic MSO embedded:...[pp...[pp...[pp...[pp...]]]] The dog buried the bone [pp behind the tree [pp in the garden [pp in front of the house [pp at the end of the street... extension... ]]]]. Transitivity S cannot be expressed in NP VP FOL! D N V NP The dog buried D N PP But in MSO! the bone behind the tree PP remember Peano's axiom: P((0 P) x((x P) (x' P)) x(x P)) in the garden trans. closure of dom(pp,np): new relation highest PP and deeply embedded NP of this relation PP 17

S NP VP D N V NP The dog burried D N PP the bone P NP behind D N PP the tree P NP in D N PP the garden NP 17.a

WS2S Logic WS2S weak monadic second order theory of 2 successors obtained by WS1S: two successors (left and right) instead of one (+1) WS1S interpretation corresponds to strings, first order variable interpreted as natural number WS2S interpretation corresponds to finite labeled trees, first order variable interpreted as position in infinite binary tree MonaSearch resp. Mona can run either in linear mode or in tree mode 18

WS2S example Prefix ordering: k=2 x y= X.((y X ( z.( zi X) z X)) x X i=1 contains y closed by predecessor every set containing y and closed by predecessor contains x (here restricted for 2 successors) Note: can be expressed by WSkS formulas, thus it can actually be removed 19

Our Example in MSO Example problem: we want to investigate how often the order verb/subject occurs in German or English sentences (yes/no ques tions) x,y,z(cat(x)=simpx cat(y)=vxfin + + fct(z) = ON x y x z y <z) There exists a node with category SIMPX, a node with category VXFIN and a node with function ON... ON codes grammatical function subject in T Ba D/Z 20

MSO in the Querying process Strong connection between MSO and tree automata: MSO formula bottom up tree automata, that accepts the set of corresponding trees MSO formula automaton algorithm 21

MSO in the Querying process General evaluation stratey of MonaSearch: Step 1 Convert user query into TA Step2 Run TA on each tree in the tree bank 22

Part 3: Query Tool: Mona and Mona Search Part 3: Query Tool: MonaSearch l

What is MonaSearch? MonaSearch Mona MonaSearch query tool for linguistic treebanks query language MSO uses Mona 24

What is MonaSearch? MonaSearch Mona Mona tree automata toolkit that compiles MSO formulas in TA developed for hardware verification, but also applicable to query treebanks pure monadic second order logic of two successors, resp. one successor no extensions only binary trees no node labels 25

Strategy to employ MONA to query treebanks Mona Treebank MSO trees formula transfor formula compiler mation special variant of TA transformed trees library function: check each tree 26 output if formula satisfiable representation of a compiled automaton into a file

Strategy to employ MONA to query treebanks Mona Treebank MSO trees formula transfor formula compiler How to do this transformation? mation special variant of TA transformed trees library function: check each tree 27 output if formula satisfiable representation of a compiled automaton into a file

MonaSearch Precompilation of Treebanks Problems of arbitrary trees disconnected subparts root a b c d 28

MonaSearch Precompilation of Treebanks Problems of arbitrary trees disconnected subparts root a b c d Solution: Simplified structures integrate disconnected subparts by introducing a new virtual root; connect disconnected subparts to this super root super root root d a b c 28

MonaSearch Precompilation of Treebanks Example tree of TIGER corpus VROOT virtual root S OC VP OP MO NG VZ PM Damit sei jedoch nicht zu rechnen : PROAV VAFIN ADV PTKNEG PTKZU VVINF $. 3.Sg.Pres.Konj. damit sein jedoch nicht zu rechnen : 29

MonaSearch Precompilation of Treebanks Problems of arbitrary trees crossing edges x. y a b 30

MonaSearch Precompilation of Treebanks Problems of arbitrary trees crossing edges x. y a b Solution: Simplified structures ignore crossing edges; take only order of children as seen by the parents into account x y b a 30

MonaSearch Precompilation of Treebanks crossing edges 31 31

MonaSearch Precompilation of Treebanks Problems of arbitrary trees secondary relations x y z a b Solution: Simplified structures: ignore secondary relations x y z a b 32

MonaSearch Precompilation of Treebanks Tasks of precompilation: 1. trees which are arbitrarily branching have to be transformed in binary trees 2. linguistic labels have to be taken care of 33

MonaSearch Precompilation of Treebanks Tasks of precompilation: 1. trees which are arbitrarily branching have to be transformed in binary trees 2. linguistic labels have to be taken care of How can this be done? 33

MonaSearch Precompilation of Treebanks Tasks of precompilation: 1. trees which are arbitrarily branching have to be transformed in binary trees 2. linguistic labels have to be taken care of How can this be done? Use of First Child Next Sibling encoding Edge labels are moved down to node below it 33

MonaSearch Precompilation of Treebanks SIMPX KOORD LK ON PRED MOD MF VXFIN NX ADVX ADJX Oder ist Bremerhaven nicht günstiger? KON VAFIN NE PTKNEG ADJD $. Example tree of TüBa D/Z 34

First Child Next Sibling Encoding x in original tree > x' in the binary tree if x has any children, call its leftmost child y, then y' will become the left child of x' x x' y... y' if x has any right siblings, call the leftmost one z, then z' will become the right child of x' x z... x'... y y z' 35

Moving down of Edge Labels SIMPX KOORD LK ON PRED MOD MF VXFIN NX ADVX ADJX Oder ist Bremerhaven nicht günstiger? KON VAFIN NE PTKNEG ADJD $. 36

Moving down of Edge Labels SIMPX KOORD LK MF VXFIN NX ON ADVX MOD ADJX PRED KON VAFIN NE PTKNEG ADJD $. Oder ist Bremerhaven nicht günstiger? 37

Disconnected Subtrees SIMPX KOORD LK MF VXFIN NX ON ADVX MOD ADJX PRED KON VAFIN NE PTKNEG ADJD $. Oder ist Bremerhaven nicht günstiger? 38

Virtual Root (virtual root) $? SIMPX KOORD MF LK VXFIN NX ON ADVX MOD ADJX PRED KON VAFIN NE PTKNEG ADJD Oder ist Bremerhaven nicht günstiger 39

First Child Next Sibling Encoding (virtual root) $? SIMPX KOORD MF LK VXFIN NX ON ADVX MOD ADJX PRED KON VAFIN NE PTKNEG ADJD Oder ist Bremerhaven nicht günstiger 40

First Child Next Sibling Encoding (virtual root) $? SIMPX KOORD MF LK VXFIN NX ON ADVX MOD ADJX PRED KON VAFIN NE PTKNEG ADJD Oder ist Bremerhaven nicht günstiger 41

First Child Next Sibling Encoding (virtual root) $? SIMPX KOORD MF LK VXFIN NX ADVX ADJX ON MOD PRED KON VAFIN NE PTKNEG ADJD Oder ist Bremerhaven nicht günstiger 42

First Child Next Sibling Encoding (virtual root) $? SIMPX KOORD LK MF VXFIN KON VAFIN NX ON ADVX MOD ADJX PRED Oder ist NE PTKNEG ADJD Bremerhaven nicht günstiger 43

First Child Next Sibling Encoding (virtual root) $? SIMPX KOORD LK MF VXFIN KON VAFIN NX ON ADVX MOD ADJX PRED Oder ist NE PTKNEG ADJD Bremerhaven nicht günstiger 44

First Child Next Sibling Encoding (virtual root) $? SIMPX KOORD KON Oder LK MF VXFIN VAFIN NX ON ADVX MOD ADJX PRED ist NE PTKNEG ADJD Bremerhaven nicht günstiger 45

First Child Next Sibling Encoding (virtual root) $? SIMPX KOORD KON Oder LK MF VXFIN VAFIN NX ON ADVX MOD ADJX PRED ist NE PTKNEG ADJD Bremerhaven nicht günstiger 46

First Child Next Sibling Encoding (virtual root) $? SIMPX KOORD KON LK Oder MF VXFIN NX ADVX MOD ON VAFIN NE PTKNEG ist Bremerhaven nicht ADJX PRED ADJD günstiger 47

First Child Next Sibling Encoding (virtual root) $? SIMPX KOORD KON LK Oder MF VXFIN NX ADVX MOD ON VAFIN NE PTKNEG ist Bremerhaven nicht ADJX PRED ADJD günstiger 48

First Child Next Sibling Encoding (virtual root) $? SIMPX KOORD KON LK Oder MF VXFIN NX VAFIN NE ON ADVX ist Bremerhaven MOD PTKNEG ADJX PRED nicht ADJD günstiger 49

First Child Next Sibling Encoding (virtual root) $? SIMPX KOORD KON Oder LK MF VXFIN NX ON VAFIN NE ADVX MOD ist Bremerhaven ADJX PTKNEG PRED nicht ADJD günstiger Binary Coding 50

Steps of Querying in Detail translation ' 1. getting a MSO query on original treebank from the user; 2. translating the query into a MONA formula ' on binary trees 3. compiling the MONA formula into a MONA tree automaton 4. for each tree of the precompiled treebank: 1. preparing the tree for the query 2. running the automaton on the translated tree, noting wether it is accepted or not; 5. presenting the results to the user 51

Querying most connectives (boolean connectives, quantification, some atomic relations): Mona counterpart can be taken over directly problem: relations that express s.th. about the shape of the tree, e.g. dominance and precedence solution: auxiliary predicates on the binary tree, defined in MONA language dom(x,y) x dominates y in the binary tree right_branch(x,y) y lies at the branch of right children starting at x 52

Querying: Translation of the parent and precedence relation Relation Formula Translation parenthood x y ex1 z: x.0 = z & right_branch(z,y) precedence x < y ex1 z: (x = z dom(z.0,x)) & dom(z.1,y) + dominance x y ex1 z: x.0 = z & dom(z,y) 53

Querying: possible result of our example Example problem: we want to investigate how often the order verb/subject occurs in German or English sentences (yes/no ques tions) x,y,z(cat(x)=simpx cat(y)=vxfin + fct(z) = ON x + y x z y <z) 54

Querying: possible result of our example SIMPX KOORD LK ON PRED MOD MF VXFIN NX ADVX ADJX Oder ist Bremerhaven nicht günstiger? KON VAFIN NE PTKNEG ADJD $. Example tree of TüBa D/Z 55

Guided Tree Automata Question: Are normal bottom up tree automata sufficient for deciding validity and generating counter examples in WS2S? 56

Guided Tree Automata Question: Are normal bottom up tree automata sufficient for deciding validity and generating counter examples in WS2S? Answer: Theoretically yes. But: transition tables have an additional dimension compared to string automata > extra level of complexity 56

Guided Tree Automata Question: Are normal bottom up tree automata sufficient for deciding validity and generating counter examples in WS2S? Answer: Theoretically yes. But: transition tables have an additional dimension compared to string automata > extra level of complexity problem: state space explosions Mona solution: special kind of tree automata, called Guided Tree Automata 56

Guided Tree Automata Guide G = ( D,, ) top down deterministic TA ; states are used to d0 designate state space names of bottom up TA D finite set of state space names : D D D guide function d 0 D initial state space name Guided Tree Automaton GTA with guide G is a set of bottom MG up tree automata: M G = {Q d }d D,, { d }d D, {q d }d D, F guide function is used here How to use this guide? 57

Guided Tree Automata How to use this guide? Given: tree t; GTA accepts t: 1. State space is assigned to every node in t. Tree can be labeled top down with state spaces according to guide function; 2. Each subtree of the resulting tree is assigned a state in a bottom up style. GTA can be seen as ordinary tree automaton, where state space has been factorized according to the guide GTA with only one state space is an ordinary tree automaton 58

Guided Tree Automata Guide defined in the header with the guide construct Example: guide a >(b,c), b >(d,e), c >(c,c), d >(d,d), e >(e,f), f >(f,f) initial state space a,b,c,d,e state spaces (boolean state space, reserved for bool variable) Restricting variables to state spaces also by universes 59

Example: Exponential Savings with Guides ws2s; var2 A,B; ex1 p1,p2,p3,p4,p5: p1<p2 & p2<p3 & p3<p4 &p4<p5 & A = {p1,p2,p3,p4,p5}; ex1 p1,p2,p3,p4,p5,p6,p7: p1<p2 & p2<p3 & p3<p4 &p4<p5 & p5<p6 & p6<p7 & A = {p1,p2,p3,p4,p5,p6,p7}; ws2s; guide d0 >(a,b), a >(a,a), b >(b,b) universe ua:0, ub:1; var2 [ua] A; var2 [ub] B; ex1 [ua] p1,p2,p3,p4,p5: p1<p2 & p2<p3 & p3<p4 &p4<p5 & A = {p1,p2,p3,p4,p5}; ex1 [ub] p1,p2,p3,p4,p5,p6,p7: p1<p2 & p2<p3 & p3<p4 &p4<p5 & p5<p6 & p6<p7 & A = {p1,p2,p3,p4,p5,p6,p7}; 60

Performance: Comparison for TIGERSearch, fsq and MonaSearch 1. An NP dominating a S dominating a PP 2. An NP dominating a S dominating a PP and an NP, which do not dominate each other 3. Sentences where the verb precedes the subject 4. An NP not dominating a S which dominates a PP 5. A PP dominating a NP which is part of a chain of embedded PPs Query TIGERSearch fsq MonaSearch 1 5.5 5.5 23 13.5 15 10 2 9 5.5 23 13.5 15 10 3 15 16 23 13.5 15 10 4 23 13.5 15 10 5 15 10 red: TIGER treebank green: T Ba D/Z time in seconds 61

Summary We considered Treebanks Query language MSO Query tool MonaSearch

Conclusions MonaSearch very high expressive power by MSO high performance of query engine fastest query system for advanced queries

THANK YOU FOR YOUR ATTENTION

THANK YOU FOR YOUR ATTENTION QUESTIONS?