Tools and resources for Tree Adjoining Grammars
|
|
|
- Bryan Mosley
- 9 years ago
- Views:
Transcription
1 Tools and resources for Tree Adjoining Grammars François Barthélemy, CEDRIC CNAM, 92 Rue St Martin FR Paris Cedex 03 Pierre Boullier, Philippe Deschamp Linda Kaouane, Abdelaziz Khajour Éric Villemonte de la Clergerie ATOLL - INRIA, Domaine de Voluceau - BP 105 FR Le Chesnay Cedex [email protected] Abstract This paper presents a workbench for Tree Adjoining Grammars that we are currently developing. This workbench includes several tools and resources based on the markup language XML, used as a convenient language to format and exchange linguistic resources. 1 Introduction Our primary concern lies in the development of efficient parsers for various grammatical formalisms of interest for Natural Language Processing. Tree Adjoining Grammars [TAG] is one of these formalisms, important from a linguistic point of view but also because it is possible to design efficient parsers. However, during our work on TAG, we were confronted with a lack of standardization of grammars, especially when dealing with wide coverage grammars. The XTAG System 1 (The XTAG Research Group, 1995) provides an implicit standard, but it is not very readable and lacks explicit specifications. The various grammars we studied presented many variations. Moreover, we also noted many problems of consistencies in most of them. Following others, amongst whom LT XML 2 and especially (Bonhomme and Lopez, 2000), we considered that the markup language XML would be a good choice to represent TAG, especially with the possibility of providing an explicit and logical specification via a DTD. Being textual, resources in XML can be read by humans and easily exchanged and maintained. Finally, there exists more and more supports to handle XML resources. We have also found that XML is a convenient language to store linguistic results, such as the shared derivation forests output by our TAG parsers. The paper starts with a brief introduction to TAGs. Section 3 presents the different XML encodings that we have designed for the representation of grammars and derivation forests. Section 4 presents several different maintenance tools we are developing to handle grammars and derivation forests. Section 5 presents servers used to access different kind of informations. Interfaces for these servers are presented in Section 6. 2 Tree Adjoining Grammars The TAG formalism (Joshi, 1987) is particularly suitable to describe many linguistic phenomena. A TAG is given by a set of elementary trees partitioned into initial trees and auxiliary trees. Internal nodes are labeled by non-terminals and leaves by non-terminal or terminals. Each auxiliary tree has a distinguished leaf, called its foot and labeled by a non-terminal, the same as the root node of. Two operations may be used to derive trees from elementary trees. The first one, called itution, replaces a leaf node labeled by a nonterminal by an initial tree whose root is also labeled by. The second operation, called ad-
2 Adjunction node v A A Spine A Root Foot Auxiliary Tree β Adjunction Figure 1: Adjunction junction, is illustrated by Figure 1. An auxiliary tree whose root is labeled by may be adjoined at any node labeled by. The subtree rooted at is grafted to the foot of. Feature TAGs extend TAGs by attaching to n- odes a pair of first-order terms represented by Feature Structures [FS] and called top and bottom arguments. These feature structures may be used, for instance, to handle agreement or enforce semantic restrictions. Lexicalized (Feature) TAGs assumes that each elementary tree has at least one lexical node labeled by a terminal. However, explicit lexicalized grammars would be huge, with one or more elementary trees for each entry in the lexicon. The choice made by the XTAG group and by all the designers of wide coverage TAGs is to factorize the grammars and gives enough information to lexicalize parts of the grammars when needed. Morphological entries (or inflected forms) reference one or more lemma entries, which, in turn, refer to families of tree schema. A tree schema is an elementary tree with a distinguished leaf called anchor node that is to be replaced by a morphological entry. Each reference may be completed by additional constraints. For instance, extracted from a small French grammar, Figure 2 shows the basic elements (morphological entry donne, lemma \DONNER\, and tree schema tn1pn2) used to build the tree tn1pn2(donne) corresponding to the syntactic pattern (1) and illustrated by sentence (2). The lemma part states that the subject NP and the prepositional complement NP must both be human and that NP is introduced by the preposition à (co-anchoring). In the tree tn1pn2, the itution nodes are marked with and the anchor node with. A A (1) quelqu un somebody (2) donne quelque chose à quelqu un gives something to somebody donne un joli livre à Sabine gives a nice book to Sabine donne: \DONNER,\ V {mode=,indnum=sing} \DONNER,\V: tn1pn2[p_2=à] {NP_0.t:restr=+,hum NP_2.t:restr=+hum} NP tn1pn2 S V <>V VP NP 3 XML Encoding P PP NP Figure 2: Tree schema 3.1 Representing grammars We have designed a DTD 4 that clearly specifies the relations between the various components of a grammar. For instance, the following DTD fragment states that a morphological entry is characterized by a field lex and includes zero or more description entries (for documentation) and at least one reference to a lemma (lemmaref). Similarly, an element lemmaref is characterized by the fields name and cat, and may be completed by a FS argument (fs). <!ELEMENT morph (desc*,lemmaref+)> <!ATTLIST morph lex CDATA #REQUIRED> <!ELEMENT lemmaref (fs?)> <!ATTLIST lemmaref name CDATA #REQUIRED cat CDATA #REQUIRED> Following the DTD, the various elements described in Figure 2 may be represented by the (tiny) following XML fragment, omitting the FS specification on nodes for sake of space and clarity. <tag axiom="s"> <morph lex="donne"> <lemmaref cat="v" name="*donner*"> <fs> <f name="mode"> <val>ind</val> <val>subj</val> 4
3 </f> <f name="num"> <val>sing</val> </f> </fs> </lemmaref> </morph> <lemma cat="v" name="*donner*"> <anchor <coanchor node_id="p_2"> <lex>à</lex> </coanchor> <equation node_id="np_0" type="top"> <fs> <f name="restr"> <val>plushum</val> </f> </fs> </equation> <equation node_id="np_2" type="top"> <fs> <f name="restr"> <val>plushum</val> </f> </fs> </equation> </anchor> </lemma> <family name="tn1pn2"> <tree name="tn1pn2"> <node cat="s" adj="yes" type="std"> <node cat="np" id="np_0" type="" /> <node cat="vp" adj="yes" type="std"> <node cat="v" adj="yes" type="anchor" /> <node cat="np" type="" /> <node cat="pp" adj="yes" type="std"> <node cat="p" id="p_2" type=""/> <node cat="np" id="np_2" type=""/> </node> </node> </node> </tree> </family> </tag> A (deterministic) TAG parser may return either the result of the analysis as a parse tree, or the steps of the derivation as a derivation tree. These two alternatives are illustrated for sentence (3) by Figures 3 and 4 (with Figure 5 showing the elementary lexicalized trees). A derivation tree indicates which operation (itution or adjunction of some tree) has taken place on which node of which tree, for instance the adjunction of tree a joli at node labeled N. It is worth noting that the parse tree may be retrieved from the derivation tree, which motivates our interest in derivation trees. (3) NP np() donne gives V S donne D un un a joli livre à Sabine nice book to Sabine NP NP VP Adj joli N N P à livre Figure 3: Parse Tree tn1pn2(donne,à) npdn(livre) PP NP Sabine np(sabine) Currently, we have encoded a small French grammar (50 tree schemata, 117 lemmas and 345 morphological entries) and an English grammar (456 tree schemata, 333 lemmas and 507 morphological entries). We are processing some other larger grammars (for both English and French). 3.2 Encoding derivations!"$#%'& d(un) adj a(joli) Figure 4: Derivation Tree )!*"!"$#%(& )!" )!",+!*"$#%(& + Figure 6: Organization of a derivation forest In case of ambiguity (frequent in NLP), several or even an unbounded number of derivation trees may actually be compacted into a shared derivation forest, equivalent to a Context-Free Grammar (Lang, 1991). This remark has guided the design
4 S tn1pn2(donne,à) NP VP V NP PP NP np() donne NP npdn(livre) D N P à NP NP np(sabine) - D d(un) a(joli) N un Adj N joli livre Sabine Figure 5: Elementary trees of a DTD 5 to encode shared derivation forests. This DTD introduces the primary elements op, deriv, and node as well as an element opref used to reference elements op. The logical organization of these elements is sketched in Figure 6. More precisely: op, identified by its attribute id, denotes either an operation of itution or of adjunction (attribute type) on some syntactic category (attribute cat) for some span of the input string (attribute span). Sub-elements of op may also be present to specify the feature values associated to the operation. deriv details a possible derivation for some operation, based on some lexicalized tree given by a tree schema (attribute tree) and an anchor (anchor). node specifies which operation op has been performed on some node (attribute node_id) of an elementary tree during a derivation. A derivation tree may be expressed in a nested way using only elements op, deriv, and node. A shared forest will require the use of opref to denote multiple occurrences of a same operation. 5 The above derivation tree may be represented by the following XML fragment (omitting information about the feature structures). <forest parser="light DyALog"> <sentence> donne un joli livre à Sabine </sentence> <op cat="s" span="0 7" id="1" type=""> <deriv tree="tn1pn2" anchor="donne"> <node id="p_2"><opref ref="5" /></node> <node id="np_0"><opref ref="2" /></node> <node id="1"><opref ref="4" /></node> <node id="np_2"><opref ref="6" /></node> </deriv> <op cat="np" span="0 1" id="2" type=""> <deriv tree="np" anchor="" /> <op cat="np" span="2 5" id="4" type=""> <deriv tree="npdn" anchor="livre"> <node id="n_"><opref ref="10" /></node> <node id="0"><opref ref="8" /></node> </deriv> <op cat="p" span="5 6" id="5" type=""> <deriv tree="p" anchor="à" /> <op cat="np" span="6 7" id="6" type=""> <deriv tree="np" anchor="sabine" /> <op cat="d" span="2 3" id="8" type=""> <deriv tree="d" anchor="un" /> <op cat="n" span=" " id="10" type="adj"> <deriv tree="an" anchor="joli" /> </forest> 4 Maintenance tools 4.1 For the grammars The XML encoding of grammars is convenient for maintenance and exchange. However, it does
5 not correspond to the input formats expected by the two parser compilers we develop. One of them (DyALog) expects a prolog-like representation of the grammars (Alonso Pardo et al., 2000) while the second one expects Range Concatenation Grammars [RCG] (Boullier, 2000). LP RCG FOREST XTAG Tree Dep HTML XML Analyzer Strip Checker Figure 8: Maintenance Tools for the forests XTAG XML RCG TAG LP LaTeX SQL Figure 7: Maintenance Tools for the grammars Therefore, we have developed in Perl a set of maintenance modules, for these conversions and for other tasks (Figure 7). The central module TAG implements an object-oriented view of the logical structure specified by the Grammar DTD. The other modules add new methods to the classes introduced by TAG. Besides the conversion modules LP and RCG, we also have a read/write XML module. Module Checker is used to check (partially) the coherence of the grammar and to produce some s- tatistics. Module Analyzer extracts information needed for the compilation by the DyALog system. Module Strip deletes all information relative to feature structures from the grammar. Module SQL may be used to store to and load from a SQL database. Our choice of Perl has been motivated by the availability from archive sites of many Perl modules to handle XML resources or database access. Moreover, the development of a Perl module is fast (for a prototype), generally only a few hours. For instance, we have realized a prototype module LaTeX, useful to build the documentation of a grammar. We are also thinking of an HTML module to build online versions. 4.2 For the derivation forests Similarly, we have also developed a set of modules to handle derivation forests (Fig. 8) with a central module FOREST and conversion modules. Modules LP, RCG, and XTAG read the output formats of the derivation forests produced by our parsers and by the TAG parser 6. The forests can then be emitted in XML format (module XM- L), in HTML format (module HTML), as trees (module Tree) or as dependency graphs (module Dep). Other modules should be added in the future, such as SQL module to read to and to write from a database used as a derivation tree-bank, a Strip module to remove features, or different filtering modules to extract subsets of a forest. 5 Servers Exploiting some of these modules, but also other components developed in Java, we are installing several servers to access different kinds of information (parsers, grammars, forests) in uniform and distributed ways. 5.1 A server of parsers We are exploring several ways to build efficient parsers for TAGs (Éric Villemonte de la Clergerie and Alonso Pardo, 1998; Alonso Pardo et al., 2000; Éric Villemonte de la Clergerie, 2001; Boullier, 2000; Barthélemy et al., 2001; Barthélemy et al., 2000), which leads us to maintain a growing set of parsers. Moreover, we wish to be able to compare the output produced by these parsers to check soundness, completeness and level of sharing in the derivation forests. To achieve these objectives, we provide a uniform setting by installing a simple server of parsers, written in Perl. Once connected to this server, one 6
6 selects a parser and sends a sentence to parse; the server returns the shared derivation forest in raw, HTML, XML, Tree or Dep formats. A WEB front-end 7 may be used to connect to this server. Figures 9 and 10 show two views of a derivation forest built using the server. Another WEB front-end 8 allows the direct submission to the server of a derivation forest in one of the 3 recognized input formats (LP, RCG, XTAG). Submission in XML format should be added soon. np 0 1 np() np_0 s 0 7 tn1pn2(donne) np#1 np 2 5 npdn(livre) p_2 p 5 6 p(a) np_2 np 6 7 np(sabine) phase may be performed for (almost) any kind of XML documents. The second main component of the server is a small query language used to fetch information from the database while hiding, for non specialists, the complexity of languages SQL, XQL or XPath. We have chosen an object oriented notation which, once again, reflects the structure of the TAG DTD and which is also close to path equations familiar to computational linguists in, for instance, HPSG. We have several types such as family, morph, tree or node corresponding to the different kinds of elements of the DTD. For each type, several methods are available. For example, the following query returns the name and the Database Id (DBId) of all trees belonging to family tn1pn2. A second kind of requests takes a DBId. and returns the full XML fragment associated to the XML element whose index in the database is.. d 2 3 d(un) d#0 n_ n an(joli) Figure 9: Derivation tree 5.2 A server of grammars Because of the size of wide-coverage grammars, we believe that working with grammars stored in files is no longer a valid option (except for exchanging them). Databases should be used instead where bits of information can be efficiently retrieved. Moreover, modern database managers provide server fonctionalities and security mechanisms. Around these ideas, we are currently developing in Java a server of grammars. First the grammars are loaded into a SQL database (namely MySQL 9 ). It should be noted that the structure of the database reflects the XML structure for grammars and not directly the structure of the grammars. This means that the loading var tree t; select t.name; t; where t.family.name = tn1pn2 end The grammar server works as a Java servlet, integrated in a HTTP server Apache using JServ 10. It may be accessed using URL with parameters (as done for CGI scripts). The server decodes the parameters and transforms the query into a SQL query send to the database corresponding to the selected grammar. The result is either a table encoded in XML format or an XML fragment of the grammar. A small WEB interface 11 is available to display the results in a navigator (by transforming them into HTML) but it is also possible to get the results as an XML file. Tools can therefore query the server by sending a URL and getting back the results in XML. The server should be soon completed for edition tasks. Full deletion of an element and of its descendants may be achieved using DBIds. Addition can be achieved by sending an XML fragment and a DBId (stating where to attach the XM- L fragment)
7 a donne p_2 p Sabine np np_0 tn1pn2 np_2 np np#1 un livre d joli d#0 n_ npdn an Figure 10: Dependency view 5.3 A server of derivation forests The development of a server of derivation forests has just started, along the same lines we followed for the server of grammars. Such a server will be an alternative to treebanks. Two main functionalities are planned for this server. First the possibility to add a derivation forest to the underlying database (in order to build corpora of derivations) and second, ways to access these derivations through a simple but powerful query language. 6 Interfaces We have already mentioned WEB interfaces to the parser server and the grammar server. Besides these interfaces, we (Kaouane, 2000) have also modified and enriched a Java interface developed by Patrice Lopez (Lopez, 2000). The new version can import grammars and derivation forests that follow our XML DTD. It can also use the server of parsers, sending the sentence to parse and receiving the shared derivation forest in XML format. The derivations are extracted from the derivation forests and, for each derivation, the tool displays both the derivation tree and the corresponding parse tree (see Figure 11). It is also possible to follow step by step a derivation by moving forward or backward. We found this functionality useful to understand or explain to students the workouts of TAG. The viewer may also be used to browse the d- ifferent components of a grammar (trees and lexicons), therefore helping in its maintenance. This tool already exploits the parser server, but we also plan to extend it to exploit the grammar server (for browsing the grammars or displaying a derived tree) and the forest server (for accessing derivation forests). Conclusion The experiments we have performed have strengthened our opinion that XML is really adequate to maintain and exchange linguistic resources and that XML allows us to quickly develop tools to handle these resources. Most of the components presented in this paper have been developed over a short period of time, and, while still preliminary, are fully usable. We believe that XML and these tools gives us solid foundations to further develop a complete environment to handle TAGs, based on many simple and easyto-maintain tools (instead of having a monolithic system). We also think that the availability of such tools or resources may prove useful for linguists (to develop grammars with browsing and maintenance tools), for parser developers (to browse grammars and derivations), and for students in computational linguistics (to understand Tree Adjoining Grammars). The tools and resources that we develop are freely available. Tools are based on a modular architecture with a specification given by the DTD and we hope that new components will be added by other people. References Miguel Alonso Pardo, Djamé Seddah, and Éric Villemonte de la Clergerie Practical aspects in compiling tabular TAG parsers. In Proceedings of the / th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+5), pages 27 32, Université Paris 7, Jussieu, Paris, France, May.
8 Figure 11: Screen capture of the Java derivation viewer F. Barthélemy, P. Boullier, Ph. Deschamp, and É. Villemonte de la Clergerie Guided parsing of range concatenation languages. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL 01), University of Toulouse, France, July. to be published. Patrick Bonhomme and Patrice Lopez TagML: XML encoding of resources for lexicalized tree adjoining grammars. In Proceedings of LREC 2000, Athens. Pierre Boullier Range concatenation grammars. In Proceedings of the 0 th International Workshop on Parsing Technologies (IWPT2000), pages 53 64, Trento, Italy, February. see also Rapport de recherche n1 3342, online at fr/rrrt/rr-3342.html, INRIA, France, January 1998, 41 pages. Aravind K. Joshi An introduction to tree adjoining grammars. In Alexis Manaster- Ramer, editor, Mathematics of Language, pages John Benjamins Publishing Co., Amsterdam/Philadelphia. Linda Kaouane Adaptation et utilisation d un environnement graphique pour les TAG au dessus du système DyALog. Mémoire de DEA, Université d Orléans. Bernard Lang Towards a uniform formal framework for parsing. In Masaru Tomita, editor, Current issues in Parsing Technology, chapter 11. Kluwer Academic Publishers. Also appeared in the Proceedings of International Workshop on Parsing Technologies IWPT89. Patrice Lopez LTAG workbench: A general framework for LTAG. In Proceedings of the / th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+5), Paris, May. Éric Villemonte de la Clergerie and Miguel Alonso Pardo A tabular interpretation of a class of 2-stack automata. In Proceedings of ACL/COLING 98, August. online at ftp://ftp.inria.fr/inria/projects/ Atoll/Eric.Clergerie/SD2SA.ps.gz. Éric Villemonte de la Clergerie Refining tabular parsers for TAGs. In Proceedings of NAACL 01, June. to be published. The XTAG Research Group A lexicalized tree adjoining grammar for English. Technical Report IRCS 95-03, Institute for Research in Cognitive Science, University of Pennsylvania, Philadelphia, PA, USA, March.
Automatic Text Analysis Using Drupal
Automatic Text Analysis Using Drupal By Herman Chai Computer Engineering California Polytechnic State University, San Luis Obispo Advised by Dr. Foaad Khosmood June 14, 2013 Abstract Natural language processing
Managing large sound databases using Mpeg7
Max Jacob 1 1 Institut de Recherche et Coordination Acoustique/Musique (IRCAM), place Igor Stravinsky 1, 75003, Paris, France Correspondence should be addressed to Max Jacob ([email protected]) ABSTRACT
A Workbench for Prototyping XML Data Exchange (extended abstract)
A Workbench for Prototyping XML Data Exchange (extended abstract) Renzo Orsini and Augusto Celentano Università Ca Foscari di Venezia, Dipartimento di Informatica via Torino 155, 30172 Mestre (VE), Italy
XML: extensible Markup Language. Anabel Fraga
XML: extensible Markup Language Anabel Fraga Table of Contents Historic Introduction XML vs. HTML XML Characteristics HTML Document XML Document XML General Rules Well Formed and Valid Documents Elements
Natural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
Interactive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1
Korpus-Abfrage: Werkzeuge und Sprachen Gastreferat zur Vorlesung Korpuslinguistik mit und für Computerlinguistik Charlotte Merz 3. Dezember 2002 Motivation Lizentiatsarbeit: A Corpus Query Tool for Automatically
Introduction to XML Applications
EMC White Paper Introduction to XML Applications Umair Nauman Abstract: This document provides an overview of XML Applications. This is not a comprehensive guide to XML Applications and is intended for
Modern Databases. Database Systems Lecture 18 Natasha Alechina
Modern Databases Database Systems Lecture 18 Natasha Alechina In This Lecture Distributed DBs Web-based DBs Object Oriented DBs Semistructured Data and XML Multimedia DBs For more information Connolly
10CS73:Web Programming
10CS73:Web Programming Question Bank Fundamentals of Web: 1.What is WWW? 2. What are domain names? Explain domain name conversion with diagram 3.What are the difference between web browser and web server
A Plan for the Continued Development of the DNS Statistics Collector
A Plan for the Continued Development of the DNS Statistics Collector Background The DNS Statistics Collector ( DSC ) software was initially developed under the National Science Foundation grant "Improving
Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata
Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata Alessandra Giordani and Alessandro Moschitti Department of Computer Science and Engineering University of Trento Via Sommarive
Outline of today s lecture
Outline of today s lecture Generative grammar Simple context free grammars Probabilistic CFGs Formalism power requirements Parsing Modelling syntactic structure of phrases and sentences. Why is it useful?
Chapter 1: Introduction
Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db book.com for conditions on re use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases
Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)?
Database Indexes How costly is this operation (naive solution)? course per weekday hour room TDA356 2 VR Monday 13:15 TDA356 2 VR Thursday 08:00 TDA356 4 HB1 Tuesday 08:00 TDA356 4 HB1 Friday 13:15 TIN090
Semantic annotation of requirements for automatic UML class diagram generation
www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute
A first step towards modeling semistructured data in hybrid multimodal logic
A first step towards modeling semistructured data in hybrid multimodal logic Nicole Bidoit * Serenella Cerrito ** Virginie Thion * * LRI UMR CNRS 8623, Université Paris 11, Centre d Orsay. ** LaMI UMR
Parsing Technology and its role in Legacy Modernization. A Metaware White Paper
Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks
ANALEC: a New Tool for the Dynamic Annotation of Textual Data
ANALEC: a New Tool for the Dynamic Annotation of Textual Data Frédéric Landragin, Thierry Poibeau and Bernard Victorri LATTICE-CNRS École Normale Supérieure & Université Paris 3-Sorbonne Nouvelle 1 rue
Learning Translation Rules from Bilingual English Filipino Corpus
Proceedings of PACLIC 19, the 19 th Asia-Pacific Conference on Language, Information and Computation. Learning Translation s from Bilingual English Filipino Corpus Michelle Wendy Tan, Raymond Joseph Ang,
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
Open Domain Information Extraction. Günter Neumann, DFKI, 2012
Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati
Semistructured data and XML. Institutt for Informatikk INF3100 09.04.2013 Ahmet Soylu
Semistructured data and XML Institutt for Informatikk 1 Unstructured, Structured and Semistructured data Unstructured data e.g., text documents Structured data: data with a rigid and fixed data format
DEPENDENCY PARSING JOAKIM NIVRE
DEPENDENCY PARSING JOAKIM NIVRE Contents 1. Dependency Trees 1 2. Arc-Factored Models 3 3. Online Learning 3 4. Eisner s Algorithm 4 5. Spanning Tree Parsing 6 References 7 A dependency parser analyzes
VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR
VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR Andrey V.Lyamin, State University of IT, Mechanics and Optics St. Petersburg, Russia Oleg E.Vashenkov, State University of IT, Mechanics and Optics, St.Petersburg,
XML Processing and Web Services. Chapter 17
XML Processing and Web Services Chapter 17 Textbook to be published by Pearson Ed 2015 in early Pearson 2014 Fundamentals of http://www.funwebdev.com Web Development Objectives 1 XML Overview 2 XML Processing
D2.4: Two trained semantic decoders for the Appointment Scheduling task
D2.4: Two trained semantic decoders for the Appointment Scheduling task James Henderson, François Mairesse, Lonneke van der Plas, Paola Merlo Distribution: Public CLASSiC Computational Learning in Adaptive
Application of XML Tools for Enterprise-Wide RBAC Implementation Tasks
Application of XML Tools for Enterprise-Wide RBAC Implementation Tasks Ramaswamy Chandramouli National Institute of Standards and Technology Gaithersburg, MD 20899,USA 001-301-975-5013 [email protected]
Database System Concepts
s Design Chapter 1: Introduction Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth
Lecture 9. Semantic Analysis Scoping and Symbol Table
Lecture 9. Semantic Analysis Scoping and Symbol Table Wei Le 2015.10 Outline Semantic analysis Scoping The Role of Symbol Table Implementing a Symbol Table Semantic Analysis Parser builds abstract syntax
Modeling the User Interface of Web Applications with UML
Modeling the User Interface of Web Applications with UML Rolf Hennicker,Nora Koch,2 Institute of Computer Science Ludwig-Maximilians-University Munich Oettingenstr. 67 80538 München, Germany {kochn,hennicke}@informatik.uni-muenchen.de
Natural Language Database Interface for the Community Based Monitoring System *
Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University
How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.
Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.
Terminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet, Mathieu Roche To cite this version: Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet,
Shallow Parsing with Apache UIMA
Shallow Parsing with Apache UIMA Graham Wilcock University of Helsinki Finland [email protected] Abstract Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
Compiler Construction
Compiler Construction Regular expressions Scanning Görel Hedin Reviderad 2013 01 23.a 2013 Compiler Construction 2013 F02-1 Compiler overview source code lexical analysis tokens intermediate code generation
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that
Enforcing Data Quality Rules for a Synchronized VM Log Audit Environment Using Transformation Mapping Techniques
Enforcing Data Quality Rules for a Synchronized VM Log Audit Environment Using Transformation Mapping Techniques Sean Thorpe 1, Indrajit Ray 2, and Tyrone Grandison 3 1 Faculty of Engineering and Computing,
WebSphere Business Monitor
WebSphere Business Monitor Monitor models 2010 IBM Corporation This presentation should provide an overview of monitor models in WebSphere Business Monitor. WBPM_Monitor_MonitorModels.ppt Page 1 of 25
CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test
CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed
How to Design and Create Your Own Custom Ext Rep
Combinatorial Block Designs 2009-04-15 Outline Project Intro External Representation Design Database System Deployment System Overview Conclusions 1. Since the project is a specific application in Combinatorial
LabVIEW Internet Toolkit User Guide
LabVIEW Internet Toolkit User Guide Version 6.0 Contents The LabVIEW Internet Toolkit provides you with the ability to incorporate Internet capabilities into VIs. You can use LabVIEW to work with XML documents,
XFlash A Web Application Design Framework with Model-Driven Methodology
International Journal of u- and e- Service, Science and Technology 47 XFlash A Web Application Design Framework with Model-Driven Methodology Ronnie Cheung Hong Kong Polytechnic University, Hong Kong SAR,
Moving Enterprise Applications into VoiceXML. May 2002
Moving Enterprise Applications into VoiceXML May 2002 ViaFone Overview ViaFone connects mobile employees to to enterprise systems to to improve overall business performance. Enterprise Application Focus;
What s in a Lexicon. The Lexicon. Lexicon vs. Dictionary. What kind of Information should a Lexicon contain?
What s in a Lexicon What kind of Information should a Lexicon contain? The Lexicon Miriam Butt November 2002 Semantic: information about lexical meaning and relations (thematic roles, selectional restrictions,
Generating XML from Relational Tables using ORACLE. by Selim Mimaroglu Supervisor: Betty O NeilO
Generating XML from Relational Tables using ORACLE by Selim Mimaroglu Supervisor: Betty O NeilO 1 INTRODUCTION Database: : A usually large collection of data, organized specially for rapid search and retrieval
RRSS - Rating Reviews Support System purpose built for movies recommendation
RRSS - Rating Reviews Support System purpose built for movies recommendation Grzegorz Dziczkowski 1,2 and Katarzyna Wegrzyn-Wolska 1 1 Ecole Superieur d Ingenieurs en Informatique et Genie des Telecommunicatiom
Everyday Lessons from Rakudo Architecture. Jonathan Worthington
Everyday Lessons from Rakudo Architecture Jonathan Worthington What do I do? I teach our advanced C#, Git and software architecture courses Sometimes a mentor at various companies in Sweden Core developer
Ficha técnica de curso Código: IFCAD320a
Curso de: Objetivos: LDAP Iniciación y aprendizaje de todo el entorno y filosofía al Protocolo de Acceso a Directorios Ligeros. Conocer su estructura de árbol de almacenamiento. Destinado a: Todos los
Factoring Surface Syntactic Structures
MTT 2003, Paris, 16 18 jui003 Factoring Surface Syntactic Structures Alexis Nasr LATTICE-CNRS (UMR 8094) Université Paris 7 [email protected] Mots-clefs Keywords Syntaxe de surface, représentation
Firewall Builder Architecture Overview
Firewall Builder Architecture Overview Vadim Zaliva Vadim Kurland Abstract This document gives brief, high level overview of existing Firewall Builder architecture.
Last Week. XML (extensible Markup Language) HTML Deficiencies. XML Advantages. Syntax of XML DHTML. Applets. Modifying DOM Event bubbling
XML (extensible Markup Language) Nan Niu ([email protected]) CSC309 -- Fall 2008 DHTML Modifying DOM Event bubbling Applets Last Week 2 HTML Deficiencies Fixed set of tags No standard way to create new
Extensible Markup Language (XML): Essentials for Climatologists
Extensible Markup Language (XML): Essentials for Climatologists Alexander V. Besprozvannykh CCl OPAG 1 Implementation/Coordination Team The purpose of this material is to give basic knowledge about XML
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems cation systems. For example, NLP could be used in Question Answering (QA) systems to understand users natural
Log Analysis Software Architecture
Log Analysis Software Architecture Contents 1 Introduction 1 2 Definitions 2 3 Software goals 2 4 Requirements 2 4.1 User interaction.......................................... 3 4.2 Log file reading..........................................
Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System
Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Oana NICOLAE Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania [email protected]
A Visual Tagging Technique for Annotating Large-Volume Multimedia Databases
A Visual Tagging Technique for Annotating Large-Volume Multimedia Databases A tool for adding semantic value to improve information filtering (Post Workshop revised version, November 1997) Konstantinos
ONTOLOGIES A short tutorial with references to YAGO Cosmina CROITORU
ONTOLOGIES p. 1/40 ONTOLOGIES A short tutorial with references to YAGO Cosmina CROITORU Unlocking the Secrets of the Past: Text Mining for Historical Documents Blockseminar, 21.2.-11.3.2011 ONTOLOGIES
The Mjølner BETA system
FakePart I FakePartTitle Software development environments CHAPTER 2 The Mjølner BETA system Peter Andersen, Lars Bak, Søren Brandt, Jørgen Lindskov Knudsen, Ole Lehrmann Madsen, Kim Jensen Møller, Claus
TD 271 Rev.1 (PLEN/15)
INTERNATIONAL TELECOMMUNICATION UNION STUDY GROUP 15 TELECOMMUNICATION STANDARDIZATION SECTOR STUDY PERIOD 2009-2012 English only Original: English Question(s): 12/15 Geneva, 31 May - 11 June 2010 Source:
A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS
A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS Abdelsalam Almarimi 1, Jaroslav Pokorny 2 Abstract This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed
Instant YANG. The Basics. Hakan Millroth, Tail- f Systems (email: hakan@tail- f.com)
Instant YANG Hakan Millroth, Tail- f Systems (email: hakan@tail- f.com) This is a short primer on the NETCONF data modeling language YANG. To learn more about YANG, take a look at the tutorials and examples
Lightweight Data Integration using the WebComposition Data Grid Service
Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed
Surface Realisation using Tree Adjoining Grammar. Application to Computer Aided Language Learning
Surface Realisation using Tree Adjoining Grammar. Application to Computer Aided Language Learning Claire Gardent CNRS / LORIA, Nancy, France (Joint work with Eric Kow and Laura Perez-Beltrachini) March
Parsing Beyond Context-Free Grammars: Introduction
Parsing Beyond Context-Free Grammars: Introduction Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Sommersemester 2016 Parsing Beyond CFG 1 Introduction Overview 1. CFG and natural languages 2. Polynomial
estatistik.core: COLLECTING RAW DATA FROM ERP SYSTEMS
WP. 2 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing (Bonn, Germany, 25-27 September
Special Topics in Computer Science
Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS
Translating between XML and Relational Databases using XML Schema and Automed
Imperial College of Science, Technology and Medicine (University of London) Department of Computing Translating between XML and Relational Databases using XML Schema and Automed Andrew Charles Smith acs203
Semantic Lifting of Unstructured Data Based on NLP Inference of Annotations 1
Semantic Lifting of Unstructured Data Based on NLP Inference of Annotations 1 Ivo Marinchev Abstract: The paper introduces approach to semantic lifting of unstructured data with the help of natural language
An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases
An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases Paul L. Bergstein, Priyanka Gariba, Vaibhavi Pisolkar, and Sheetal Subbanwad Dept. of Computer and Information Science,
M LTO Multilingual On-Line Translation
O non multa, sed multum M LTO Multilingual On-Line Translation MOLTO Consortium FP7-247914 Project summary MOLTO s goal is to develop a set of tools for translating texts between multiple languages in
Visualizing an OrientDB Graph Database with KeyLines
Visualizing an OrientDB Graph Database with KeyLines Visualizing an OrientDB Graph Database with KeyLines 1! Introduction 2! What is a graph database? 2! What is OrientDB? 2! Why visualize OrientDB? 3!
Specify the location of an HTML control stored in the application repository. See Using the XPath search method, page 2.
Testing Dynamic Web Applications How To You can use XML Path Language (XPath) queries and URL format rules to test web sites or applications that contain dynamic content that changes on a regular basis.
Distributed Database for Environmental Data Integration
Distributed Database for Environmental Data Integration A. Amato', V. Di Lecce2, and V. Piuri 3 II Engineering Faculty of Politecnico di Bari - Italy 2 DIASS, Politecnico di Bari, Italy 3Dept Information
A Visual Language Based System for the Efficient Management of the Software Development Process.
A Visual Language Based System for the Efficient Management of the Software Development Process. G. COSTAGLIOLA, G. POLESE, G. TORTORA and P. D AMBROSIO * Dipartimento di Informatica ed Applicazioni, Università
Mildly Context Sensitive Grammar Formalisms
Mildly Context Sensitive Grammar Formalisms Petra Schmidt Pfarrer-Lauer-Strasse 23 66386 St. Ingbert [email protected] Overview 3 Tree-djoining-Grammars 4 Co 7 8 9 CCG 9 LIG 10 TG 13 Linear
XML Schema Definition Language (XSDL)
Chapter 4 XML Schema Definition Language (XSDL) Peter Wood (BBK) XML Data Management 80 / 227 XML Schema XML Schema is a W3C Recommendation XML Schema Part 0: Primer XML Schema Part 1: Structures XML Schema
Keywords: Regression testing, database applications, and impact analysis. Abstract. 1 Introduction
Regression Testing of Database Applications Bassel Daou, Ramzi A. Haraty, Nash at Mansour Lebanese American University P.O. Box 13-5053 Beirut, Lebanon Email: rharaty, [email protected] Keywords: Regression
Comprendium Translator System Overview
Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4
Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy
Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy Kim S. Larsen Odense University Abstract For many years, regular expressions with back referencing have been used in a variety
Data Tool Platform SQL Development Tools
Data Tool Platform SQL Development Tools ekapner Contents Setting SQL Development Preferences...5 Execution Plan View Options Preferences...5 General Preferences...5 Label Decorations Preferences...6
Overview of MT techniques. Malek Boualem (FT)
Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,
