9 AMOS: A NATURAL LANGUAGE PARSER IMPLEMENTED AS A DEDUCTIVE DATABASE IN LOLA
|
|
|
- Ashlyn Gibson
- 10 years ago
- Views:
Transcription
1 Published as R. Ramakrishnan, editor, Applications of Logic Databases, chapter 9, pages Kluwer Academic Publishers, Boston, AMOS: A NATURAL LANGUAGE PARSER IMPLEMENTED AS A DEDUCTIVE DATABASE IN LOLA Günther Specht* and Burkhard Freitag** * Technische Universität München, Orleansstr. 34, D München, Germany ** Universität Passau, D Passau, Germany ABSTRACT In this paper we present the set-oriented bottom-up parsing system AMOS which is a major application of the deductive database system LOLA. AMOS supports the morpho-syntactical analysis of old Hebrew and has now been operationally used by linguists for a couple of years. The system allows the declarative specification of Definite Clause Grammar rules. Due to the set-oriented bottom-up evaluation strategy of LOLA it is particularly well suited to the analysis of language ambiguities. 1 INTRODUCTION In this paper the set-oriented bottom-up natural language parsing system AMOS, a major application of the deductive database system LOLA [3, 4], is presented. The AMOS system serves for the morpho-syntactical analysis of old Hebrew text and is intensively used by linguists. A grammar for old Hebrew [12] has been formalized as a Definite Clause Grammar (DCG) and represented as a LOLA program. The Definite Clause Grammar formalism and the evaluation of the corresponding logic programs by a PROLOG interpreter are well-known [10, 11]. Definite Clause Grammars were designed to provide a means for the declarative specification of grammar rules which can directly be interpreted as a logic program. PROLOG-based DCG parsers, though, suffer from a number of drawbacks
2 2 Chapter 9 Left-recursive rules can not be interpreted directly. Backtracking involves the duplicate construction of syntactic structures. In case of ambiguity of the grammar rules only one parse tree at a time is constructed thus introducing another source of backtracking. If information beyond the word position within a text (e.g. morphological attributes) is required, the dictionary has to be stored as a collection of unit clauses. The connection of a PROLOG based system with its tupleat-a-time strategy to a set-oriented relational database needs a buffering interface. This problem is known as the impedance mismatch problem. Solutions have been proposed for each of the problems listed above. In [10] bottom-up parsers for DCG s based on the left-corner method and interpretable by PROLOG are introduced. In [8] a bottom-up parsing system making use of similar techniques has been described. They translate the original DCG rules into a PROLOG program which incorporates subgoals governing the control of the evaluation process thus gaining efficiency as compared to ordinary DCG interpretation. In this paper we present a DCG parser we have implemented, and which has the following properties: Grammar rules can be specified in a purely declarative way. Recursive rules of any type can be processed without modifications. There is no duplicate construction of syntactic structures. In case of ambiguities all applicable parse trees are constructed simultaneously. The dictionary can be stored in an external relational database. Arbitrary queries concerning the entire text base can be processed. The paper is organized as follows. In Section 2 we give a brief overview of the deductive database system LOLA. Section 3 presents the general ideas underlying the representation of Definite Clause Grammars as LOLA programs. Section 4 contains a description of the AMOS system, while Section 5 explains how the system is currently used by linguists and what results have been
3 AMOS Natural Language Parser 3 achieved using AMOS. The paper ends with some concluding remarks in Section 6. Since the original grammar rules for old Hebrew are very complex we use a simple English example taken from [7] to explicate the essential features of the AMOS system. The AMOS system has been developed at the Munich University of Technology in cooperation with the research group of W. Richter, Institut für Assyriologie und Hethitologie at the University of Munich. Part of the project has been funded by the "Deutsche Forschungsgemeinschaft" under contract Ba 722/3 "Effiziente Deduktion". 2 SKETCH OF THE DEDUCTIVE DATABASE SYSTEM LOLA The system described in this paper has been implemented using the deductive database system LOLA [3, 4]. The centerpiece of the deductive database system prototype LOLA is a logic programming language with complex terms, (stratified) negation, explicit existential quantification and type declarations 1. The LOLA language has an interated fixpoint semantics and integrates deduction and efficient data access using relational techniques. The evaluation scheme is semi-naive, bottom-up, and set-at-a-time. Functions of the host language Common Lisp can be called from within LOLA programs and vice versa. The user can choose among various optimizations, e.g. magic set transformation, common subexpression elimination, indexing based on goal-functions. An explanation facility is integrated into the system. The LOLA system is implemented in Common Lisp and runs on UNIX workstations. LOLA provides automatic access to external relational database systems via a SQL interface. The language support special database goals of the form $db(hdb-identifieri,hdb-atomi) A database goal may occur in a LOLA program at any place a normal goal would appear. Syntactically, hdb-atomi is a DATALOG atom, i.e. an atom 1 Most of the LOLA systax is very close to the Prolog syntax. Variable names begin with or a capital letter. Constants, predicate symbols and function symbols begin with a small letter.
4 4 Chapter 9 having only variables or constants as argument terms. The predicate symbol of hdb-atomi is the name of a relation that can reside in main memory or in an external database. The argument terms of database atoms are treated as in normal logical subgoals: Constants restrict the corresponding attribute values, i.e. only matching tuples are selected. Variables are bound to the corresponding attribute values as returned by the database during query evaluation. This allows to pass attribute values to other subgoals or to the rule head as usual. If the database identifier hdb-identifieri is equal to the constant $main the database goal refers to a relation of the LOLA main memory database 2. hdb-identifieri may also be a term of the form hdbms-identifieri(hdb-namei). where the function symbol hdbms-identifieri indicates the database management system that manages the database hdb-namei. This information is necessary for the proper selection of the gateway between the LOLA system and the external database system during query processing. To evaluate fragments of a deductive database program in an external database system, SQL-queries equivalent to the corresponding subgoals are generated by the LOLA compiler. There is also an option to cluster SQL-queries as far as possible. More details on external database access using LOLA can be found in [2]. 3 GRAMMARS AS LOGIC PROGRAMS 3.1 The Grammar Rules Consider the DCG rule N --> Z 1, :::, Z p. The corresponding LOLA rule is automatically obtained by augmenting each (terminal or nonterminal) symbol by two attributes describing the position within the text stream: 2 The LOLA system provides a convenient interface for defining a main memory relation as a set of facts in LOLA syntax.
5 AMOS Natural Language Parser 5 N(I 0,I p ) :- Z 1 (I 0,I 1 ), :::, Z p (I p;1, I p ). See Figure 1 for an example. Extra conditions can be represented as additional Input text: Positions: John saw a man with a mirror noun verb det noun prep det noun John saw a man with a mirror Figure 1 Sample input text and word positions subgoals in the body of the LOLA rule. As opposed to Prolog-related definitions of the DCG formalism we do not need to restrict the use of recursive DCG rules in any way. In particular, left-recursive rules are allowed. Figure 2 shows sample LOLA rules representing a simple DCG grammar. We use the following abbreviations: s for sentence, np for noun phrase, pp for preposition phrase, vp for verb phrase, det for determinator, and prep for preposition. (Partial) parse trees are represented as complex (Herbrand) terms which are successively constructed during query evaluation. In the example, the parse tree terms, e.g. s(np TREE,VP TREE), occur at the additional third attribute position of rule heads and subgoals. 3.2 The Dictionary The dictionary stores the terminal symbols together with their positions within the text stream that is to be analyzed. It normally depends on the particular text to be analyzed while the grammar rules themselves do not change. Furthermore, the dictionary is likely to consist of a very large number of entries (see Section 4). In LOLA we are able to store the dictionary separate from the grammar rules as a collection of external relations where every lexical category <C> corresponds to a relation <C>. Using the SQL-database interface of the
6 6 Chapter 9 s(x,y,s(np_tree,vp_tree)) :- np(x,z,np_tree), vp(z,y,vp_tree). s(x,y,s(s_tree,pp_tree)) :- s(x,z,s_tree), pp(z,y,pp_tree). np(x,y,np1(n_tree)) :- noun(x,y,n_tree). np(x,y,np2(d_tree,n_tree)) :- det(x,z,d_tree), noun(z,y,n_tree). np(x,y,np2(np_tree,pp_tree)) :- np(x,z,np_tree), pp(z,y,pp_tree). pp(x,y,pp(p_tree,np_tree)) :- prep(x,z,p_tree), np(z,y,np_tree). vp(x,y,vp(v_tree,np_tree)) :- verb(x,z,v_tree), np(z,y,np_tree). Figure 2 Rules representing a simple DCG grammar LOLA system (cf. Section 2) these relations may reside on an external relational database system. At query evaluation time the appropriate (portions of) the dictionary relations are downloaded into the LOLA main memory database. Position identifiers can easily be generated if the text is already separated into sentences. Languages without punctuation symbols, such as old Hebrew, require an additional pass. The position identifiers specify the string position relative to a sentence marker. The coding of string positions by position identifiers rather than difference lists is particularly suited to relational databases. The dictionary, i.e. the fact base of the parsing system, can be generated automatically. Normally it is the result of the morphological analysis of the input text (see Section 4). As usual for DCGs, additional attributes can be used, e.g. to store case, gender and number. It occurs frequently that the morphological analysis produces ambiguous results, i.e. different classifications for the same occurrence of a word. Every such classification is stored as a separate tuple in the corresponding dictionary relation. Part of a dictionary database containing classifications of the words occurring in the sample input sentence of Figure 1 is shown in Figure 3. Note, that the word "saw" has an ambiguous lexical classification.
7 AMOS Natural Language Parser 7 Schema of Dictionary Database: noun_rel(text_id,from,to,word,case, gender,number) det_rel(text_id,from,to,word) verb_rel(text_id,from,to,word,person, number,tense,groundform) prep_rel(text_id,from,to,word) Dictionary relations: noun_rel = {(ch3s1,0,1,"john",nominative, masculin,singular), (ch3s1,3,4,"man",accusative, masculin,singular), (ch3s1,1,2,"saw",nominative, neutrum,singular),... } det_rel = {(ch3s1,2,3,"a"), (ch3s1,5,6,"a"),... } verb_rel = {(ch3s1,1,2,"saw",3,singular, past,"to see"),... } prep_rel = {(ch3s1,4,5,"with"),... } Figure 3 Sample dictionary database 3.3 The Grammar as a LOLA Program The complete LOLA program representing the parser for the simple grammar of Figure 2 is shown in Figures 4 and 5. The first part of the program (Figure 4) contains type declarations for each predicate and function symbol occurring in the program starting with the predicate symbols which do not have a result type and followed by the function symbols. In the sample program all function symbols have the result type tree. Finally, the external relations and computed predicates are declared. The type system turned out to be very useful to increase the programming security. While there
8 8 Chapter 9 has been work on type systems for Prolog [9, 5] and Gödel [6], commercial PROLOG-systems use types only for non-logical constructs e.g. arithmetics. The second part of the program (Figure 5) contains the defining rules for the predicates, i.e. the rules representing the DCG grammar. The dictionary $program(simple_grammar). % Declaration of Predicate Symbols ::= parse(<number>,<tree>). ::= s(<number>,<number>,<tree>). ::= np(<number>,<number>,<tree>). ::= pp(<number>,<number>,<tree>). ::= vp(<number>,<number>,<tree>). ::= noun(<number>,<number>,<tree>). ::= det(<number>,<number>,<tree>). ::= prep(<number>,<number>,<tree>). ::= verb(<number>,<number>,<tree>). % Declaration of Function Symbols <tree> ::= noun(<word>) det(<word>) prep(<word>) verb(<word>) s(<tree>,<tree>) np1(<tree>) np2(<tree>,<tree>) pp(<tree>,<tree>) vp(<tree>,<tree>). % Declaration of external Relations <$db_relation> ::= noun_rel(<text_id>,<number>,<number>, <word>,<case>, <gender>,<numerus>). <$db_relation> ::= verb_rel(<text_id>,<number>,<number>, <word>,<person>,<numerus>, <tense>,<groundform>). <$db_relation> ::= det_rel(<text_id>,<number>,<number>, <word>). <$db_relation> ::= prep_rel(<text_id>,<number>,<number>, <word>). % Declaration of Computed Predicates <$built_in_function([])> ::= print_tree(<tree>). Figure 4 Declaration part of LOLA program representing simple grammar
9 AMOS Natural Language Parser 9 % Parse Rule and DCG Rules parse(x,y,tree) :- s(x,y,tree), $built_in([$b], print_tree(tree)). s(x,y,s(np_tree,vp_tree)) :- np(x,z,np_tree), vp(z,y,vp_tree). s(x,y,s(s_tree,pp_tree)) :- s(x,z,s_tree), pp(z,y,pp_tree). np(x,y,np1(n_tree)) :- noun(x,y,n_tree). np(x,y,np2(d_tree,n_tree)) :- det(x,z,d_tree), noun(z,y,n_tree). np(x,y,np2(np_tree,pp_tree)) :- np(x,z,np_tree), pp(z,y,pp_tree). pp(x,y,pp(p_tree,np_tree)) :- prep(x,z,p_tree), np(z,y,np_tree). vp(x,y,vp(v_tree,np_tree)) :- verb(x,z,v_tree), np(z,y,np_tree). %--- Link to external Dictionary Database --- noun(x,y,noun(w)) :- $db($main, noun_rel(_,x,y,w,_,_,_)). verb(x,y,verb(w)) :- $db($main, verb_rel(_,x,y,w,_,_,_,_)). det(x,y,det(w)) :- $db($main, det_rel(_,x,y,w)). prep(x,y,prep(w)) :- $db($main, prep_rel(_,x,y,w)). Figure 5 Rules of LOLA program representing simple grammar database is linked to the program using LOLA s database goals 3. The external database facility of LOLA automatically generates SQL-code for dictionary queries and transfers the resulting SQL-queries to the external database system. As described above, parse trees are represented by complex terms 4. The term representing the final complete parse tree, i.e. the term constructed by the s-rules, is graphically displayed as a tree by the user interface. To this end the 3 Note, that for the sake of simplicity additional attributes such as case are eliminated by projection. The AMOS system, however, makes intensive use of additional attributes. 4 At this point it should be mentioned that symbols can be overloaded in LOLA in the sense that the same symbol may denote both a predicate and a function, even of different arity, without disturbing type correctness.
10 10 Chapter 9 s / \ np vp / \ noun verb np / \ john saw np pp / \ / \ det noun prep np / \ a man with det noun a mirror Figure 6 First parse tree resulting from sample query computed predicate 5 print tree is called when processing the built in subgoal of the parse rule. The query :- parse(0,x,tree). results in the construction and display of all possible parse trees for complete sentences 6. The first argument of parse represents the begin position, the second the end position of the sentence parsed. The third argument contains the parse tree term. In our example the answer relation consists of three answer tuples, two of them with end position 7 and one with end position 4. The display of the former two parse trees is visualized in Figures 6 and 7. The reader may have noticed that we use different function symbols as constructors of the parse trees while the parse trees shown in Figures 6 and 7 do not make this distinction. To preserve type correctness we have to distinguish the noun phrase constructors of arity 1 (np1) from those of arity 2 (np2). On the other hand, we want to display the parse trees with maximum convenience 5 In the current version of LOLA computed predicates are implemended as Common Lisp functions. See [4] for more details on computed predicates. 6 For the AMOS system it is important to consider all sentences and not only those of maximum length since punctuation is lacking in old Hebrew.
11 AMOS Natural Language Parser 11 s / \ s pp / \ / \ np vp prep np _ / \ / \ noun verb np with det noun _ _ / \ john saw det noun a mirror a man Figure 7 Second parse tree resulting from sample query for linguists. The print tree function performs the appropriate processing of the computed parse trees. Instead of implementing special purpose display routines such as print tree the explanation facility built-in to the LOLA system [14, 15], can be used to display the parse trees together with information how the particular trees have been derived, i.e. which rules have been applied etc.. 4 THE AMOS PARSING SYSTEM The AMOS system for the morpho-syntactical analysis of old Hebrew text has been implemented applying the techniques described above. The grammar for old Hebrew [12] has been formalized as DCG representation and written as a LOLA program [13]. The dictionary relations are generated by the morphological analysis system SALOMO for old Hebrew [1] and can be stored in an external relational database.
12 12 Chapter 9 The sample AMOS rules 7 attp(sentence,x,y,advp1(n_tree,adj_tree), Case,Gender,Number) :- noun(sentence,x,z,n_tree,case,gender,number), adj(sentence,z,y,adj_tree,case,gender,number). attp(sentence,x,y,advp2(a_tree,adj_tree), Case,Gender,Number) :- attp(sentence,x,z,a_tree,case,gender,number), adj(sentence,z,y,adj_tree,case,gender,number). represent the grammar rule for the attribute-phrase (attp): An attribute-phrase in old Hebrew is a noun, followed by an adjective, or an attribute-phrase followed by an adjective. In addition, the rules specify that congruence of case, gender and number is required. The following (simplified) AMOS rules show that even nonlinear recursion is necessary. The apposition-phrase (appp) defines compound nouns and compound noun-phrases: appp(sentence,x,y,appp(n_tree1,n_tree2)) :- noun(sentence,x,z,n_tree1,absolutus, _, _), noun(sentence,z,y,n_tree2,_, _, _). appp(sentence,x,y,appp(n_tree,a_tree)) :- noun(sentence,x,z,n_tree,absolutus, _, _), appp(sentence,z,y,a_tree). appp(sentence,x,y,appp(a_tree1,a_tree2)) :- appp(sentence,x,z,a_tree1), appp(sentence,z,y,a_tree2). The AMOS program contains several more appp-rules which are more complicated and selective. A section of the dependency graph of the AMOS program is shown in Figure 8. The very beginning of the book of the Ecclesiasts is an example of an apposition-phrase to which the above nonlinear recursive rule is applicable. A literal translation results in: 7 The real AMOS system contains more attp and appp rules than shown here. Note, that the schema of logic predicates and base relations is extended as compared to the simple examples shown in Section 3.
13 AMOS Natural Language Parser 13 words Kohelet son David king of Jerusalem noun noun noun noun noun preposition noun This can be parsed in several ways (using additional morphem attributes not mentioned here) corresponding to the following interpretations: 1. words [of] Kohelet [who was a] son [of] David [and David was] king of Jerusalem 2. words [of] Kohelet [who was a] son [of] David [and every son of David was] king of Jerusalem 3. words [of] Kohelet [who was a] son [of] David [and who (Kohelet) was] king of Jerusalem 4. words [of] Kohelet [who was a] son [of] David [and words of the] king of Jerusalem AMOS derives a parse tree for each of the above interpretations. Linguists are often very interested in finding such (syntactical) ambiguities. The characteristics of the AMOS system are Large volumes of text like the Old Testament; Ambiguities in the dictionary due to words not uniquely classifiable at the morphological level; Ambiguous parse trees as derived using the DCG rules. The dictionary database consists of 25 relations with a total of 117 attributes. The extension of the database relations depends on the text which is to be analyzed and may be rather large thus preventing the downloading of the entire dictionary. However, we can still benefit from the advantages of deductive databases. Since AMOS adopts the set-at-a-time evaluation strategy of LOLA there is no need to parse one sentence after the other as in Prolog-based systems. We found that processing an entire paragraph or chapter gives the best performance results. On the other hand, database support is needed to store the input text as well as the computed answer relations. The AMOS
14 14 Chapter 9 apposition preposition constructus annexion attribute numeral Figure 8 Section of the dependency graph of the AMOS program dictionary for one chapter of the Genesis contains 167 KBytes with 4282 tuples. The entire Genesis contains MBytes with tuples, and the Old Testament has MBytes with 2.7 million tuples. The complete morpho-syntactical parser for old Hebrew is represented by about 200 LOLA rules. The program contains 4 linearly recursive, 13 mutually recursive, and 1 nonlinearly recursive (quadratically recursive) predicate symbols. An overview of the AMOS system architecture is shown in Figure 9. Frequently used queries are precompiled and can be invoked by the user. In addition, the system allows ad-hoc queries, which may include additional rules and facts. The latter feature facilitates hypothetical reasoning and testing of new grammar rules. The system is currently running on UNIX workstations with Allegro Common Lisp, Lucid Common Lisp, and AKCL. 5 AMOS AS A TOOL FOR LINGUISTIC RESEARCH The overall goal when using AMOS as a tool for linguistic research is twofold:
15 Figure 9 Overview of the AMOS system architecture 1. Finding a consistent and complete grammar for the old Hebrew language and 2. generating complete morphological and syntactical descriptions of the input texts. Since there is no native speaker, these goals can only be achieved in by mutual approximation, i.e. in a ping-pong kind of strategy. As a first step we used rules that had been extracted from textbooks on old Hebrew grammar and subsequently formalized according to the DCG formalism to obtain a LOLA program that could be applied to old Hebrew text. The results showed that not all phrases had been covered correctly, which means that some rules had been too sloppy or too restrictive or had been missing. By testing new rules using the ad-hoc query facility we reached a good stable state of the rule system in a step-by-step fashion.
16 16 Chapter 9 The historical and prosaic parts of the bible have already completely been analyzed using AMOS. New research is on the way concerning more complex rules for lyric texts. Because the AMOS system facilitates the application of alternative sets of grammar rules to huge text bases and supports the user in automatically extracting the differences in the corresponding result relations, it helped developing a correct and much more exact grammar of old Hebrew than was known before. The second goal linguists try to achieve when using AMOS is to obtain a complete (electronical) concordance, i.e. a kind of index, not only for single words but also for word phrases and sentences. By default, the larger part of the computed answer relations containing the parse trees is persistently stored in the external database without user interaction. Only for phrases with ambiguous parse trees and for some cases particularly interesting for further research confirmation by the user is required. About 98 % of the input text can be analyzed correctly without manual confirmation. A comparison to the results previously obtained in similar cases can additionally be taken as a validation check. 6 CONCLUSION We have described the AMOS parsing system for old Hebrew text. The system has been implemented using the LOLA system. Only the predicate causing side-effects, i.e. print tree, had to be encoded in the host language. There are considerable advantages as compared to Prolog based systems, e.g. the indepence of rule order and the ability to handle any type of recursion directly. Acknowledgement We would like to thank W. Richter and W. Eckardt, Seminar of Hebraistic and Ugaristic at the University of Munich (LMU), for supporting the formalization of the grammar of old Hebrew.
17 AMOS Natural Language Parser 17 REFERENCES [1] W. Eckardt. Computergestützte Analyse althebräischer Texte: Algorithmische Erkennung der Morphologie, volume 29 of Arbeiten zu Text und Sprache im Alten Testament. EOS Verlag, St. Ottilien, Germany, (In German). [2] B. Freitag and J. Kempe. Multidatabase access with the LOLA system. Technical Report TUM-I9416, Technische Universität München, Munich, Germany, [3] B. Freitag, H. Schütz, and G. Specht. LOLA a logic language for deductive databases and its implementation. In Proc. 2nd Intl. Symp. on Database Systems for Advanced Applications (DASFAA 91), pages , Tokyo, [4] B. Freitag, H. Schütz, G. Specht, R. Bayer, and U. Güntzer. LOLA a deductive database system with integrated SQL-database access. Technical report, Technische Universität München, Munich, Germany, [5] M. Hanus. Horn clause programs with polymorphic types: Semantics and resolution. In Proc. TAPSOFT 89, volume 352 of LNCS, pages , Berlin, Springer-Verlag. [6] P. M. Hill and J. W. Lloyd. The Gödel report. Technical Report TR-91-02, University of Bristol, Dep. of Computer Science, [7] B. Lang. Datalog automata. In Proc. 3rd Intl. Conf. on Data and Knowledge Bases: Improving Usability and Responsiveness, pages , Jerusalem, [8] Y. Matsumoto and R. Sugimura. A parsing system based on logic programming. In Proc. Intl. Joint Conf. on Artificial Intellingence (IJCAI 87), pages , [9] A. Mycroft and R. A. O Keefe. A polymorphic type system for Prolog. Artificial Intelligence, 23: , [10] F. Pereira and S. Shieber. Prolog and Natural-Language Analysis, volume 10 of Lecture Notes. CSLI, Stanford, [11] F. Pereira and D. Warren. Definite clause grammars for language analysis - a survey of the formalism and a comparison with augmented transition networks. Artificial Intelligence, 13: , 1980.
18 18 Chapter 9 [12] W. Richter. Grundlagen einer althebräischen Grammatik: B. Die Beschreibungsebenen, II. Die Wortfügung (Morphosyntax), volume 10 of Arbeiten zu Text und Sprache im Alten Testament. EOS Verlag, St. Ottilien, Germany, (In German). [13] G. Specht. Wissensbasierte Analyse althebräischer Morphosyntax: Das Expertensystem AMOS, volume 35 of Arbeiten zu Text und Sprache im Alten Testament. EOS Verlag, St. Ottilien, Germany, (In German). [14] G. Specht. Source-to-Source Transformationen zur Erklärung des Programmverhaltens bei deduktiven Datenbanken. PhD thesis, Technische Universität München, Munich, Germany, volume 42 of DISKI, infix Verlag, St. Augustin, 1993 (In German). [15] G. Specht. Generating explanation trees even for negations in deductive database systems. In Proc. 5th Workshop on Logic Programming Environments, in conjunction with Int. Logic Programming Symposium (ILPS 93), Vancouver, 1993.
How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.
Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.
Natural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
Paraphrasing controlled English texts
Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich [email protected] Abstract. We discuss paraphrasing controlled English texts, by defining
Special Topics in Computer Science
Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS
Overview of the TACITUS Project
Overview of the TACITUS Project Jerry R. Hobbs Artificial Intelligence Center SRI International 1 Aims of the Project The specific aim of the TACITUS project is to develop interpretation processes for
Artificial Intelligence Exam DT2001 / DT2006 Ordinarie tentamen
Artificial Intelligence Exam DT2001 / DT2006 Ordinarie tentamen Date: 2010-01-11 Time: 08:15-11:15 Teacher: Mathias Broxvall Phone: 301438 Aids: Calculator and/or a Swedish-English dictionary Points: The
Extending Data Processing Capabilities of Relational Database Management Systems.
Extending Data Processing Capabilities of Relational Database Management Systems. Igor Wojnicki University of Missouri St. Louis Department of Mathematics and Computer Science 8001 Natural Bridge Road
Language. Johann Eder. Universitat Klagenfurt. Institut fur Informatik. Universiatsstr. 65. A-9020 Klagenfurt / AUSTRIA
PLOP: A Polymorphic Logic Database Programming Language Johann Eder Universitat Klagenfurt Institut fur Informatik Universiatsstr. 65 A-9020 Klagenfurt / AUSTRIA February 12, 1993 Extended Abstract The
Domain Knowledge Extracting in a Chinese Natural Language Interface to Databases: NChiql
Domain Knowledge Extracting in a Chinese Natural Language Interface to Databases: NChiql Xiaofeng Meng 1,2, Yong Zhou 1, and Shan Wang 1 1 College of Information, Renmin University of China, Beijing 100872
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
Outline of today s lecture
Outline of today s lecture Generative grammar Simple context free grammars Probabilistic CFGs Formalism power requirements Parsing Modelling syntactic structure of phrases and sentences. Why is it useful?
A Knowledge-based System for Translating FOL Formulas into NL Sentences
A Knowledge-based System for Translating FOL Formulas into NL Sentences Aikaterini Mpagouli, Ioannis Hatzilygeroudis University of Patras, School of Engineering Department of Computer Engineering & Informatics,
NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR
NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR 1 Gauri Rao, 2 Chanchal Agarwal, 3 Snehal Chaudhry, 4 Nikita Kulkarni,, 5 Dr. S.H. Patil 1 Lecturer department o f Computer Engineering BVUCOE,
Enhancing ECA Rules for Distributed Active Database Systems
Enhancing ECA Rules for Distributed Active Database Systems 2 Thomas Heimrich 1 and Günther Specht 2 1 TU-Ilmenau, FG Datenbanken und Informationssysteme, 98684 Ilmenau Universität Ulm, Abteilung Datenbanken
Object-Oriented Software Specification in Programming Language Design and Implementation
Object-Oriented Software Specification in Programming Language Design and Implementation Barrett R. Bryant and Viswanathan Vaidyanathan Department of Computer and Information Sciences University of Alabama
A Chart Parsing implementation in Answer Set Programming
A Chart Parsing implementation in Answer Set Programming Ismael Sandoval Cervantes Ingenieria en Sistemas Computacionales ITESM, Campus Guadalajara [email protected] Rogelio Dávila Pérez Departamento de
Introduction to formal semantics -
Introduction to formal semantics - Introduction to formal semantics 1 / 25 structure Motivation - Philosophy paradox antinomy division in object und Meta language Semiotics syntax semantics Pragmatics
Presented to The Federal Big Data Working Group Meetup On 07 June 2014 By Chuck Rehberg, CTO Semantic Insights a Division of Trigent Software
Semantic Research using Natural Language Processing at Scale; A continued look behind the scenes of Semantic Insights Research Assistant and Research Librarian Presented to The Federal Big Data Working
Interactive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
Knowledge Engineering for Business Rules in PROLOG
Knowledge Engineering for Business Rules in PROLOG Ludwig Ostermayer, Dietmar Seipel University of Würzburg, Department of Computer Science Am Hubland, D 97074 Würzburg, Germany {ludwig.ostermayer,dietmar.seipel}@uni-wuerzburg.de
Natural Language Database Interface for the Community Based Monitoring System *
Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University
A View Integration Approach to Dynamic Composition of Web Services
A View Integration Approach to Dynamic Composition of Web Services Snehal Thakkar, Craig A. Knoblock, and José Luis Ambite University of Southern California/ Information Sciences Institute 4676 Admiralty
Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages
ICOM 4036 Programming Languages Preliminaries Dr. Amirhossein Chinaei Dept. of Electrical & Computer Engineering UPRM Spring 2010 Language Evaluation Criteria Readability: the ease with which programs
Static vs. Dynamic. Lecture 10: Static Semantics Overview 1. Typical Semantic Errors: Java, C++ Typical Tasks of the Semantic Analyzer
Lecture 10: Static Semantics Overview 1 Lexical analysis Produces tokens Detects & eliminates illegal tokens Parsing Produces trees Detects & eliminates ill-formed parse trees Static semantic analysis
CALICO Journal, Volume 9 Number 1 9
PARSING, ERROR DIAGNOSTICS AND INSTRUCTION IN A FRENCH TUTOR GILLES LABRIE AND L.P.S. SINGH Abstract: This paper describes the strategy used in Miniprof, a program designed to provide "intelligent' instruction
Basic Parsing Algorithms Chart Parsing
Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS 2011/2012 Anna Schmidt Talk Outline Chart Parsing Basics Chart Parsing Algorithms Earley Algorithm CKY Algorithm
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that
Livingston Public Schools Scope and Sequence K 6 Grammar and Mechanics
Grade and Unit Timeframe Grammar Mechanics K Unit 1 6 weeks Oral grammar naming words K Unit 2 6 weeks Oral grammar Capitalization of a Name action words K Unit 3 6 weeks Oral grammar sentences Sentence
Semester Review. CSC 301, Fall 2015
Semester Review CSC 301, Fall 2015 Programming Language Classes There are many different programming language classes, but four classes or paradigms stand out:! Imperative Languages! assignment and iteration!
Declarative Parsing and Annotation of Electronic Dictionaries
Declarative Parsing and Annotation of Electronic Dictionaries Christian Schneiker 1, Dietmar Seipel 1, Werner Wegstein 2, and Klaus Prätor 3 1 Department of Computer Science {schneiker seipel}@informatik.uni-wuerzburg.de
Cedalion A Language Oriented Programming Language (Extended Abstract)
Cedalion A Language Oriented Programming Language (Extended Abstract) David H. Lorenz Boaz Rosenan The Open University of Israel Abstract Implementations of language oriented programming (LOP) are typically
How To Understand And Understand Common Lisp
Language-Oriented Programming am Beispiel Lisp Arbeitskreis Objekttechnologie Norddeutschland HAW Hamburg, 6.7.2009 Prof. Dr. Bernhard Humm Hochschule Darmstadt, FB Informatik und Capgemini sd&m Research
Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology
Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Makoto Nakamura, Yasuhiro Ogawa, Katsuhiko Toyama Japan Legal Information Institute, Graduate
Structure of the talk. The semantics of event nominalisation. Event nominalisations and verbal arguments 2
Structure of the talk Sebastian Bücking 1 and Markus Egg 2 1 Universität Tübingen [email protected] 2 Rijksuniversiteit Groningen [email protected] 12 December 2008 two challenges for a
Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Query optimization. DBMS Architecture. Query optimizer. Query optimizer.
DBMS Architecture INSTRUCTION OPTIMIZER Database Management Systems MANAGEMENT OF ACCESS METHODS BUFFER MANAGER CONCURRENCY CONTROL RELIABILITY MANAGEMENT Index Files Data Files System Catalog BASE It
Efficiently Identifying Inclusion Dependencies in RDBMS
Efficiently Identifying Inclusion Dependencies in RDBMS Jana Bauckmann Department for Computer Science, Humboldt-Universität zu Berlin Rudower Chaussee 25, 12489 Berlin, Germany [email protected]
COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing
COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing The scanner (or lexical analyzer) of a compiler processes the source program, recognizing
Chapter 1. Dr. Chris Irwin Davis Email: [email protected] Phone: (972) 883-3574 Office: ECSS 4.705. CS-4337 Organization of Programming Languages
Chapter 1 CS-4337 Organization of Programming Languages Dr. Chris Irwin Davis Email: [email protected] Phone: (972) 883-3574 Office: ECSS 4.705 Chapter 1 Topics Reasons for Studying Concepts of Programming
Lecture 9. Semantic Analysis Scoping and Symbol Table
Lecture 9. Semantic Analysis Scoping and Symbol Table Wei Le 2015.10 Outline Semantic analysis Scoping The Role of Symbol Table Implementing a Symbol Table Semantic Analysis Parser builds abstract syntax
Statistical Machine Translation
Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language
Design Patterns in Parsing
Abstract Axel T. Schreiner Department of Computer Science Rochester Institute of Technology 102 Lomb Memorial Drive Rochester NY 14623-5608 USA [email protected] Design Patterns in Parsing James E. Heliotis
PTE Academic Preparation Course Outline
PTE Academic Preparation Course Outline August 2011 V2 Pearson Education Ltd 2011. No part of this publication may be reproduced without the prior permission of Pearson Education Ltd. Introduction The
Division of Mathematical Sciences
Division of Mathematical Sciences Chair: Mohammad Ladan, Ph.D. The Division of Mathematical Sciences at Haigazian University includes Computer Science and Mathematics. The Bachelor of Science (B.S.) degree
Artificial Intelligence
Artificial Intelligence ICS461 Fall 2010 1 Lecture #12B More Representations Outline Logics Rules Frames Nancy E. Reed [email protected] 2 Representation Agents deal with knowledge (data) Facts (believe
Artificial Intelligence. Class: 3 rd
Artificial Intelligence Class: 3 rd Teaching scheme: 4 hours lecture credits: Course description: This subject covers the fundamentals of Artificial Intelligence including programming in logic, knowledge
03 - Lexical Analysis
03 - Lexical Analysis First, let s see a simplified overview of the compilation process: source code file (sequence of char) Step 2: parsing (syntax analysis) arse Tree Step 1: scanning (lexical analysis)
A Test Case Generator for the Validation of High-Level Petri Nets
A Test Case Generator for the Validation of High-Level Petri Nets Jörg Desel Institut AIFB Universität Karlsruhe D 76128 Karlsruhe Germany E-mail: [email protected] Andreas Oberweis, Torsten
Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1
Korpus-Abfrage: Werkzeuge und Sprachen Gastreferat zur Vorlesung Korpuslinguistik mit und für Computerlinguistik Charlotte Merz 3. Dezember 2002 Motivation Lizentiatsarbeit: A Corpus Query Tool for Automatically
Sentence Structure/Sentence Types HANDOUT
Sentence Structure/Sentence Types HANDOUT This handout is designed to give you a very brief (and, of necessity, incomplete) overview of the different types of sentence structure and how the elements of
Overview of MT techniques. Malek Boualem (FT)
Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,
Syntactic Theory on Swedish
Syntactic Theory on Swedish Mats Uddenfeldt Pernilla Näsfors June 13, 2003 Report for Introductory course in NLP Department of Linguistics Uppsala University Sweden Abstract Using the grammar presented
An XML Framework for Integrating Continuous Queries, Composite Event Detection, and Database Condition Monitoring for Multiple Data Streams
An XML Framework for Integrating Continuous Queries, Composite Event Detection, and Database Condition Monitoring for Multiple Data Streams Susan D. Urban 1, Suzanne W. Dietrich 1, 2, and Yi Chen 1 Arizona
No Evidence. 8.9 f X
Section I. Correlation with the 2010 English Standards of Learning and Curriculum Framework- Grade 8 Writing Summary Adequate Rating Limited No Evidence Section I. Correlation with the 2010 English Standards
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati
Context free grammars and predictive parsing
Context free grammars and predictive parsing Programming Language Concepts and Implementation Fall 2012, Lecture 6 Context free grammars Next week: LR parsing Describing programming language syntax Ambiguities
Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata
Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata Alessandra Giordani and Alessandro Moschitti Department of Computer Science and Engineering University of Trento Via Sommarive
Extracting translation relations for humanreadable dictionaries from bilingual text
Extracting translation relations for humanreadable dictionaries from bilingual text Overview 1. Company 2. Translate pro 12.1 and AutoLearn 3. Translation workflow 4. Extraction method 5. Extended
2. The Language of First-order Logic
2. The Language of First-order Logic KR & R Brachman & Levesque 2005 17 Declarative language Before building system before there can be learning, reasoning, planning, explanation... need to be able to
A + dvancer College Readiness Online Alignment to Florida PERT
A + dvancer College Readiness Online Alignment to Florida PERT Area Objective ID Topic Subject Activity Mathematics Math MPRC1 Equations: Solve linear in one variable College Readiness-Arithmetic Solving
Predicate Logic. For example, consider the following argument:
Predicate Logic The analysis of compound statements covers key aspects of human reasoning but does not capture many important, and common, instances of reasoning that are also logically valid. For example,
ON FUNCTIONAL SYMBOL-FREE LOGIC PROGRAMS
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical and Mathematical Sciences 2012 1 p. 43 48 ON FUNCTIONAL SYMBOL-FREE LOGIC PROGRAMS I nf or m at i cs L. A. HAYKAZYAN * Chair of Programming and Information
English Descriptive Grammar
English Descriptive Grammar 2015/2016 Code: 103410 ECTS Credits: 6 Degree Type Year Semester 2500245 English Studies FB 1 1 2501902 English and Catalan FB 1 1 2501907 English and Classics FB 1 1 2501910
Introduction to tuple calculus Tore Risch 2011-02-03
Introduction to tuple calculus Tore Risch 2011-02-03 The relational data model is based on considering normalized tables as mathematical relationships. Powerful query languages can be defined over such
Integrating Pattern Mining in Relational Databases
Integrating Pattern Mining in Relational Databases Toon Calders, Bart Goethals, and Adriana Prado University of Antwerp, Belgium {toon.calders, bart.goethals, adriana.prado}@ua.ac.be Abstract. Almost a
The Prolog Interface to the Unstructured Information Management Architecture
The Prolog Interface to the Unstructured Information Management Architecture Paul Fodor 1, Adam Lally 2, David Ferrucci 2 1 Stony Brook University, Stony Brook, NY 11794, USA, [email protected] 2 IBM
Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
n Introduction n Art of programming language design n Programming language spectrum n Why study programming languages? n Overview of compilation
Lecture Outline Programming Languages CSCI-4430 & CSCI-6430, Spring 2016 www.cs.rpi.edu/~milanova/csci4430/ Ana Milanova Lally Hall 314, 518 276-6887 [email protected] Office hours: Wednesdays Noon-2pm
CSCI 3136 Principles of Programming Languages
CSCI 3136 Principles of Programming Languages Faculty of Computer Science Dalhousie University Winter 2013 CSCI 3136 Principles of Programming Languages Faculty of Computer Science Dalhousie University
Syntaktická analýza. Ján Šturc. Zima 208
Syntaktická analýza Ján Šturc Zima 208 Position of a Parser in the Compiler Model 2 The parser The task of the parser is to check syntax The syntax-directed translation stage in the compiler s front-end
Moving from CS 61A Scheme to CS 61B Java
Moving from CS 61A Scheme to CS 61B Java Introduction Java is an object-oriented language. This document describes some of the differences between object-oriented programming in Scheme (which we hope you
Programming Languages
Programming Languages Qing Yi Course web site: www.cs.utsa.edu/~qingyi/cs3723 cs3723 1 A little about myself Qing Yi Ph.D. Rice University, USA. Assistant Professor, Department of Computer Science Office:
5.1 Database Schema. 5.1.1 Schema Generation in SQL
5.1 Database Schema The database schema is the complete model of the structure of the application domain (here: relational schema): relations names of attributes domains of attributes keys additional constraints
Chapter 1: Introduction
Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db book.com for conditions on re use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases
Hybrid Strategies. for better products and shorter time-to-market
Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,
Symbiosis of Evolutionary Techniques and Statistical Natural Language Processing
1 Symbiosis of Evolutionary Techniques and Statistical Natural Language Processing Lourdes Araujo Dpto. Sistemas Informáticos y Programación, Univ. Complutense, Madrid 28040, SPAIN (email: [email protected])
Moving Enterprise Applications into VoiceXML. May 2002
Moving Enterprise Applications into VoiceXML May 2002 ViaFone Overview ViaFone connects mobile employees to to enterprise systems to to improve overall business performance. Enterprise Application Focus;
Textual Modeling Languages
Textual Modeling Languages Slides 4-31 and 38-40 of this lecture are reused from the Model Engineering course at TU Vienna with the kind permission of Prof. Gerti Kappel (head of the Business Informatics
DESIGN OF A WORKFLOW SYSTEM TO IMPROVE DATA QUALITY USING ORACLE WAREHOUSE BUILDER 1
DESIGN OF A WORKFLOW SYSTEM TO IMPROVE DATA QUALITY USING ORACLE WAREHOUSE BUILDER 1 Esther BOROWSKI Institut für Informatik, Freie Universität Berlin, Germany E-mail: [email protected] Hans-J.
Functional Programming
FP 2005 1.1 3 Functional Programming WOLFRAM KAHL [email protected] Department of Computing and Software McMaster University FP 2005 1.2 4 What Kinds of Programming Languages are There? Imperative telling
1 File Processing Systems
COMP 378 Database Systems Notes for Chapter 1 of Database System Concepts Introduction A database management system (DBMS) is a collection of data and an integrated set of programs that access that data.
LESSON THIRTEEN STRUCTURAL AMBIGUITY. Structural ambiguity is also referred to as syntactic ambiguity or grammatical ambiguity.
LESSON THIRTEEN STRUCTURAL AMBIGUITY Structural ambiguity is also referred to as syntactic ambiguity or grammatical ambiguity. Structural or syntactic ambiguity, occurs when a phrase, clause or sentence
Lesson 8: Introduction to Databases E-R Data Modeling
Lesson 8: Introduction to Databases E-R Data Modeling Contents Introduction to Databases Abstraction, Schemas, and Views Data Models Database Management System (DBMS) Components Entity Relationship Data
The University of Jordan
The University of Jordan Master in Web Intelligence Non Thesis Department of Business Information Technology King Abdullah II School for Information Technology The University of Jordan 1 STUDY PLAN MASTER'S
Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability
Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability Ana-Maria Popescu Alex Armanasu Oren Etzioni University of Washington David Ko {amp, alexarm, etzioni,
Genetic programming with regular expressions
Genetic programming with regular expressions Børge Svingen Chief Technology Officer, Open AdExchange [email protected] 2009-03-23 Pattern discovery Pattern discovery: Recognizing patterns that characterize
Glossary of Object Oriented Terms
Appendix E Glossary of Object Oriented Terms abstract class: A class primarily intended to define an instance, but can not be instantiated without additional methods. abstract data type: An abstraction
