Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability
|
|
|
- Lesley Brown
- 10 years ago
- Views:
Transcription
1 Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability Ana-Maria Popescu Alex Armanasu Oren Etzioni University of Washington David Ko {amp, alexarm, etzioni, daveko, Alexander Yates Abstract Natural Language Interfaces to Databases (NLIs) can benefit from the advances in statistical parsing over the last fifteen years or so. However, statistical parsers require training on a massive, labeled corpus, and manually creating such a corpus for each database is prohibitively expensive. To address this quandary, this paper reports on the PRECISE NLI, which uses a statistical parser as a plug in. The paper shows how a strong semantic model coupled with light re-training enables PRECISE to overcome parser errors, and correctly map from parsed questions to the corresponding SQL queries. We discuss the issues in using statistical parsers to build database-independent NLIs, and report on experimental results with the benchmark ATIS data set where PRECISE achieves 94% accuracy. 1 Introduction and Motivation Over the last fifteen years or so, much of the NLP community has focused on the use of statistical and machine learning techniques to solve a wide range of problems in parsing, machine translation, and more. Yet, classical problems such as building Natural Language Interfaces to Databases (NLIs) (Grosz et al., 1987) are far from solved. There are many reasons for the limited success of past NLI efforts (Androutsopoulos et al., 1995). We highlight several problems that are remedied by our approach. First, manually authoring and tuning a semantic grammar for each new database is brittle and prohibitively expensive. In response, we have implemented a transportable NLI that aims to minimize manual, database-specific configuration. Second, NLI systems built in the 70s and 80s had limited syntactic parsing capabilities. Thus, we have an opportunity to incorporate the important advances made by statistical parsers over the last two decades in an NLI. However, attempting to use a statistical parser in a database-independent NLI leads to a quandary. On the one hand, to parse questions posed to a particular database, the parser has to be trained on a corpus of questions specific to that database. Otherwise, many of the parser s decisions will be incorrect. For example, the Charniak parser (trained on the 40,000 sentences in the WSJ portion of the Penn Treebank) treats list as a noun, but in the context of the ATIS database it is a verb. 1 On the other hand, manually creating and labeling a massive corpus of questions for each database is prohibitively expensive. We consider two methods of resolving this quandary and assess their performance individually and in concert on the ATIS data set. First, we use a strong semantic model to correct parsing errors. We introduce a theoretical framework for discriminating between Semantically Tractable (ST) questions and difficult ones, and we show that ST questions are prevalent in the well-studied ATIS data set (Price, 1990). Thus, we show that the semantic component of the NLI task can be surprisingly easy and can be used to compensate for syntactic parsing errors. Second, we re-train the parser using a relatively small set of 150 questions, where each word is labeled by its part-of-speech tag. To demonstrate how these methods work in practice, we sketch the fully-implemented PRECISE NLI, where a parser is a modular plug in. This modularity enables PRECISE to leverage continuing advances in parsing technology over time by plugging in improved parsers as they become available. The remainder of this paper is organized as follows. We describe PRECISE in Section 2, sketch our theory in Section 3, and report on our experiments in Section 4. We consider related work in Section 5, and conclude in Section 6. 2 The PRECISE System Overview Our recent paper (Popescu et al., 2003) introduced the PRECISE architecture and its core algorithm for 1 This is an instance of a well known machine learning principle typically, a learning algorithm is effective when its test examples are drawn from roughly the same distribution as its training examples.
2 reducing semantic interpretation to a graph matching problem that is solved by MaxFlow. In this section we provide a brief overview of PRECISE, focusing on the components necessary to understanding its performance on the ATIS data set in Section 4. To discuss PRECISE further, we must first introduce some terminology. We say that a database is made up of three types of elements: relations, attributes and values. Each element is unique: an attribute element is a particular column in a particular relation and each value element is the value of a particular attribute. A value is compatible with its attribute and also with the relation containing this attribute. An attribute is compatible with its relation. Each attribute in the database has associated with it a special value, which we call a wh-value, that corresponds to a wh-word (what, where, etc.). We define a lexicon as a tuple (T, E, M), where T is a set of strings, called tokens (intuitively, tokens are strings of one or more words, like New York ); E is a set of database elements, wh-values, and join paths; 2 and M is a subset of T E a binary relation between tokens and database elements. PRECISE takes as input a lexicon and a parser. Then, given an English question, PRECISE maps it to one (or more) corresponding SQL queries. We concisely review how PRECISE works through a simple example. Consider the following question q: What are the flights from Boston to Chicago? First, the parser plug-in automatically derives a dependency analysis for q from q s parse tree, represented by the following compact syntactic logical form: LF (q) = what(0), is(0, 1), f light(1), f rom(1, 2), boston(2), to(1, 3), chicago(3). LF (q) contains a predicate for each question word. Head nouns correspond to unary predicates whose arguments are constant identifiers. Dependencies are encoded by equality constraints between arguments to different predicates. The first type of dependency is represented by noun and adjective pre-modifiers corresponding to unary predicates whose arguments are the identifiers for the respective modified head nouns. A second type of dependency is represented by noun postmodifiers and mediated by prepositions (in the above example, from and to ). The prepositions correspond to binary predicates whose arguments specify the attached noun phrases. For instance, from attaches flight to boston. Finally, subject/predicate, predicate/direct object and predicate/indirect object dependency information is computed for the various 2 A join path is a set of equality constraints between the attributes of two or more tables. See Section 3 for more details and a formal definition. verbs present in the question. Verbs correspond to binary or tertiary predicates whose arguments indicate what noun phrases play the subject and object roles. In our example, the verb is mediates the dependency between what and flight. 3 PRECISE s lexicon is generated by automatically extracting value, attribute, and relation names from the database. We manually augmented the lexicon with relevant synonyms, prepositions, etc.. The tokenizer produces a single complete tokenization of this question and lemmatizes the tokens: (what, is, flight, from, boston, to, chicago). By looking up the tokens in the lexicon, PRECISE efficiently retrieves the set of potentially matching database elements for every token. In this case, what, boston and chicago are value tokens, to and from are attribute tokens and flight is a relation token. In addition to this information, the lexicon also contains a set of restrictions for tokens that are prepositions or verbs. The restrictions specify the database elements that are allowed to match to the arguments of the respective preposition or verb. For example, from can take as arguments a flight and a city. The restrictions also specify the join paths connecting these relations/attributes. The syntactic logical form is used to retrieve the relevant set of restrictions for a given question. The matcher takes as input the information described above and reduces the problem of satisfying the semantic constraints imposed by the definition of a valid interpretation to a graph matching problem (Popescu et al., 2003). In order for each attribute token to match a value token, Boston and Chicago map to the respective values of the database attribute city.cityname, from maps to flight.fromairport or fare.fromairport and to maps to flight.toairport or fare.toairport. The restrictions validate the output of the matcher and are then used in combination with the syntactic information to narrow down even further the possible interpretations for each token by enforcing local dependencies. For example, the syntactic information tells us that from refers to flight and since flight uniquely maps to flight, this means that from will map to flight.fromairport rather than fare.fromairport (similarly, to maps to flight.toairport and what maps to flight.flightid). Finally, the matcher compiles a list of all relations satisfying all the clauses in the syntactic logical form using each constant and narrows down the set 3 PRECISE uses a larger set of constraints on dependency relations, but for brevity, we focus on those relevant to our examples.
3 of possible interpretations for each token accordingly. Each set of (constant, corresponding database element) pairs represents a semantic logical form. The query generator takes each semantic logical form and uses the join path information available in the restrictions to form the final SQL queries corresponding to each semantic interpretation. S VP PP pronoun verb noun prep noun prep noun prep noun WhatareflightsfromBostontoChicagoonMonday? Figure 1: Example of an erroneous parse tree corrected by PRECISE s semantic over-rides. PRECISE detects that the parser attached the PP on Monday to Chicago in error. PRECISE attempts to re-attach on Monday first to the PP to Chicago, and then to the flights from Boston to Chicago, where it belongs. 2.1 Parser Enhancements We used the Charniak parser (Charniak, 2000) for the experiments reported in this paper. We found that the Charniak parser, which was trained on the WSJ corpus, yielded numerous syntactic errors. Our first step was to hand tag a set of 150 questions with Part Of Speech (POS) tags, and re-train the parser s POS tagger. As a result, the probabilities associated with certain tags changed dramatically. For example, initially, list was consistently tagged as a noun, but after re-training it was consistently labeled as a verb. This change occurs because, in the ATIS domain, list typically occurs in imperative sentences, such as List all flights. Focusing exclusively on the tagger drastically reduced the amount of data necessary for re-training. Whereas the Charniak parser was originally trained on close to 40,000 sentences, we only required 150 sentences for re-training. Unfortunately, the retrained parser still made errors when solving difficult syntactic problems, most notably preposition attachment and preposition ellipsis. PRECISE corrects both types of errors using semantic information. PP PP We refer to PRECISE s use of semantic information to correct parser errors as semantic over-rides. Specifically, PRECISE detects that an attachment decision made by the parser is inconsistent with the semantic information in its lexicon. 4 When this occurs, PRECISE attempts to repair the parse tree as follows. Given a noun phrase or a prepositional phrase whose corresponding node n in the parse tree has the wrong parent p, PRECISE traverses the path in the parse tree from p to the root node, searching for a suitable node to attach n to. PRECISE chooses the first ancestor of p such that when n is attached to the new node, the modified parse tree agrees with PRECISE s semantic model. Thus, the semantic over-ride procedure is a generate-and-test search where potential solutions are generated in the order of ancestors of node n in the parse tree. The procedure s running time is linear in the depth of the parse tree. Consider, for example, the question What are flights from Boston to Chicago on Monday? The parser attaches the prepositional phrase on Monday to Chicago whereas it should be attached to flights (see Figure 1). The parser merely knows that flights, Boston, and Chicago are nouns. It then uses statistics to decide that on Monday is most likely to attach to Chicago. However, this syntactic decision is inconsistent with the semantic information in PRECISE s lexicon the preposition on does not take a city and a day as arguments, rather it takes a flight and a day. Thus, PRECISE decides to over-ride the parser and attach on elsewhere. As shown in Figure 1, PRECISE detects that the parser attached the PP on Monday to Chicago in error. PRECISE attempts to re-attach on Monday first to the PP to Chicago, and then to the flights from Boston to Chicago, where it belongs. While in our example the parser violated a constraint in PRECISE s lexicon, the violation of any semantic constraint will trigger the over-ride procedure. In the above example, we saw how semantic overrides help PRECISE fix prepositional attachment errors; they also enable it to correct parser errors in topicalized questions (e.g., What are Boston to Chicago flights? ) and in preposition ellipsis (e.g., when on is omitted in the question What are flights from Boston to Chicago Monday? ). Unfortunately, semantic over-rides do not correct all of the parser s errors. Most of the remaining parser errors fall into the following categories: relative clause attachment, verb attachment, numeric 4 We say that node n is attached to node p if p is the parent of n in the parse tree.
4 noun phrases, and topicalized prepositional phrases. In general, semantic over-rides can correct local attachment errors, but cannot over-come more global problems in the parse tree. Thus, PRECISE can be forced to give up and ask the user to paraphrase her question. 3 PRECISE Theory The aim of this section is to explain the theoretical under-pinnings of PRECISE s semantic model. We show that PRECISE always answers questions from the class of Semantically Tractable (ST) questions correctly, given correct lexical and syntactic information. 5 We begin by introducing some terminology that builds on the definitions given Section Definitions A join path is a set of equality constraints between a sequence of database relations. More formally, a join path for relations R 1,..., R n is a set of constraints C {R i.a = R i+1.b 1 i n 1}. Here the notation R i.a refers to the value of attribute a in relation R i. We say a relation between token set T and a set of database elements and join paths E respects a lexicon L if it is a subset of M. A question is simply a string of characters. A tokenization of a question (with respect to a lexicon) is an ordered set of strings such that each element of the tokenization is an element of the lexicon s token set, and the concatenation of the elements of the tokenization, in order, is equal to the original question. For a given lexicon and question, there may be zero, one, or several tokenizations. Any question that has at least one tokenization is tokenizable. An attachment function is a function F L,q : T T, where L is the lexicon, q is a question, and T is the set of tokens in the lexicon. The attachment function is meant to represent dependency information available to PRECISE through a parser. For example, if a question includes the phrase restaurants in Seattle, the attachment function would attach Seattle to restaurants for this question. Not all tokens are attached to something in every question, so the attachment function is not a total function. We say that a relation R between tokens in a question q respects the attachment function if t 1, t 2, R(t 1, t 2 ) (F L,q (t 1 ) = t 2 ) (F L,q does not take on a value for t 1 ). 5 We do not claim that NLI users will restrict their questions to the ST subset of English in practice, but rather that identifying classes of questions as semantically tractable (or not), and experimentally measuring the prevalence of such questions, is a worthwhile avenue for NLI research. In an NLI, interpretations of a question are SQL statements. We define a valid interpretation of a question as being an SQL statement that satisfies a number of conditions connecting it to the tokens in the question. Because of space constraints, we provide only one such constraint as an example: There exists a tokenization t of the question and a set of database elements E such that there is a one-to-one map from t to E respecting the lexicon, and for each value element v E, there is exactly one equality constraint in the SQL clause that uses v. For a complete definition of a valid interpretation, see (Popescu et al., 2003). 3.2 Semantic Tractability Model In this section we formally define the class of ST questions, and show that PRECISE can provably map such questions to the corresponding SQL queries. Intuitively, ST questions are easy to understand questions where the words or phrases correspond to database elements or constraints on join paths. Examining multiple questions sets and databases, we have found that nouns, adjectives, and adverbs in easy questions refer to database relations, attributes, or values. Moreover, the attributes and values in a question pair up naturally to indicate equality constraints in SQL. However, values may be paired with implicit attributes that do not appear in the question (e.g., the attribute cuisine in What are the Chinese restaurants in Seattle? is implicit). Interestingly, there is no notion of implicit value the question What are restaurants with cuisine in Seattle? does not make sense. A preposition indicates a join between the relations corresponding to the arguments of the preposition. For example, consider the preposition from in the question what airlines fly from Boston to Chicago? from connects the value Boston (in the relation cities ) to the relation airlines. Thus, we know that the corresponding SQL query will join airlines and cities. We formalize these observations about questions below. We say that a question q is semantically tractable using lexicon L and attachment function F L,q if: 1. It is possible to split q up into words and phrases found in L. (More formally, q is tokenizable according to L.) 2. While words may have multiple meanings in the lexicon, it must be possible to find a oneto-one correspondence between tokens in the question and some set of database elements.
5 (More formally, there exists a tokenization t and a set of database elements and join paths E t such that there is a bijective function f from t to E t that respects L.) 3. There is at least one such set E t that has exactly one wh-value. 4. It is possible to add implicit attributes to E t to get a set E t with exactly one compatible attribute for every value. (More formally, for some E t with a wh-value there exist attributes a 1,..., a n such that E t = E t {a 1,..., a n } and there is a bijective function g from the set of value elements (including wh-values) V to the set of attribute elements A in E t.) 5. At least one such E t obeys the syntactic restrictions of F L,q. (More formally, let A = A E t. Then we require that {(f 1 (g 1 (a)), f 1 (a)) a A } respects F L,q.) 3.3 Results and Discussion We say that an NLI is sound for a class of questions Q using lexicon L and attachment function F L if for every input q Q, every output of the NLI is a valid interpretation. We say the NLI is complete if it returns all valid interpretations. Our main result is the following: Theorem 1 Given a lexicon L and attachment function F L, PRECISE is sound and complete for the class of semantically tractable questions. In practical terms, the theorem states that given correct and complete syntactic and lexical information, PRECISE will return exactly the set of valid interpretations of a question. If PRECISE is missing syntactic or semantic constraints, it can generate extraneous interpretations that it believes are valid. Also, if a person uses a term in a manner inconsistent with PRECISE s lexicon, then PRECISE will interpret her question incorrectly. Finally, PRECISE will not answer a question that contains words absent from its lexicon. The theorem is clearly an idealization, but the experiments reported in Section 4 provide evidence that it is a useful idealization. PRECISE, which embodies the model of semantic tractability, achieves very high accuracy because in practice it either has correct and complete lexical and syntactic information or it has enough semantic information to compensate for its imperfect inputs. In fact, as we explained in Section 2.1, PRECISE s semantic model enables it to correct parser errors in some cases. Finding all the valid interpretations for a question is computationally expensive in the worst case (even just tokenizing a question is -complete (Popescu et al., 2003)). Moreover, if the various syntactic and semantic constraints are fed to a standard constraint solver, then the problem of finding even a single valid interpretation is exponential in the worst case. However, we have been able to formulate PRECISE s constraint satisfaction problem as a graph matching problem that is solved in polynomial time by the MaxFlow algorithm: Theorem 2 For lexicon L, PRECISE finds one valid interpretation for a tokenization T of a semantically tractable question in time O(Mn 2 ), where n is the number of tokens in T and M is the maximum number of interpretations that a token can have in L. 4 Experimental Evaluation Semantic Tractability (ST) theory and PRECISE s architecture raise a four empirical questions that we now address via experiments on the ATIS data set (Price, 1990): how prevalent are ST questions? How effective is PRECISE in mapping ATIS questions to SQL queries? What is the impact of semantic over-rides? What is the impact of parser retraining? Our experiments utilized the 448 contextindependent questions in the ATIS Scoring Set A. We chose the ATIS data set because it is a standard benchmark (see Table 2) where independently generated questions are available to test the efficacy of an NLI. We found that 95.8% of the ATIS questions were ST questions. We classified each question as ST (or not) by running PRECISE on the question and System Setup PRECISE PRECISE-1 P arser ORIG 61.9% 60.3% P arser ORIG % 85.5% P arser T RAINED 92.4% 88.2% P arser T RAINED % 89.2% P arser CORRECT 95.8% 91.9% Table 1: Impact of Parser Enhancements. The PRECISE column records the percentage of questions where the small set of SQL queries returned by PRECISE contains the correct query; PRECISE-1 refers to the questions correctly interpreted if PRECISE is forced to return exactly one SQL query. P arser ORIG is the original version of the parser, P arser T RAINED is the version re-trained for the ATIS domain, and P arser CORRECT is the version whose output is corrected manually. System configurations marked by + indicate the automatic use of semantic over-rides to correct parser errors.
6 PRECISE PRECISE-1 AT&T CMU MIT SRI BBN UNISYS MITRE HEY 94.0% 89.1% 96.2% 96.2% 95.5% 93% 90.6% 76.4% 69.4% 92.5% Table 2: Accuracy Comparison between PRECISE, PRECISE-1 and the major ATIS NLIs. Only PRECISE and the HEY NLI are database independent. All results are for performance on the context-independent questions in ATIS. recording its response. Intractable questions were due to PRECISE s incomplete semantic information. Consider, for example, the ATIS request List flights from Oakland to Salt Lake City leaving after midnight Thursday. PRECISE fails to answer this question because it lacks a model of time, and so cannot infer that after midnight Thursday means early Friday morning. In addition, we found that the prevalence of ST questions in the ATIS data is consistent with our earlier results on the set of 1,800 natural language questions compiled by Ray Mooney in his experiments in three domains (Tang and Mooney, 2001). As reported in (Popescu et al., 2003), we found that approximately 80% of Mooney s questions were ST. PRECISE performance on the ATIS data was also comparable to its performance on the Mooney data sets. Table 1 quantifies the impact of the parser enhancements discussed in Section 2.1. Since PRE- CISE can return multiple distinct SQL queries when it judges a question to be ambiguous, we report its results in two columns. The left column (PRECISE) records the percentage of questions where the set of returned SQL queries contains the correct query. The right column (PRECISE-1) records the percentage of questions where PRECISE is correct if it is forced to return exactly one query per question. In our experiments, PRECISE returned a single query 92.4% of the time, and returned two queries the rest of the time. Thus, the difference between the two columns is not great. Initially, plugging the Charniak parser into PRE- CISE yielded only 61.9% accuracy. Introducing semantic over-rides to correct prepositional attachment and preposition ellipsis errors increased PRE- CISE s accuracy to 89.7% the parser s erroneous POS tags still led PRECISE astray in some cases. After re-training the parser on 150 POS-tagged ATIS questions, but without utilizing semantic overrides, PRECISE achieved 92.4% accuracy. Combining both re-training and semantic over-rides, PRE- CISE achieved 94.0% accuracy. This accuracy is close to the maximum that PRECISE can achieve, given its incomplete semantic information we found that, when all parsing errors are corrected by hand, PRECISE s accuracy is 95.8%. To assess PRECISE s performance, we compared it with previous work. Table 2 shows PRECISE s accuracy compared with the most successful ATIS NLIs (Minker, 1998). We also include, for comparison, the more recent database-independent HEY system (He and Young, 2003). All systems were compared on the ATIS scoring set A, but we did clean the questions by introducing sentence breaks, removing verbal errors, etc.. Since we could add modules to PRECISE to automatically handle these various cases, we don t view this as significant. Given the database-specific nature of most previous ATIS systems, it is remarkable that PRECISE is able to achieve comparable accuracy. PRECISE does return two interpretations a small percentage of the time. However, even when restricted to returning a single interpretation, PRECISE-1 still achieved an impressive 89.1% accuracy (Table 1). 5 Related Work We discuss related work in three categories: Database-independent NLIs, ATIS-specific NLIs, and sublanguages. Database-independent NLIs There has been extensive previous work on NLIs (Androutsopoulos et al., 1995), but three key elements distinguish PRE- CISE. First, we introduce a model of ST questions and show that it produces provably correct interpretations of questions (subject to the assumptions of the model). We measure the prevalence of ST questions to demonstrate the practical import of our model. Second, we are the first to use a statistical parser as a plug in, experimentally measure its efficacy, and analyze the attendant challenges. Finally, we show how to leverage our semantic model to correct parser errors in difficult syntactic cases (e.g., prepositional attachment). A more detailed comparison of PRECISE with a wide range of NLI systems appears in (Popescu et al., 2003). The advances in this paper over our previous one include: reformulation of ST THEORY, the parser retraining, semantic over-rides, and the experiments testing PRECISE on the ATIS data. ATIS NLIs The typical ATIS NLIs used either domain-specific semantic grammars (Seneff, 1992; Ward and Issar, 1996) or stochastic models that required fully annotated domain-specific corpora for reliable parameter estimation (Levin and Pieraccini, 1995). In contrast, since it uses its model of semantically tractable questions, PRECISE does not
7 require heavy manual processing and only a small number of annotated questions. In addition, PRE- CISE leverages existing domain-independent parsing technology and offers theoretical guarantees absent from other work. Improved versions of ATIS systems such as Gemini (Moore et al., 1995) increased their coverage by allowing an approximate question interpretation to be computed from the meanings of some question fragments. Since PRE- CISE focuses on high precision rather than recall, we analyze every word in the question and interpret the question as a whole. Most recently, (He and Young, 2003) introduced the HEY system, which learns a semantic parser without requiring fully-annotated corpora. HEY uses a hierarchical semantic parser that is trained on a set of questions together with their corresponding SQL queries. HEY is similar to (Tang and Mooney, 2001). Both learning systems require a large set of questions labeled by their SQL queries an expensive input that PRECISE does not require and, unlike PRECISE, both systems cannot leverage continuing improvements to statistical parsers. Sublanguages The early work with the most similarities to PRECISE was done in the field of sublanguages. Traditional sublanguage work (Kittredge, 1982) has looked at defining sublanguages for various domains, while more recent work (Grishman, 2001; Sekine, 1994) suggests using AI techniques to learn aspects of sublanguages automatically. Our work can be viewed as a generalization of traditional sublanguage research. We restrict ourselves to the semantically tractable subset of English rather than to a particular knowledge domain. Finally, in addition to offering formal guarantees, we assess the prevalence of our sublanguage in the ATIS data. 6 Conclusion This paper is the first to provide evidence that statistical parsers can support NLIs such as PRECISE. We identified the quandary associated with appropriately training a statistical parser: without special training for each database, the parser makes numerous errors, but creating a massive, labeled corpus of questions for each database is prohibitively expensive. We solved this quandary via light re-training of the parser s tagger and via PRECISE s semantic over-rides, and showed that in concert these methods enable PRECISE to rise from 61.9% accuracy to 94% accuracy on the ATIS data set. Even though PRECISE is database independent, its accuracy is comparable to the best of the database-specific ATIS NLIs developed in previous work (Table 2). References I. Androutsopoulos, G. D. Ritchie, and P. Thanisch Natural Language Interfaces to Databases - An Introduction. In Natural Language Engineering, vol 1, part 1, pages E. Charniak A Maximum-Entropy-Inspired Parser. In Proc. of NAACL R. Grishman Adaptive information extraction and sublanguage analysis. In Proc. of IJCAI B.J. Grosz, D. Appelt, P. Martin, and F. Pereira TEAM: An Experiment in the Design of Transportable Natural Language Interfaces. In Artificial Intelligence 32, pages Y. He and S. Young A data-driven spoken language understanding system. In IEEE Workshop on Automatic Speech Recognition and Understanding. R. Kittredge Variation and homogeneity of sublanguages. In R. Kittredge and J. Lehrberger, editors, Sublanguage: Studies of Language in Restricted Semantic Domains, pages de Gruyter, Berlin. E. Levin and R. Pieraccini Chronus, the next generation. In Proc. of the DARPA Speech and Natural Language Workshop, pages W. Minker Evaluation methodologies for interactive speech systems. In First International Conference on Language Resources and Evaluation, pages R. Moore, D. Appelt, J. Dowding, J. M. Gawron, and D. Moran Combining linguistic and statistical knowledge sources in natural-language processing for atis. In Proc. of the ARPA Spoken Language Technology Workshop. A. Popescu, O. Etzioni, and H. Kautz Towards a theory of natural language interfaces to databases. In Proc. of IUI P. Price Evaluation of spoken language systems: the atis domain. In Proc. of the DARPA Speech and Natural Language Workshop, pages S. Sekine A New Direction For Sublanguage NLP. In Proc. of the International Conference on New Methods in Language Processing, pages S. Seneff Robust parsing for spoken language systems. In Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing. L.R. Tang and R.J. Mooney Using Multiple Clause Constructors in Inductive Logic Programming for Semantic Parsing. In Proc. of the 12th European Conference on Machine Learning (ECML- 2001), Freiburg, Germany, pages W. Ward and S. Issar Recent improvements in the cmu spoken language understanding system. In Proc. of the ARPA Human Language Technology Workshop, pages
Natural Language Database Interface for the Community Based Monitoring System *
Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University
Classification of Natural Language Interfaces to Databases based on the Architectures
Volume 1, No. 11, ISSN 2278-1080 The International Journal of Computer Science & Applications (TIJCSA) RESEARCH PAPER Available Online at http://www.journalofcomputerscience.com/ Classification of Natural
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati
Semantic Analysis of Natural Language Queries Using Domain Ontology for Information Access from Database
I.J. Intelligent Systems and Applications, 2013, 12, 81-90 Published Online November 2013 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijisa.2013.12.07 Semantic Analysis of Natural Language Queries
Interactive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
Building a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
Open Domain Information Extraction. Günter Neumann, DFKI, 2012
Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for
Natural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
International Journal of Advance Foundation and Research in Science and Engineering (IJAFRSE) Volume 1, Issue 1, June 2014.
A Comprehensive Study of Natural Language Interface To Database Rajender Kumar*, Manish Kumar. NIT Kurukshetra [email protected] *, [email protected] A B S T R A C T Persons with no knowledge
Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata
Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata Alessandra Giordani and Alessandro Moschitti Department of Computer Science and Engineering University of Trento Via Sommarive
S. Aquter Babu 1 Dr. C. Lokanatha Reddy 2
Model-Based Architecture for Building Natural Language Interface to Oracle Database S. Aquter Babu 1 Dr. C. Lokanatha Reddy 2 1 Assistant Professor, Dept. of Computer Science, Dravidian University, Kuppam,
ABSTRACT 2. SYSTEM OVERVIEW 1. INTRODUCTION. 2.1 Speech Recognition
The CU Communicator: An Architecture for Dialogue Systems 1 Bryan Pellom, Wayne Ward, Sameer Pradhan Center for Spoken Language Research University of Colorado, Boulder Boulder, Colorado 80309-0594, USA
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.
Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.
Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features
, pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of
Computer Standards & Interfaces
Computer Standards & Interfaces 35 (2013) 470 481 Contents lists available at SciVerse ScienceDirect Computer Standards & Interfaces journal homepage: www.elsevier.com/locate/csi How to make a natural
Object-Relational Database Based Category Data Model for Natural Language Interface to Database
Object-Relational Database Based Category Data Model for Natural Language Interface to Database Avinash J. Agrawal Shri Ramdeobaba Kamla Nehru Engineering College Nagpur-440013 (INDIA) Ph. No. 91+ 9422830245
Pattern based approach for Natural Language Interface to Database
RESEARCH ARTICLE OPEN ACCESS Pattern based approach for Natural Language Interface to Database Niket Choudhary*, Sonal Gore** *(Department of Computer Engineering, Pimpri-Chinchwad College of Engineering,
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems cation systems. For example, NLP could be used in Question Answering (QA) systems to understand users natural
Overview of MT techniques. Malek Boualem (FT)
Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,
Learning Translation Rules from Bilingual English Filipino Corpus
Proceedings of PACLIC 19, the 19 th Asia-Pacific Conference on Language, Information and Computation. Learning Translation s from Bilingual English Filipino Corpus Michelle Wendy Tan, Raymond Joseph Ang,
Special Topics in Computer Science
Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering
Automatic Text Analysis Using Drupal
Automatic Text Analysis Using Drupal By Herman Chai Computer Engineering California Polytechnic State University, San Luis Obispo Advised by Dr. Foaad Khosmood June 14, 2013 Abstract Natural language processing
Natural Language Web Interface for Database (NLWIDB)
Rukshan Alexander (1), Prashanthi Rukshan (2) and Sinnathamby Mahesan (3) Natural Language Web Interface for Database (NLWIDB) (1) Faculty of Business Studies, Vavuniya Campus, University of Jaffna, Park
PoS-tagging Italian texts with CORISTagger
PoS-tagging Italian texts with CORISTagger Fabio Tamburini DSLO, University of Bologna, Italy [email protected] Abstract. This paper presents an evolution of CORISTagger [1], an high-performance
NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR
NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR 1 Gauri Rao, 2 Chanchal Agarwal, 3 Snehal Chaudhry, 4 Nikita Kulkarni,, 5 Dr. S.H. Patil 1 Lecturer department o f Computer Engineering BVUCOE,
Department of Computer Science and Engineering, Kurukshetra Institute of Technology &Management, Haryana, India
Volume 5, Issue 4, 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Survey of Natural
Word Completion and Prediction in Hebrew
Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology
Shallow Parsing with Apache UIMA
Shallow Parsing with Apache UIMA Graham Wilcock University of Helsinki Finland [email protected] Abstract Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic
Semantic Mapping Between Natural Language Questions and SQL Queries via Syntactic Pairing
Semantic Mapping Between Natural Language Questions and SQL Queries via Syntactic Pairing Alessandra Giordani and Alessandro Moschitti Department of Computer Science and Engineering University of Trento
Customizing an English-Korean Machine Translation System for Patent Translation *
Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,
Brill s rule-based PoS tagger
Beáta Megyesi Department of Linguistics University of Stockholm Extract from D-level thesis (section 3) Brill s rule-based PoS tagger Beáta Megyesi Eric Brill introduced a PoS tagger in 1992 that was based
Domain Knowledge Extracting in a Chinese Natural Language Interface to Databases: NChiql
Domain Knowledge Extracting in a Chinese Natural Language Interface to Databases: NChiql Xiaofeng Meng 1,2, Yong Zhou 1, and Shan Wang 1 1 College of Information, Renmin University of China, Beijing 100872
CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test
CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed
Overview of the TACITUS Project
Overview of the TACITUS Project Jerry R. Hobbs Artificial Intelligence Center SRI International 1 Aims of the Project The specific aim of the TACITUS project is to develop interpretation processes for
Chapter 8. Final Results on Dutch Senseval-2 Test Data
Chapter 8 Final Results on Dutch Senseval-2 Test Data The general idea of testing is to assess how well a given model works and that can only be done properly on data that has not been seen before. Supervised
Statistical Machine Translation
Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language
Enforcing Data Quality Rules for a Synchronized VM Log Audit Environment Using Transformation Mapping Techniques
Enforcing Data Quality Rules for a Synchronized VM Log Audit Environment Using Transformation Mapping Techniques Sean Thorpe 1, Indrajit Ray 2, and Tyrone Grandison 3 1 Faculty of Engineering and Computing,
Symbiosis of Evolutionary Techniques and Statistical Natural Language Processing
1 Symbiosis of Evolutionary Techniques and Statistical Natural Language Processing Lourdes Araujo Dpto. Sistemas Informáticos y Programación, Univ. Complutense, Madrid 28040, SPAIN (email: [email protected])
A Case Study of Question Answering in Automatic Tourism Service Packaging
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, Special Issue Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0045 A Case Study of Question
Testing LTL Formula Translation into Büchi Automata
Testing LTL Formula Translation into Büchi Automata Heikki Tauriainen and Keijo Heljanko Helsinki University of Technology, Laboratory for Theoretical Computer Science, P. O. Box 5400, FIN-02015 HUT, Finland
31 Case Studies: Java Natural Language Tools Available on the Web
31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software
Clustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic
Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic by Sigrún Helgadóttir Abstract This paper gives the results of an experiment concerned with training three different taggers on tagged
Domain Classification of Technical Terms Using the Web
Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using
Semantic annotation of requirements for automatic UML class diagram generation
www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute
DEPENDENCY PARSING JOAKIM NIVRE
DEPENDENCY PARSING JOAKIM NIVRE Contents 1. Dependency Trees 1 2. Arc-Factored Models 3 3. Online Learning 3 4. Eisner s Algorithm 4 5. Spanning Tree Parsing 6 References 7 A dependency parser analyzes
Paraphrasing controlled English texts
Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich [email protected] Abstract. We discuss paraphrasing controlled English texts, by defining
Automated Extraction of Security Policies from Natural-Language Software Documents
Automated Extraction of Security Policies from Natural-Language Software Documents Xusheng Xiao 1 Amit Paradkar 2 Suresh Thummalapenta 3 Tao Xie 1 1 Dept. of Computer Science, North Carolina State University,
Robustness of a Spoken Dialogue Interface for a Personal Assistant
Robustness of a Spoken Dialogue Interface for a Personal Assistant Anna Wong, Anh Nguyen and Wayne Wobcke School of Computer Science and Engineering University of New South Wales Sydney NSW 22, Australia
Conceptual Schema Approach to Natural Language Database Access
Conceptual Schema Approach to Natural Language Database Access In-Su Kang, Seung-Hoon Na, Jong-Hyeok Lee Div. of Electrical and Computer Engineering Pohang University of Science and Technology (POSTECH)
The Specific Text Analysis Tasks at the Beginning of MDA Life Cycle
SCIENTIFIC PAPERS, UNIVERSITY OF LATVIA, 2010. Vol. 757 COMPUTER SCIENCE AND INFORMATION TECHNOLOGIES 11 22 P. The Specific Text Analysis Tasks at the Beginning of MDA Life Cycle Armands Šlihte Faculty
Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde
Statistical Verb-Clustering Model soft clustering: Verbs may belong to several clusters trained on verb-argument tuples clusters together verbs with similar subcategorization and selectional restriction
BILINGUAL TRANSLATION SYSTEM
BILINGUAL TRANSLATION SYSTEM (FOR ENGLISH AND TAMIL) Dr. S. Saraswathi Associate Professor M. Anusiya P. Kanivadhana S. Sathiya Abstract--- The project aims in developing Bilingual Translation System for
Presented to The Federal Big Data Working Group Meetup On 07 June 2014 By Chuck Rehberg, CTO Semantic Insights a Division of Trigent Software
Semantic Research using Natural Language Processing at Scale; A continued look behind the scenes of Semantic Insights Research Assistant and Research Librarian Presented to The Federal Big Data Working
A Mixed Trigrams Approach for Context Sensitive Spell Checking
A Mixed Trigrams Approach for Context Sensitive Spell Checking Davide Fossati and Barbara Di Eugenio Department of Computer Science University of Illinois at Chicago Chicago, IL, USA [email protected], [email protected]
Semantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
Specialty Answering Service. All rights reserved.
0 Contents 1 Introduction... 2 1.1 Types of Dialog Systems... 2 2 Dialog Systems in Contact Centers... 4 2.1 Automated Call Centers... 4 3 History... 3 4 Designing Interactive Dialogs with Structured Data...
TOOL OF THE INTELLIGENCE ECONOMIC: RECOGNITION FUNCTION OF REVIEWS CRITICS. Extraction and linguistic analysis of sentiments
TOOL OF THE INTELLIGENCE ECONOMIC: RECOGNITION FUNCTION OF REVIEWS CRITICS. Extraction and linguistic analysis of sentiments Grzegorz Dziczkowski, Katarzyna Wegrzyn-Wolska Ecole Superieur d Ingenieurs
NATURAL LANGUAGE TO SQL CONVERSION SYSTEM
International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN 2249-6831 Vol. 3, Issue 2, Jun 2013, 161-166 TJPRC Pvt. Ltd. NATURAL LANGUAGE TO SQL CONVERSION
Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1
Korpus-Abfrage: Werkzeuge und Sprachen Gastreferat zur Vorlesung Korpuslinguistik mit und für Computerlinguistik Charlotte Merz 3. Dezember 2002 Motivation Lizentiatsarbeit: A Corpus Query Tool for Automatically
Deposit Identification Utility and Visualization Tool
Deposit Identification Utility and Visualization Tool Colorado School of Mines Field Session Summer 2014 David Alexander Jeremy Kerr Luke McPherson Introduction Newmont Mining Corporation was founded in
Collecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
Transition-Based Dependency Parsing with Long Distance Collocations
Transition-Based Dependency Parsing with Long Distance Collocations Chenxi Zhu, Xipeng Qiu (B), and Xuanjing Huang Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science,
A System for Labeling Self-Repairs in Speech 1
A System for Labeling Self-Repairs in Speech 1 John Bear, John Dowding, Elizabeth Shriberg, Patti Price 1. Introduction This document outlines a system for labeling self-repairs in spontaneous speech.
Context Grammar and POS Tagging
Context Grammar and POS Tagging Shian-jung Dick Chen Don Loritz New Technology and Research New Technology and Research LexisNexis LexisNexis Ohio, 45342 Ohio, 45342 [email protected] [email protected]
Semantic parsing with Structured SVM Ensemble Classification Models
Semantic parsing with Structured SVM Ensemble Classification Models Le-Minh Nguyen, Akira Shimazu, and Xuan-Hieu Phan Japan Advanced Institute of Science and Technology (JAIST) Asahidai 1-1, Nomi, Ishikawa,
Chinese Open Relation Extraction for Knowledge Acquisition
Chinese Open Relation Extraction for Knowledge Acquisition Yuen-Hsien Tseng 1, Lung-Hao Lee 1,2, Shu-Yen Lin 1, Bo-Shun Liao 1, Mei-Jun Liu 1, Hsin-Hsi Chen 2, Oren Etzioni 3, Anthony Fader 4 1 Information
INF5820 Natural Language Processing - NLP. H2009 Jan Tore Lønning [email protected]
INF5820 Natural Language Processing - NLP H2009 Jan Tore Lønning [email protected] Semantic Role Labeling INF5830 Lecture 13 Nov 4, 2009 Today Some words about semantics Thematic/semantic roles PropBank &
Automatic Detection and Correction of Errors in Dependency Treebanks
Automatic Detection and Correction of Errors in Dependency Treebanks Alexander Volokh DFKI Stuhlsatzenhausweg 3 66123 Saarbrücken, Germany [email protected] Günter Neumann DFKI Stuhlsatzenhausweg
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University [email protected] Kapil Dalwani Computer Science Department
POSBIOTM-NER: A Machine Learning Approach for. Bio-Named Entity Recognition
POSBIOTM-NER: A Machine Learning Approach for Bio-Named Entity Recognition Yu Song, Eunji Yi, Eunju Kim, Gary Geunbae Lee, Department of CSE, POSTECH, Pohang, Korea 790-784 Soo-Jun Park Bioinformatics
Detecting Parser Errors Using Web-based Semantic Filters
Detecting Parser Errors Using Web-based Semantic Filters Alexander Yates Stefan Schoenmackers University of Washington Computer Science and Engineering Box 352350 Seattle, WA 98195-2350 Oren Etzioni {ayates,
Mining the Software Change Repository of a Legacy Telephony System
Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,
Effective Self-Training for Parsing
Effective Self-Training for Parsing David McClosky [email protected] Brown Laboratory for Linguistic Information Processing (BLLIP) Joint work with Eugene Charniak and Mark Johnson David McClosky - [email protected]
Component visualization methods for large legacy software in C/C++
Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University [email protected]
An Approach towards Automation of Requirements Analysis
An Approach towards Automation of Requirements Analysis Vinay S, Shridhar Aithal, Prashanth Desai Abstract-Application of Natural Language processing to requirements gathering to facilitate automation
Language Interface for an XML. Constructing a Generic Natural. Database. Rohit Paravastu
Constructing a Generic Natural Language Interface for an XML Database Rohit Paravastu Motivation Ability to communicate with a database in natural language regarded as the ultimate goal for DB query interfaces
D2.4: Two trained semantic decoders for the Appointment Scheduling task
D2.4: Two trained semantic decoders for the Appointment Scheduling task James Henderson, François Mairesse, Lonneke van der Plas, Paola Merlo Distribution: Public CLASSiC Computational Learning in Adaptive
An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines)
An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines) James Clarke, Vivek Srikumar, Mark Sammons, Dan Roth Department of Computer Science, University of Illinois, Urbana-Champaign.
Parsing Technology and its role in Legacy Modernization. A Metaware White Paper
Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that
CS4025: Pragmatics. Resolving referring Expressions Interpreting intention in dialogue Conversational Implicature
CS4025: Pragmatics Resolving referring Expressions Interpreting intention in dialogue Conversational Implicature For more info: J&M, chap 18,19 in 1 st ed; 21,24 in 2 nd Computing Science, University of
Data Quality in Information Integration and Business Intelligence
Data Quality in Information Integration and Business Intelligence Leopoldo Bertossi Carleton University School of Computer Science Ottawa, Canada : Faculty Fellow of the IBM Center for Advanced Studies
Analysis and Synthesis of Help-desk Responses
Analysis and Synthesis of Help-desk s Yuval Marom and Ingrid Zukerman School of Computer Science and Software Engineering Monash University Clayton, VICTORIA 3800, AUSTRALIA {yuvalm,ingrid}@csse.monash.edu.au
Adaptive Context-sensitive Analysis for JavaScript
Adaptive Context-sensitive Analysis for JavaScript Shiyi Wei and Barbara G. Ryder Department of Computer Science Virginia Tech Blacksburg, VA, USA {wei, ryder}@cs.vt.edu Abstract Context sensitivity is
Why language is hard. And what Linguistics has to say about it. Natalia Silveira Participation code: eagles
Why language is hard And what Linguistics has to say about it Natalia Silveira Participation code: eagles Christopher Natalia Silveira Manning Language processing is so easy for humans that it is like
A Survey of Natural Language Interface to Database Management System
A Survey of Natural Language Interface to Database Management System B.Sujatha Dr.S.Viswanadha Raju Humera Shaziya Research Scholar Professor & Head Lecturer in Computers Dept. of CSE Dept. of CSE Dept.
Access Control Fundamentals
C H A P T E R 2 Access Control Fundamentals An access enforcement mechanism authorizes requests (e.g., system calls) from multiple subjects (e.g., users, processes, etc.) to perform operations (e.g., read,,
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
