Using Knowledge Extraction and Maintenance Techniques To Enhance Analytical Performance
|
|
- Elfreda Cannon
- 8 years ago
- Views:
Transcription
1 Using Knowledge Extraction and Maintenance Techniques To Enhance Analytical Performance David Bixler, Dan Moldovan and Abraham Fowler Language Computer Corporation 1701 N. Collins Blvd #2000 Richardson, TX, 75080, USA {bixler, moldovan, Keywords: Information Sharing and Collaboration, Search and Retrieval, Novel Intelligence from Massive Data, Knowledge Discovery and Dissemination, Information Sharing and Collaboration Abstract Analysts are constantly overwhelmed by large amounts of data which lack meaningful or useful structure. LCC is working on two tools which help to alleviate this problem, Jaguar and Polaris. The technical contributions of each of these tools, namely automatic extraction of semantic relations, automatic ontology construction, and metrics to evaluate ontology quality, as well as experimental results are discussed. 1. Introduction Intelligence analysts are constantly plagued with an overabundance of information. Individual analysts approach this problem in a variety of ways, using organizational methods which work on a small scale but do not lend themselves to interoperability with methods used by other analysts. Even these methods do not solve the problem, as analysts can only handle a tiny amount of the information available to them. Unfortunately, many of the clues and answers they are looking for reside in the vast amounts of information left untouched, and even the information they do have at their disposal lacks many of the data bridges which could help drive inferences and hypotheses. LCC has been developing two tools which will help these problems by enabling technologies such as those which leverage prior and tacit knowledge: Question Answering (QA), Information Extraction (IE), and Summarization. These two tools are Polaris, a semantic parser, and Jaguar, an automatic ontology builder. Both Polaris and Jaguar operate automatically on text, allowing an analyst to perform other tasks while these tools run in the background. The end result of Jaguar (which uses Polaris in its processing) is automatically generated, semantically rich, domain-specific ontologies which analysts can use while working on a task related to a domain or set of domains. These ontologies can capture data specific to a given analyst as well as data for broader use, allowing analysts to keep their own specific knowledge while being able to share and exchange information with other analysts in an efficient, streamlined fashion. The ontologies and semantic clusters can also be integrated with other tools to boost their accuracy and performance. 2. Motivation Analysts lack tools which can assist them in higher modes of critical thinking, but it is these tools which analysts need to improve analysis on complex issues [Heuer]. One method is to structure the information in a way which is easy to understand and allows the analyst to be more efficient. More information, however, is not necessarily better. Many psychological studies have demonstrated that accuracy generally increases very little, if at all, as more information is given to an expert; what is needed is "more truly useful information" [Heuer]. Since analysis tends not to improve with more information, it is important that the information that is used is the most important and is structured in a useful fashion. It is also well-known that the capacity of short term memory (STM) is very minute, and long term memory (LTM) retrieval is difficult for tasks not performed recently. Humans are also not good at identifying patterns between chunks of data, structuring data in ways which are useful, and analogizing. External memory aids are helpful in resolving these issues, and semantically enriched ontologies can serve as external memory aids by both identifying patterns between concepts and groups of concepts and simulating a highly structured LTM that is simple to retrieve information from. Heuer also notes that human memory rarely changes retroactively, and well-maintained knowledge bases can accommodate this shortcoming. 3. Approach 3.1 Polaris Polaris is based on a set of 40 semantic relations which LCC has defined. Semantic relations are abstractions of underlying relations between concepts, and can occur within a word, between words, between phrases, and between sentences. Semantic relations are useful because
2 # Semantic Relation # Semantic Relation # Semantic Relation 1 Possession 15 Source-From 29 Possibility 2 Kinship 16 Topic 30 Certainty 3 Property-Attribute Holder 17 Manner 31 Theme-Patient 4 Agent 18 Means 32 Result 5 Temporal 19 Accompaniment-Companion 33 Stimulus 6 Depiction 20 Experiencer 34 Extent 7 Part-Whole 21 Recipient 35 Predicate 8 Hyponymy 22 Frequency 36 Belief 9 Entail 23 Influence 37 Goal 10 Cause 24 Associated-with/Other 38 Meaning 11 Make-Produce 25 Measure 39 Justification 12 Instrument 26 Synonymy-Name 40 Explanation 13 Location-Space 27 Antonymy 14 Purpose 28 Probability-of/Existence Table 1: LCC s 40 Semantic Relations they provide denser connectivity between concepts and contexts. Also, detecting semantic relations is one essential step toward the ultimate goal of machine text understanding. Semantic relations allow for richer ontologies and knowledge bases which can capture contextual knowledge, events, and firmer assertions. LCC's set of 40 relations is summarized in Table 1. These 40 relations have been carefully selected for their usefulness in natural language processing, for the feasibility of their automatic extraction from text, and for the broadest semantic coverage with the least amount of overlap. While no list will ever be perfect, LCC feels this list strikes a good balance between being too specific (too many relations making reasoning difficult) and too general (not enough information to be useful). An example of semantic relations is the sentence He carefully disarmed the letter bomb. The compound nominal letter bomb alone contains at least 5 semantic relations: letter bomb IS-A bomb, letter bomb IS-A letter, letter is the LOCATION of the bomb, bombing is the PURPOSE of letter bomb, and letter is the MEANS of bombing. The sentence also includes several other relations: He is the AGENT of disarm; carefully is the MANNER of disarmed; and the letter bomb is the THEME (or object) of disarmed. Together, these semantic relations can give a structured picture of the event: who was involved, what was done, and to what; and what was the purpose, etc. of the object involved. To find semantic relations in text, Polaris uses a combination of state-of-the-art text processing and machine learning techniques. In the first step, low-level NLP processing, such as named entity recognition, part-ofspeech tagging, syntactic parsing and word sense disambiguation, are used to structure the text. The parse tree is then broken down into a number of syntactic patterns that Polaris can analyze. These syntactic patterns include s and their arguments, complex nominals, adjective phrases, adjective clauses, and others. Polaris next runs classifiers on each section of text that matched a syntactic pattern. The classifiers examine features of the text and attempt to determine whether any of the 40 relations apply between the elements of the pattern. Most of the classifiers are based on one of four different machine learning algorithms: Decision Trees, Naïve Bayes, Support Vector Machine (SVM), and Semantic Scattering (a new learning algorithm that uses WordNet classes to find the most probable relation that holds between two nouns [Badulescu]). Some of these machine-learning classifiers use a per-relation approach to output only one specific relation they were trained to recognize, while others use a per-pattern approach which could potentially output any of the 40 semantic relations. Additionally, some classifiers containing human-coded rules are used for the most explicit and unambiguous cases. These three methods form a hybrid approach which produces better results than any one approach on its own. As an example of actual system performance, Table 2 demonstrates the output discovered by Polaris from the sentence Bin Laden reportedly purchased anthrax a half decade ago from a supplier in North Korea. Human-generated relations System output AGENT(Bin Laden, purchased) AGENT(Bin Laden, purchased) TOPIC(purchased, reportedly) THEME(anthrax, purchased) THEME(anthrax, purchased) RECIPIENT(a supplier in North LOCATION(from a supplier in Korea, purchased) North Korea, purchased) TEMPORAL(a half decade ago, TEMPORAL(a half decade ago, purchased) purchased) MEASURE(a half, decade) PROPERTY(half, decade) LOCATION(in North Korea, a LOCATION(in North Korea, a supplier) supplier) Table 2: List of relations discovered from example sentence 3.2 Jaguar Jaguar automatically builds domain-specific ontologies by processing plain text from a variety of sources. These ontologies can be fine-tuned to contain the level of detail
3 desired by an analyst. Ontologies built by Jaguar contain (i) ontological concepts, which are the basic building blocks of an ontology, (ii) a hierarchy, consisting of a structure imposed on certain ontological concepts via transitive relations that generally hold to be universally true (e.g. IS-A, part-whole, locative, etc), and (iii) the contextual knowledge base, consisting of semantic contexts that encapsulate knowledge of events via semantic relations. Current work also includes a fourth component called Axioms on Demand which capture assertions about knowledge and are useful for reasoning. Jaguar is a complex text processing project, using both basic and advanced NLP tools to accomplish its task. The first step in the process is to filter and clean up the input text. Raw input to Jaguar can come from all possible types of sources, including Word documents, PDF files and web pages in HTML format, and is therefore prone to having many irregularities, such as incomplete, strangely formatted sentences, headings, and tabular information. The filtering mechanism of Jaguar is a crucial step that makes the input acceptable for subsequent NLP tools to process it. A single run of Jaguar can be divided into two major processes: (i) text processing, and (ii) classification/hierarchy formation. In Text Processing, Jaguar is provided with a set of seeds which are used to determine the set of sentences of interest. Until recently, these were always selected manually; now, seeds can be automatically generated if desired and used in place of or to augment the manually selected seed set. The set of sentences selected based on the seeds goes through a set of NLP processing tools: named-entity recognition, part-ofspeech tagging, parsing, word-sense disambiguation, coreference resolution, and semantic relation discovery (Polaris). The resulting data structure is processed and used to populate one or many semantic contexts, groups of relations or nested contexts which hold true around a common central concept. Another aspect of text processing is concept discovery, which entails the discovery of noun concepts in sentences which are related to the target words or seeds. Each processed sentence is scanned for noun phrases, and targeted noun concepts are added to a local data structure for subsequent processing into the ontology's hierarchy. Figure 1 shows an example hierarchy and semantic context. Classification is the determination of a hierarchical structure within a group of concepts. Isolated IS-A (hypernymy) relations are discovered in the text processing stage. Classification uses a set of well-formed and tested procedures to impose a hierarchical structure on the set of discovered concepts, and it uses WordNet [Miller] as its upper ontology. Details of these procedures are presented in [Moldovan and Girju]. Hypernymy relations discovered via classification may contain anomalies or redundancies. Jaguar contains a conflict resolution engine which detects and corrects possible inconsistencies. The hierarchies in Jaguar are created link by link (or relation by relation) and follow a conflict avoidance technique, Figure 1: Example Hierarchy and Semantic Context within a Knowledge Base wherein each new relation is tested for anomalies/redundancies before being added to the hierarchy. Although single runs of Jaguar yield rich ontologies, the real power of it lies in providing an option to layer ontologies from many different runs. Jaguar can currently merge disparate ontologies into one by using the aforementioned conflict resolution technique. The merge tool merges the two ontologies' concept sets, hierarchies (using conflict resolution), and their knowledge bases (set of semantic contexts). Merging is useful for distributed or parallel systems where small chunks of the input text may be processed on some portions of the system and then subsequently merged. It also provides a foundation for future work in contextual reasoning and epistemic logic. The result is a rich knowledge base which can be viewed at many different levels of granularity, providing an analyst with the level of detail desired. 4. Results 4.1 Polaris As mentioned earlier, Polaris uses four machine learning algorithms to discover semantic relations in syntactic patterns: Semantic Scattering, Decision Trees, Naïve Bayes and Support Vector Machine. There are six primary pattern types discovered within noun phrases: N-N and Adj-N (which comprise compound nominals), 's and of (Genitive patterns), Adjective Phrases, and Adjective Clauses. The first five are further subdivided into nominalized and non-nominalized occurrences, giving a total of 11 patterns discovered within compound nominals. Table 3 summarizes the accuracy over the training data of each machine learning algorithm for each noun phrase pattern. In this table, non-al refers to nominalized forms and al refers to non-nominalized. The training corpus source for the noun phrase patterns is Wall Street Journal (TreeBank 2), L.A. Times (TREC 9), and XWN 2.0 [Harabagiu and Moldovan]. There are also five argument level patterns being discovered: NP, NP, PP, ADVP, and S. Table 4 summa-
4 Machine Learning Algorithms Syntactic Patterns Adjective Complex nominals Genitives Phrases NN AdjN Of 's NP prep NP al al al Verbal nonal nonal nonal Nonal nonal al Adj Clauses Semantic Scattering n/a NP Wh- Pron Decision Tree n/a Naïve Bayes n/a SVM Table 3: Machine Learning Accuracy for Noun Phrase Level rizes the accuracy over the training data for two machine learning algorithms. The training corpus source for the argument patterns is FrameNet [Baker]. Neither table is an indication of overall system score; however, if all inputs were perfect, each would indicate the expected best performance for the current system. Machine Learning Algorithms Syntactic Patterns NP NP PP ADVP Verb S Decision Tree SVM Table 4: Machine Learning for Verb Argument Level LCC has created a benchmark corpus to evaluate the Polaris system. The corpus contains 300 sentences, but currently only 51 have been fully annotated due to the large manual effort required. Within these 51 sentences, human annotators discovered 683 total relations; 290 of these match the syntactic patterns that Polaris currently recognizes. A scorer program runs Polaris over these same 51 sentences and compares the generated relations to the human annotations. As of March 29, 2005, Polaris discovered 265 relations within the syntactic patterns that it uses. Of these, 94 were exact matches to the human annotations. An additional 38.2 were partial matches, meaning that while the relation type was correct and the argument bracketing at least overlapped, there were some extra or missing tokens in the generated arguments. The partial matches are scored using precision, recall, and F- measure on the overlapping tokens. The total score for all matches, including discounting for partial matches, is shown in Table 5. The first column indicates performance on all human annotations, including those on syntactic patterns Polaris currently cannot see. The second column shows the performance within the syntactic patterns Polaris currently recognizes. The second column is a better indication of the overall potential of Polaris' approach if it were extended to include more syntactic patterns. All relations Measured over: Only relations covered by syntactic patterns Precision 49.89% Recall 19.63% 50.04% F-Measure 28.18% 49.96% Table 5: Polaris System Score The numbers continue to improve but are obviously not perfect. There are many reasons for this, resulting both from external and internal factors. The external NLP techniques which Polaris depends on offer varying degrees of precision. Automatic word sense disambiguation is percent accurate for nouns, and lower than that for s. Syntactic parsing is close to 90 percent accurate for subtrees, but this precision degenerates to somewhere between 50 and 70 percent for an entire, complex sentence. The part of speech tagger is around 95 percent accurate, and the named entity tagger ranges from percent accuracy. Additionally, there is currently no true coreference resolution library. Multiplying the accuracies of each tool which Polaris depends upon demonstrates that there is likely less than 50 percent likelihood of accuracy on real-world, complex sentences. Internally, there are also many issues which affect the precision and recall. The training data has a fair number of issues: insufficient examples for syntactic patterns or semantic relations; narrow domain for the training corpora; inconsistency in the order of relations arguments; noisy data; and lack of a one-to-one mapping to the source. Additionally, there are currently not enough features for each of the semantic relations. Relation arguments are many times ambiguous within a parse tree structure, and syntactic patterns do not always capture all relations. The machine learning classifiers tend to only return one relation per syntactic pattern even if there are multiple possibilities. There are also issues caused by metonymy (figures of speech) and multiple relations
5 Metric Name Conceptual Precision (CP) Subsumption Precision (SP) Conceptual Recall (CR) Subsumption Recall (SR) Unlinked Concepts (UC) Conceptual Expansion (CE) Metric Description number of well-formed and relevant concepts in the ontology divided by the total number of concepts in the ontology number of correct subsumption links in the ontology divided by the total number of subsumption links in the ontology number of well-formed and relevant concepts in the ontology divided by the union of this number and this number from a reference ontology number of correct subsumption links in the ontology divided by the union of this number and this number from a reference ontology proportion of orphan concepts in the ontology proportional difference between number of seed concepts and number of concepts in generated ontology Table 6: Ontology Evaluation Metrics found within the same phrase. Work is being done on all of these areas to help improve precision and recall. 4.2 Jaguar LCC has recently developed a battery of evaluation metrics to assess the quality of ontologies. They are summarized in Table 6. These ontology evaluation metrics were used to evaluate two versions of Jaguar, one which uses a manually selected set of seed concepts and one which selects seeds automatically. The document collection used for this evaluation was 5.67 megabytes of text from a CNS (Center for Nonproliferation Studies) corpus focused on chemical and biological weapons. The manually selected seed set consisted of 158 concepts associated with biological agents and weapons, and the automatically selected seed set consisted of 100 concepts. Both sets were used as input to Jaguar to create two separate ontologies for the biological weapons and agents domain. Two manually built, hand-edited ontologies focusing on the biological weapons domain were used as reference ontologies. These reference ontologies were pruned from the original ontologies to remove information about chemical and nuclear weapons, and one of them was additionally pruned to remove concepts not found in the document collection. The first reference ontology, which contains 151 concepts and 208 subsumption links, will be referred to as BW-manual, and the second one, which contains 68 concepts and 93 subsumption links, will be referred to as BW-manual-filtered. Jaguar was run two times, first with the 158 manually selected seeds (labeled BW-KAT1), and second with the 100 automatically selected seeds (labeled BW-KAT2). BW-KAT1 contained 4,712 concepts, with 896 considered to be well-formed and relevant to the domain; 85 of these were unsubsumed, and 756 of the remaining 811 subsumed concepts were considered to be accurate when checked manually. BW-KAT2 contained 7,197 concepts, with 1,147 considered to be well-formed and relevant to the domain; 68 of these were unsubsumed, and 977 of the remaining 1079 subsumed concepts were considered to be accurate. The metrics described above are summarized for BW-KAT1 and BW-KAT2 in Table 7. With the exception of conceptual precision, the results are very good. The results are also very comparable between the manual and automatic selection of seeds. There are, however, still issues which need to be addressed to improve the results. Due to its dependency on Polaris, Jaguar also depends on a number of lower level NLP components. Their shortcomings and effect on Polaris have previously been discussed and thus impact the performance of Jaguar. Improvement in lower level components should increase the performance of Jaguar. There is still a good bit of noise in the input to Jaguar, and better filtering techniques will increase the overall quality of the resultant ontology. The classifier uses a variety of heuristics, many of which possess some degree of ambiguity. Additionally, anomalies in the hypernymy tree, such as two very different concepts sharing the same hypernym several levels removed, introduces more noise into the data. Conflict resolution is still being researched, and though an initial implementation is in place, further refinement should also improve the quality of the built ontologies. Much effort has been made to build a collection of Metric BW-KAT1 BW-KAT2 Conceptual Precision (CP) 19.02% (896/4712) 15.94% (1147/7197) Subsumption Precision (SP) 93.22% (756/811) 90.55% (977/1079) Conceptual Recall (1) CR % (896/( )) 88.37% (1147/( )) Conceptual Recall (2) CR % (896/( )) 94.40% (1147/( )) Subsumption Recall (1) SR % (756/( )) 82.45% (977/( )) Subsumption Recall (2) SR % (756/( )) 91.31% (977/( )) Conceptual Expansion (CE) % (( )/ 158) 1047% (( )/100) Unlinked Concepts (UC) 9.49% (85/896) 5.93% (68/1147) Table 7: Results of Jaguar Evaluation
6 domain-specific ontologies on a regular and automatic basis. Using web harvesting tools developed at LCC, Jaguar has been extended to build ontologies automatically from the web. Seed concepts are used as query keywords for a search engine like Google, and found documents are ranked accordingly and then processed by Jaguar. Over 30 different ontologies have been built which include IS-A hierarchies; work is being done to augment them with other relation types, such as partwhole and locative. Example domains which have been built and made available via the web include HR, biological weapons, Al Qaeda, North Korean Nuclear Program, acid rain, and trains. 5. Conclusion LCC has made great strides toward extracting, structuring, and maintaining knowledge which can assist an analyst in higher levels of critical thinking for better analysis, but there is still much work to be done. Continued improvement of the quality of knowledge extracted and the relationships between chunks of knowledge is needed to ensure that the most useful information is always available to the analyst. More detailed work on extracting and formulating Axioms on Demand will allow ontologies to become more useful knowledge bases. Work on reasoning will allow the system to perform preliminary analysis and present it to the analyst to aid the critical thinking process. Mechanisms for connecting with disparate knowledge bases and ontologies are also being explored to improve the utility and structure of knowledge available to the analyst. The impact on text processing has already been large by bridging the gap to machine text understanding, enabling powerful technologies like QA, reasoning and inferences, IE, and summarization. Overall, the current system provides a very strong foundation for future endeavors and possesses a great deal of utility in its own right. Roxana Girju, et al Support Vector Machines Applied to the Classification of Semantic Relations in Nominalized Noun Phrases. In Proc. of the Lexical Semantics Workshop, HLT 2004, Boston. Sanda Harabagiu and Dan Moldovan. Knowledge Processing on an Extended WordNet. WordNet-An Electronic Lexical Database. MIT Press, C. Fellbaum editor, pp , Richards J. Heuer, Jr. Psychology of Intelligence Analysis, Center for the Study of Intelligence, Central Intelligence Agency, George Miller. WordNet: a lexical database for English. Communications of the ACM, Vol.38, No.11:39-41, Dan I. Moldovan and Roxana C. Girju. An Interactive Tool for the Rapid Development of Knowledge Bases. International Journal on Artificial Intelligence Tools, vol 10, no 1-2, March Acknowledgments This material is based upon work funded in part by the U.S. Government and any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the U.S. Government. Thanks to Altaf Mohammed, Lowell Boggs, Adriana Badulescu, and Ian Niles for their contributions. References Adriana Badulescu. Classification of Semantic Relations Between Nouns. Ph.D. Dissertation, University of Texas at Dallas Collins F. Baker, Charles J. Fillmore, and John B. Lowe The Berkeley FrameNet Project. In Proceedings of COLING/ACL '98: Montreal, Canada.
Building a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
More informationArchitecture of an Ontology-Based Domain- Specific Natural Language Question Answering System
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering
More informationTREC 2003 Question Answering Track at CAS-ICT
TREC 2003 Question Answering Track at CAS-ICT Yi Chang, Hongbo Xu, Shuo Bai Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China changyi@software.ict.ac.cn http://www.ict.ac.cn/
More informationExperiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
More informationOverview of the TACITUS Project
Overview of the TACITUS Project Jerry R. Hobbs Artificial Intelligence Center SRI International 1 Aims of the Project The specific aim of the TACITUS project is to develop interpretation processes for
More informationEfficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
More informationBridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project
Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project Ahmet Suerdem Istanbul Bilgi University; LSE Methodology Dept. Science in the media project is funded
More informationInteractive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
More informationHow To Write A Summary Of A Review
PRODUCT REVIEW RANKING SUMMARIZATION N.P.Vadivukkarasi, Research Scholar, Department of Computer Science, Kongu Arts and Science College, Erode. Dr. B. Jayanthi M.C.A., M.Phil., Ph.D., Associate Professor,
More informationNatural Language Database Interface for the Community Based Monitoring System *
Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University
More informationReverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms
Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms Irina Astrova 1, Bela Stantic 2 1 Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn,
More informationOpen Domain Information Extraction. Günter Neumann, DFKI, 2012
Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for
More informationSearch and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or
More informationPOSBIOTM-NER: A Machine Learning Approach for. Bio-Named Entity Recognition
POSBIOTM-NER: A Machine Learning Approach for Bio-Named Entity Recognition Yu Song, Eunji Yi, Eunju Kim, Gary Geunbae Lee, Department of CSE, POSTECH, Pohang, Korea 790-784 Soo-Jun Park Bioinformatics
More informationA Framework-based Online Question Answering System. Oliver Scheuer, Dan Shen, Dietrich Klakow
A Framework-based Online Question Answering System Oliver Scheuer, Dan Shen, Dietrich Klakow Outline General Structure for Online QA System Problems in General Structure Framework-based Online QA system
More informationAn Efficient Database Design for IndoWordNet Development Using Hybrid Approach
An Efficient Database Design for IndoWordNet Development Using Hybrid Approach Venkatesh P rabhu 2 Shilpa Desai 1 Hanumant Redkar 1 N eha P rabhugaonkar 1 Apur va N agvenkar 1 Ramdas Karmali 1 (1) GOA
More informationCollecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
More informationClustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
More informationText Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com
Text Analytics with Ambiverse Text to Knowledge www.ambiverse.com Version 1.0, February 2016 WWW.AMBIVERSE.COM Contents 1 Ambiverse: Text to Knowledge............................... 5 1.1 Text is all Around
More informationTowards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis
Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,
More informationFacilitating Business Process Discovery using Email Analysis
Facilitating Business Process Discovery using Email Analysis Matin Mavaddat Matin.Mavaddat@live.uwe.ac.uk Stewart Green Stewart.Green Ian Beeson Ian.Beeson Jin Sa Jin.Sa Abstract Extracting business process
More informationThree Methods for ediscovery Document Prioritization:
Three Methods for ediscovery Document Prioritization: Comparing and Contrasting Keyword Search with Concept Based and Support Vector Based "Technology Assisted Review-Predictive Coding" Platforms Tom Groom,
More informationDomain Adaptive Relation Extraction for Big Text Data Analytics. Feiyu Xu
Domain Adaptive Relation Extraction for Big Text Data Analytics Feiyu Xu Outline! Introduction to relation extraction and its applications! Motivation of domain adaptation in big text data analytics! Solutions!
More informationSearch Engine Based Intelligent Help Desk System: iassist
Search Engine Based Intelligent Help Desk System: iassist Sahil K. Shah, Prof. Sheetal A. Takale Information Technology Department VPCOE, Baramati, Maharashtra, India sahilshahwnr@gmail.com, sheetaltakale@gmail.com
More informationCustomizing an English-Korean Machine Translation System for Patent Translation *
Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,
More informationThe Prolog Interface to the Unstructured Information Management Architecture
The Prolog Interface to the Unstructured Information Management Architecture Paul Fodor 1, Adam Lally 2, David Ferrucci 2 1 Stony Brook University, Stony Brook, NY 11794, USA, pfodor@cs.sunysb.edu 2 IBM
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationWhy are Organizations Interested?
SAS Text Analytics Mary-Elizabeth ( M-E ) Eddlestone SAS Customer Loyalty M-E.Eddlestone@sas.com +1 (607) 256-7929 Why are Organizations Interested? Text Analytics 2009: User Perspectives on Solutions
More informationTravis Goodwin & Sanda Harabagiu
Automatic Generation of a Qualified Medical Knowledge Graph and its Usage for Retrieving Patient Cohorts from Electronic Medical Records Travis Goodwin & Sanda Harabagiu Human Language Technology Research
More informationClustering Technique in Data Mining for Text Documents
Clustering Technique in Data Mining for Text Documents Ms.J.Sathya Priya Assistant Professor Dept Of Information Technology. Velammal Engineering College. Chennai. Ms.S.Priyadharshini Assistant Professor
More informationANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS Gürkan Şahin 1, Banu Diri 1 and Tuğba Yıldız 2 1 Faculty of Electrical-Electronic, Department of Computer Engineering
More informationSPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
More informationWhat Is This, Anyway: Automatic Hypernym Discovery
What Is This, Anyway: Automatic Hypernym Discovery Alan Ritter and Stephen Soderland and Oren Etzioni Turing Center Department of Computer Science and Engineering University of Washington Box 352350 Seattle,
More information72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD
72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD Paulo Gottgtroy Auckland University of Technology Paulo.gottgtroy@aut.ac.nz Abstract This paper is
More informationModern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability
Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability Ana-Maria Popescu Alex Armanasu Oren Etzioni University of Washington David Ko {amp, alexarm, etzioni,
More informationSemantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
More informationEnglish Grammar Checker
International l Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-3 E-ISSN: 2347-2693 English Grammar Checker Pratik Ghosalkar 1*, Sarvesh Malagi 2, Vatsal Nagda 3,
More informationTaxonomy learning factoring the structure of a taxonomy into a semantic classification decision
Taxonomy learning factoring the structure of a taxonomy into a semantic classification decision Viktor PEKAR Bashkir State University Ufa, Russia, 450000 vpekar@ufanet.ru Steffen STAAB Institute AIFB,
More informationAn Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them
An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them Vangelis Karkaletsis and Constantine D. Spyropoulos NCSR Demokritos, Institute of Informatics & Telecommunications,
More informationCINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test
CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed
More informationFlattening Enterprise Knowledge
Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it
More informationWord Completion and Prediction in Hebrew
Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology
More informationSustaining Privacy Protection in Personalized Web Search with Temporal Behavior
Sustaining Privacy Protection in Personalized Web Search with Temporal Behavior N.Jagatheshwaran 1 R.Menaka 2 1 Final B.Tech (IT), jagatheshwaran.n@gmail.com, Velalar College of Engineering and Technology,
More informationTaxonomies in Practice Welcome to the second decade of online taxonomy construction
Building a Taxonomy for Auto-classification by Wendi Pohs EDITOR S SUMMARY Taxonomies have expanded from browsing aids to the foundation for automatic classification. Early auto-classification methods
More informationC o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER
INTRODUCTION TO SAS TEXT MINER TODAY S AGENDA INTRODUCTION TO SAS TEXT MINER Define data mining Overview of SAS Enterprise Miner Describe text analytics and define text data mining Text Mining Process
More informationAn Approach towards Automation of Requirements Analysis
An Approach towards Automation of Requirements Analysis Vinay S, Shridhar Aithal, Prashanth Desai Abstract-Application of Natural Language processing to requirements gathering to facilitate automation
More informationAccelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems cation systems. For example, NLP could be used in Question Answering (QA) systems to understand users natural
More informationData Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC
Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.
More informationDATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
More informationDomain Independent Knowledge Base Population From Structured and Unstructured Data Sources
Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources Michelle
More informationPhase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde
Statistical Verb-Clustering Model soft clustering: Verbs may belong to several clusters trained on verb-argument tuples clusters together verbs with similar subcategorization and selectional restriction
More informationSemantic annotation of requirements for automatic UML class diagram generation
www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute
More informationWikipedia and Web document based Query Translation and Expansion for Cross-language IR
Wikipedia and Web document based Query Translation and Expansion for Cross-language IR Ling-Xiang Tang 1, Andrew Trotman 2, Shlomo Geva 1, Yue Xu 1 1Faculty of Science and Technology, Queensland University
More informationAN APPROACH TO WORD SENSE DISAMBIGUATION COMBINING MODIFIED LESK AND BAG-OF-WORDS
AN APPROACH TO WORD SENSE DISAMBIGUATION COMBINING MODIFIED LESK AND BAG-OF-WORDS Alok Ranjan Pal 1, 3, Anirban Kundu 2, 3, Abhay Singh 1, Raj Shekhar 1, Kunal Sinha 1 1 College of Engineering and Management,
More informationChapter 8. Final Results on Dutch Senseval-2 Test Data
Chapter 8 Final Results on Dutch Senseval-2 Test Data The general idea of testing is to assess how well a given model works and that can only be done properly on data that has not been seen before. Supervised
More informationDomain Classification of Technical Terms Using the Web
Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using
More informationA Framework for Ontology-Based Knowledge Management System
A Framework for Ontology-Based Knowledge Management System Jiangning WU Institute of Systems Engineering, Dalian University of Technology, Dalian, 116024, China E-mail: jnwu@dlut.edu.cn Abstract Knowledge
More informationBusiness Intelligence and Decision Support Systems
Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley
More informationONLINE RESUME PARSING SYSTEM USING TEXT ANALYTICS
ONLINE RESUME PARSING SYSTEM USING TEXT ANALYTICS Divyanshu Chandola 1, Aditya Garg 2, Ankit Maurya 3, Amit Kushwaha 4 1 Student, Department of Information Technology, ABES Engineering College, Uttar Pradesh,
More informationParaphrasing controlled English texts
Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich kaljurand@gmail.com Abstract. We discuss paraphrasing controlled English texts, by defining
More informationEffective Data Retrieval Mechanism Using AML within the Web Based Join Framework
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 ushaduraisamy@yahoo.co.in 2 anishgracias@gmail.com Abstract A vast amount of assorted
More informationWeb-Scale Extraction of Structured Data Michael J. Cafarella, Jayant Madhavan & Alon Halevy
The Deep Web: Surfacing Hidden Value Michael K. Bergman Web-Scale Extraction of Structured Data Michael J. Cafarella, Jayant Madhavan & Alon Halevy Presented by Mat Kelly CS895 Web-based Information Retrieval
More informationWIKITOLOGY: A NOVEL HYBRID KNOWLEDGE BASE DERIVED FROM WIKIPEDIA. by Zareen Saba Syed
WIKITOLOGY: A NOVEL HYBRID KNOWLEDGE BASE DERIVED FROM WIKIPEDIA by Zareen Saba Syed Thesis submitted to the Faculty of the Graduate School of the University of Maryland in partial fulfillment of the requirements
More informationDiagnosis Code Assignment Support Using Random Indexing of Patient Records A Qualitative Feasibility Study
Diagnosis Code Assignment Support Using Random Indexing of Patient Records A Qualitative Feasibility Study Aron Henriksson 1, Martin Hassel 1, and Maria Kvist 1,2 1 Department of Computer and System Sciences
More informationCENG 734 Advanced Topics in Bioinformatics
CENG 734 Advanced Topics in Bioinformatics Week 9 Text Mining for Bioinformatics: BioCreative II.5 Fall 2010-2011 Quiz #7 1. Draw the decompressed graph for the following graph summary 2. Describe the
More informationONTOLOGY FOR MOBILE PHONE OPERATING SYSTEMS
ONTOLOGY FOR MOBILE PHONE OPERATING SYSTEMS Hasni Neji and Ridha Bouallegue Innov COM Lab, Higher School of Communications of Tunis, Sup Com University of Carthage, Tunis, Tunisia. Email: hasni.neji63@laposte.net;
More informationResolving Common Analytical Tasks in Text Databases
Resolving Common Analytical Tasks in Text Databases The work is funded by the Federal Ministry of Economic Affairs and Energy (BMWi) under grant agreement 01MD15010B. Database Systems and Text-based Information
More informationImplementation of hybrid software architecture for Artificial Intelligence System
IJCSNS International Journal of Computer Science and Network Security, VOL.7 No.1, January 2007 35 Implementation of hybrid software architecture for Artificial Intelligence System B.Vinayagasundaram and
More informationLanguage and Computation
Language and Computation week 13, Thursday, April 24 Tamás Biró Yale University tamas.biro@yale.edu http://www.birot.hu/courses/2014-lc/ Tamás Biró, Yale U., Language and Computation p. 1 Practical matters
More informationDATA PREPARATION FOR DATA MINING
Applied Artificial Intelligence, 17:375 381, 2003 Copyright # 2003 Taylor & Francis 0883-9514/03 $12.00 +.00 DOI: 10.1080/08839510390219264 u DATA PREPARATION FOR DATA MINING SHICHAO ZHANG and CHENGQI
More informationSearch and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
More informationQuestion Answering and Multilingual CLEF 2008
Dublin City University at QA@CLEF 2008 Sisay Fissaha Adafre Josef van Genabith National Center for Language Technology School of Computing, DCU IBM CAS Dublin sadafre,josef@computing.dcu.ie Abstract We
More informationIMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
More informationNgram Search Engine with Patterns Combining Token, POS, Chunk and NE Information
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University sekine@cs.nyu.edu Kapil Dalwani Computer Science Department
More informationDetecting Parser Errors Using Web-based Semantic Filters
Detecting Parser Errors Using Web-based Semantic Filters Alexander Yates Stefan Schoenmackers University of Washington Computer Science and Engineering Box 352350 Seattle, WA 98195-2350 Oren Etzioni {ayates,
More informationThe University of Washington s UW CLMA QA System
The University of Washington s UW CLMA QA System Dan Jinguji, William Lewis,EfthimisN.Efthimiadis, Joshua Minor, Albert Bertram, Shauna Eggers, Joshua Johanson,BrianNisonger,PingYu, and Zhengbo Zhou Computational
More informationCustomer Intentions Analysis of Twitter Based on Semantic Patterns
Customer Intentions Analysis of Twitter Based on Semantic Patterns Mohamed Hamroun mohamed.hamrounn@gmail.com Mohamed Salah Gouider ms.gouider@yahoo.fr Lamjed Ben Said lamjed.bensaid@isg.rnu.tn ABSTRACT
More informationHow To Use Data Mining For Knowledge Management In Technology Enhanced Learning
Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning
More informationLanguage Interface for an XML. Constructing a Generic Natural. Database. Rohit Paravastu
Constructing a Generic Natural Language Interface for an XML Database Rohit Paravastu Motivation Ability to communicate with a database in natural language regarded as the ultimate goal for DB query interfaces
More informationWHITEPAPER. Text Analytics Beginner s Guide
WHITEPAPER Text Analytics Beginner s Guide What is Text Analytics? Text Analytics describes a set of linguistic, statistical, and machine learning techniques that model and structure the information content
More informationDynamic Data in terms of Data Mining Streams
International Journal of Computer Science and Software Engineering Volume 2, Number 1 (2015), pp. 1-6 International Research Publication House http://www.irphouse.com Dynamic Data in terms of Data Mining
More informationA Statistical Text Mining Method for Patent Analysis
A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, shjun@cju.ac.kr Abstract Most text data from diverse document databases are unsuitable for analytical
More information» A Hardware & Software Overview. Eli M. Dow <emdow@us.ibm.com:>
» A Hardware & Software Overview Eli M. Dow Overview:» Hardware» Software» Questions 2011 IBM Corporation Early implementations of Watson ran on a single processor where it took 2 hours
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More information11-792 Software Engineering EMR Project Report
11-792 Software Engineering EMR Project Report Team Members Phani Gadde Anika Gupta Ting-Hao (Kenneth) Huang Chetan Thayur Suyoun Kim Vision Our aim is to build an intelligent system which is capable of
More informationShallow Parsing with Apache UIMA
Shallow Parsing with Apache UIMA Graham Wilcock University of Helsinki Finland graham.wilcock@helsinki.fi Abstract Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic
More informationTowards Robust High Performance Word Sense Disambiguation of English Verbs Using Rich Linguistic Features
Towards Robust High Performance Word Sense Disambiguation of English Verbs Using Rich Linguistic Features Jinying Chen and Martha Palmer Department of Computer and Information Science, University of Pennsylvania,
More informationRequirements Analysis Concepts & Principles. Instructor: Dr. Jerry Gao
Requirements Analysis Concepts & Principles Instructor: Dr. Jerry Gao Requirements Analysis Concepts and Principles - Requirements Analysis - Communication Techniques - Initiating the Process - Facilitated
More informationMETA DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 asistithod@gmail.com
More informationPersonalization of Web Search With Protected Privacy
Personalization of Web Search With Protected Privacy S.S DIVYA, R.RUBINI,P.EZHIL Final year, Information Technology,KarpagaVinayaga College Engineering and Technology, Kanchipuram [D.t] Final year, Information
More informationTechnical Report. The KNIME Text Processing Feature:
Technical Report The KNIME Text Processing Feature: An Introduction Dr. Killian Thiel Dr. Michael Berthold Killian.Thiel@uni-konstanz.de Michael.Berthold@uni-konstanz.de Copyright 2012 by KNIME.com AG
More informationI. INTRODUCTION NOESIS ONTOLOGIES SEMANTICS AND ANNOTATION
Noesis: A Semantic Search Engine and Resource Aggregator for Atmospheric Science Sunil Movva, Rahul Ramachandran, Xiang Li, Phani Cherukuri, Sara Graves Information Technology and Systems Center University
More informationFolksonomies versus Automatic Keyword Extraction: An Empirical Study
Folksonomies versus Automatic Keyword Extraction: An Empirical Study Hend S. Al-Khalifa and Hugh C. Davis Learning Technology Research Group, ECS, University of Southampton, Southampton, SO17 1BJ, UK {hsak04r/hcd}@ecs.soton.ac.uk
More informationA Case Study of Question Answering in Automatic Tourism Service Packaging
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, Special Issue Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0045 A Case Study of Question
More informationLegal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND II. PROBLEM AND SOLUTION
Brian Lao - bjlao Karthik Jagadeesh - kjag Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND There is a large need for improved access to legal help. For example,
More informationMining Opinion Features in Customer Reviews
Mining Opinion Features in Customer Reviews Minqing Hu and Bing Liu Department of Computer Science University of Illinois at Chicago 851 South Morgan Street Chicago, IL 60607-7053 {mhu1, liub}@cs.uic.edu
More informationdm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING
dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING ABSTRACT In most CRM (Customer Relationship Management) systems, information on
More informationThe compositional semantics of same
The compositional semantics of same Mike Solomon Amherst College Abstract Barker (2007) proposes the first strictly compositional semantic analysis of internal same. I show that Barker s analysis fails
More informationINF5820 Natural Language Processing - NLP. H2009 Jan Tore Lønning jtl@ifi.uio.no
INF5820 Natural Language Processing - NLP H2009 Jan Tore Lønning jtl@ifi.uio.no Semantic Role Labeling INF5830 Lecture 13 Nov 4, 2009 Today Some words about semantics Thematic/semantic roles PropBank &
More informationClick to edit Master title style
Click to edit Master title style UNCLASSIFIED//FOR OFFICIAL USE ONLY Dr. Russell D. Richardson, G2/INSCOM Science Advisor UNCLASSIFIED//FOR OFFICIAL USE ONLY 1 UNCLASSIFIED Semantic Enrichment of the Data
More information