# Fact-based question decomposition in DeepQA

Save this PDF as:

Size: px
Start display at page:

## Transcription

4 types of patterns, targeting parallel configurations as described below. Table 1 Parallel decomposition rule coverage. 1. Independent subtreesvone strategy for identifying potentially independent subquestions within a question is to look for clauses that are likely to capture a separate piece of information about the answer from the rest of the question. Relative or subordinate clauses are examples of such potentially independent subtrees and are typically indicative of parallel decomposition, as shown in the example below: (6) FICTIONAL ANIMALS: The name of this character, introduced in 1894, comes from the Hindi for Bbear[. We use syntactic cues that indicate such clauses within the PAS of the question to identify these parallel decomposable questions. Labels of edges in the PAS that connect such subtrees to the focus are generally good indicators of a subquestion and are used to identify the subtree as a decomposable fact. In example (6), the patterns identify Bthis character, first introduced in 1894[ as a decomposed subquestion. Another syntactic cue involves patterns using conjunctions as decomposition points, as shown in example (2) mentioned earlier. 2. Composable unitsvan alternate strategy for identifying subquestions is to Bcompose[ the fact by combining elements from the question. In contrast to the previous type of patterns where we try to find a portion of the PAS that can be Bbroken off[ as an independent subquestion, the Bcomposable units[ patterns take different parts of the PAS and combine them to form a subquestion. For example, from question (7), (7) THE CABINET: James Wilson of Iowa, who headed this Department for 16 years, served longer than any other cabinet officer, we derive a subquestion BJames Wilson of Iowa headed this Department[ by composing the elements in subject, verb, and object positions of the PAS, respectively. 3. Segments with qualifiersvwe employ a separate group of patterns for cases where the modifier of the focus is a relative qualifier, such as the first, only, and the westernmost. In such cases, information from another clause is usually required to Bcomplete[ the relative qualifier. Without it, the statement or fact has incomplete information (e.g., Bthe third man[) and needs a supporting clause (e.g., Bthe third man... to climb Mt. Everest[) for the decomposition to make sense as a fact about the correct answer. Because it occurs often enough in the Jeopardy! data set, we have a separate category for such patterns. To deal with these cases, we created patterns that combine the characteristics of composable unit patterns with the independent subtree patterns. We Bcompose[ the relative qualifier, the focus (along with its modifiers), and the attached supporting clause subtree to instantiate this type of pattern. We defined rules to capture each of the patterns and applied our parallel decomposition rules to a blind set of 1,269 Final Jeopardy! questions to get an estimate of how often these patterns occur in the data. Results are shown in Table 1. The table also shows the total distinct questions found parallel decomposable as multiple rules can fire on the same question. We did not evaluate the precision and recall for the parallel decomposition rules for three reasons: Manually creating a standard of parallel decomposable questions with their subparts is laborious, particularly because it requires judging whether facts are mutually independent or not. Even with such a standard, it is difficult to measure our rules by meaningfully comparing or aligning their output with the decompositions in the standard. We use a machine learning model (described later) to suitably combine and weigh the decomposition rules on the basis of their performance on the training set, thus implicitly taking into account their precision and recall (e.g., precise rules get a higher weight in the model). Rewriting subquestions Decomposition rules described in the previous subsection are applied in sequence to a given question to determine whether it can be decomposed. For example, given the question: (8) HISTORIC PEOPLE: The life story of this man who died in 1801 was chronicled in an A&E Biography DVD titled BTriumph and Treason[. The system decomposes the question into two subquestions: Q1: This man who died in Q2: The life story of this man was chronicled 13 : 4 A. KALYANPUR ET AL.

6 Table 2 Evaluating parallel decomposition. for the candidate answer to the sum of the confidences obtained for that answer across all the subquestions. For the machine learning algorithm, we use the Weka [3] implementation of logistic regression with instance weighting. Evaluating parallel decomposition Evaluation data Our data set is a collection of Final Jeopardy! question answer pairs obtained from the J! Archive website (http://www.j-archive.com/). This gives us ground truth for both training a system and evaluating its performance. We split our set of roughly 3,000 Final Jeopardy! questions into a training set of 1,138 questions, a development set of 517 questions, and a blind test set of 1,269 questions. Experiments The decomposition rules were defined and tuned during the development. The final re-ranking model was trained on the training set using the features described in the previous subsection. The model was trained using logistic regression with instance weighting, where positive instances are weighed four times that of negative instances to address the significant imbalance of instances in the two classes. The results of applying the parallel decomposition rules followed by the re-ranking model to the 1,269 blind test questions are shown in the last row of Table 2. The baseline is the performance of the underlying QA (Watson) system used in our meta-framework without the decomposition components or analysis. We further evaluate the importance of the contextualization aspect of our question rewriting technique. For this purpose, we altered our algorithm to issue the subquestion text as is, using the original category, and retrained and evaluated the resultant model again on our test set. Results are in the second row of Table 2. Finally, we also wanted to compare our machine learning-based re-ranking model with a simple heuristic strategy that re-ranks answers based on a new final score computed as the sum of the confidences for the answer across all subquestions, including the original answer confidence for the entire question. Results are in the third row of Table 2. Discussion of results Table 2 shows the end-to-end accuracy of the baseline system against different configurations of the parallel decomposition answering extension over two sets of questions, i.e., the entire blind test set (1,269 questions) and a subset of this identified as parallel decomposable by our patterns (598 out of 1,269 questions, i.e., 47%). Interestingly, the performance of the baseline QA system on the parallel decomposable subset was 56.6%, 6% higher than the performance on all test questions. This is because questions in the parallel decomposable class typically contain more than one fact or constraint that the answer must satisfy, and even without decomposition, the system can exploit this redundancy in some cases such as when one fact is strongly associated with the correct answer and there is evidence supporting this in the sources. Our results also show that without rewriting the questions to include contextual information in the category for subquestions, the system performance did not improve much over the baseline. On the other hand, using rewriting to insert context, the parallel decomposition algorithm using a simple heuristic re-ranking approach started to show some gains over the baseline. Finally, the best result was obtained using a machine learning-based re-ranker with question rewriting, where the system gained 1.4% in accuracy (with 10 gains and 2 losses) on the decomposable question set, which translated to an end-to-end gain of 0.6%. This gain was statistically significant, with significance assessed for p G :05 using McNemar s test [4] with Yates correction for continuity. 13 : 6 A. KALYANPUR ET AL.

11 Adam Lally IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY USA Mr. Lally is a Senior Technical Staff Member at the IBM T. J. Watson Research Center. He received a B.S. degree in computer science from Rensselaer Polytechnic Institute in 1998 and an M.S. degree in computer science from Columbia University in As a member of the IBM DeepQA Algorithms Team, he helped develop the Watson system architecture that gave the machine its speed. He also worked on the natural-language processing algorithms that enable Watson to understand questions and categories and gather and assess evidence in natural language. Before working on Watson, he was the lead software engineer for the Unstructured Information Management Architecture project, an open-source platform for creating, integrating, and deploying unstructured information management solutions. Jennifer Chu-Carroll IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY USA Dr. Chu-Carroll is a Research Staff Member in the Semantic Analysis and Integration Department at the T. J. Watson Research Center. She received a Ph.D. degree in computer science from the University of Delaware in Prior to joining IBM in 2001, she spent 5 years as a Member of Technical Staff at Lucent Technologies Bell Laboratories. Dr. Chu-Carroll s research interests are in the area of natural-language processing, more specifically in question-answering and dialogue systems. Dr. Chu-Carroll serves on numerous technical committees, including as program committee co-chair of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT) 2006 and as general chair of NAACL HLT A. KALYANPUR ET AL. 13 : 11

### Structure Mapping for Jeopardy! Clues

Structure Mapping for Jeopardy! Clues J. William Murdock murdockj@us.ibm.com IBM T.J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598 Abstract. The Jeopardy! television quiz show asks natural-language

### Watson: An Overview of the DeepQA Project

Watson: An Overview of the DeepQA Project AI Magazine 2010 IMB TEAM: David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A. Kalyanpur, Adam Lally, J. William Murdock, Eric

### Decision Making in IBM Watson Question Answering

Decision Making in IBM Watson Question ing Dr. J. William Murdock IBM Watson Research Center IBM, the IBM logo, ibm.com, Smarter Planet and the planet icon are trademarks of International Business Machines

### » A Hardware & Software Overview. Eli M. Dow

» A Hardware & Software Overview Eli M. Dow Overview:» Hardware» Software» Questions 2011 IBM Corporation Early implementations of Watson ran on a single processor where it took 2 hours

### Semantic Technologies in IBM Watson TM

Semantic Technologies in IBM Watson TM Alfio Gliozzo IBM Watson Research Center Yorktown Heights, NY 10598 gliozzo@us.ibm.com Siddharth Patwardhan IBM Watson Research Center Yorktown Heights, NY 10598

The Prolog Interface to the Unstructured Information Management Architecture Paul Fodor 1, Adam Lally 2, David Ferrucci 2 1 Stony Brook University, Stony Brook, NY 11794, USA, pfodor@cs.sunysb.edu 2 IBM

### Finding Answers in Semi-Structured Data

Finding Answers in Semi-Structured Data Joe DiFebo Sean Massung December 19, 2012 Abstract Using Wikipedia [5] as a knowledge base is a relatively new branch in the thought process of the Question Answering

### IBM Bluemix- Developing a Travel guide using Watson services

IBM Bluemix- Developing a Travel guide using Watson services by Mahalakshmy Krishnamoorthy A Project Report Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer

### Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata

Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata Alessandra Giordani and Alessandro Moschitti Department of Computer Science and Engineering University of Trento Via Sommarive

### How Watson Works. Dave Mobley Watson Solutions Architect, Watson Technical Sales 1/22/14

How Watson Works Dave Mobley Watson Solutions Architect, Watson Technical Sales /22/4 Page IBM is a registered trademark of the International Business Machines Corporation in the United States, or other

### MAN VS. MACHINE. How IBM Built a Jeopardy! Champion. 15.071x The Analytics Edge

MAN VS. MACHINE How IBM Built a Jeopardy! Champion 15.071x The Analytics Edge A Grand Challenge In 2004, IBM Vice President Charles Lickel and coworkers were having dinner at a restaurant All of a sudden,

### A Statistical Text Mining Method for Patent Analysis

A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, shjun@cju.ac.kr Abstract Most text data from diverse document databases are unsuitable for analytical

### Overview of the TACITUS Project

Overview of the TACITUS Project Jerry R. Hobbs Artificial Intelligence Center SRI International 1 Aims of the Project The specific aim of the TACITUS project is to develop interpretation processes for

### Answering Questions Requiring Cross-passage Evidence

Answering Questions Requiring Cross-passage Evidence Kisuh Ahn Dept. of Linguistics Hankuk University of Foreign Studies San 89 Wangsan-li, Mohyeon-myeon Yongin-si, Gyeongi-do, Korea kisuhahn@gmail.com

### Natural Language to Relational Query by Using Parsing Compiler

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

### TREC 2003 Question Answering Track at CAS-ICT

TREC 2003 Question Answering Track at CAS-ICT Yi Chang, Hongbo Xu, Shuo Bai Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China changyi@software.ict.ac.cn http://www.ict.ac.cn/

### Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System

Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering

### IBM Announces Eight Universities Contributing to the Watson Computing System's Development

IBM Announces Eight Universities Contributing to the Watson Computing System's Development Press release Related XML feeds Contact(s) information Related resources ARMONK, N.Y. - 11 Feb 2011: IBM (NYSE:

### Typing candidate answers using type coercion

Typing candidate answers using type coercion Many questions explicitly indicate the type of answer required. One popular approach to answering those questions is to develop recognizers to identify instances

### A Survey on Product Aspect Ranking

A Survey on Product Aspect Ranking Charushila Patil 1, Prof. P. M. Chawan 2, Priyamvada Chauhan 3, Sonali Wankhede 4 M. Tech Student, Department of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra,

### Finding needles in the haystack: Search and candidate generation

Finding needles in the haystack: Search and candidate generation A key phase in the DeepQA architecture is Hypothesis Generation, in which candidate system responses are generated for downstream scoring

### End-to-End Sentiment Analysis of Twitter Data

End-to-End Sentiment Analysis of Twitter Data Apoor v Agarwal 1 Jasneet Singh Sabharwal 2 (1) Columbia University, NY, U.S.A. (2) Guru Gobind Singh Indraprastha University, New Delhi, India apoorv@cs.columbia.edu,

### Building a Question Classifier for a TREC-Style Question Answering System

Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given

21st International Congress on Modelling and Simulation, Gold Coast, Australia, 29 Nov to 4 Dec 2015 www.mssanz.org.au/modsim2015 On the Feasibility of Answer Suggestion for Advice-seeking Community Questions

### Identifying implicit relationships

Identifying implicit relationships Answering natural-language questions may often involve identifying hidden associations and implicit relationships. In some cases, an explicit question is asked by the

### PRODUCT REVIEW RANKING SUMMARIZATION

PRODUCT REVIEW RANKING SUMMARIZATION N.P.Vadivukkarasi, Research Scholar, Department of Computer Science, Kongu Arts and Science College, Erode. Dr. B. Jayanthi M.C.A., M.Phil., Ph.D., Associate Professor,

### The goals of IBM Research are to advance computer science

Building Watson: An Overview of the DeepQA Project David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A. Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John Prager,

### Software Engineering of NLP-based Computer-assisted Coding Applications

Software Engineering of NLP-based Computer-assisted Coding Applications 1 Software Engineering of NLP-based Computer-assisted Coding Applications by Mark Morsch, MS; Carol Stoyla, BS, CLA; Ronald Sheffer,

### Mavuno: A Scalable and Effective Hadoop-Based Paraphrase Acquisition System

Mavuno: A Scalable and Effective Hadoop-Based Paraphrase Acquisition System Donald Metzler and Eduard Hovy Information Sciences Institute University of Southern California Overview Mavuno Paraphrases 101

### IBM Research Report. Towards the Open Advancement of Question Answering Systems

RC24789 (W0904-093) April 22, 2009 Computer Science IBM Research Report Towards the Open Advancement of Question Answering Systems David Ferrucci 1, Eric Nyberg 2, James Allan 3, Ken Barker 4, Eric Brown

### Subordinating to the Majority: Factoid Question Answering over CQA Sites

Journal of Computational Information Systems 9: 16 (2013) 6409 6416 Available at http://www.jofcis.com Subordinating to the Majority: Factoid Question Answering over CQA Sites Xin LIAN, Xiaojie YUAN, Haiwei

### Domain Adaptive Relation Extraction for Big Text Data Analytics. Feiyu Xu

Domain Adaptive Relation Extraction for Big Text Data Analytics Feiyu Xu Outline! Introduction to relation extraction and its applications! Motivation of domain adaptation in big text data analytics! Solutions!

### A View Integration Approach to Dynamic Composition of Web Services

A View Integration Approach to Dynamic Composition of Web Services Snehal Thakkar, Craig A. Knoblock, and José Luis Ambite University of Southern California/ Information Sciences Institute 4676 Admiralty

### Wikipedia and Web document based Query Translation and Expansion for Cross-language IR

Wikipedia and Web document based Query Translation and Expansion for Cross-language IR Ling-Xiang Tang 1, Andrew Trotman 2, Shlomo Geva 1, Yue Xu 1 1Faculty of Science and Technology, Queensland University

### DEVELOPMENT AND ANALYSIS OF HINDI-URDU PARALLEL CORPUS

DEVELOPMENT AND ANALYSIS OF HINDI-URDU PARALLEL CORPUS Mandeep Kaur GNE, Ludhiana Ludhiana, India Navdeep Kaur SLIET, Longowal Sangrur, India Abstract This paper deals with Development and Analysis of

### Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

### CONFIOUS * : Managing the Electronic Submission and Reviewing Process of Scientific Conferences

CONFIOUS * : Managing the Electronic Submission and Reviewing Process of Scientific Conferences Manos Papagelis 1, 2, Dimitris Plexousakis 1, 2 and Panagiotis N. Nikolaou 2 1 Institute of Computer Science,

### Facilitating Business Process Discovery using Email Analysis

Facilitating Business Process Discovery using Email Analysis Matin Mavaddat Matin.Mavaddat@live.uwe.ac.uk Stewart Green Stewart.Green Ian Beeson Ian.Beeson Jin Sa Jin.Sa Abstract Extracting business process

### Text Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com

Text Analytics with Ambiverse Text to Knowledge www.ambiverse.com Version 1.0, February 2016 WWW.AMBIVERSE.COM Contents 1 Ambiverse: Text to Knowledge............................... 5 1.1 Text is all Around

### A Quirk Review of Translation Models

A Quirk Review of Translation Models Jianfeng Gao, Microsoft Research Prepared in connection with the 2010 ACL/SIGIR summer school July 22, 2011 The goal of machine translation (MT) is to use a computer

### Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority

### Watson and Jeopardy! In the game: The interface between. B. L. Lewis

In the game: The interface between Watson and Jeopardy! B. L. Lewis To play as a contestant in Jeopardy!TM, IBM Watson TM needed an interface program to handle the communications between the Jeopardy!

### Computational Linguistics: Syntax I

Computational Linguistics: Syntax I Raffaella Bernardi KRDB, Free University of Bozen-Bolzano P.zza Domenicani, Room: 2.28, e-mail: bernardi@inf.unibz.it Contents 1 Syntax...................................................

### Collecting Polish German Parallel Corpora in the Internet

Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska

### Qualitative Corporate Dashboards for Corporate Monitoring Peng Jia and Miklos A. Vasarhelyi 1

Qualitative Corporate Dashboards for Corporate Monitoring Peng Jia and Miklos A. Vasarhelyi 1 Introduction Electronic Commerce 2 is accelerating dramatically changes in the business process. Electronic

### Numerical Algorithms for Predicting Sports Results

Numerical Algorithms for Predicting Sports Results by Jack David Blundell, 1 School of Computing, Faculty of Engineering ABSTRACT Numerical models can help predict the outcome of sporting events. The features

### Paraphrasing controlled English texts

Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich kaljurand@gmail.com Abstract. We discuss paraphrasing controlled English texts, by defining

### Effect of Using Neural Networks in GA-Based School Timetabling

Effect of Using Neural Networks in GA-Based School Timetabling JANIS ZUTERS Department of Computer Science University of Latvia Raina bulv. 19, Riga, LV-1050 LATVIA janis.zuters@lu.lv Abstract: - The school

### Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 ushaduraisamy@yahoo.co.in 2 anishgracias@gmail.com Abstract A vast amount of assorted

### Interactive Dynamic Information Extraction

Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken

### An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them

An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them Vangelis Karkaletsis and Constantine D. Spyropoulos NCSR Demokritos, Institute of Informatics & Telecommunications,

### Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

### Mining Opinion Features in Customer Reviews

Mining Opinion Features in Customer Reviews Minqing Hu and Bing Liu Department of Computer Science University of Illinois at Chicago 851 South Morgan Street Chicago, IL 60607-7053 {mhu1, liub}@cs.uic.edu

### Phrases. Topics for Today. Phrases. POS Tagging. ! Text transformation. ! Text processing issues

Topics for Today! Text transformation Word occurrence statistics Tokenizing Stopping and stemming Phrases Document structure Link analysis Information extraction Internationalization Phrases! Many queries

### Network Working Group

Network Working Group Request for Comments: 2413 Category: Informational S. Weibel OCLC Online Computer Library Center, Inc. J. Kunze University of California, San Francisco C. Lagoze Cornell University

### Domain Classification of Technical Terms Using the Web

Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using

### Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

### 72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD Paulo Gottgtroy Auckland University of Technology Paulo.gottgtroy@aut.ac.nz Abstract This paper is

### Watson. An analytical computing system that specializes in natural human language and provides specific answers to complex questions at rapid speeds

Watson An analytical computing system that specializes in natural human language and provides specific answers to complex questions at rapid speeds I.B.M. OHJ-2556 Artificial Intelligence Guest lecturing

### Putting IBM Watson to Work In Healthcare

Martin S. Kohn, MD, MS, FACEP, FACPE Chief Medical Scientist, Care Delivery Systems IBM Research marty.kohn@us.ibm.com Putting IBM Watson to Work In Healthcare 2 SB 1275 Medical data in an electronic or

### Research Statement Immanuel Trummer www.itrummer.org

Research Statement Immanuel Trummer www.itrummer.org We are collecting data at unprecedented rates. This data contains valuable insights, but we need complex analytics to extract them. My research focuses

### Making Watson fast. E. A. Epstein M. I. Schor B. S. Iyer A. Lally E. W. Brown J. Cwiklik

Making Watson fast IBM Watsoni is a system created to demonstrate DeepQA technology by competing against human champions in a question-answering game designed for people. The DeepQA architecture was designed

### Multi-source hybrid Question Answering system

Multi-source hybrid Question Answering system Seonyeong Park, Hyosup Shim, Sangdo Han, Byungsoo Kim, Gary Geunbae Lee Pohang University of Science and Technology, Pohang, Republic of Korea {sypark322,

### Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms

Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms Irina Astrova 1, Bela Stantic 2 1 Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn,

### Models of Dissertation Research in Design

Models of Dissertation Research in Design S. Poggenpohl Illinois Institute of Technology, USA K. Sato Illinois Institute of Technology, USA Abstract This paper is a meta-level reflection of actual experience

### Mining Navigation Histories for User Need Recognition

Mining Navigation Histories for User Need Recognition Fabio Gasparetti and Alessandro Micarelli and Giuseppe Sansonetti Roma Tre University, Via della Vasca Navale 79, Rome, 00146 Italy {gaspare,micarel,gsansone}@dia.uniroma3.it

### Answer: Who is Watson and Why Do We Care? Question: The first computer to win on Jeopardy, by a score of \$40,300 to \$30,000

Answer: Who is Watson and Why Do We Care? Question: The first computer to win on Jeopardy, by a score of \$40,300 to \$30,000 Moderator: Dr. Steve Buckley IBM Business Analytics & Optimization Applied Research

### The English Language Learner CAN DO Booklet

WORLD-CLASS INSTRUCTIONAL DESIGN AND ASSESSMENT The English Language Learner CAN DO Booklet Grades 9-12 Includes: Performance Definitions CAN DO Descriptors For use in conjunction with the WIDA English

### The Role of Sentence Structure in Recognizing Textual Entailment

Blake,C. (In Press) The Role of Sentence Structure in Recognizing Textual Entailment. ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic. The Role of Sentence Structure

### Using News Articles to Predict Stock Price Movements

Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 gyozo@cs.ucsd.edu 21, June 15,

### Healthcare Measurement Analysis Using Data mining Techniques

www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

### Natural Language Updates to Databases through Dialogue

Natural Language Updates to Databases through Dialogue Michael Minock Department of Computing Science Umeå University, Sweden Abstract. This paper reopens the long dormant topic of natural language updates

### RNN for Sentiment Analysis: Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

RNN for Sentiment Analysis: Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Borui(Athena) Ye University of Waterloo borui.ye@uwaterloo.ca July 15, 2015 1 / 26 Overview 1 Introduction

### Clustering Connectionist and Statistical Language Processing

Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised

### Overview of MT techniques. Malek Boualem (FT)

Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,

### Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines

, 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing

### MODEL DRIVEN DEVELOPMENT OF BUSINESS PROCESS MONITORING AND CONTROL SYSTEMS

MODEL DRIVEN DEVELOPMENT OF BUSINESS PROCESS MONITORING AND CONTROL SYSTEMS Tao Yu Department of Computer Science, University of California at Irvine, USA Email: tyu1@uci.edu Jun-Jang Jeng IBM T.J. Watson

### Abstraction in Computer Science & Software Engineering: A Pedagogical Perspective

Orit Hazzan's Column Abstraction in Computer Science & Software Engineering: A Pedagogical Perspective This column is coauthored with Jeff Kramer, Department of Computing, Imperial College, London ABSTRACT

### How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.

### Mathematics Cognitive Domains Framework: TIMSS 2003 Developmental Project Fourth and Eighth Grades

Appendix A Mathematics Cognitive Domains Framework: TIMSS 2003 Developmental Project Fourth and Eighth Grades To respond correctly to TIMSS test items, students need to be familiar with the mathematics

### Term extraction for user profiling: evaluation by the user

Term extraction for user profiling: evaluation by the user Suzan Verberne 1, Maya Sappelli 1,2, Wessel Kraaij 1,2 1 Institute for Computing and Information Sciences, Radboud University Nijmegen 2 TNO,

### CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation.

CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation. Miguel Ruiz, Anne Diekema, Páraic Sheridan MNIS-TextWise Labs Dey Centennial Plaza 401 South Salina Street Syracuse, NY 13202 Abstract:

### 3 Paraphrase Acquisition. 3.1 Overview. 2 Prior Work

Unsupervised Paraphrase Acquisition via Relation Discovery Takaaki Hasegawa Cyberspace Laboratories Nippon Telegraph and Telephone Corporation 1-1 Hikarinooka, Yokosuka, Kanagawa 239-0847, Japan hasegawa.takaaki@lab.ntt.co.jp

### Financial Trading System using Combination of Textual and Numerical Data

Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,

### Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

### Introducing diversity among the models of multi-label classification ensemble

Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and

### Semantic Search in Portals using Ontologies

Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br

### IBM SPSS Direct Marketing 23

IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release

### Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features

, pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of

### Chapter 8. Final Results on Dutch Senseval-2 Test Data

Chapter 8 Final Results on Dutch Senseval-2 Test Data The general idea of testing is to assess how well a given model works and that can only be done properly on data that has not been seen before. Supervised

### Robust Question Answering for Speech Transcripts: UPC Experience in QAst 2009

Robust Question Answering for Speech Transcripts: UPC Experience in QAst 2009 Pere R. Comas and Jordi Turmo TALP Research Center Technical University of Catalonia (UPC) {pcomas,turmo}@lsi.upc.edu Abstract

### DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

### Research on News Video Multi-topic Extraction and Summarization

International Journal of New Technology and Research (IJNTR) ISSN:2454-4116, Volume-2, Issue-3, March 2016 Pages 37-39 Research on News Video Multi-topic Extraction and Summarization Di Li, Hua Huo Abstract

### Service composition in IMS using Java EE SIP servlet containers

Service composition in IMS using Java EE SIP servlet containers Torsten Dinsing, Göran AP Eriksson, Ioannis Fikouras, Kristoffer Gronowski, Roman Levenshteyn, Per Pettersson and Patrik Wiss The IP Multimedia

### ONTOLOGIES A short tutorial with references to YAGO Cosmina CROITORU

ONTOLOGIES p. 1/40 ONTOLOGIES A short tutorial with references to YAGO Cosmina CROITORU Unlocking the Secrets of the Past: Text Mining for Historical Documents Blockseminar, 21.2.-11.3.2011 ONTOLOGIES

### HIGH SCHOOL MASS MEDIA AND MEDIA LITERACY STANDARDS

Guidelines for Syllabus Development of Mass Media Course (1084) DRAFT 1 of 7 HIGH SCHOOL MASS MEDIA AND MEDIA LITERACY STANDARDS Students study the importance of mass media as pervasive in modern life

### IBM SPSS Direct Marketing 22

IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release

### National Assessment of Educational Progress (NAEP)

Exploring Cognitive Demand in Instruction and Assessment Karin K. Hess Over the past decades, educators and psychologists have attempted to develop models for understanding cognitive complexity as it relates