A Framework-based Online Question Answering System. Oliver Scheuer, Dan Shen, Dietrich Klakow



Similar documents
Shallow Parsing with Apache UIMA

TREC 2003 Question Answering Track at CAS-ICT

Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System

Automatic Text Analysis Using Drupal

Search and Information Retrieval

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information

Ask your Database: Natural Language Processing using In-Memory Technology

Automatic Knowledge Base Construction Systems. Dr. Daisy Zhe Wang CISE Department University of Florida September 3th 2014

Apache Jakarta Tomcat

An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines)

31 Case Studies: Java Natural Language Tools Available on the Web

Interactive Dynamic Information Extraction

GATE Mímir and cloud services. Multi-paradigm indexing and search tool Pay-as-you-go large-scale annotation

Open Domain Information Extraction. Günter Neumann, DFKI, 2012

Schema documentation for types1.2.xsd

Customizing an English-Korean Machine Translation System for Patent Translation *

Natural Language Processing

Development of Monitoring and Analysis Tools for the Huawei Cloud Storage

Benchmark Study on Distributed XML Filtering Using Hadoop Distribution Environment. Sanjay Kulhari, Jian Wen UC Riverside

Presents. WITSML Solutions For Your Business

Natural Language to Relational Query by Using Parsing Compiler

Semantic annotation of requirements for automatic UML class diagram generation

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov

OWB Users, Enter The New ODI World

A stream computing approach towards scalable NLP

Tier Architectures. Kathleen Durant CS 3200

Chapter 8. Final Results on Dutch Senseval-2 Test Data

High-Volume Data Warehousing in Centerprise. Product Datasheet

The University of Amsterdam s Question Answering System at QA@CLEF 2007

Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems

Designing a Windows Server 2008 Applications Infrastructure

Building a Question Classifier for a TREC-Style Question Answering System

Rotorcraft Health Management System (RHMS)

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic

BIRT Document Transform

Managed Video as a Service RFP Template

Performance Testing Process A Whitepaper

Developing XML Solutions with JavaServer Pages Technology

CHAPTER FIVE RESULT ANALYSIS

MDM Multidomain Edition (Version 9.6.0) For Microsoft SQL Server Performance Tuning

SSC - Web development Model-View-Controller for Java web application development

Semester Thesis Traffic Monitoring in Sensor Networks

NetFlow probe on NetFPGA

XTM Web 2.0 Enterprise Architecture Hardware Implementation Guidelines. A.Zydroń 18 April Page 1 of 12

PoS-tagging Italian texts with CORISTagger

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

Improved metrics collection and correlation for the CERN cloud storage test framework

Performance Analysis and Visualization of SystemC Models. Adam Donlin and Thomas Lenart Xilinx Research

DTS Web Developers Guide

A Middleware Strategy to Survive Compute Peak Loads in Cloud

Natural Language Processing in the EHR Lifecycle

6231B: Maintaining a Microsoft SQL Server 2008 R2 Database

Software Performance and Scalability

McAfee Product Entitlement Definitions

Hosting as a Service (HaaS) Playbook. Version 0.92

The EMSX Platform. A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks. A White Paper.

How To Retire A Legacy System From Healthcare With A Flatirons Eas Application Retirement Solution

Log Analysis Software Architecture

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

Master Degree Project Ideas (Fall 2014) Proposed By Faculty Department of Information Systems College of Computer Sciences and Information Technology

Salesforce Certified Data Architecture and Management Designer. Study Guide. Summer 16 TRAINING & CERTIFICATION

Word Completion and Prediction in Hebrew

ILMT Central Team. Performance tuning. IBM License Metric Tool 9.0 Questions & Answers IBM Corporation

VMware vcloud Director for Service Providers

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Last Updated: July STATISTICA Enterprise Server Security

Holistic Performance Analysis of J2EE Applications

Replication on Virtual Machines

AlienVault Unified Security Management (USM) 4.x-5.x. Deployment Planning Guide

CS 6740 / INFO Ad-hoc IR. Graduate-level introduction to technologies for the computational treatment of information in humanlanguage

MARTe Framework. Middleware for RT Control Development

JReport Server Deployment Scenarios

3-Tier Architecture. 3-Tier Architecture. Prepared By. Channu Kambalyal. Page 1 of 19

Optimization of Internet Search based on Noun Phrases and Clustering Techniques

High Level Design Distributed Network Traffic Controller

A Modular Approach to Teaching Mobile APPS Development

Origins of Operating Systems OS/360. Martin Grund HPI

Open Flow Controller and Switch Datasheet

CLOUD DEVELOPMENT BEST PRACTICES & SUPPORT APPLICATIONS

Best practices and use cases for consistent, enterprise-wide SIEM security policy management

Crystal Enterprise Report Application Server

Oracle WebLogic Server 11g: Administration Essentials

Distributed Computing and Big Data: Hadoop and MapReduce

Oracle WebLogic Server: Remote Monitoring and Management

Transcription:

A Framework-based Online Question Answering System Oliver Scheuer, Dan Shen, Dietrich Klakow

Outline General Structure for Online QA System Problems in General Structure Framework-based Online QA system Requirements of Online QA system Framework Architecture Requirements Satisfaction Web Demo

General Structure for QA System Different researchers are responsible for different modules One module s output pass to the other module Modules share NLP tools, external resources or machine learning techniques

Problems in General Structure Frequent Change of NLP tools QA Module Integration Consecutive Module Evaluation Web-related Considerations

Problem 1 -- Frequent Changing of NLP tools Many choices of NLP tools POS Tagging Named Entity Recognition Coreference Resolution Text Chunking Parsing Semantic Analysis Reasoning Brill s TBL Tagger / Upenn MXPOST Tagger / fntbl tagger / TiMBL tagger / Edinburgh C&C tools / Lingpipe / Edinburgh C&C tools / Irene s / Lingpipe / Abney s Chunker / Edinburgh C&C tools / BaseNP Chunker / Stanford Parser / Collins Parser / Charniak Parser / CCG Parser / MiniPar /

Problem 1 -- Frequent Changing of NLP tools One development circle Decide which kind of analysis to use NP chunking or full chunking? PCFG-based Syntactic parsing or dependency parsing? Evaluate NLP tool separately Accuracy vs. Running time Robustness Test various combinations Contributions to the overall performance How to easily and quickly change NLP tools?

Problem 2 QA Module Integration Each Current Module will be enhanced New Module will be added into the system Different modules are chosen for different processing Web-based passage retrieval module Corpus-based passage retrieval module Answer validation module How to minimize the effort for module integration?

Problem 3 Consecutive Module Evaluation Easy to evaluate module separately Question Processing Module Experimental data for question classification Passage Retrieval Module Document Ranking Task in TREC QA Answer Extraction Module generate proper sentences set by TREC judgment file Enhance module based on separate evaluation Difficult to test the effectiveness of a module on the whole system How to make a consecutive module evaluation?

Problem 4 Web-related Considerations Response Time Client-Server Communication Thread Synchronization System Logging User Information Backup User address, User request, How to cope with all web-related aspects?

Framework-based Online QA System Framework define an overall structure for the system consider all of the aspects not directly related to language processing module Functions Provide well defined interfaces for modules and tools Manage collaborations of modules Enable consecutive module evaluation Handle all web-related aspects

Requirements of Framework 1 Modularity Minimize dependencies between modules between modules and framework Framework won t be changed any more once it is built To minimize effects of code modification Interaction only over a small interface Flexibility Dynamically load modules into framework Allowing to plug in/out arbitrary modules Allowing to pass data in any format between modules

Requirements of Framework 2 Configurability for system setting Not distributed across the whole source codes Exposed to the users Access a readable and editable configuration file Avoids recompiling Scalability Adjustable with respect to hardware How many user can be served in parallel? Max. number of user requests Max. ability of resource consumption CPU, working memory

Requirements of Framework 3 Initialization Long initialization time of Modules Separate module work into 2 phases time-uncritical (one time) initialization phase time-critical (frequent) working phase hold modules in working memory Synchronization Web-Apps work in multi-thread mode Shared objects have to be synchronized One thread per user request

Framework

Framework

Framework

Framework

Framework

Framework

Framework

Session Manager Java servlet class Accept user question return answer to user Control all activities on the top level Initialization Stage Instantiate Configuration Manager, Persistence Manager, Response Manager, Logger Manager Build a Worker Manager Pool Instantiate n Worker Managers Put them into Work Manager Pool Working Stage Get a work manager from pool to process a question

Configuration Manager Read a XML configuration file Provide Session Manager Maximum number of the users Provide Work Manager Which processing modules to use Information for NLP tools Provide Persistence Manager Database Information

Worker Manager Instantiate the selected modules Question Processing Module Passage Retrieval Module Answer Extraction Module Call the modules to process user questions and extract answer A standardized interface Worker Hold in working memory once system runs

Persistence Manager Gather data Question retrieved sentences, extracted answers Running time for each module Offline Evaluation accuracy and running time single modules / module combinations Question Corpus Further enhance question processing module Logging to Database / XML file

Response Manager Choose an output format the specifics of requesting client Different output formats Specified by the respective Response Generator module HTML pages for normal PC clients WML pages for small screened devices (PDA)

Collaboration of Modules datasheet object Instantiated by Session Manager Exchange information between modules

Requirements Satisfaction

Requirements Satisfaction 1 Modularity

Requirements Satisfaction 2 Flexibility

Requirements Satisfaction 3 Configurability

Requirements Satisfaction 4 Initialization

Requirements Satisfaction 5 Scalability

Requirements Satisfaction 6 Synchronization

Requirements Satisfaction Flexibility Modularity Initialization Scalability Synchronization Configurability

Conclusion Framework for online QA system Web-related aspects Module integration Module collaboration Consecutive module evaluation Release QA researchers from all non-nlp related tasks Platform for researchers working on QA together

Demo