Human Goal Classification of Natural Language Text

Similar documents
Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Open Domain Information Extraction. Günter Neumann, DFKI, 2012

Terminology Extraction from Log Files

31 Case Studies: Java Natural Language Tools Available on the Web

GrammAds: Keyword and Ad Creative Generator for Online Advertising Campaigns

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov

Stanford s Distantly-Supervised Slot-Filling System

Question Answering and Multilingual CLEF 2008

C o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER

Building a Question Classifier for a TREC-Style Question Answering System

Get Ready for IELTS Writing. About Get Ready for IELTS Writing. Part 1: Language development. Part 2: Skills development. Part 3: Exam practice

Social Security Lesson Plan. Central Historical Question: Which historical account of Social Security is more accurate?

ANALYZING THE TEXT IN MEDICAL RECORDS: A COLLECTIVE APPROACH USING VISUALIZATION. By W H Inmon

An Overview of Computational Advertising

Terminology Extraction from Log Files

Corpus Design for a Unit Selection Database

BBC LEARNING ENGLISH 6 Minute Grammar Question forms

Modal Verbs in New Zealand English Directives'

Tagging with Hidden Markov Models

Proficiency Evaluation Test Intermediate to Advanced

Studying the Impact of Text Summarization on Contextual Advertising

Twitter Stock Bot. John Matthew Fong The University of Texas at Austin

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking

Chunk Parsing. Steven Bird Ewan Klein Edward Loper. University of Melbourne, AUSTRALIA. University of Edinburgh, UK. University of Pennsylvania, USA

Text Mining for Health Care and Medicine. Sophia Ananiadou Director National Centre for Text Mining

Expert System. Deep Semantic vs. Keyword and Shallow Linguistic: A New Approach for Supporting Exploitation

Digital Asset Management and Controlled Vocabulary

Deep Divisions over Debt Reduction Proposals

The English Department Guide. To doing well in your. English GCSE Exams

E-discovery Taking Predictive Coding Out of the Black Box

published by


LABERINTO at ImageCLEF 2011 Medical Image Retrieval Task

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari

Word Completion and Prediction in Hebrew

Interactive Dynamic Information Extraction

Context Grammar and POS Tagging

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy

Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams

A Method for Automatic De-identification of Medical Records

USING NVIVO FOR DATA ANALYSIS IN QUALITATIVE RESEARCH AlYahmady Hamed Hilal Saleh Said Alabri Ministry of Education, Sultanate of Oman

Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization

Why are Organizations Interested?

Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System

Identifying Personal Stories in Millions of Weblog Entries

Score: /20. Answer Sheet. Today s Date: Your Name ( ): Your Student Number: Your English Teacher s Name( ): 19. (A) (B) (C) (D) 18.

A chart generator for the Dutch Alpino grammar

Extraction and Visualization of Protein-Protein Interactions from PubMed

PTE Academic Recommended Resources

Common Core Writing Rubrics, Grade 3

Semantic Features of Verbs and Types of Present Perfect in English

SOCIS: Scene of Crime Information System - IGR Review Report

Computer Standards & Interfaces

LINKING WORDS AND PHRASES

Improving Knowledge Discovery. By Combining Text-Mining (TDM) And Link-Analysis Techniques

ONLINE RESUME PARSING SYSTEM USING TEXT ANALYTICS

BBC LEARNING ENGLISH 6 Minute Grammar Past perfect continuous

SWIFT: A Text-mining Workbench for Systematic Review

Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1

Topics in basic DBMS course

Flattening Enterprise Knowledge

The Seven Practice Areas of Text Analytics

Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata

Optimization of Internet Search based on Noun Phrases and Clustering Techniques

Anotaciones semánticas: unidades de busqueda del futuro?

Protein-protein Interaction Passage Extraction Using the Interaction Pattern Kernel Approach for the BioCreative 2015 BioC Track

Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization

ECIR a Lightweight Approach for Entity-centric Information Retrieval

Extrac'ng People s Hobby and Interest Informa'on from Social Media Content

Grade 6 English Language Arts Performance Level Descriptors

Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.

Expository Essay vs. Persuasive Essay

PTE Academic Recommended Resources

Sentiment analysis on tweets in a financial domain

The Gender Gap Attitudes on Public Policy Issues

Identifying SPAM with Predictive Models

Content Area Vocabulary: Activities Packet

Download Check My Words from:

From Terminology Extraction to Terminology Validation: An Approach Adapted to Log Files

PTE Academic Preparation Course Outline

Chapter 2 The Information Retrieval Process

Semantic SharePoint. Technical Briefing. Helmut Nagy, Semantic Web Company Andreas Blumauer, Semantic Web Company

Open Mind. Unit 1 Who do you think you are?

User research for information architecture projects

Syntactic and Semantic Differences between Nominal Relative Clauses and Dependent wh-interrogative Clauses

Handouts for Conversation Partners: Grammar

Automated Content Analysis of Discussion Transcripts

March 12, 2007 Survey Results on Education Among California Business Leaders

Scenario 2: Assessment Practices. Subject Matter: Interactive Communication. Acquiring and Presenting Cultural Information.

Attacking information overload in software development

A Case Study of Question Answering in Automatic Tourism Service Packaging

A guide to the lifeblood of DAM:

NetOwl(TM) Extractor Technical Overview March 1997

Appendix B: Topline Questionnaire

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Writing learning objectives

DEPENDENCY PARSING JOAKIM NIVRE

Taxonomies for Auto-Tagging Unstructured Content. Heather Hedden Hedden Information Management Text Analytics World, Boston, MA October 1, 2013

Transcription:

Human Goal Classification of Natural Language Text Mark Kröll, Knowledge Management Institute Graz University of Technology Reid Swanson and Andrew Gordon Institute for Creative Technologies University of Southern California 1

Excerpt from Barack Obama s Denver Speech: I will stop giving the wealthiest Americans tax cuts that they don't need and didn't ask for, and restore fairness to our economy. I'll give a tax cut to working people; provide relief to homeowners; and eliminate the income tax for seniors making under $50,000 so they can retire with the dignity and security they have earned. Charity Helping the needy Intentional Profile of this speech Taxonomy of Human Goals (developed by Read et al. [Chulef01] ) however, human goals are seldom mentioned explicitly in plain text... need a connection between text and the human goal taxonomy actions that contribute to the achievement of a goal are expressed quite often 2

Profiles of People s Interests knowledge about a person s interests can be used to create an informative profile from knowing people s goals and interests one can infer their opinions their relationship with other people their attitude towards life Acquiring the data represents the easy part Weblogs Transcripts of political speeches Creating an interest profile out of it, the more challenging part Textual data?? 3

Knowledge Base Textual Content The idea is to: Taxonomy of Human Goals 1.) collect a list of representative actions that hint towards goal categories ( Knowledge Base) 2.) based on the identification of actions, goal categories are assigned 4

Phrases: Phrase Search Queries Category: Looking Young Taxonomy of Human Goals Brainstorming Avoid wrinkles Age well Be vibrant with Energy Looking Vital Causal Relations In order to avoid wrinkles Essential for aging well Necessary for looking vital Data preparation and searching the index Processing of textual content Yahoo! BOSS API Political Speeches Looking Young you need to moisturize inside and out Profile Creation by Action Identification Looking Young but the biggest reason women have such high risk of vitamin D deficit according to Holick, women are encouraged to avoid all sunlight and skin cancer. Profile Knowledge Base/ Index 5

Quality of the Knowledge Base Some facts: contains 168.657 sentences min: 12 (Category: Firm Values) max: 7323 (Category: Helping Others) yielding a skewed distribution Annotation Task to approximate the precision of the entries not relevant to the category not containing an action that can be performed to achieve the goal random sample consisting of 674 entries 57% correct entries vs. 43% incorrect entries 6

Barack Obama 51 Speeches (135 Categories) CATEGORIES Jan03 08 Jan08 08 Jan20 08... T I M E Jun21 08 Jun23 08 Jun24 08 Jun26 08 Jun28 08 Jun30 08 Aspirations Being better than others Being Creative Being free Being responsible 7

Comparing Average Profiles John McCain Barack Obama Average Profiles based on 51 speeches of Obama and 43 speeches of McCain given between January and June 8

Evaluation Sentences out of speech: Assigned Category: Score: I'll give a tax cut to working people; provide relief to homeowners; and eliminate the income tax for seniors making under $50,000 so they can retire with the dignity and security they have earned. Charity 0.59 We need to widely reform the way we do business in Washington; to end wasteful spending that does little if anything to meet government's obligations to the American people. Ethical 0.62 I am running for President because I believe that we need fundamental change in America. Bills 0.92 9

Improving the Quality by a more sophisticated pre-processing using bigrams using verb/noun bigrams (need part-of-speech tagging) by applying a pre-classification where sentences are pre-classified to ensure presence of an action using for instance verb phrases out of parse trees as features by using only advantageous causal relation according to the annotation task 10

Size of the Knowledge Base Weak points skewed distribution of sentences number of sentences per category too low Means to increase the amount of sentences Revising the search phrases adding further phrases expansion of present phrases (word net) Use Yahoo! BOSS API to retrieve more results per submitted query Now restricted to 500 11

Discussion How could we identify actions that are relevant for a certain category? Example for the search phrase: in order to age well Cork has been used for over 400 years, and many winemakers today still believe that in order to age well, wine needs gradual exposure to oxygen Heuristics vs. automatic approach How important is the corpus where we acquire the actions from? Are other corpora (Yahoo! Answers, Wikipedia) better suited? To what extent does the difference in vocabulary (web vs. Political speeches) influence the profile generation? 12

Thank you for your attention! 13

References [Chulef01] Chulef, A. S.; Read, S. J. & Walsh, D. A. (2001), 'A Hierarchical Taxonomy of Human Goals', Motivation and Emotion 25(3), 191--232. [Quirk85] Quirk, R.; Greenbaum, S.; Leech, G. & Svartvik, J. (1985), A Comprehensive Grammar of the English Language, Longman, London. 14

verb/noun bigram example The sentence: In order to look young, people are willing to undergo surgeries and enhancement procedures that cost a lot of time and money. would produce following bigrams: undergo surgeries undergo enhancement undergo procedures cost time cost money 15

Finding Actions - Examples Search phrase: In order to avoid wrinkles Extracted Sentences out of Web Content: You need to moisturize inside and out, in order to avoid wrinkles. But the biggest reason women have such high risk of vitamin D deficit according to Holick, is that women are encouraged to avoid all sunlight in order to avoid wrinkles and skin cancer. back 16