2014/02/13 Sphinx Lunch



Similar documents
Leveraging Frame Semantics and Distributional Semantic Polieties

UNSUPERVISED INDUCTION AND FILLING OF SEMANTIC SLOTS FOR SPOKEN DIALOGUE SYSTEMS USING FRAME-SEMANTIC PARSING

Combining Slot-based Vector Space Model for Voice Book Search

THE THIRD DIALOG STATE TRACKING CHALLENGE

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde

Clustering Connectionist and Statistical Language Processing

Specialty Answering Service. All rights reserved.

Comparative Error Analysis of Dialog State Tracking

D2.4: Two trained semantic decoders for the Appointment Scheduling task

Christian Leibold CMU Communicator CMU Communicator. Overview. Vorlesung Spracherkennung und Dialogsysteme. LMU Institut für Informatik

ABSTRACT 2. SYSTEM OVERVIEW 1. INTRODUCTION. 2.1 Speech Recognition

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Using the Amazon Mechanical Turk for Transcription of Spoken Language

Unsupervised Data Mining (Clustering)

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu

DATA ANALYSIS II. Matrix Algorithms

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

ADVANCED MACHINE LEARNING. Introduction

Natural Language to Relational Query by Using Parsing Compiler

Summarizing Online Forum Discussions Can Dialog Acts of Individual Messages Help?

VoiceXML-Based Dialogue Systems

Search Engines. Stephen Shaw 18th of February, Netsoc

Geometric-Guided Label Propagation for Moving Object Detection

Web page creation using VoiceXML as Slot filling task. Ravi M H

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

A Survey on Product Aspect Ranking

ONLINE RESUME PARSING SYSTEM USING TEXT ANALYTICS

Analytics on Big Data

How To Use A Webmail On A Pc Or Macodeo.Com

Term extraction for user profiling: evaluation by the user

dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING

Spoken Dialog Challenge 2010: Comparison of Live and Control Test Results

Emotion Detection from Speech

Machine Learning using MapReduce

Experiments in Web Page Classification for Semantic Web

German Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings

Classifying Manipulation Primitives from Visual Data

Fast Matching of Binary Features

A Semi-Supervised Clustering Approach for Semantic Slot Labelling

AVoiceportal Enhanced by Semantic Processing and Affect Awareness

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR

Music Mood Classification

Language Technology II Language-Based Interaction Dialogue design, usability,evaluation. Word-error Rate. Basic Architecture of a Dialog System (3)

Natural Language Dialogue in a Virtual Assistant Interface

Domain Adaptive Relation Extraction for Big Text Data Analytics. Feiyu Xu

A GrAF-compliant Indonesian Speech Recognition Web Service on the Language Grid for Transcription Crowdsourcing

Robustness of a Spoken Dialogue Interface for a Personal Assistant

A System for Labeling Self-Repairs in Speech 1

1. Introduction to Spoken Dialogue Systems

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov

Micro blogs Oriented Word Segmentation System

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

Interactive person re-identification in TV series

Speech Processing Applications in Quaero

. Learn the number of classes and the structure of each class using similarity between unlabeled training patterns

31 Case Studies: Java Natural Language Tools Available on the Web

Using Speech to Reply to SMS Messages While Driving: An In-Car Simulator User Study

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012

CS 6740 / INFO Ad-hoc IR. Graduate-level introduction to technologies for the computational treatment of information in humanlanguage

Analysis of Internet Topologies: A Historical View

Metadata for Corpora PATCOR and Domotica-2

ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search

Which universities lead and lag? Toward university rankings based on scholarly output

Big Data and Scripting map/reduce in Hadoop

Building A Vocabulary Self-Learning Speech Recognition System

Neovision2 Performance Evaluation Protocol

6.2.8 Neural networks for data mining

: Introduction to Machine Learning Dr. Rita Osadchy

Survey: Retrieval of Video Using Content (Speech &Text) Information

Effective Self-Training for Parsing

Music Genre Classification

Search Result Optimization using Annotators

English Grammar Checker

1342 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 7, NOVEMBER 2008

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

Presentation Video Retrieval using Automatically Recovered Slide and Spoken Text

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Conclusions and Future Directions

How To Cluster

CENG 734 Advanced Topics in Bioinformatics

Data Mining Project Report. Document Clustering. Meryem Uzun-Per

Robust Methods for Automatic Transcription and Alignment of Speech Signals

Soft Clustering with Projections: PCA, ICA, and Laplacian

A Method for Automatic De-identification of Medical Records

MA2823: Foundations of Machine Learning

Comparison of K-means and Backpropagation Data Mining Algorithms

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. ~ Spring~r

p. 3 p. 4 p. 93 p. 111 p. 119

The Enron Corpus: A New Dataset for Classification Research

Social Media Mining. Data Mining Essentials

Document Image Retrieval using Signatures as Queries

SVM Based Learning System For Information Extraction

Statistics for BIG data

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

SPEAKER IDENTITY INDEXING IN AUDIO-VISUAL DOCUMENTS

Special Topics in Computer Science

INF5820, Obligatory Assignment 3: Development of a Spoken Dialogue System

Transcription:

2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue Systems Using Frame-Semantic Parsing Yun-Nung (Vivian) Chen, William Yang Wang, and Alexander I. Rudnicky Language Technologies Institute School of Computer Science Carnegie Mellon University

Overview Introduction Approach Experiments Conclusion 2

Overview Introduction Approach Experiments Conclusion 3

In 2011 I have long been looking for good knowledgeable student to take on the task of getting conversational agents to move beyond prerecorded interaction (or clunky spoken dialog systems) Wow! Sounds like they know how to achieve the ultimate machine intelligence at CMU 4

Spoken Language Understanding (SLU) SLU in dialogue systems SLU maps NL inputs to semantic forms I would like to go to Shadyside Tuesday. location: Shadyside date: Tuesday Semantic frames, slots, & values often manually defined by domain experts or developers. What are the problems? 5

Problems with Predefined Slots Generalization: may not generalize to real-world users. Bias propagation: can bias subsequent data collection and annotation. Maintenance: when new data comes in, developers need to start a new round of annotation to analyze the data and update the grammar. Efficiency: time consuming, and high costs. Can we automatically induce semantic slots only using raw audios? 6

Overview Introduction Approach Experiments Conclusion 7

Probabilistic Frame-Semantic Parsing (Das et al., 2010; 2013) on ASR-transcribed utterances. First Step 8

FrameNet FrameNet (Baker et al., 1998) a linguistically-principled semantic resource, based on the frame-semantics theory. Example Frame (food): contains words referring to items of food. Frame Element: a descriptor indicates the characteristic of food. low fat milk milk evokes the food frame; low fat fills the descriptor FE. 9

Frame-Semantic Parsing SEMAFOR (Das et al., 2010; 2013) a state-of-the-art frame-semantics parser, trained on manually annotated FrameNet sentences. 10

The Panacea? Unfortunately... Bad! Good! can i have a cheap restaurant Good! Frame: capability FT LU: can FE LU: i Frame: expensiveness FT LU: cheap Frame: locale by use FT/FE LU: restaurant Task: adapting generic frames to task-specific settings in dialogue systems. 11

As A Ranking Problem Main idea Ranking domain-specific concepts higher than generic semantic concepts can i have a cheap restaurant Frame: capability FT LU: can FE LU: i Frame: expensiveness FT LU: cheap Frame: locale by use FT/FE LU: restaurant slot candidate slot filler 12

Slot Ranking Model (1/3) Rank the slot candidates by integrating two scores the frequency of the slot candidate in the SEMAFOR-parsed corpus slots with higher frequency may be more important the coherence of slot fillers domain-specific concepts should focus on fewer topics and be similar to each other slot: quantity a three one all slot: expensiveness cheap inexpensive expensive lower coherence in topic space higher coherence in topic space 13

Slot Ranking Model (2/3) Measure coherence by word-level context clustering For each slot, slot candidate: expensiveness We have corresponding cluster vector corresponding value vectors: cheap, not expensive (from the utterances with s i in the parsing results) the frequency of words in v j clustered into cluster k Measure coherence measure by pair-wised cosine similarity The slot with higher h(s i ) usually focuses on fewer topics, more specific, and preferable for slots of SDS. 14

Slot Ranking Model (3/3) Spectral clustering For each word r i = 1 when w occurs in the i-th utterance r i = 0 otherwise Assume that the words in the same utterance are related to each other The approach can be summarized in five steps: 1. Calculate the distance matrix 2. Derive the affinity matrix 3. Generate the graph Laplacian 4. Eigen decomposition of L 5. Perform K-means clustering of eigenvectors Reasons why spectral clustering: 1) can be solved efficiently by standard linear algebra 2) invariant to the shapes and densities of each cluster 3) projects the manifolds within data into solvable space and often outperform others 15

Overview Introduction Approach Experiments Conclusion 16

Experiments Dataset Cambridge University SLU corpus (Henderson, 2012) restaurant recommendation in an in-car setting in Cambridge WER = 37% vocabulary size = 1868 2,166 dialogues 15,453 utterances dialogue slot (total # = 10): addr, area, food, name, phone, postcode, price range, signature, task, type 17

Slot Induction Evaluation MAP of the slot ranking model measure the quality of induced slots based on induced and reference slots via the mapping table Approach ASR MAP Manual Frequency 67.31 59.41 K-Means 68.45 59.76 Spectral Clustering 69.36 61.86 The mapping table between induced and reference slots Induced Slot Speak on topic Part orientational Direction Locale Part inner outer Food origin (NULL) Contacting Sending Commerce scenario Expensiveness Range (NULL) Reference Slot Addr Area Food Name Phone Postcode Price range Signature The majority of the reference slots used in a real world SDS can be induced automatically in an unsupervised way. seeking Desiring Locating Locale by use building Task Type 18

Slot Filling Evaluation MAP of the slot ranking model For each slot, we compute F1by comparing the lists of extracted and reference slot fillers The top-5 F1-measure slot-filling corresponding to matched slot mapping for ASR SEMAFOR Slot Locale by use Speak on topic Expensiveness Origin Direction Reference Slot Type Addr Price range Food Area F1-Hard 89.75 88.86 62.05 36.00 29.81 F1-Soft 89.96 88.86 62.35 43.48 29.81 F1-Hard: the values of two slot fillers are exactly the same F1-Soft: the values of two slot fillers both contain at least one overlapping words 19

Slot Induction and Filling Evaluation MAP-F1-Hard / MAP-F1-Soft weight the MAP score with F1-Hard / F1-Soft scores Approach MAP-F1-Hard MAP-F1-Soft ASR Manual ASR Manual Frequency 26.96 27.84 27.29 28.68 K-Means 27.38 27.99 27.67 28.83 Spectral Clustering 30.52 28.40 30.85 29.22 When the induced slot mismatches the reference slot, all the slot fillers will be judged as incorrect fillers. 20

Overview Introduction Approach Experiments Conclusion 21

Conclusion We propose an unsupervised approach for automatic induction and filling of semantic slots. Our work makes use of a state-of-the-art semantic parser, and adapts the linguistically principled generic FrameNetstyle outputs to the target semantic space corresponding to a domain-specific SDS setting. Our experiments show that automatically induced semantic slots align well with the reference slots created by domain experts. 22

Thanks for your attention! Q & A 23