Final Project Presentation. By Amritaansh Verma



Similar documents
Emotion Detection from Speech

Guide to Knowledge Elicitation Interviews

New Beginnings: Managing the Emotional Impact of Diabetes Module 1

Lesson One: Introduction to Customer Service

Recognition of Emotions in Interactive Voice Response Systems

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Vocal Emotion Recognition

1 Introduction. An Emotion-Aware Voice Portal

Ammar Ahmad Awan, Muhammad Aamir Saleem, Sungyoung Lee

BPMN TRAINING COURSE:

Speech Analytics. Whitepaper

60 Daily Social Skills Lessons for the Intermediate Classroom (Grades 3-6)

Speech Recognition Software Review

Fuzzy Emotion Recognition in Natural Speech Dialogue

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Their stories are tragic. A new chapter starts now. now.

I want you to know how to use Android accessibility features with accessible instructional materials.

Are your employees: Miss Miller s Institute will train your employees in: huge benefits for your company.

Applying Machine Learning to Stock Market Trading Bryce Taylor

Overview of PEAR and NIOST Measurement Tools. Tool HSA HSA-R SAYO-T APT SAYO-Y. Survey of Academic and Youth Outcomes: Teacher Version

DESCRIBING OUR COMPETENCIES. new thinking at work

Applications of speech-to-text in customer service. Dr. Joachim Stegmann Deutsche Telekom AG, Laboratories

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak

Specialty Answering Service. All rights reserved.

Voice Driven Animation System

Module 3. Ways of Finding Answers to Research Questions

1. Introduction to Spoken Dialogue Systems

Showing Interest and Expressing Appreciation

Emotion Recognition Using Blue Eyes Technology

Mobile Accessibility. Jan Richards Project Manager Inclusive Design Research Centre OCAD University

GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING

An Application of Data Leakage Prevention System based on Biometrics Signals Recognition Technology

Abstract. Avaya Solution & Interoperability Test Lab

A secure face tracking system

Global Pre-intermediate CEF descriptors

Before we get started

Smartphone Overview for the Blind and Visually Impaired

Predicting the Stock Market with News Articles

Using coping statements to avoid common thinking traps

FPGA Implementation of Human Behavior Analysis Using Facial Image

White paper. Axis Video Analytics. Enhancing video surveillance efficiency

Web page creation using VoiceXML as Slot filling task. Ravi M H

Owner's Manual for Voice Control. The Convenient Alternative to Manual Control.

Building a Question Classifier for a TREC-Style Question Answering System

VoiceXML Tutorial. Part 1: VoiceXML Basics and Simple Forms

Porting VNC to Mobile Platforms

ARTIFICIALLY INTELLIGENT COLLEGE ORIENTED VIRTUAL ASSISTANT

Voice User Interfaces (CS4390/5390)

zen Platform technical white paper

Dealing with problems and complaints

wishpond EBOOK Easter: A Guide to

A New Age for Advertising Copy Testing Facial Expression Measurement Technology. Brian Sheehan Newhouse School, Syracuse University

Thai Language Self Assessment

McGILL QUALITY OF LIFE QUESTIONNAIRE

Automated News Item Categorization

Virtual Personal Assistant

I start work at 8:30. Unit aims. Getting started

Communication Process

CASE STUDY. Trapeze Group (UK) Ltd.

American Gestures. A lesson for Elementary Students

Imagine It! ICEBREAKER:

Virtual Patients: Assessment of Synthesized Versus Recorded Speech

Kore Bots Platform Competitive Comparison Overview Kore Bots Platform Competitive Comparison Overview

TRAINING NEEDS ANALYSIS

About Sectra Communications

Communication levels. Levels of communication

Big Data and Opinion Mining: Challenges and Opportunities

SIPAC. Signals and Data Identification, Processing, Analysis, and Classification

Chapter 13: Program Development and Programming Languages

Nonverbal Communication Human Communication Lecture 26

interviewscribe User s Guide

Setting Up Your Android Development Environment. For Mac OS X (10.6.8) v1.0. By GoNorthWest. 3 April 2012

IVR PARTICIPANT MANUAL

The Visualization Pipeline

When Change Goes Wrong

Diabetes and Emotions

Text of Templates

Bayesian Spam Filtering

Alcohol and Drugs. 1. When was the first time you consumed alcohol/drugs? What form of substance did you take? Why did you do it?

Emerging technologies - AJAX, VXML SOA in the travel industry

Introduction Purpose...vii Rationale...xiii How to Use this Book...x Process Essentials...xi

FiliText: A Filipino Hands-Free Text Messaging Application

A Functional Approach to Functional Analysis. Carla Miller

Prediction of Stock Market Shift using Sentiment Analysis of Twitter Feeds, Clustering and Ranking

An Introduction to VoiceXML

Transcription:

Final Project Presentation By Amritaansh Verma

Introduction I am making a Virtual Voice Assistant that understands and reacts to emotions The emotions I am targeting are Sarcasm, Happiness, Anger/Aggression and Sadness/Boredom

Why is it interesting? Most current virtual assistants like Apple s siri, Samsung s S-Voice, Vlingo etc. disregard the prosodic information of the user s speech Including this capability in Virtual Assistants will make them more life-like and would help them gain more wide spread acceptance

Is it really required? Do users really direct Anger/Sarcasm towards Virtual Assistants? Sometimes it is an inevitable, spontaneous natural human reaction

Project Description Two Main Components: 1. An emotion detection module (openear) 2. A simple voice assistant

Emotion Detection Module openear Toolkit Munich Open-Source Emotion and Affect Recognition Toolkit Open Source, free Provides efficient feature extraction algorithms implemented in C++, classfiers, and pre-trained models on well-k Behaves as an API, takes in user utterances as input and give its classification into the four basic emotional categories as output.

openear is ready to use Four ready-to-use classifier model-sets are provided for recognition of basic emotion categories and interest level I am planning to collect some speech data myself pertaining to the four basic emotional categories which arise in a typical interaction with a virtual assistant to train and test my module upon (Anger, Happiness, Sadness, and Sarcasm)

Built-in classifier model Sets Berlin Speech Emotion Database (EMO-DB), containing seven classes of basic emotions (Anger, Fear, Happiness, Disgust, Boredom, Sadness, Neutral) enterface corpus with six emotion categories (Anger, Disgust, Fear, Happiness, Sadness, and Surprise) ABC corpus with the classes (Aggressive, Cheerful, Intoxicated, Nervous, Neutral, Tired) Audio Visual Interest Corpus (AVIC) with labels for three levels of interest (-1: disinterest, 0: normal, and 1: high interest).

How it works? The voice assistant will record user speech (or pipeline real time speech data for faster incremental processing) and forwards it to the openear API to detect its emotional content Upon getting back the prosody information, the virtual assistant responds appropriately based on both the user utterance and the prosody information

How It works? Response User Utterance Voice Assistant Prosody Informaiton Utterance openear

Some Use Cases Sarcasm Example: Virtual Assistant: Did you mean ( says something which it interpreted incorrectly )? User: (Sarcastically/Angrily) Yes, that s exactly what I meant! Virtual Assistant (appropriate response): I am sorry that I misunderstood can you please provide me with that information again?

Some Use Cases Happiness Example: User: (Happily) Remind me to meet Roy tomorrow at 6. Virtual Assistant: I have set at reminder for 6 tomorrow, by the way you seem happy today, what s the good news?

Some Use Cases Anger Example: Virtual Assistant: Would you like me to call (Wrong person s name)? User: (Angrily) NO! Virtual Assistant: (Opens up phonebook for manual selection rather than repeating question as a response to anger detection)

Some Use Cases Sadness Example: User: (Sad/Bored) What s the weather like today?(or any other task) Virtual Assistant: The weather is, You sound sad today, do you want me to tell you a joke to cheer you up?

Evaluation I am planning to do a hands-on approach which involves testing out the application on a number of utterances (from a corpus of real users) belonging to different categories of emotions and see if it can properly classify and respond to them The users would be given a usability questionnaire to restrict the domain of utterances to be more relevant

Demonstration openear toolkit Virtual Assistant(Java Application)

Lessons Learnt from the Course The papers on Recognition & Understanding prosody helped me out on the basics Current systems are lacking in prosody recognition and response, thus the motivation

Future Work: I have to fine tune/debug the integration of my voice assistant and openear Figure out the right choice for the virtual Assistant: npceditor or AIML chatbot OR voicexml, right now I am testing using a simple java application Collect data pertaining to typical interactions of humans with virtual assistants and train openear on it for better accuracy Develop more use cases and implement them Port the application to an Android device (tricky??)

Open Questions? Some good follow-up projects might be extending the openear implementation itself to detect new emotions Making this a pluggable module which can be integrated into existing virtual assistants to give them the capability to recognize emotions