Vocal Emotion Recognition

Size: px
Start display at page:

Download "Vocal Emotion Recognition"

Transcription

1 Vocal Emotion Recognition State-of-the-Art in Classification of Real-Life Emotions October 26, 2010 Stefan Steidl International Computer Science Institute (ICSI) at Berkeley, CA Overview 2 / 49 1 Different Perspectives on Emotion Recognition 2 FAU Aibo Emotion Corpus 3 Own Results on Emotion Classification 4 INTERSPEECH 2009 Emotion Challenge

2 Overview 3 / 49 1 Different Perspectives on Emotion Recognition Psychology of Emotion Computer Science 2 FAU Aibo Emotion Corpus 3 Own Results on Emotion Classification 4 INTERSPEECH 2009 Emotion Challenge Facial Expressions of Emotion 4 / 49

3 5 / 49 Universal Basic Emotions Paul Ekman postulates the existence of 6 basic emotions: anger, fear, disgust, surprise, joy, sadness other emotions are mixed or blended emotions universal facial expressions Terminology 6 / 49 Different affective states [1]: type of affective state inten- dura- syn- event appraisal rapid- behavsity tion chroni- focus elicita- ity of ioral zation tion change impact emotion - mood - interpersonal stances - - attitudes personality traits - : low, : medium, : high, : very high, -: indicates a range [1] K. R. Scherer: Vocal communication of emotion: A review of research paradigms, Speech Communication, Vol. 40, pp , 2003

4 7 / 49 Terminology (cont.) Definition of Emotion Emotion (Scherer) episodes of coordinated changes in several components including at least: neurophysiological activation, motor expression, and subjective feeling but possibly also action tendencies and cognitive processes in response to external or internal events of major significance to the organism Vocal Expression of Emotion 8 / 49 Results from studies in Psychology of Emotion anger/ fear/ sadness joy/ boredom stress rage panic elation Intensity F 0 floor/mean F 0 variability F 0 range ( ) 1 Sentence contour High frequency energy ( ) 2 Speech and articulation rate ( ) 2 1 Banse and Scherer found a decrease in F 0 range 2 inconclusive evidence Goal Classification of the subject s actual emotional state (some sort of lie detector for emotions)

5 9 / 49 Human-Computer Interaction (HCI) Emotion-Related User States naturally occurring states of users in human-machine communication emotions in a broader sense coordinated changes in several components NOT required classification of the perceived emotional state, not necessarily the actual emotion of the speaker 10 / 49 Pattern Recognition Pattern Recognition Point of View classification task: choose 1 of n given classes discrimination of classes rather than classification definition of good features machine classification Actually not needed definition of term emotion information on how specific features change

6 11 / 49 Emotional Speech Corpora Acted data based on Basic Emotions theory suited for studying prototypical emotions corpora easy to create (inexpensive, no labeling process) high audio quality balanced classes neutral linguistic content (focus on acoustics only) high recognition results 12 / 49 Emotional Speech Corpora (cont.) Popular corpora Emotional Prosody Speech and Transcript corpus (LDC): 15 classes Berlin Emotional Speech Database (EmoDB): 7 classes 89.9 % accuracy (speaker independent LOSO evaluation, speaker adaptation, feature selection) [2] Danish Emotional Speech Corpus: 5 classes 74.5 % accuracy (10-fold SCV, feature selection) [3] [2] B. Vlasenko et al.: Combining Frame and Turn-Level Information for Robust Recognition of Emotions within Speech, INTERSPEECH 2007 [3] Schuller et al.: Emotion Recognition in the Noise Applying Large Acoustic Feature Sets, Speech Prosody 2006

7 13 / 49 Emotional Speech Corpora (cont.) Naturally occurring emotions states that actually appear in HCI (real applications) difficult to create (appropriate scenario needed, ethical concerns, need to label data) low emotional intensity in general 80 % neutral low audio quality (reverberation, noise, far-distance microphones) needed for machine classification (because conditions between training and test must not differ too much) research on both acoustic and linguistic features possible new research questions: optimal emotion unit almost no corpora large enough for machine classification available (do not exist or are not available for research) Overview 14 / 49 1 Different Perspectives on Emotion Recognition 2 FAU Aibo Emotion Corpus Scenario Labeling of User States Data-driven Dimensions of Emotion Units of Analysis Sparse Data Problem 3 Own Results on Emotion Classification 4 INTERSPEECH 2009 Emotion Challenge

8 15 / 49 The FAU Aibo Emotion Corpus 51 children (30 f, 21 m) at the age of 10 to hours of spontaneous speech (mainly short commands) 48,401 words in 13,642 audio files 16 / 49 FAU Aibo Emotion Corpus (cont.) data base for CEICES and INTERSPEECH 2009 Emotion Challenge available for scientific, non-commercial use [4] S. Steidl: Automatic Classification of Emotion-Related User States in Spontaneous Children s Speech, Logos Verlag, Berlin available online:

9 Emotion-Related User States 17 / categories: prior inspection of the data before labeling joyful surprised motherese neutral bored emphatic helpless touchy/irritated reprimanding angry other motherese the way mothers/parents address their babies either because Aibo is well-behaving or because the child wants Aibo to obey; positive equivalent to reprimanding emphatic pronounced, accentuated, sometimes hyper-articulated way but without showing any emotion reprimanding the child is reproachful, reprimanding, wags the finger Labeling of User States 18 / 49 Labeling: 5 students of linguistics holistic labeling on the word level majority vote emotion category words angry (A) % touchy (T) % reprimanding (R) % emphatic (E) 2, % neutral (N) 39, % motherese (M) 1, % joyful (J) %. all 48, %

10 19 / 49 Labeling of User States (cont.) Confusion matrix majority vote emotion category A T R E N M J angry (A) touchy (T) reprimanding (R) emphatic (E) neutral (N) motherese (M) joyful (J) / 49 Data-driven Dimensions of Emotions Non-metric dimensional scaling: arranging the emotion categories in the 2-dimensional space states that are often confused are close to each other +interaction interaction motherese reprimanding touchy neutral emphatic angry joyful interaction negative valence positive

11 21 / 49 Units of Analysis Units of analysis v1 v2 p3 s3 stopp Aibo g radeaus fein machst du das stopp sitz word level chunk level Ohm_18_342 turn level Ohm_18_343 Advantages/disadvantages of larger units + more information less emotional homogeneity S. Steidl: Vocal Emotion Recognition 22 / 49 Sparse Data Problem Super classes: Motherese 0.5 joyful motherese neutral angry emphatic 1 touchy reprimanding Anger: angry, touchy/irritated, reprimanding Emphatic Neutral Motherese Neutral 0 0 Anger S = 0.32 S. Steidl: RSQ = 0.73 Vocal Emotion Recognition Emphatic S = RSQ =

12 23 / 49 Sparse Data Problem (cont.) Data subsets Aibo corpus Aibo turn set Aibo chunk set Aibo word set data set number of taken from words # chunks # turns Aibo corpus 48,401 18,216 13,642 Aibo word set 6,070 4,543 3,996 Aibo chunk set 13,217 4,543 3,996 Aibo turn set 17,618 6,413 3,996 Overview 24 / 49 1 Different Perspectives on Emotion Recognition 2 FAU Aibo Emotion Corpus 3 Own Results on Emotion Classification Results for different Units of Analysis Machine vs. Human Feature Types and their Relevance 4 INTERSPEECH 2009 Emotion Challenge

13 Most Appropriate Unit of Analysis 25 / 49 Classification complete set of features classification with Linear Discriminant Analysis (LDA) 51-fold speaker-independent cross-validation unit of number of number of average analysis features samples recall word level 265 6,070 words 67.2 % chunk level 700 4,543 chunks 68.9 % turn level 700 3,996 turns 63.2 % Chunks: best compromise between length of the segment homogeneity of the emotional state within the segment Machine Classifier vs. Human Labeler 26 / 49 Entropy based measure: labeler class A E A A 1 2 A E N M decoder: M 1 2 A E N M A E N M H dec = 1.41 implicit weighting of classification errors depending on the word that is classified

14 27 / 49 Machine Classifier vs. Human Labeler (cont.) Classification: Aibo word set rel. frequency [%] avg. human labeler machine classifier entropy [5] S. Steidl, M. Levit, A. Batliner, E. Nöth, H. Niemann: Of All Things the Measure is Man Classification of Emotions and Inter-Labeler Consistency, ICASSP / 49 Evaluation of Different Types of Features Types of features acoustic features prosodic features spectral features voice quality features linguistic features Evaluation Artificial Neural Networks (ANN) 51-fold speaker-independent cross-validation combination by early or late fusion

15 29 / 49 Acoustic Features: Prosody Prosody suprasegmental characteristics such as pitch contour energy contour temporal shortening/lengthening of words duration of pauses between words 30 / 49 Acoustic Features: Prosody (cont.) Classification results: Aibo chunk set average recall [%] pauses (16) duration (37) energy (25) all F0 (29)

16 31 / 49 Acoustic Features: Spectral Characteristics (cont.) Classification results: Aibo chunk set average recall [%] prosody (107) HNR (2) TEO (64) MFCC (24) formants (16) jitter/shimmer (4) best combination Acoustic Features: Voice Quality 32 / 49 Classification results: Aibo chunk set average recall [%] prosody (107) MFCC (24) formants (16) HNR (2) jitter/shimmer (4) TEO (64) best combination

17 Acoustic Features: Combination 33 / 49 Classification results: Aibo chunk set average recall [%] prosody (107) MFCC (24) formants (16) jitter/shimmer (4) HNR (2) TEO (64) best combination Linguistic Features 34 / 49 Types of linguistic features word characteristics average word length (number of letters, phonemes, syllables) proportion of word fragments average number of repetitions part-of-speech features unigram models bag-of-words

18 35 / 49 Linguistic Features (cont.) Part-of-Speech (POS) Features only 6 coarse POS categories can be annotated without considering context nouns, proper names % of total inflected adjectives not inflected adjectives present/past participles (other) verbs, infinitives auxiliaries articles, pronouns, particles, interjections Anger Joyful Neutral Emphatic Motherese Other - 36 / 49 Linguistic Features (cont.) Unigram Models u(w, e) = log 10 P(e w) P(e) Anger P(A w) Emphatic P(E w) böser (bad) 29.2 % stopp (stop) 30.5 % stehenbleiben (stop) 18.9 % halt (halt) 29.3 % nein (no) 17.0 % links (left) 20.5 % aufstehen (get up) 12.3 % rechts (right) 18.9 % Aibo (Aibo) 10.1 % nein (no) 17.6 % Neutral P(N w) Motherese P(M w) okay (okay) 98.6 % fein (fine) 57.5 % und (and) 98.5 % ganz (very) 41.9 % Stück (bit) 98.5 % braver (good) 36.0 % in (in) 98.2 % sehr (very) 23.5 % noch (still) 96.2 % brav (good) 21.7 %

19 37 / 49 Linguistic Features (cont.) Bag-of-Words utterance: Aibo, geh nach links! (Aibo, move to the left!) Aibo allen geh nach links Aibolein representation of the linguistic content word order getting lost various dimensionality reduction techniques 38 / 49 Linguistic Features (cont.) Classification results: Aibo chunk set average recall [%] word statistics (6) POS (6) unigram models (16) best combination BOW (254 50)

20 39 / 49 Combination of Acoustic and Linguistic Features Classification results: Aibo chunk set 80 average recall [%] acoustic features (late fusion, ANN) best combination (late fusion, ANN) best combination linguistic features combination (late fusion, ANN) combination (early fusion, LDA) 40 / 49 Similar Results within CEICES CEICES: Combining Efforts for Improving Automatic Classification of Emotional User States collaboration of various research groups within the European Network of Excellence HUMAINE ( ) state-of-the-art feature set with 4,000 features SVM (linear kernel), 3-fold speaker-independent cross-validation selection of 150 features (SFFS): surviving feature types? only chunk based features, no information outside Aibo chunk set [6] A. Batliner, S. Steidl, B. Schuller, D. Seppi, T. Vogt, J. Wagner, L. Devillers, L. Vidrascu, V. Aharonson, L. Kessous, N. Amir: Whodunnit Searching for the Most Important Feature Types Signalling Emotion-Related User States in Speech, Computer, Speech, and Language, Vol. 25, Issue 1 (January 2011), pp. 4-28

21 41 / 49 Similar Results within CEICES(cont.) duration energy F0 spectrum cepstrum voice quality wavelets all acoustic BOW POS higher semantics varia all linguistic all SFFS # total # F MEASURE SHARE PORTION SFFS # F MEASURE SHARE PORTION Overview 42 / 49 1 Different Perspectives on Emotion Recognition 2 FAU Aibo Emotion Corpus 3 Own Results on Emotion Classification 4 INTERSPEECH 2009 Emotion Challenge

22 INTERSPEECH 2009 Emotion Challenge 43 / 49 New goals: challenge with standardized test conditions open microphone: using the complete corpus highly unbalanced classes including all observed emotional categories including chunks with low inter-labeler agreement 44 / 49 INTERSPEECH 2009 Emotion Challenge (cont.) Speaker independent training and test sets 2-class problem: NEGative vs. IDLe # NEG IDL train test class problem: Anger, Emphatic, Neutral, Positive, Rest # A E N P R train test

23 45 / 49 INTERSPEECH 2009 Emotion Challenge (cont.) Sub-Challenges 1 Feature Sub-Challenge optimisation of feature extraction/selection; classifier settings fixed 2 Classifier Sub-Challenge optimisation of classification techniques; feature set given 3 Open Performance Sub-Challenge optimisation of feature extraction/selection and classification techniques 46 / 49 INTERSPEECH 2009 Emotion Challenge (cont.) Participants Open Performance Classifier Feature Sub-Challenge Sub-Challenge Sub-Challenge number of 2 classes 5 classes 2 classes 5 classes 2 classes 5 classes participants [7] B. Schuller, A. Batliner, S. Steidl, D. Seppi: Recognising Realistic Emotions and Affect in Speech: State of the Art and Lessons Learnt from the First Challenge, Speech Communication, Special Issue Sensing Emotion and Affect - Facing Realism in Speech Processing, to appear

24 47 / 49 INTERSPEECH 2009 Emotion Challenge (cont.) 2-class problem: NEGative vs. IDLe 74 average recall [%] unweighted avg. recall weighted avg. recall 62 Majority voting Dumouchel et al. Vlasenko et al. Kockmann et al. 60 Baseline Barra-Chicote et al. Vogt et al. Bozkurt et al. Polzehl et al. Luengo et al. 48 / 49 INTERSPEECH 2009 Emotion Challenge (cont.) 5-class problem: Anger, Emphatic, Neutral, Positive, Rest 55 average recall [%] Lee et al. Vlasenko et al. Luengo et al. Planet et al. Dumouchel et al unweighted average recall weighted average recall 35 Vogt el al. Barra-Chicote et al. Baseline Majority voting Kockmann et al. Bozkurt et al.

25 State-of-the-Art: Summary 49 / 49 Berlin Emotion Speech Database 7-class problem: hot anger, disgust, fear/panic, happiness, sadness/sorrow, boredom, neutral balanced classes 90 % accuracy FAU Aibo Emotion Corpus 4-class problem: Anger, Emphatic, Neutral, Motherese subset with roughly balanced classes (Aibo chunk set) 69 % unweighted average recall 5-class problem: Anger, Emphatic, Neutral, Positive, Rest highly unbalanced classes, complete corpus 44 % unweighted average recall 2-class problem: NEGative vs. IDLe highly unbalanced classes, complete corpus 71 % unweighted average recall

Emotion Detection from Speech

Emotion Detection from Speech Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction

More information

Automatic Emotion Recognition from Speech

Automatic Emotion Recognition from Speech Automatic Emotion Recognition from Speech A PhD Research Proposal Yazid Attabi and Pierre Dumouchel École de technologie supérieure, Montréal, Canada Centre de recherche informatique de Montréal, Montréal,

More information

Recognition of Emotions in Interactive Voice Response Systems

Recognition of Emotions in Interactive Voice Response Systems Recognition of Emotions in Interactive Voice Response Systems Sherif Yacoub, Steve Simske, Xiaofan Lin, John Burns HP Laboratories Palo Alto HPL-2003-136 July 2 nd, 2003* E-mail: {sherif.yacoub, steven.simske,

More information

MODELING OF USER STATE ESPECIALLY OF EMOTIONS. Elmar Nöth. University of Erlangen Nuremberg, Chair for Pattern Recognition, Erlangen, F.R.G.

MODELING OF USER STATE ESPECIALLY OF EMOTIONS. Elmar Nöth. University of Erlangen Nuremberg, Chair for Pattern Recognition, Erlangen, F.R.G. MODELING OF USER STATE ESPECIALLY OF EMOTIONS Elmar Nöth University of Erlangen Nuremberg, Chair for Pattern Recognition, Erlangen, F.R.G. email: noeth@informatik.uni-erlangen.de Dagstuhl, October 2001

More information

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,

More information

EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS

EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS VALERY A. PETRUSHIN Andersen Consulting 3773 Willow Rd. Northbrook, IL 60062 petr@cstar.ac.com ABSTRACT The paper describes two experimental

More information

Final Project Presentation. By Amritaansh Verma

Final Project Presentation. By Amritaansh Verma Final Project Presentation By Amritaansh Verma Introduction I am making a Virtual Voice Assistant that understands and reacts to emotions The emotions I am targeting are Sarcasm, Happiness, Anger/Aggression

More information

SALIENT FEATURES FOR ANGER RECOGNITION IN GERMAN AND ENGLISH IVR PORTALS

SALIENT FEATURES FOR ANGER RECOGNITION IN GERMAN AND ENGLISH IVR PORTALS Chapter 1 SALIENT FEATURES FOR ANGER RECOGNITION IN GERMAN AND ENGLISH IVR PORTALS Tim Polzehl Quality and Usability Lab, Technischen Universität Berlin / Deutsche Telekom Laboratories, Ernst-Reuter-Platz

More information

An Extension to the Sammon Mapping for the Robust Visualization of Speaker Dependencies

An Extension to the Sammon Mapping for the Robust Visualization of Speaker Dependencies An Extension to the Sammon Mapping for the Robust Visualization of Speaker Dependencies Andreas Maier, Julian Exner, Stefan Steidl, Anton Batliner, Tino Haderlein, and Elmar Nöth Universität Erlangen-Nürnberg,

More information

Voice Analytics for Dementia Assessment

Voice Analytics for Dementia Assessment Alex Sorin, IBM Research - Haifa Voice Analytics for Dementia Assessment DemAAL, September 2013, Chania, Greece Acknowledgements The experimental work presented below is supported by Dem@Care FP7 project

More information

1 Introduction. An Emotion-Aware Voice Portal

1 Introduction. An Emotion-Aware Voice Portal An Emotion-Aware Voice Portal Felix Burkhardt*, Markus van Ballegooy*, Roman Englert**, Richard Huber*** T-Systems International GmbH*, Deutsche Telekom Laboratories**, Sympalog Voice Solutions GmbH***

More information

You Seem Aggressive! Monitoring Anger in a Practical Application

You Seem Aggressive! Monitoring Anger in a Practical Application You Seem Aggressive! Monitoring Anger in a Practical Application Felix Burkhardt Deutsche Telekom Laboratories, Berlin, Germany Felix.Burkhardt@telekom.de Abstract A monitoring system to detect emotional

More information

THE VOICE OF LOVE. Trisha Belanger, Caroline Menezes, Claire Barboa, Mofida Helo, Kimia Shirazifard

THE VOICE OF LOVE. Trisha Belanger, Caroline Menezes, Claire Barboa, Mofida Helo, Kimia Shirazifard THE VOICE OF LOVE Trisha Belanger, Caroline Menezes, Claire Barboa, Mofida Helo, Kimia Shirazifard University of Toledo, United States tbelanger@rockets.utoledo.edu, Caroline.Menezes@utoledo.edu, Claire.Barbao@rockets.utoledo.edu,

More information

Sentiment analysis: towards a tool for analysing real-time students feedback

Sentiment analysis: towards a tool for analysing real-time students feedback Sentiment analysis: towards a tool for analysing real-time students feedback Nabeela Altrabsheh Email: nabeela.altrabsheh@port.ac.uk Mihaela Cocea Email: mihaela.cocea@port.ac.uk Sanaz Fallahkhair Email:

More information

Psychological Motivated Multi-Stage Emotion Classification Exploiting Voice Quality Features

Psychological Motivated Multi-Stage Emotion Classification Exploiting Voice Quality Features 22 Psychological Motivated Multi-Stage Emotion Classification Exploiting Voice Quality Features Marko Lugger and Bin Yang University of Stuttgart Germany Open Access Database www.intechweb.org 1. Introduction

More information

Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations

Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations Hugues Salamin, Anna Polychroniou and Alessandro Vinciarelli University of Glasgow - School of computing Science, G128QQ

More information

Technische Universität München. Speech Processing When it gets Emotional

Technische Universität München. Speech Processing When it gets Emotional Speech Processing When it gets Emotional One Hour of: Intro Emotion & Speech Some Hot Topics And next? Björn Schuller 2 Intro Application Natural Interaction Media Retrieval Monitoring Editing Encoding

More information

Prosodic Characteristics of Emotional Speech: Measurements of Fundamental Frequency Movements

Prosodic Characteristics of Emotional Speech: Measurements of Fundamental Frequency Movements Prosodic Characteristics of Emotional Speech: Measurements of Fundamental Frequency Movements Authors: A. Paeschke, W. F. Sendlmeier Technical University Berlin, Germany ABSTRACT Recent data on prosodic

More information

Emotion and Its Triggers in Human Spoken Dialogue: Recognition and Analysis

Emotion and Its Triggers in Human Spoken Dialogue: Recognition and Analysis Emotion and Its Triggers in Human Spoken Dialogue: Recognition and Analysis Nurul Lubis, Sakriani Sakti, Graham Neubig, Tomoki Toda, Ayu Purwarianti, and Satoshi Nakamura Abstract Human communication is

More information

Word Completion and Prediction in Hebrew

Word Completion and Prediction in Hebrew Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology

More information

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer

More information

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software Carla Simões, t-carlas@microsoft.com Speech Analysis and Transcription Software 1 Overview Methods for Speech Acoustic Analysis Why Speech Acoustic Analysis? Annotation Segmentation Alignment Speech Analysis

More information

AMMON: A Speech Analysis Library for Analyzing Affect, Stress, and Mental Health on Mobile Phones

AMMON: A Speech Analysis Library for Analyzing Affect, Stress, and Mental Health on Mobile Phones AMMON: A Speech Analysis Library for Analyzing Affect, Stress, and Mental Health on Mobile Phones Keng-hao Chang, Drew Fisher, John Canny Computer Science Division, University of California at Berkeley

More information

The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency

The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency Andrey Raev 1, Yuri Matveev 1, Tatiana Goloshchapova 2 1 Speech Technology Center, St. Petersburg, RUSSIA {raev, matveev}@speechpro.com

More information

Gender Identification using MFCC for Telephone Applications A Comparative Study

Gender Identification using MFCC for Telephone Applications A Comparative Study Gender Identification using MFCC for Telephone Applications A Comparative Study Jamil Ahmad, Mustansar Fiaz, Soon-il Kwon, Maleerat Sodanil, Bay Vo, and * Sung Wook Baik Abstract Gender recognition is

More information

engin erzin the use of speech processing applications is expected to surge in multimedia-rich scenarios

engin erzin the use of speech processing applications is expected to surge in multimedia-rich scenarios engin erzin Associate Professor Department of Computer Engineering Ph.D. Bilkent University http://home.ku.edu.tr/ eerzin eerzin@ku.edu.tr Engin Erzin s research interests include speech processing, multimodal

More information

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,

More information

Big Data and Opinion Mining: Challenges and Opportunities

Big Data and Opinion Mining: Challenges and Opportunities Big Data and Opinion Mining: Challenges and Opportunities Dr. Nikolaos Korfiatis Director Frankfurt Big Data Lab JW Goethe University Frankfurt, Germany /~nkorf Agenda Opinion Mining and Sentiment Analysis

More information

Nonverbal Communication Human Communication Lecture 26

Nonverbal Communication Human Communication Lecture 26 Nonverbal Communication Human Communication Lecture 26 Mar-14-11 Human Communication 1 1 Nonverbal Communication NVC can be communicated through gestures and touch (Haptic communication), by body language

More information

Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies

Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies Sentiment analysis of Twitter microblogging posts Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies Introduction Popularity of microblogging services Twitter microblogging posts

More information

AUTOMATIC DETECTION OF CONTRASTIVE ELEMENTS IN SPONTANEOUS SPEECH

AUTOMATIC DETECTION OF CONTRASTIVE ELEMENTS IN SPONTANEOUS SPEECH AUTOMATIC DETECTION OF CONTRASTIVE ELEMENTS IN SPONTANEOUS SPEECH Ani Nenkova University of Pennsylvania nenkova@seas.upenn.edu Dan Jurafsky Stanford University jurafsky@stanford.edu ABSTRACT In natural

More information

Emotion in Speech: towards an integration of linguistic, paralinguistic and psychological analysis

Emotion in Speech: towards an integration of linguistic, paralinguistic and psychological analysis Emotion in Speech: towards an integration of linguistic, paralinguistic and psychological analysis S-E.Fotinea 1, S.Bakamidis 1, T.Athanaselis 1, I.Dologlou 1, G.Carayannis 1, R.Cowie 2, E.Douglas-Cowie

More information

INFERRING SOCIAL RELATIONSHIPS IN A PHONE CALL FROM A SINGLE PARTY S SPEECH Sree Harsha Yella 1,2, Xavier Anguera 1 and Jordi Luque 1

INFERRING SOCIAL RELATIONSHIPS IN A PHONE CALL FROM A SINGLE PARTY S SPEECH Sree Harsha Yella 1,2, Xavier Anguera 1 and Jordi Luque 1 INFERRING SOCIAL RELATIONSHIPS IN A PHONE CALL FROM A SINGLE PARTY S SPEECH Sree Harsha Yella 1,2, Xavier Anguera 1 and Jordi Luque 1 1 Telefonica Research, Barcelona, Spain 2 Idiap Research Institute,

More information

Appositions versus Double Subject Sentences what Information the Speech Analysis brings to a Grammar Debate

Appositions versus Double Subject Sentences what Information the Speech Analysis brings to a Grammar Debate Appositions versus Double Subject Sentences what Information the Speech Analysis brings to a Grammar Debate Horia-Nicolai Teodorescu 1,2 and Diana Trandabăţ 1,3 1 Institute for Computes Science, Romanian

More information

Robust Methods for Automatic Transcription and Alignment of Speech Signals

Robust Methods for Automatic Transcription and Alignment of Speech Signals Robust Methods for Automatic Transcription and Alignment of Speech Signals Leif Grönqvist (lgr@msi.vxu.se) Course in Speech Recognition January 2. 2004 Contents Contents 1 1 Introduction 2 2 Background

More information

Author Gender Identification of English Novels

Author Gender Identification of English Novels Author Gender Identification of English Novels Joseph Baena and Catherine Chen December 13, 2013 1 Introduction Machine learning algorithms have long been used in studies of authorship, particularly in

More information

Thirukkural - A Text-to-Speech Synthesis System

Thirukkural - A Text-to-Speech Synthesis System Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,

More information

Establishing the Uniqueness of the Human Voice for Security Applications

Establishing the Uniqueness of the Human Voice for Security Applications Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.

More information

Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction

Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction : A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction Urmila Shrawankar Dept. of Information Technology Govt. Polytechnic, Nagpur Institute Sadar, Nagpur 440001 (INDIA)

More information

Degree of highness or lowness of the voice caused by variation in the rate of vibration of the vocal cords.

Degree of highness or lowness of the voice caused by variation in the rate of vibration of the vocal cords. PITCH Degree of highness or lowness of the voice caused by variation in the rate of vibration of the vocal cords. PITCH RANGE The scale of pitch between its lowest and highest levels. INTONATION The variations

More information

Emotion Recognition Using Blue Eyes Technology

Emotion Recognition Using Blue Eyes Technology Emotion Recognition Using Blue Eyes Technology Prof. Sudan Pawar Shubham Vibhute Ashish Patil Vikram More Gaurav Sane Abstract We cannot measure the world of science in terms of progress and fact of development.

More information

Speech Signal Processing: An Overview

Speech Signal Processing: An Overview Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech

More information

Selected Topics in Applied Machine Learning: An integrating view on data analysis and learning algorithms

Selected Topics in Applied Machine Learning: An integrating view on data analysis and learning algorithms Selected Topics in Applied Machine Learning: An integrating view on data analysis and learning algorithms ESSLLI 2015 Barcelona, Spain http://ufal.mff.cuni.cz/esslli2015 Barbora Hladká hladka@ufal.mff.cuni.cz

More information

A System for Labeling Self-Repairs in Speech 1

A System for Labeling Self-Repairs in Speech 1 A System for Labeling Self-Repairs in Speech 1 John Bear, John Dowding, Elizabeth Shriberg, Patti Price 1. Introduction This document outlines a system for labeling self-repairs in spontaneous speech.

More information

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be

More information

Principal Components of Expressive Speech Animation

Principal Components of Expressive Speech Animation Principal Components of Expressive Speech Animation Sumedha Kshirsagar, Tom Molet, Nadia Magnenat-Thalmann MIRALab CUI, University of Geneva 24 rue du General Dufour CH-1211 Geneva, Switzerland {sumedha,molet,thalmann}@miralab.unige.ch

More information

Comparative Error Analysis of Dialog State Tracking

Comparative Error Analysis of Dialog State Tracking Comparative Error Analysis of Dialog State Tracking Ronnie W. Smith Department of Computer Science East Carolina University Greenville, North Carolina, 27834 rws@cs.ecu.edu Abstract A primary motivation

More information

Integration of Negative Emotion Detection into a VoIP Call Center System

Integration of Negative Emotion Detection into a VoIP Call Center System Integration of Negative Detection into a VoIP Call Center System Tsang-Long Pao, Chia-Feng Chang, and Ren-Chi Tsao Department of Computer Science and Engineering Tatung University, Taipei, Taiwan Abstract

More information

Using crowdsourcing for labelling emotional speech assets

Using crowdsourcing for labelling emotional speech assets Using crowdsourcing for labelling emotional speech assets Alexey Tarasov, Charlie Cullen, Sarah Jane Delany Digital Media Centre Dublin Institute of Technology Abstract The success of supervised learning

More information

Separation and Classification of Harmonic Sounds for Singing Voice Detection

Separation and Classification of Harmonic Sounds for Singing Voice Detection Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay

More information

Automatic slide assignation for language model adaptation

Automatic slide assignation for language model adaptation Automatic slide assignation for language model adaptation Applications of Computational Linguistics Adrià Agustí Martínez Villaronga May 23, 2013 1 Introduction Online multimedia repositories are rapidly

More information

KNOWLEDGE-BASED IN MEDICAL DECISION SUPPORT SYSTEM BASED ON SUBJECTIVE INTELLIGENCE

KNOWLEDGE-BASED IN MEDICAL DECISION SUPPORT SYSTEM BASED ON SUBJECTIVE INTELLIGENCE JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 22/2013, ISSN 1642-6037 medical diagnosis, ontology, subjective intelligence, reasoning, fuzzy rules Hamido FUJITA 1 KNOWLEDGE-BASED IN MEDICAL DECISION

More information

Guest Editors Introduction: Machine Learning in Speech and Language Technologies

Guest Editors Introduction: Machine Learning in Speech and Language Technologies Guest Editors Introduction: Machine Learning in Speech and Language Technologies Pascale Fung (pascale@ee.ust.hk) Department of Electrical and Electronic Engineering Hong Kong University of Science and

More information

SENTIMENT EXTRACTION FROM NATURAL AUDIO STREAMS. Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen

SENTIMENT EXTRACTION FROM NATURAL AUDIO STREAMS. Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen SENTIMENT EXTRACTION FROM NATURAL AUDIO STREAMS Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen Center for Robust Speech Systems (CRSS), Eric Jonsson School of Engineering, The University of Texas

More information

On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues

On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues J Multimodal User Interfaces (2010) 3: 7 19 DOI 10.1007/s12193-009-0032-6 ORIGINAL PAPER On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues Florian

More information

Artificial Neural Network for Speech Recognition

Artificial Neural Network for Speech Recognition Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken

More information

Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis

Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis Fabio Tesser, Giacomo Sommavilla, Giulio Paci, Piero Cosi Institute of Cognitive Sciences and Technologies, National

More information

Learning to Recognize Talkers from Natural, Sinewave, and Reversed Speech Samples

Learning to Recognize Talkers from Natural, Sinewave, and Reversed Speech Samples Learning to Recognize Talkers from Natural, Sinewave, and Reversed Speech Samples Presented by: Pankaj Rajan Graduate Student, Department of Computer Sciences. Texas A&M University, College Station Agenda

More information

PoS-tagging Italian texts with CORISTagger

PoS-tagging Italian texts with CORISTagger PoS-tagging Italian texts with CORISTagger Fabio Tamburini DSLO, University of Bologna, Italy fabio.tamburini@unibo.it Abstract. This paper presents an evolution of CORISTagger [1], an high-performance

More information

II. RELATED WORK. Sentiment Mining

II. RELATED WORK. Sentiment Mining Sentiment Mining Using Ensemble Classification Models Matthew Whitehead and Larry Yaeger Indiana University School of Informatics 901 E. 10th St. Bloomington, IN 47408 {mewhiteh, larryy}@indiana.edu Abstract

More information

LMELECTURES: A MULTIMEDIA CORPUS OF ACADEMIC SPOKEN ENGLISH

LMELECTURES: A MULTIMEDIA CORPUS OF ACADEMIC SPOKEN ENGLISH LMELECTURES: A MULTIMEDIA CORPUS OF ACADEMIC SPOKEN ENGLISH K. Riedhammer, M. Gropp, T. Bocklet, F. Hönig, E. Nöth, S. Steidl Pattern Recognition Lab, University of Erlangen-Nuremberg, GERMANY noeth@cs.fau.de

More information

Automated Content Analysis of Discussion Transcripts

Automated Content Analysis of Discussion Transcripts Automated Content Analysis of Discussion Transcripts Vitomir Kovanović v.kovanovic@ed.ac.uk Dragan Gašević dgasevic@acm.org School of Informatics, University of Edinburgh Edinburgh, United Kingdom v.kovanovic@ed.ac.uk

More information

LANGUAGE! 4 th Edition, Levels A C, correlated to the South Carolina College and Career Readiness Standards, Grades 3 5

LANGUAGE! 4 th Edition, Levels A C, correlated to the South Carolina College and Career Readiness Standards, Grades 3 5 Page 1 of 57 Grade 3 Reading Literary Text Principles of Reading (P) Standard 1: Demonstrate understanding of the organization and basic features of print. Standard 2: Demonstrate understanding of spoken

More information

Big Data. Daniel Hardt. Supply Chain Leaders Forum 3 September 2015. IT Management, CBS

Big Data. Daniel Hardt. Supply Chain Leaders Forum 3 September 2015. IT Management, CBS The Revolution Learning from : Text, Feelings and Machine Learning IT Management, CBS Supply Chain Leaders Forum 3 September 2015 The Revolution Learning from : Text, Feelings and Machine Learning Outline

More information

Myanmar Continuous Speech Recognition System Based on DTW and HMM

Myanmar Continuous Speech Recognition System Based on DTW and HMM Myanmar Continuous Speech Recognition System Based on DTW and HMM Ingyin Khaing Department of Information and Technology University of Technology (Yatanarpon Cyber City),near Pyin Oo Lwin, Myanmar Abstract-

More information

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

More information

Voice User Interfaces (CS4390/5390)

Voice User Interfaces (CS4390/5390) Revised Syllabus February 17, 2015 Voice User Interfaces (CS4390/5390) Spring 2015 Tuesday & Thursday 3:00 4:20, CCS Room 1.0204 Instructor: Nigel Ward Office: CCS 3.0408 Phone: 747-6827 E-mail nigel@cs.utep.edu

More information

Introduction to the Database

Introduction to the Database Introduction to the Database There are now eight PDF documents that describe the CHILDES database. They are all available at http://childes.psy.cmu.edu/data/manual/ The eight guides are: 1. Intro: This

More information

Efficient diphone database creation for MBROLA, a multilingual speech synthesiser

Efficient diphone database creation for MBROLA, a multilingual speech synthesiser Efficient diphone database creation for, a multilingual speech synthesiser Institute of Linguistics Adam Mickiewicz University Poznań OWD 2010 Wisła-Kopydło, Poland Why? useful for testing speech models

More information

Fast Labeling and Transcription with the Speechalyzer Toolkit

Fast Labeling and Transcription with the Speechalyzer Toolkit Fast Labeling and Transcription with the Speechalyzer Toolkit Felix Burkhardt Deutsche Telekom Laboratories, Berlin, Germany Felix.Burkhardt@telekom.de Abstract We describe a software tool named Speechalyzer

More information

Microblog Sentiment Analysis with Emoticon Space Model

Microblog Sentiment Analysis with Emoticon Space Model Microblog Sentiment Analysis with Emoticon Space Model Fei Jiang, Yiqun Liu, Huanbo Luan, Min Zhang, and Shaoping Ma State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory

More information

COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE. Fall 2014

COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE. Fall 2014 COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE Fall 2014 EDU 561 (85515) Instructor: Bart Weyand Classroom: Online TEL: (207) 985-7140 E-Mail: weyand@maine.edu COURSE DESCRIPTION: This is a practical

More information

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set Overview Evaluation Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes training set, validation set, test set holdout, stratification

More information

Applying Repair Processing in Chinese Homophone Disambiguation

Applying Repair Processing in Chinese Homophone Disambiguation Applying Repair Processing in Chinese Homophone Disambiguation Yue-Shi Lee and Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan, R.O.C.

More information

Analysis of SMO and BPNN Model for Speech Emotion Recognition System

Analysis of SMO and BPNN Model for Speech Emotion Recognition System International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Issue-4 E-ISSN: 2347-2693 Analysis of SMO and BPNN Model for Speech Emotion Recognition System Rohit katyal

More information

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, VOL. 6, NO. X, XXXXX 2015 1. Sentiment Analysis: From Opinion Mining to Human-Agent Interaction

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, VOL. 6, NO. X, XXXXX 2015 1. Sentiment Analysis: From Opinion Mining to Human-Agent Interaction TRANSACTIONS ON AFFECTIVE COMPUTING, VOL. 6, NO. X, XXXXX 2015 1 Sentiment Analysis: From Opinion Mining to Human-Agent Interaction Chloe Clavel and Zoraida Callejas Abstract The opinion mining and human-agent

More information

A Method for Automatic De-identification of Medical Records

A Method for Automatic De-identification of Medical Records A Method for Automatic De-identification of Medical Records Arya Tafvizi MIT CSAIL Cambridge, MA 0239, USA tafvizi@csail.mit.edu Maciej Pacula MIT CSAIL Cambridge, MA 0239, USA mpacula@csail.mit.edu Abstract

More information

Active Learning SVM for Blogs recommendation

Active Learning SVM for Blogs recommendation Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the

More information

Social Media Analytics Summit April 17-18, 2012 Hotel Kabuki, San Francisco WELCOME TO THE SOCIAL MEDIA ANALYTICS SUMMIT #SMAS12

Social Media Analytics Summit April 17-18, 2012 Hotel Kabuki, San Francisco WELCOME TO THE SOCIAL MEDIA ANALYTICS SUMMIT #SMAS12 Social Media Analytics Summit April 17-18, 2012 Hotel Kabuki, San Francisco WELCOME TO THE SOCIAL MEDIA ANALYTICS SUMMIT #SMAS12 www.textanalyticsnews.com www.usefulsocialmedia.com New Directions in Social

More information

Applications of speech-to-text in customer service. Dr. Joachim Stegmann Deutsche Telekom AG, Laboratories

Applications of speech-to-text in customer service. Dr. Joachim Stegmann Deutsche Telekom AG, Laboratories Applications of speech-to-text in customer service. Dr. Joachim Stegmann Deutsche Telekom AG, Laboratories Contents. 1. Motivation 2. Scenarios 2.1 Voice box / call-back 2.2 Quality management 3. Technology

More information

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015 Sentiment Analysis D. Skrepetos 1 1 Department of Computer Science University of Waterloo NLP Presenation, 06/17/2015 D. Skrepetos (University of Waterloo) Sentiment Analysis NLP Presenation, 06/17/2015

More information

Turkish Radiology Dictation System

Turkish Radiology Dictation System Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey arisoyeb@boun.edu.tr, arslanle@boun.edu.tr

More information

Analysis and Synthesis of Hypo and Hyperarticulated Speech

Analysis and Synthesis of Hypo and Hyperarticulated Speech Analysis and Synthesis of and articulated Speech Benjamin Picart, Thomas Drugman, Thierry Dutoit TCTS Lab, Faculté Polytechnique (FPMs), University of Mons (UMons), Belgium {benjamin.picart,thomas.drugman,thierry.dutoit}@umons.ac.be

More information

The Minor Third Communicates Sadness in Speech, Mirroring Its Use in Music

The Minor Third Communicates Sadness in Speech, Mirroring Its Use in Music Emotion 2010 American Psychological Association 2010, Vol. 10, No. 3, 335 348 1528-3542/10/$12.00 DOI: 10.1037/a0017928 The Minor Third Communicates Sadness in Speech, Mirroring Its Use in Music Meagan

More information

Chapter 8. Final Results on Dutch Senseval-2 Test Data

Chapter 8. Final Results on Dutch Senseval-2 Test Data Chapter 8 Final Results on Dutch Senseval-2 Test Data The general idea of testing is to assess how well a given model works and that can only be done properly on data that has not been seen before. Supervised

More information

SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS

SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS Bálint Tóth, Tibor Fegyó, Géza Németh Department of Telecommunications and Media Informatics Budapest University

More information

Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features

Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features , pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of

More information

Diagnostic system for speech articulation and speech understanding

Diagnostic system for speech articulation and speech understanding Diagnostic system for speech articulation and speech understanding A. Czyzewski. B. Kostek, H. Skarzynski Institute of Physiology and Pathology of Hearing Warsaw, Poland e-mail: speech@telewelfare.com

More information

Objective Intelligibility Assessment of Text-to-Speech Systems Through Utterance Verification

Objective Intelligibility Assessment of Text-to-Speech Systems Through Utterance Verification Objective Intelligibility Assessment of Text-to-Speech Systems Through Utterance Verification Raphael Ullmann 1,2, Ramya Rasipuram 1, Mathew Magimai.-Doss 1, and Hervé Bourlard 1,2 1 Idiap Research Institute,

More information

AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS

AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS PIERRE LANCHANTIN, ANDREW C. MORRIS, XAVIER RODET, CHRISTOPHE VEAUX Very high quality text-to-speech synthesis can be achieved by unit selection

More information

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that

More information

SENTIMENT analysis, particularly the automatic analysis

SENTIMENT analysis, particularly the automatic analysis IEEE INTELLIGENT SYSTEMS, VOL. X, NO. X, MONTH, YEAR 1 YouTube Movie Reviews: In, Cross, and Open-domain Sentiment Analysis in an Audiovisual Context Martin Wöllmer, Felix Weninger, Tobias Knaup, Björn

More information

62 Hearing Impaired MI-SG-FLD062-02

62 Hearing Impaired MI-SG-FLD062-02 62 Hearing Impaired MI-SG-FLD062-02 TABLE OF CONTENTS PART 1: General Information About the MTTC Program and Test Preparation OVERVIEW OF THE TESTING PROGRAM... 1-1 Contact Information Test Development

More information

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract

More information

Annotated bibliographies for presentations in MUMT 611, Winter 2006

Annotated bibliographies for presentations in MUMT 611, Winter 2006 Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.

More information

Technical University of Berlin, Germany Institute for Language and Communication Markus Brückl, Walter Sendlmeier

Technical University of Berlin, Germany Institute for Language and Communication Markus Brückl, Walter Sendlmeier Technical University of Berlin, Germany Institute for Language and Communication Markus Brückl, Walter Sendlmeier Sue Ellen Linville: - Firm conclusions as to the effect of aging on and levels are not

More information

MODELING DYNAMIC PATTERNS FOR EMOTIONAL CONTENT IN MUSIC

MODELING DYNAMIC PATTERNS FOR EMOTIONAL CONTENT IN MUSIC 12th International Society for Music Information Retrieval Conference (ISMIR 2011) MODELING DYNAMIC PATTERNS FOR EMOTIONAL CONTENT IN MUSIC Yonatan Vaizman Edmond & Lily Safra Center for Brain Sciences,

More information

Technologies for Voice Portal Platform

Technologies for Voice Portal Platform Technologies for Voice Portal Platform V Yasushi Yamazaki V Hitoshi Iwamida V Kazuhiro Watanabe (Manuscript received November 28, 2003) The voice user interface is an important tool for realizing natural,

More information

Straightforward Advanced CEF Checklists

Straightforward Advanced CEF Checklists Straightforward Advanced CEF Checklists Choose from 0 5 for each statement to express how well you can carry out the following skills practised in Straightforward Advanced. 0 = I can t do this at all.

More information