Technische Universität München. Speech Processing When it gets Emotional

Size: px
Start display at page:

Download "Technische Universität München. Speech Processing When it gets Emotional"

Transcription

1 Speech Processing When it gets Emotional

2 One Hour of: Intro Emotion & Speech Some Hot Topics And next? Björn Schuller 2

3 Intro

4 Application Natural Interaction Media Retrieval Monitoring Editing Encoding Entertainment / Coaching Encoded Audio: kbit/s Acoustic Acoustic Parameters Acoustic/Symb. Symbolic Semantic Parametric Acoustic/Symbolic Symbolic Semantic Björn Schuller 4

5 Application ASC-Inclusion ASC-Inclusion: Interactive Emotion Games for Social Inclusion of Children with with Autism Spectrum Conditions, IDGEI, Björn Schuller 5

6 Speech Recognition Speaker Classification 1950 single speaker, digits words several 1000 words speaker identification 1990 trained dictation 1990 single speaker, emotion 2000 robust, million words 2013 few isolated states/traits 2010 real-life recognition 2018 full real-life classification Björn Schuller 6

7 Paralingustics Computational Paralinguistics, Wiley, Björn Schuller 7

8 Checking Taxonomies either sure rather so so / n.a. rather sure or trait state acted spontaneous complex simple measured assessed categorical continuous felt perceived intentional instinctual consistent discrepant private social prototypical peripheral universal culture-specific uni-modal multi-modal Björn Schuller 8

9 Relationships biological trait primitives cultural trait primitives personality emotion/affect social signals deviant speech discrepant communication Computational Paralinguistics, Wiley, Björn Schuller 9

10 Speech & Emotion

11 Recognition Processing Flow Front End Back End Intelligent Audio Analysis, Springer, Björn Schuller 11

12 Modelling "generic" mapping "empirical", data-driven mapping from: Scherer 2000: Emotion. Björn Schuller from: Batliner

13 Collecting Data Emotional factors in speech based human-machine interaction in the operating room, International Journal of Computer Assisted Radiology and Surgery, 5 (1): , Example: Speech In Minimal Invasive Surgery Collection 29 operations 37.4 h, Segmentation: 16% speech Annotation Emotion Text Noise by type 4 Annotators Emotion [m:s] #Turns [%] Neutral 235: Joy 34: Anger 22: Impatience 29: Confusion 27: Total 350:01 9, Björn Schuller 13

14 Data The Computational Paralinguistics Challenge, IEEE Signal Processing Magazine, 29(4): 2-6, INTERSPEECH & AVEC Challenges Corpus Speaker state & trait Sub- Challenge TT [h] # Chunks # Subjects # Annotators Type Lang. Audio khz SPC Personality SP FR FM 8 SLD Likability PR DE TEL 8 NCSC Pathology PR NL LAB 16 ALC Intoxication PR DE LAB 16 SLC Sleepiness PR DE LAB 16 agender Age, Gender PR DE TEL 8 TUM AVIC Affect SP UK LAB 44 FAU AEC Emotion SP DE LAB 16 AVEC Emotion SP UK LAB 48 Björn Schuller 14

15 Data: Interspeech ComParE SSPNet Vocalisation Corpus (SVC) sec clips, 60 phone calls, 120 subjects (63 female) Subjects mutually unknown, commonly solve Winter Survival Task Cell phone audio, frame-wise classification: laughter / filler / garbage SSPNet Conflict Corpus (SC²) second clips from Canal9 Corpus 45 Swiss political debates (in French), 138 subjects (23 female) ~550 assessors via Amazon Mechanical Turk, Conflict in [-10, 10] Child Pathologic Speech Database (CPSD) 2.5k instances, imitation of 26 prompted French phrases Different modalities / varying intonation 99 Children (6 18a) out of which 35 Pervasive Development Disorder: Autism: 10m/2f, Dysphasia: 10m/3f, Non-Otherwise Specified: 9m,/1f Björn Schuller 15

16 Interspeech ComParE The INTERSPEECH 2013 Computational Paralinguistics Challenge: Social Signals, Conflict, Emotion, Autism, Interspeech, Geneva Multimodal Emotion Portrayals (GEMEP) 1.2 k instances Emotional speech 10 professional actors (5 f) Enacted Nonsensical phrases Sustained vowels Björn Schuller 16

17 Chunking linguistically blind segmentations: frames - fractions of time units - pauses Björn Schuller 17

18 Chunking Example: Gain by Supra-segmental Features TUM AVIC corpus Turn vs. Frame (MI-)SVM Recognising Interest in Conversational Speech Comparing Bag of Frames and Supra-segmental Features", Interspeech, Björn Schuller 18

19 Preprocessing Example openblissart Input signal V (Signal spectrogram) V W H Initialised (supervised NMF) / online estimation (unsupervised NMF) H Optimisation (EM) (Activations) (Component spectra) W (Source spectrogram) Source signals Optimization and Parallelization of Monaural Source Separation Algorithms in the openblissart Toolkit, Journal of Signal Processing Systems, Springer, Björn Schuller 19

20 Continuous Descriptors Pitch Detection PDA in Time Domain PDA by Short Time Principle Determination of 1. Partial Simplification of structure Analysis of Time Signal Correlation Maximum Likelihood Analysis in Frequ. domain Björn Schuller 20

21 Continuous Descriptors Pitch (FAU Aibo Corpus) 67.9% voiced frames, ~ 6% erroneous pitch (>10 % deviation) ~2.0% loss in recognition accuracy (duration features less affected) The Impact of F0 Extraction Errors on the Classification of Prominence and Emotion, ICPhS, Björn Schuller 21

22 Feature Brute-Forcing opensmile Speech & Music Interpreation by Large Space Extraction Low-Level-Descriptors (Hierarchical) Functionals Standard feature sets Multithreading Memory efficient Fully configurable #features RTF 10k k.03 opensmile - The Munich Versatile and Fast Open-Source Audio Feature Extractor, ACM Multimedia, (3 rd place ACM MM Open Source Software Competition) Björn Schuller 22

23 Intonation ((Multiple) Pitch, ) Extremes (min, max, range, ) Acoustics Linguistics Intensity (Enegry, Teager, ) Linear Predicition (LPCC, PLP,...) Cepstral Coefficients (MFCC, HFCC, ) Formants (Amplitude, Position, ) Spectrum (PCP, CHROMA,...) TF-Transformation (Wavelets, Gabor, ) Harmonicity (HNR, NHR,...) Pertubation (Jitter, Shimmer, ) Linguistics (Phonemes, Words, ) Non-Linguistics (Laughter, Sighs, ) Disfluencies (Pauses, ) Derving (raw LLD, deltas, regression coefficients, correlation coefficients, ) Deriving (Raw, Stemmed, POS-, Semantic-, Tagging, ) Filtering (smoothing, normalising, ) Tokenizing (NGrams, ) Chunking (absolute, relative, syntactic, semantic, ) Mean (arithmetic, absolute, ) Percentiles (quartiles, ranges, ) Higher Moments (std. dev., kurtosis, ) Peaks (number, distances, ) Segments (number, duration, ) Regression (coefficients, error, ) Spectral (DCT coefficients, ) Temporal (durations, positions, ) Vector Space Modelling (bag-of-words, ) Look-Up (word lists, concepts, ) Statistical (salience, info gain, ) Deriving (raw functionals, hierarchical, cross-lld, crosschunking, contextual, ) Filtering (smoothing, normalising, ) Low-Level-Descriptors Functionals Björn Schuller 23

24 Baseline Features Generation Brute-force Systematic Not optimised Size can become issue Provide extraction scripts Björn Schuller 24

25 The Bigger the Better? Optimisation and Feature set Definition Example: Intoxication Sub-Challenge IS 2011 You Björn Schuller 25

26 Feature Selection Feature compression: Example Bottlenecks Integrate space reduction Interspeech 2010 Challenge Acoustic-Linguistic Recognition of Interest in Speech with Bottleneck-BLSTM Nets, Interspeech, Björn Schuller 26

27 Classification & Regression Modelling Affective & social signals are sequential: Context + Dynamics Long-Short Term Modelling Recurrent Neural Network Long Short-Term Memory RNN i t-3 i t-2 i t-1 i t i t-3 i t-2 i t-1 i t i t+1 i t+2 h t-3 h t-2 h t-1 h t h t-3 h t-2 h t-1 h t h t+1 h t+2 o t-3 o t-2 o t-1 o t o t-3 o t-2 o t-1 o t o t+1 o t+2 Combining Long Short-Term Memory and Dynamic Bayesian Networks for Incremental Emotion-Sensitive Artificial Listening, IEEE Journal of Selected Topics in Signal Processing, 4(5): , Björn Schuller 27

28 Classification & Regression LSTM Cell Linear Unit Auto weight 1 Error Carousel Non-linear Gate Input / Output / Forget F 1 EC I Peep-Hole Connection Multiplicative Open / Shutdown O Björn Schuller 28

29 Classification & Regression Input Hidden 1 I I.. F E C O F E C O. Hidden N Output Björn Schuller 29

30 Classification & Regression Exmaple: Exploit Temporal Context by LSTM > 2 past & future instances gain derivative for t=16 gain derivative [%] time step t Context-Sensitive Learning for Enhanced Audiovisual Emotion Classification, IEEE Transactions on Affective Computing, 3(2): , Björn Schuller 30

31 ASR & Emotion Training and Adapting Models AM, LM, both Word accuracy Significance Adaptation of neutral ASR engine On the Impact of Children s Emotional Speech on Acoustic and Language Models, EURASIP Journal on Audio, Speech, and Music Processing, Björn Schuller 31

32 The Words Predicting User Tags TED-Talk Corpus 843 talks à 18 min max Bag of Words + SVM Words that Fascinate the Listener: Predicting Affective Ratings of On-Line Lectures, International Journal of Distance Education Technologies,11(2): , UA Björn Schuller 32

33 ASR & SER ASR Influence Salience Emotion Challenge 2-class Task Reduced Beam-width Emotion Recognition using Imperfect Speech Recognition", Interspeech, Björn Schuller 33

34 Interspeech ComParE The Computational Paralinguistics Challenge, IEEE Signal Processing Magazine, 29(4): 2-6, # Classes UAR/*UAAUC/ + CC [%] 2013 Social Signals 2x2 83.3* Conflict Emotion Autism Personality 5x Likability Intelligibility Intoxication Sleepiness Age Gender Interest [-1,1] Emotion Negativity Björn Schuller 34

35 AVEC Audiovisual Recognition AVEC 2011/12: 50,4k Turns LBP + 2k acoustics Valence low high Activation Expectation Power Valence WA [%] CC LSTM-Modeling of Continuous Emotions in an Audiovisual Affect Recognition Framework", Image and Vision Computing, 31(2): , (best result AVEC Challenge) Björn Schuller 35

36 AVEC Fully Continuous Sub-Challenge 3 x 500,000 frame / label pairs AVEC The Continuous Audio/Visual Emotion and Recognition Challenge", ACM ICMI, Björn Schuller 36

37 AVEC Data Audio-visual depressive language corpus (AVDLC) 340 video clips of subjects performing a HCI task, (mean 25) min Total duration 240 hours One person per clip, 292 subjects, (mean 31+/- 12) years Depression Recognition Sub-Challenge Continuous Beck Depression Index (BDI) Affect Recognition Sub-Challenge Continuous Arousal & Valence ratings Competition Measure / Baseline Features Root Mean Square Error over all sessions, LPQ + 2k acoustics AVEC The Continuous Audio/Visual Emotion and Depression Recognition Challenge", ACM Multimedia, Björn Schuller 37

38 Experience ComParE / AVEC # Participating Teams Interspeech ComParE: 33, 32, 34, 52, 65 ( ) AVEC:12-17 ( ) Some Observations Fusion of best participant results exceeds individual winners throughout These multiple-site results are so far not reached by individual attempts Supra-segmental modelling of information prevails by far Only sparsely dynamic algorithms such as HMMs or DBNs Cross disciplines seems to be successful Great submissions outside competition, e.g., perception studies Björn Schuller 38

39 Some Hot Topics

40 Data Crowd Sourcing Amazon Mechanical Turk? Data Synthesis Example: Emotion in Speech Cross-corpus testing, 3 levels of valence, 6 databases Training with real speech / synthesized speech Learning with Synthesized Speech for Automatic Emotion Recognition, ICASSP, (Pending European Patent) Test with real speech Train % WA Human 64.8 Synth Human + Synth Björn Schuller 40

41 Active Learning Active Learning Goal: Select the most informative instances from large amount of unlabelled data Ask for human labelling aid Typical: High class imbalance Example: FAU Aibo Emotion Corpus (2009 Emotion Challenge) 2-class task: NEGative / IDLe Active Learning by Sparse Instance Tracking and Classifier Confidence in Acoustic Emotion Recognition, Interspeech, Björn Schuller 41

42 AL: Sparse Instance Tracking Labelled utterances Add Manully labelled utterances No data likely to be sparse class Certain UA is achieved Train Model Final model Sparse instance tracking Classify Unlabelled utterances Imbalanced 2-class task Select the samples predicted as sparse class, i.e. NEG Björn Schuller 42

43 AL: Sparse Instance Tracking SI-AL: Select samples predicted as NEG Improvement of UA 5.0% absolutely 95.0% transcribed training data reduced Björn Schuller 43

44 AL: Confidence W/ Instance upsampling: For imbalanced task Labelled utterances Upsampling Train Add Manully labelled utterances Medium confidence score 2-class task Query: min Ci-0.5 Model Classify Certain UA is achieved Final model Unlabelled utterances Björn Schuller 44

45 AL: Confidence & Upsampling MCS-AL: Select middle of the prediction score ranking Improvement of UA 1.0% absolutely 75.0% transcribed training data reduced Björn Schuller 45

46 Semi-Supervised Learning Semi-Supervised Parameters: Min. confidence level # iterations Co-Training Provide different views E.g., feature partitions Examples Interspeech Challenges: Age, Sleepiness, Emotion Co-Training Succeeds in Computational Paralinguistics, IEEE ICASSP, Björn Schuller 46

47 Confidence Measures Human Confidence Learning Human Labeller Agreement per class Confidence Measures for Speech Emotion Recognition: a Start, IEEE/ITG Speech Communication, Björn Schuller 47

48 Confidence Measures Human Confidence Human Labeller Agreement Evaluation Confidence Measures for Speech Emotion Recognition: a Start, IEEE/ITG Speech Communication, Björn Schuller 48

49 Confidence Measures External Confidence Re-labelling by dimensional approach Learning behaviour of engine Confidence Measures in Speech Emotion Recognition Based on Semi-supervised Learning, Interspeech, Björn Schuller 49

50 Confidence Measures External Confidence Adaptation to target domain: semi-supervised learning Confidence Measures in Speech Emotion Recognition Based on Semi-supervised Learning, Interspeech, Björn Schuller 50

51 Fusion of Engines Fusion Late fusion (vote) Ask for confidences Determining optimal N Usually long-standing Upper benchmark Message: Together we re best Björn Schuller 51

52 Fusion of Engines Why it works Q-Statistics: Pairwise measuring whether participants commit same errors on test Medium Term Speaker Traits A Review on Intoxication, Sleepiness, and the First Challenge", Computer Speech & Language, Björn Schuller 52

53 Cumulative Evidence Session Level Fusion Combine participant fusion cumulative evidence Medium Term Speaker Traits A Review on Intoxication, Sleepiness, and the First Challenge", Computer Speech & Language, Björn Schuller 53

54 Distributed Recognition Distributed Recognition Reduce computational load on end-user devices Allow easy integration into new applications (cf. Google API) Solve data scarcity problem: semi-supervised learning etc. Two main issues: Bandwidth / storage cost Privacy orig. 12 MFCC Björn Schuller 54

55 Distributed Recognition Client-server model (cf. ETSI ES V1.1.5) Sub-Vector Quantization (SVQ) Björn Schuller 55

56 Distributed Recognition FAU AIBO 9 bit, 32 subvectors: 67.4% UA Training set Clustering Codebook Clustering Codebook L Clustering Codebook M Distributed Recognition of Emotion in Speech, IEEE ISCCSP, Björn Schuller 56

57 And Next?

58 Going Web Example: Audiovisual Sentiment Polarity Multi-Modal Movie Opinion Database 370 videos: YouTube & ExpoTV Amateur Movie Reviews YouTube Movie Reviews: In, Cross, and Open-domain Sentiment Analysis in an Audiovisual Context IEEE Intelligent Systems Magazine, Björn Schuller 58

59 Going Web Sentiment Polarity Multi-Modal Movie Opinion Database In-Domain Cross-Domain Open-Domain Effect of ASR Boost by adding A/V YouTube Movie Reviews: In, Cross, and Open-domain Sentiment Analysis in an Audiovisual Context, IEEE Intelligent Systems Magazine, Björn Schuller 59

60 Going Non-aware Graz Real-Life Affect in the Street & Supermarket (GRAS²) Corpus 6 channel audio + video + eyetracking + EDA + temperature + 2x 3D motion InterSPAR Search: Sauerkraut, Ovomaltine Ask for: SPAR Chocolate, specific Calculator, typical Austrian product, Turkish Ayran, denture adhisive for third teeth, Anti-athlete s foot cream Björn Schuller 60

61 Transfer Learning Sparse Autoencoder-based Feature Transfer Learning for Speech Emotion Recognition, IEEE /HUMAINE ACII, Exploit Similar Data Related data, e.g., cross-corpus, cross-lingual, cross-trait Related feature vectors Related tasks Methodology Highly specific, e.g., New models from old New labels from old Example: Feature adaptation: Sparse Autoencoder SUSAS AIBO AEC just 50 target instances Björn Schuller 61

62 Transfer Learning On the Acoustics of Emotion in Audio: What Speech, Music, and Sound have in Common, Frontiers in Psychology, 4(5): 1-12, Cross-Audio Speech Spontaneous: VAM Enacted: GEMEP Arousal Music NTWICM Sound EmoFindSound Valence Task-adapted features Björn Schuller 62

63 Context Exploitation Interdepency Text Truthfulness Phonetic/linguistic Multi-modal Immediate situational Heart Rate Intoxication Relatedness Body Size/Shape Role Emotion/Affect Sleepiness general time & space Age Health Personality Gender Ethnicity ID Björn Schuller 63

64 Wrap-Up Some Hot Topics Fighting data sparseness: synthesis, active & weakly supervised Confidence Measures Fusion & Cumulative Evidence Distribution What s next? Fighting data sparseness: web & transfer Higher level features Context Usability Analysis Real-world products Björn Schuller 64

65 Merci Beaucoup! Bedankt! Thank You. time for broad application!

Emotion Detection from Speech

Emotion Detection from Speech Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction

More information

Recent advances in Digital Music Processing and Indexing

Recent advances in Digital Music Processing and Indexing Recent advances in Digital Music Processing and Indexing Acoustics 08 warm-up TELECOM ParisTech Gaël RICHARD Telecom ParisTech (ENST) www.enst.fr/~grichard/ Content Introduction and Applications Components

More information

Vocal Emotion Recognition

Vocal Emotion Recognition Vocal Emotion Recognition State-of-the-Art in Classification of Real-Life Emotions October 26, 2010 Stefan Steidl International Computer Science Institute (ICSI) at Berkeley, CA Overview 2 / 49 1 Different

More information

SENTIMENT analysis, particularly the automatic analysis

SENTIMENT analysis, particularly the automatic analysis IEEE INTELLIGENT SYSTEMS, VOL. X, NO. X, MONTH, YEAR 1 YouTube Movie Reviews: In, Cross, and Open-domain Sentiment Analysis in an Audiovisual Context Martin Wöllmer, Felix Weninger, Tobias Knaup, Björn

More information

Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011

Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning

More information

Speech Signal Processing: An Overview

Speech Signal Processing: An Overview Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech

More information

Machine Learning: Overview

Machine Learning: Overview Machine Learning: Overview Why Learning? Learning is a core of property of being intelligent. Hence Machine learning is a core subarea of Artificial Intelligence. There is a need for programs to behave

More information

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Natalia Vassilieva, PhD Senior Research Manager GTC 2016 Deep learning proof points as of today Vision Speech Text Other Search & information

More information

INFERRING SOCIAL RELATIONSHIPS IN A PHONE CALL FROM A SINGLE PARTY S SPEECH Sree Harsha Yella 1,2, Xavier Anguera 1 and Jordi Luque 1

INFERRING SOCIAL RELATIONSHIPS IN A PHONE CALL FROM A SINGLE PARTY S SPEECH Sree Harsha Yella 1,2, Xavier Anguera 1 and Jordi Luque 1 INFERRING SOCIAL RELATIONSHIPS IN A PHONE CALL FROM A SINGLE PARTY S SPEECH Sree Harsha Yella 1,2, Xavier Anguera 1 and Jordi Luque 1 1 Telefonica Research, Barcelona, Spain 2 Idiap Research Institute,

More information

engin erzin the use of speech processing applications is expected to surge in multimedia-rich scenarios

engin erzin the use of speech processing applications is expected to surge in multimedia-rich scenarios engin erzin Associate Professor Department of Computer Engineering Ph.D. Bilkent University http://home.ku.edu.tr/ eerzin eerzin@ku.edu.tr Engin Erzin s research interests include speech processing, multimodal

More information

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer

More information

SPEAKER TRAIT CHARACTERIZATION IN WEB VIDEOS: UNITING SPEECH, LANGUAGE, AND FACIAL FEATURES

SPEAKER TRAIT CHARACTERIZATION IN WEB VIDEOS: UNITING SPEECH, LANGUAGE, AND FACIAL FEATURES SPEAKER TRAIT CHARACTERIZATION IN WEB VIDEOS: UNITING SPEECH, LANGUAGE, AND FACIAL FEATURES Felix Weninger 1, Claudia Wagner 2, Martin Wöllmer 1,3, Björn Schuller 1,2, and Louis-Philippe Morency 4 1 Machine

More information

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016 Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00

More information

Establishing the Uniqueness of the Human Voice for Security Applications

Establishing the Uniqueness of the Human Voice for Security Applications Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Automatic slide assignation for language model adaptation

Automatic slide assignation for language model adaptation Automatic slide assignation for language model adaptation Applications of Computational Linguistics Adrià Agustí Martínez Villaronga May 23, 2013 1 Introduction Online multimedia repositories are rapidly

More information

Annotated bibliographies for presentations in MUMT 611, Winter 2006

Annotated bibliographies for presentations in MUMT 611, Winter 2006 Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.

More information

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

More information

Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations

Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations Hugues Salamin, Anna Polychroniou and Alessandro Vinciarelli University of Glasgow - School of computing Science, G128QQ

More information

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990

More information

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song , pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,

More information

Active Learning SVM for Blogs recommendation

Active Learning SVM for Blogs recommendation Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the

More information

Fast Labeling and Transcription with the Speechalyzer Toolkit

Fast Labeling and Transcription with the Speechalyzer Toolkit Fast Labeling and Transcription with the Speechalyzer Toolkit Felix Burkhardt Deutsche Telekom Laboratories, Berlin, Germany Felix.Burkhardt@telekom.de Abstract We describe a software tool named Speechalyzer

More information

A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization

A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization Ángela Blanco Universidad Pontificia de Salamanca ablancogo@upsa.es Spain Manuel Martín-Merino Universidad

More information

Big Data: Image & Video Analytics

Big Data: Image & Video Analytics Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)

More information

Learning to Recognize Talkers from Natural, Sinewave, and Reversed Speech Samples

Learning to Recognize Talkers from Natural, Sinewave, and Reversed Speech Samples Learning to Recognize Talkers from Natural, Sinewave, and Reversed Speech Samples Presented by: Pankaj Rajan Graduate Student, Department of Computer Sciences. Texas A&M University, College Station Agenda

More information

You Seem Aggressive! Monitoring Anger in a Practical Application

You Seem Aggressive! Monitoring Anger in a Practical Application You Seem Aggressive! Monitoring Anger in a Practical Application Felix Burkhardt Deutsche Telekom Laboratories, Berlin, Germany Felix.Burkhardt@telekom.de Abstract A monitoring system to detect emotional

More information

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised

More information

Crowdfunding Support Tools: Predicting Success & Failure

Crowdfunding Support Tools: Predicting Success & Failure Crowdfunding Support Tools: Predicting Success & Failure Michael D. Greenberg Bryan Pardo mdgreenb@u.northwestern.edu pardo@northwestern.edu Karthic Hariharan karthichariharan2012@u.northwes tern.edu Elizabeth

More information

Automatic Emotion Recognition from Speech

Automatic Emotion Recognition from Speech Automatic Emotion Recognition from Speech A PhD Research Proposal Yazid Attabi and Pierre Dumouchel École de technologie supérieure, Montréal, Canada Centre de recherche informatique de Montréal, Montréal,

More information

Emotion and Its Triggers in Human Spoken Dialogue: Recognition and Analysis

Emotion and Its Triggers in Human Spoken Dialogue: Recognition and Analysis Emotion and Its Triggers in Human Spoken Dialogue: Recognition and Analysis Nurul Lubis, Sakriani Sakti, Graham Neubig, Tomoki Toda, Ayu Purwarianti, and Satoshi Nakamura Abstract Human communication is

More information

A Supervised Approach To Musical Chord Recognition

A Supervised Approach To Musical Chord Recognition Pranav Rajpurkar Brad Girardeau Takatoki Migimatsu Stanford University, Stanford, CA 94305 USA pranavsr@stanford.edu bgirarde@stanford.edu takatoki@stanford.edu Abstract In this paper, we present a prototype

More information

Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction

Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction : A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction Urmila Shrawankar Dept. of Information Technology Govt. Polytechnic, Nagpur Institute Sadar, Nagpur 440001 (INDIA)

More information

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,

More information

Event Detection in Basketball Video Using Multiple Modalities

Event Detection in Basketball Video Using Multiple Modalities Event Detection in Basketball Video Using Multiple Modalities Min Xu, Ling-Yu Duan, Changsheng Xu, *Mohan Kankanhalli, Qi Tian Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore, 119613

More information

Myanmar Continuous Speech Recognition System Based on DTW and HMM

Myanmar Continuous Speech Recognition System Based on DTW and HMM Myanmar Continuous Speech Recognition System Based on DTW and HMM Ingyin Khaing Department of Information and Technology University of Technology (Yatanarpon Cyber City),near Pyin Oo Lwin, Myanmar Abstract-

More information

School Class Monitoring System Based on Audio Signal Processing

School Class Monitoring System Based on Audio Signal Processing C. R. Rashmi 1,,C.P.Shantala 2 andt.r.yashavanth 3 1 Department of CSE, PG Student, CIT, Gubbi, Tumkur, Karnataka, India. 2 Department of CSE, Vice Principal & HOD, CIT, Gubbi, Tumkur, Karnataka, India.

More information

Big Data Text Mining and Visualization. Anton Heijs

Big Data Text Mining and Visualization. Anton Heijs Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark

More information

Turkish Radiology Dictation System

Turkish Radiology Dictation System Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey arisoyeb@boun.edu.tr, arslanle@boun.edu.tr

More information

Thirukkural - A Text-to-Speech Synthesis System

Thirukkural - A Text-to-Speech Synthesis System Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,

More information

Lecture 1-10: Spectrograms

Lecture 1-10: Spectrograms Lecture 1-10: Spectrograms Overview 1. Spectra of dynamic signals: like many real world signals, speech changes in quality with time. But so far the only spectral analysis we have performed has assumed

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it Web Mining Margherita Berardi LACAM Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it Bari, 24 Aprile 2003 Overview Introduction Knowledge discovery from text (Web Content

More information

F. Aiolli - Sistemi Informativi 2007/2008

F. Aiolli - Sistemi Informativi 2007/2008 Text Categorization Text categorization (TC - aka text classification) is the task of buiding text classifiers, i.e. sofware systems that classify documents from a domain D into a given, fixed set C =

More information

KNOWLEDGE-BASED IN MEDICAL DECISION SUPPORT SYSTEM BASED ON SUBJECTIVE INTELLIGENCE

KNOWLEDGE-BASED IN MEDICAL DECISION SUPPORT SYSTEM BASED ON SUBJECTIVE INTELLIGENCE JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 22/2013, ISSN 1642-6037 medical diagnosis, ontology, subjective intelligence, reasoning, fuzzy rules Hamido FUJITA 1 KNOWLEDGE-BASED IN MEDICAL DECISION

More information

Automatic Transcription: An Enabling Technology for Music Analysis

Automatic Transcription: An Enabling Technology for Music Analysis Automatic Transcription: An Enabling Technology for Music Analysis Simon Dixon simon.dixon@eecs.qmul.ac.uk Centre for Digital Music School of Electronic Engineering and Computer Science Queen Mary University

More information

SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS

SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS Bálint Tóth, Tibor Fegyó, Géza Németh Department of Telecommunications and Media Informatics Budapest University

More information

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015 Sentiment Analysis D. Skrepetos 1 1 Department of Computer Science University of Waterloo NLP Presenation, 06/17/2015 D. Skrepetos (University of Waterloo) Sentiment Analysis NLP Presenation, 06/17/2015

More information

An Arabic Text-To-Speech System Based on Artificial Neural Networks

An Arabic Text-To-Speech System Based on Artificial Neural Networks Journal of Computer Science 5 (3): 207-213, 2009 ISSN 1549-3636 2009 Science Publications An Arabic Text-To-Speech System Based on Artificial Neural Networks Ghadeer Al-Said and Moussa Abdallah Department

More information

Introduction to Pattern Recognition

Introduction to Pattern Recognition Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15 Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright GENIVI Alliance

More information

Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations

Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations C. Wright, L. Ballard, S. Coull, F. Monrose, G. Masson Talk held by Goran Doychev Selected Topics in Information Security and

More information

SURVEY REPORT DATA SCIENCE SOCIETY 2014

SURVEY REPORT DATA SCIENCE SOCIETY 2014 SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses

More information

AMMON: A Speech Analysis Library for Analyzing Affect, Stress, and Mental Health on Mobile Phones

AMMON: A Speech Analysis Library for Analyzing Affect, Stress, and Mental Health on Mobile Phones AMMON: A Speech Analysis Library for Analyzing Affect, Stress, and Mental Health on Mobile Phones Keng-hao Chang, Drew Fisher, John Canny Computer Science Division, University of California at Berkeley

More information

Improvement of an Automatic Speech Recognition Toolkit

Improvement of an Automatic Speech Recognition Toolkit Improvement of an Automatic Speech Recognition Toolkit Christopher Edmonds, Shi Hu, David Mandle December 14, 2012 Abstract The Kaldi toolkit provides a library of modules designed to expedite the creation

More information

Machine Learning. 01 - Introduction

Machine Learning. 01 - Introduction Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge

More information

The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency

The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency Andrey Raev 1, Yuri Matveev 1, Tatiana Goloshchapova 2 1 Speech Technology Center, St. Petersburg, RUSSIA {raev, matveev}@speechpro.com

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music ISO/IEC MPEG USAC Unified Speech and Audio Coding MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music The standardization of MPEG USAC in ISO/IEC is now in its final

More information

Audio Coding Algorithm for One-Segment Broadcasting

Audio Coding Algorithm for One-Segment Broadcasting Audio Coding Algorithm for One-Segment Broadcasting V Masanao Suzuki V Yasuji Ota V Takashi Itoh (Manuscript received November 29, 2007) With the recent progress in coding technologies, a more efficient

More information

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be

More information

EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS

EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS VALERY A. PETRUSHIN Andersen Consulting 3773 Willow Rd. Northbrook, IL 60062 petr@cstar.ac.com ABSTRACT The paper describes two experimental

More information

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications

More information

Predicting the Stock Market with News Articles

Predicting the Stock Market with News Articles Predicting the Stock Market with News Articles Kari Lee and Ryan Timmons CS224N Final Project Introduction Stock market prediction is an area of extreme importance to an entire industry. Stock price is

More information

Dan French Founder & CEO, Consider Solutions

Dan French Founder & CEO, Consider Solutions Dan French Founder & CEO, Consider Solutions CONSIDER SOLUTIONS Mission Solutions for World Class Finance Footprint Financial Control & Compliance Risk Assurance Process Optimization CLIENTS CONTEXT The

More information

Distributed Computing and Big Data: Hadoop and MapReduce

Distributed Computing and Big Data: Hadoop and MapReduce Distributed Computing and Big Data: Hadoop and MapReduce Bill Keenan, Director Terry Heinze, Architect Thomson Reuters Research & Development Agenda R&D Overview Hadoop and MapReduce Overview Use Case:

More information

lop Building Machine Learning Systems with Python en source

lop Building Machine Learning Systems with Python en source Building Machine Learning Systems with Python Master the art of machine learning with Python and build effective machine learning systems with this intensive handson guide Willi Richert Luis Pedro Coelho

More information

Separation and Classification of Harmonic Sounds for Singing Voice Detection

Separation and Classification of Harmonic Sounds for Singing Voice Detection Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay

More information

Video Affective Content Recognition Based on Genetic Algorithm Combined HMM

Video Affective Content Recognition Based on Genetic Algorithm Combined HMM Video Affective Content Recognition Based on Genetic Algorithm Combined HMM Kai Sun and Junqing Yu Computer College of Science & Technology, Huazhong University of Science & Technology, Wuhan 430074, China

More information

Functional Auditory Performance Indicators (FAPI)

Functional Auditory Performance Indicators (FAPI) Functional Performance Indicators (FAPI) An Integrated Approach to Skill FAPI Overview The Functional (FAPI) assesses the functional auditory skills of children with hearing loss. It can be used by parents,

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Learning to Process Natural Language in Big Data Environment

Learning to Process Natural Language in Big Data Environment CCF ADL 2015 Nanchang Oct 11, 2015 Learning to Process Natural Language in Big Data Environment Hang Li Noah s Ark Lab Huawei Technologies Part 1: Deep Learning - Present and Future Talk Outline Overview

More information

Machine Learning with MATLAB David Willingham Application Engineer

Machine Learning with MATLAB David Willingham Application Engineer Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the

More information

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,

More information

Available from Deakin Research Online:

Available from Deakin Research Online: This is the authors final peered reviewed (post print) version of the item published as: Adibi,S 2014, A low overhead scaled equalized harmonic-based voice authentication system, Telematics and informatics,

More information

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina Graduate Co-op Students Information Manual Department of Computer Science Faculty of Science University of Regina 2014 1 Table of Contents 1. Department Description..3 2. Program Requirements and Procedures

More information

Intrusion Detection via Machine Learning for SCADA System Protection

Intrusion Detection via Machine Learning for SCADA System Protection Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. s.l.yasakethu@surrey.ac.uk J. Jiang Department

More information

FOREX TRADING PREDICTION USING LINEAR REGRESSION LINE, ARTIFICIAL NEURAL NETWORK AND DYNAMIC TIME WARPING ALGORITHMS

FOREX TRADING PREDICTION USING LINEAR REGRESSION LINE, ARTIFICIAL NEURAL NETWORK AND DYNAMIC TIME WARPING ALGORITHMS FOREX TRADING PREDICTION USING LINEAR REGRESSION LINE, ARTIFICIAL NEURAL NETWORK AND DYNAMIC TIME WARPING ALGORITHMS Leslie C.O. Tiong 1, David C.L. Ngo 2, and Yunli Lee 3 1 Sunway University, Malaysia,

More information

On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues

On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues J Multimodal User Interfaces (2010) 3: 7 19 DOI 10.1007/s12193-009-0032-6 ORIGINAL PAPER On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues Florian

More information

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software Carla Simões, t-carlas@microsoft.com Speech Analysis and Transcription Software 1 Overview Methods for Speech Acoustic Analysis Why Speech Acoustic Analysis? Annotation Segmentation Alignment Speech Analysis

More information

Advanced Signal Processing and Digital Noise Reduction

Advanced Signal Processing and Digital Noise Reduction Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New

More information

Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

More information

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Oana NICOLAE Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania oananicolae1981@yahoo.com

More information

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,

More information

Patient Similarity-guided Decision Support

Patient Similarity-guided Decision Support Patient Similarity-guided Decision Support Tanveer Syeda-Mahmood, PhD IBM Almaden Research Center May 2014 2014 IBM Corporation What is clinical decision support? Rule-based expert systems curated by people,

More information

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D.

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D. Data Mining on Social Networks Dionysios Sotiropoulos Ph.D. 1 Contents What are Social Media? Mathematical Representation of Social Networks Fundamental Data Mining Concepts Data Mining Tasks on Digital

More information

Multimedia data mining: state of the art and challenges

Multimedia data mining: state of the art and challenges Multimed Tools Appl (2011) 51:35 76 DOI 10.1007/s11042-010-0645-5 Multimedia data mining: state of the art and challenges Chidansh Amitkumar Bhatt Mohan S. Kankanhalli Published online: 16 November 2010

More information

Machine Learning for NLP

Machine Learning for NLP Natural Language Processing SoSe 2015 Machine Learning for NLP Dr. Mariana Neves May 4th, 2015 (based on the slides of Dr. Saeedeh Momtazi) Introduction Field of study that gives computers the ability

More information

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours. (International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

A Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman

A Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman A Comparison of Speech Coding Algorithms ADPCM vs CELP Shannon Wichman Department of Electrical Engineering The University of Texas at Dallas Fall 1999 December 8, 1999 1 Abstract Factors serving as constraints

More information

Speech Processing Applications in Quaero

Speech Processing Applications in Quaero Speech Processing Applications in Quaero Sebastian Stüker www.kit.edu 04.08 Introduction! Quaero is an innovative, French program addressing multimedia content! Speech technologies are part of the Quaero

More information

Sentiment analysis on tweets in a financial domain

Sentiment analysis on tweets in a financial domain Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International

More information