Multi Modal Affective Data Analytics Mykola Pechenizkiy SDAD 2012 @ ECMLPKDD2012 2 September 2012 Bristol, UK http://www.win.tue.nl/stressatwork
Affective data Social media Social media leads to masses of affective data related to peoples emotions, sentiments and opinions In the recent past was used mainly for marketing needs Web analytics Social Media Whatever the incentive was to study this, sentiment classification has become much more accurate 3
Multilingual Sentiment Classification 4
Rule based polarity detection Rule based emission model: 8 kinds of rules: Emission 5
SentiCorr How much positive and negative content do we read or write? 6
Mobile SentiCorr App What a fantastic idea, now if Great idea! Get it on ios This app is esigned to make someone else (or a computer) soon (anonymous) read our e mails for and protect us from WHAT??? How lazy can we get? Like someone Stress commented is often on CNN reactions, made if we are worse getting upset by by the tones anticipation and/or scoldings in e mails, we certainly have bigger issues that need to of be an dealt unpleasant with. C mon, guys, go event invent something useful. Not to mention, does it detect and actually dissipated irony? Will it weed out the liars? once you tackle the Pleeeeezzzeeee, what a WASTE OF SOMEONE S COLLEGIATE TIME AND ENERGY. Don t we have problem directly houses to clean and poor people to feed and old folks to help with their shopping? Go do something useful with your time, inventors of his app!!! Pamela Briggs British Psychological Society 7
OLAP Style Exploration of Data Summaries 8
Exploration of Individual Cases, e.g. e Mails 9
Sentiment vs. Fact Classification News in media or business are considered to be sentiment neutral, but they often contain positive or negative information, e.g. You will be fired in 3 months because of the serious budget cuts. no sentiment, but negative information Similarly, in work related correspondence there could be stressing information: How can we identify it? 10
Sentiment discovery: State of the art Sentiment analysis/classification is mature! Commercial products, free services, open source, variety of apps, evolves in many directions Several great overviews: Sentiment Analysis in Practice ICDM2011 tutorial by Tiger Zhang (ebay Research Labs) http://web.cs.dal.ca/~yongzhen/publication/paper/icdm2011_senti mentanalysisinpracticetutorial.pdf Modeling Opinions and Beyond in Social Media by Bing Liu (UIC) http://kdd2012.sigkdd.org/sites/images/summerschool/bing Liu.pptx
Outline Framework for Stress Analytics: Data management, OLAP support Shape based Query by Example Stress detection from speech and GSR Predictive features and classification From controlled experiments to real life
What is stress? Is it a bad thing?
Stress in NL according to Coosto.nl Not really job related
Impact of Stress at Work WHO: by 2020 Top 5 diseases will be stress related. USA: health care expenditures are ~50% greater for workers who report high levels of stress at work (J. Occup. Env. Med, 40:843 854). the Netherlands: (TNO, 2006; TU/e Cursor 2012): The direct costs of stress are 4 billion Euro per year. Every year 150.000 300.000 employees become ill because of stress at work. 1/7 disabled because of stress at work. In TU Delft, 53% of surveyed students indicated that they experienced huge stress during their studies. 15
What do organizations try (not) to do? Reduce workload (33%) Discuss psychological load (28%) Change work processes (17%) Improve work/life balance (14%) Improve managers skills (13%) Extend regulations (9%) Source: (TNO, dossier Werkdruk)
What can go wrong? They are not always aware of the problem or don t know the exact cause People do not always want to share what they experience with others Not always timely enough Expensive to organize meeting with psychologists, interventions The individual causes are different and not always well understood Giving practical advises is not trivial
Types of Stress and Stressors Different types of stress: Survival stress a response to a physical danger Environmental stress noise, crowding, pressure from work or family Internal stress worrying about things we can't control; putting ourselves in situations we know will cause us stress (addicted to stress expanding todo list with more and more conference deadlines) Fatigue and overwork in a long term perspective Stress affects both body and mind 18
Types of Stress and Stressors Three kinds of stress: Acute: caused by an acute short term stress factor. Episodic acute: occurs more frequently & periodically. Chronic: caused by long term stress factors harmful. Factors causing stress@work: long work hours, work overload, time pressure, difficult, demanding or complex tasks, high responsibility, lack of breaks, lack of training conflicts, underpromotion, job insecurity, lack of variety, and poor physical work conditions (limited space, temperature and lighting conditions) 19
Concept Be eep! 20
StressAnalytics Make people aware of their stress and stressors Overview of stressors Exploration of relations Access to evidence, i.e. annotated, measured stress Empowerment by awareness (+ implicit/explicit advice)
Our approach to StressAnalytics What, When, Where, with Whom Physiological signs Pattern Mining OLAP cube 22
Our approach to Stress Analytics Make a person aware of what is happening how they spend their time and when and from where the stress comes in Provide valuable input for pattern mining/knowledge discovery Much richer data sources Visual analytics Interactive exploration of stress related data Collecting subjective data/labels from a person through the interaction 23
GUI Exploration, Interaction, Visual Analytics OLAP Zoom in&out, slice&dice Pattern mining, prediction, query by example Data Mining Feature extraction, peak/change detection, classification Raw data, objective evidence External environment temperature, lighting, noise, airconditioning External userrelated data KPI, E mail, calendar, social media, news Physiological signs GSR, temp., voice, heart rate, facial expressions
Evidence: physiol. signals & external sources GSR, Temperature, Speech, Facial expressions, Sentiment in text
Alignment of Information Sources What person reads and writes: SentiCorr What person does in general according to agenda Environment context (lighting, noise, temp etc.) Annotate data from video, sound, text processing, and vital signs What person does with the computer http://wakoopa.com/ Different aspect with pre processing, storing, managing 26
Stress Data Cube/OLAP Quick data summaries wrt predefined dimensions 27
Stress Analytics Visualization OLAP style exploration: selecting multidimension, zoom in, zoom out. Navigating to the evidences: i.e. raw data: GSR, skin temperature, speech, and email Shape based time series similarity search State of the art UCR Suite (Keogh et al.) Demo: http://www.win.tue.nl:8080/saw_analytics/stress_v isualization.jsp
OLAP system, a Star Schema
Shape Based Query by Example Given a subsequence of GSR time series s Query Find a similar shape time series with s Result
Euclidean Distance: Shape based QBE Dynamic Time Warping (DTW) State of the art UCR Suite (Keogh et al.)
How to measure stress Determine stress level based on observed sweat production 32
Detection and Categorization of Stress Based on GSR data alone not as easy as the following figure may suggest: 33
Challenges in Stress Detection All kinds of noise, e.g. loosing contact with the skin Activity (exercising), environment (cold/hot) context and personal differences may impact GSR we observe 34
Interpretation isn t straightforward 35
Detection as Classification GSR features Mean, SD, min and max of GSR. Mean, min and max of peak height. Total number of GSR response. The sum of GSR amplitude. The sum of rising time response. The sum of energy response.
Adding more data to disambiguate Skin and room temperature, noise, accelerometer, voice, face, 37
e.g. activity recognition can help Writing vs. typing vs. walking vs. teaching vs Analyzing accelerometer data only (wrist band) 38
Uncontrolled and semi controlled Philips Research employees wearing the device during their working hours Students passing the written and multiple choice exams Students presenting demos/posters with course project results More to come via HumanCapitalCare 39
Experiment demo
41
Measuring GSR in (un)controlled settings Philips prototype Self made, the LEGO Mindstorms NXT 42
Multi Source Affective Data Classification Stress/Emotion classification from text, GSR & speech Facial expression analysis GSR & other sensors 43
Automatic Stress Detection speech model GSR model feature enrichment ensemble learning speech GSR speech GSR speech GSR speech features GSR features speech features GSR features speech features GSR features classification classification combine features classification classification classification ensemble
Stress and Skin Conductance Stress Changes in Autonomic Nervous System (ANS) activation of sweat glands Changes of skin conductance Changes of the amount of the produced sweat Relax skin is drier skin conductance is lower Stress sweat increases skin conductance is higher
GSR features Mean, SD, min and max of GSR. Mean, min and max of peak height. Total number of GSR response. The sum of GSR amplitude. The sum of rising time response. The sum of energy response.
Change detection approach Online settings
Preprocessing steps 50
Stress and Speech Stress Respiration Rate increases Increased Pitch Increased subglottal pressure Voice is a good indicator of stress [scherer, 1986]
Speech Features Voiced and unvoiced speech
Speech Features Pitch / Fundamental frequency
Speech Features Mel Frequency Cepstral Coefficients (MFCCs) are coefficients that approximate human perception auditory response. Audio (temporal) FFT frequency Mel scale filter filtered frequency logs power MFCCs Store the first coefficients DCT representation DCT log frequency
Classification Methods Support Vector Machine (SVM) State of the arts. Decision Tree classifier. K means using Vector Quantization (VQ). This method is chosen as a baseline. Gaussian Mixture Model (GMM). This method works well for speaker recognition task. Change detectors: ADWIN, thresholding
Stress Dataset Three types of GSR patterns. First Second Third type: type:
Aligning of data sources 60 seconds GSR Instance 1 Instance 2 Instance 3 speech Instance 1 Instance 2 Instance 3
Stress Dataset: Speech Features
Stress Model using GSR features 10-times 10-fold CV (not subject independent) 90 80 70 70.51 79.66 80.72 73.45 74.9 77.81 66.82 70.6 62.52 Accuracy (percent) 60 50 40 30 46.12 55.54 53.21 k means GMM SVM 20 Decision Tree 10 0 Recovery vs workloads Recovery vs heavy workload Light vs heavy workload SVM outperformed other methods. Recognizing light vs heavy workload is harder than between recovery vs heavy workload.
Stress Model using speech features 100 92.39 92.56 91.69 90 Accuracy (percent) 80 70 60 50 40 30 62.08 58.82 56.78 55.6 55.39 49.65 68.86 70.69 71.47 59.08 49.17 50.6 52.3 k means GMM SVM 20 Decision Tree 10 0 Pitch MFCC MFCC Pitch RASTA SVM outperforms the other classifier. K means and GMM do not perform well for speech. MFCC is a good indicator for stress detection.
1 subject leave out cross validation (subject independent model) Accuracy (percent) Accuracy (percent) 10090 9080 80 70 70 60 60 50 50 40 40 30 30 92.39 92.56 79.66 80.72 91.69 74.84 75 70.6 63.04 67.82 70 72.17 62.08 53.04 It is better to address the problem of stress detection using a subject dependent model 20 20 10 10 0 0 Recovery Pitch vs workloads MFCC Recovery vs heavy workload MFCC Pitch Light vs heavy RASTA workload PLP GSR Tasks Speech Features 10 times 10 fold CV 1 Subject Leave Out 1 subject leave out CV
Fusion Approaches Feature enrichment Ensemble learning
Fusion of GSR and Speech Accuracy (percent) 100 90 80 70 60 50 40 30 20 10 0 90.73 92.43 91.34 92.47 69.04 70.17 MFCC and GSR MFCC Pitch and GSR Pitch and GSR Enriching Feature Space Logistic Regression as MetaLearner Light vs. heavy workload, balanced data
Kappa Agreement for Classifiers Measure agreement between two model using Cohen s Kappa test. Kappa = 1 complete agreement. Kappa = 0 complete disagreement.
Stress detection summary Speech is more reliable (in lab settings) than GSR, but more subject dependent. SVM is performing better on both GSR and Speech signal. ADWIN & thresholding detectors do well on GSR Combining GSR and Speech is not trivial: Speech and GSR predictions are highly independent (low kappa value) This diversity may be exploited with dynamic integrations methods
Further directions Extend the notion of stress (positive and negative) in the stress analytics framework. Stress analytics affective data analytics Collect more data to enable OLAP KDD part of the framework. Combine with other signals, such as facial expression, heart rate, nutrition. Long path from lab setting to real life situation; but both are needed.
Is Acute Stress Good or Bad? 69
What is the Relaxation Then? 70
Is Normal Condition Good or Bad? What if someone s patterns looks like NNNNNNNNNNNNNNNN 71
Summary The fun parts come from The fact that not much is known about stress Playing with heterogeneous/multi modal data Multi disciplinary (data collection, data management, data mining, visual analytics) Engineering approach to data mining How to show the utility i.e. what we do helps to understand better stress as a phenomenon, and the stressors, and how to helps people at the end 72
Take home messages Lab settings vs. real world Availability and quality of the signal Voice recorded Someone s else voice recorded Noise and missing data, uncertainty A person cannot speak (during the meeting while someone else is speaking) Ground truth, labels, subjective vs. objective A large problem space If you know how to help us with any part on StressAnalytics talk to me 73