How To Find Out If You Are Stressed
|
|
|
- Heather Rich
- 5 years ago
- Views:
Transcription
1 Stress Detection from Speech and Galvanic Skin esponse Signals Hindra Kurniawan 1, lexandr V. Maslov 1,2, Mykola Pechenizkiy 1 1 Department of Computer Science, TU Eindhoven, the Netherlands [email protected], [email protected] 2 Department of MIT, University of Jyväskylä, Finland [email protected] bstract The problem of stress-management has been receiving an increasing attention in related research communities due to a wider recognition of potential problems caused by chronic stress and due to the recent developments of technologies providing non-intrusive ways of collecting continuously objective measurements to monitor person s stress level. Experimental studies have shown already that stress level can be judged based on the analysis of Galvanic Skin esponse () and signals. In this paper we investigate how techniques can be used to automatically determine periods of acute stress relying on information contained in and/or of a person. 1 Introduction Chronic stress has become a serious problem affecting different life situations and carrying a wide range of health-related diseases, including cardiovascular disease, cerebrovascular disease, diabetes, and immune deficiencies [5]. In health-care and well-being research the problem of stress-management has been receiving an increasing attention [1, 2, 8]. Technologies that automatically recognize stress can become a powerful tool to motivate people to adjust their behavior and lifestyle to achieve a better stress balance. Technology such as sensors can be used for obtaining an objective measurement of stress level, which afterwards, machine learning technique is employed to build a model for stress detection and recognition. Change detection [3] and [16] are two general types of approaches allowing to indicate whether person was stressed over a particular time period. In this paper we study how techniques can be used to automatically determine periods of acute stress in the offline settings. For this purpose we constructed a benchmark dataset by collecting and voice recordings from ten people during a controlled study. Then we trained classifiers using feature extracted from and/or signals. The results of our experiments show that for the constructed benchmark 1) SVM classifiers reach 92% stress detection generalization accuracy, 2) extracted from have higher predictive power than but at the same time are more person dependent, i.e. they do not allow classifiers to generalize well to those people for whom no data were available during the model training phase, and 3) combining information from both signals together is not trivial, i.e. learning a classifier or an ensemble of classifiers that use extracted from and from does not improve the generalization accuracy significantly. The rest of this chapter is organized as follows: we review the previous studies on stress detection in and physiological signals (Section 2), discuss the benchmark dataset construction, feature engineering from and signals, and approaches for stress detection (Section 3), summarize the main results on the constructed benchmark (Section 4). 2 The problem of stress detection The concept of stress remains elusive, very broad and used differently in a number of domains. Two models which are commonly used today come from Selye and Lazarus. Selye [17] proposed General daptation Syndrome (GS) model, which identifies various stages of stress response. Selye argues that stress is a disruption of homeostatis by physical (noise, excessive heat or cold) or psychological (extreme emotion, frustration or sleep deprivation) stimuli which alters the internal equilibrium of the body which causes a stress response. Lazarus s model [12] emphasises on two central concepts: appraisal, i.e. individual s evaluation and interpretation of their circumstances, and coping, i.e. individual s efforts to handle and solve the situation. Lazarus argues that neither the stressor nor one s response is sufficient for defining stress, rather it would be one s perception and appraisal of the stressor that determines if it creates stress.
2 We focus on how the stress shows itself in and physiological signals. Stress detection in. The effect of stress in relation to production has been well studied over the past decades. espiration has been found to correlate with certain emotional situations. When an individual experiences a stressful situation, his respiration rate increases. This presumably will increase subglottal pressure during, which is known to increase pitch (or fundamental frequency) during voiced section [14]. Moreover, an increase in the respiration rate causes a shorter duration of between breaths which, in turn, affects the articulation rate. nother controversial model for detecting stress in is Voice Stress nalysis [23], i.e. measures fluctuations in the physiological microtremors present in every muscle in the body, including vocal cords. Stress detection from physiological signals. utonomic Nervous Systems (NS) of a human controls the organs of our body such as the heart, stomach and intestines. NS can be divided into two divisions, sympathetic and parasympathetic nervous systems. The parasympathetic nervous system is responsible for nourishing, calming the nerves to return to the regular function, healing, and regeneration. On the contrary, the sympathetic nervous system is accountable for activating the glands and organs for defending the body from the threat. The activation of sympathetic nervous system might be accompanied by many bodily reactions, such as an increase in the heart rate, rapid blood flow to the muscle, activation of sweat glands, and increase in the respiration rate. These physiological changes, such as the electrical activity on the scalp, Blood Pressure (BP), Blood Volume Pulse (BVP), Galvanic Skin esponse () etc., can be measured objectively by using modern technology sensors. These psychological variables can be monitored in noninvasive ways and have been investigated extensively over the past decades. 3 Experimental study Since there is no publicly available benchmark for studying stress detection from multi-modal sensor measurements we discuss how we constructed one collecting data in controlled settings. Then we discuss the feature extraction from and on and signals. 3.1 Benchmark construction Psychological stress elicitation. There have been numerous methods proposed for stress elicitation studies, including the Stroop Color-Word Interference Test [18], the Trier Social Stress Test (TSST) [11] and Trier Mental Challenge Test (TMCT) that we used for benchmark construction. Stroop found that it took a longer time to read the words printed in a different color than the same words printed in black. This task has been widely utilized as a cognitive stressor able to induce a heightened level of physiological arousal. TSST and TMCT correspond to the public speaking and mental arithmetic stressors that induce considerable changes in the concentration of adrenocorticotropin (CTH), cortisol, prolactin and heart-rate in independent studies. Data Collection. During the experiment, several signals were recorded including, facial expression, and skin conductance. The was sampled at a sampling rate of 44,100 Hz by using two channels. Facial expression was recorded using Handycam Camcorders with High Definition (HD) resolution at 1, 440 1, 080 pixels. To make a sensor measuring the changes in skin conductance we used the LEGO Mindstorms NXT 1 and an CX wire connector sensor, which converts the analog reading to digital raw values in the range of 0 to 1,023 (Figure 1). The measurements were sampled with 2Hz frequency by using LEGO NXT. Next, the raw value was sent in real time to the computer by means of Bluetooth s connection. We used LEJOS - Java for LEGO Mindstorms 2 open source framework for handling this connection. CX connector Velcro (a) luminum Foil Copper wires Figure 1. (a) The modified CX wire connector sensor; (b) device based on LEGO NXT Mindstorms. We attempted to eliminate the effect of known artifacts into the measurement. Thus, the room temperature where ten subjects participated in the experiment one by one was made constant by means of an air conditioner; no type of physical stressor like noise was present; was collected from the right hand, second phalanx of index and middle fingers. Subjects were of similar age, BMI and were healthy (b)
3 The stress experiment lasted for approximately one hour and consisted of three sessions, including baseline (filling the questionnaire and relaxation for 10 minutes), light workload (Stroop-Word congruent color test, an easy mental arithmetic test, and an easy mental subtraction test) and heavy workload triggering stress (Stroop-Word incongruent color test, a hard mental arithmetic test, and a hard mental subtraction test). 3.2 Feature Engineering Speech. There are different ways to capture information contained in. We considered several subsets including the smoothed energy computed using the overlapping time frames (Figure 2), voiced and unvoiced (Figure 3), pitch (i.e. the fundamental frequency of the signal), Mel Frequency Cepstral Coefficients (MFCCs) that represent human audio perception (and are the most widely used spectral representation of in many applications, including and speaker recognition [7] and emotion recognition), and elative Spectral Transform - Perceptual Linear Perception (ST-PLP) [10] that were shown to be robust to static noise possibly contained in the signal. We used VOICEBOX 3 processing toolbox containing the obust lgorithm for Pitch Tracking (PT), proposed by Talkin [19] for computing pitch (and shown to perform better than simple auto-correlation function), and MFCCs extractor and Matlab toolbox for ST-PLP 4. (a) (b) (c) Figure 3. (a) Speech signal of the utterance bookstore, (b) Waveform of the voiced sound (red region in (a)), and (c) Waveform of the unvoiced sound (green region in (a)).. In general, has a typical startle response (Fig. 4), which is a fast change of the signal in response to a sudden stimulus, characterized by the amplitude and rising time of the signal. (a) (b) Figure 2. (a) Speech signal (utterance of bookstore ), (b) Energy of each sample, and (c) verage energy for each frame (c) Figure 4. Four startle responses. The peak which is detected by the algorithm is marked with x and the onset is marked with o. The amplitude and rising time of the response are denoted by and respectively. Boucsein has demonstrated that skin conductance is subject to inter-person variability, with differences in age, gender, ethnicity, and hormonal cycle contributing to individual differences [4]. Due to these differences; we normalized the skin conductance signals by subtracting the baseline (values
4 of the when the user is supposed to be relaxed) minimum and dividing by baseline range [20]. We considered mean, minimum, maximum and standard deviation of skin conductance and peak height; and the total number and the cumulative amplitude, rising time and energy of startle responses in segment. These were found useful in earlier studies [9]. We performed automatic detection of responses using ED toolbox Classification Methods We experimented with four different techniques; K- Means using vector quantization as a baseline, decision tree, Gaussian Mixture Model (GMM) [15], which has been shown to work well for speaker recognition, and the Support Vector Machine (LibSVM tool [6]; BF kernel with cross-validation to choose the best-performing parameters), as a popular general purpose state-of-art technique. We also used logistic regression method for combining the result from two different models (i.e. and ) into one single regression value. The rationale behind choosing this method is that it is simple, and has been shown to work well for combining two different models in stress detection tasks [13]. Logistic regression requires that the and data are available at the same time instance and have been aligned with each other. We segmented the sensor data streams using a oneminute non-overlapping window. and data were aligned manually in offline settings. Then, the instances which only contained the voiced sound of the subject were stored. We have compared three binary tasks: distinguishing recovery vs. workloads (light and heavy), recovery vs. heavy workload and light workload vs. heavy workload. There are 100 instances of recovery class, 110 instances of light workload and 120 instances of heavy workloads (thus there are = 230 instances of workload). The corresponding baseline majority class accuracies are 230/340 or 67.6%, 120/220 or 54.5%, and 120/230 or 52.2%. 4 Classification results First we present subject dependent results (the data from the same subject can be present both in the training and testing sets). Then we also provide the comparative results with 1-subject-leave-out cross-validation to show the corresponding performance for the subject independent case. 5 Table 1 depicts the average classifier accuracies using 10-times 10-fold cross-validation. GMM, SVM and decision tree method outperformed the baseline (K-Means). The SVM outperforms the other classifiers and can reach accuracy around 80.72% for classifying recovery versus heavy workload session. s expected, differentiating the stress level in the light vs. heavy workload setting is harder than in the recovery vs. heavy workload setting. Table 1. accuracies. K-Means GMM SVM DTree ecov-work 46.1± ± ± ±1.3 ecov-heavy 55.5± ± ± ±1.3 Light-heavy 53.2± ± ± ±1.8 Speech. We consider here on light vs. heavy workload (110 positive and 110 negative instances) results as the most difficult case. The which are investigated include pitch, MFCC, MFCC-Pitch and ST PLP. total of 12 pitch are used, including mean, minimum, maximum, median, standard deviation, range (maximum - minimum) of pitch and its first derivation. We used 144 MFCC including mean, variance, minimum and maximum of the first 12 cepstral coefficients (excluding the 0-th coefficient), delta coefficients (the first derivative of coefficients) and delta-delta coefficient (the second derivative of coefficients). The feature MFCC-Pitch represents the direct concatenation of MFCC and pitch. In total, 108 ST PLP were used including mean, variance, minimum and maximum of ST PLP coefficients, and first two derivatives. The experimental results using these with 10-times 10-fold cross-validation are depicted in Table 2. Table 2. Speech accuracies. K-Means GMM SVM DTree Pitch 49.7± ± ± ±2.8 MFCC 55.4± ± ± ±3.1 MFCC-Pitch 49.2± ± ± ±1.3 ST PLP 50.6± ± ± ±3.0 K-means results in the lowest accuracy and is unsuitable for stress detection. The GMM classifier outperforms K- Means (baseline) with insignificant differences. In general, the GMM accuracies using are lower than using. In contrast, the SVM method, which is not based on the distribution fit technique, gives a higher accuracy in this setting. This is most likely due to the high dimensionality of (e.g. up to 144 dimensions for MFCC) which makes the Gaussian distribution sparse, thus, making the Expectation-Maximization algorithm fail
5 ccuracy (percent) ccuracy (percent) utomatic Stress Detection combine ensemble a) model b) model c) feature enrichment d) ensemble learning Figure 5. Experimental comparison of four stress approaches. to cluster and fit the data. The decision tree classifier, despite its simplicity, performs relative quite well in this setting. SVMs trained on MFCC and MFCC result in highest accuracy above 92%. Pitch alone is not a good indicator for stress. Fusion of and. There are many approaches for combining the and. One of them is by using feature enrichment. Both from and are combined together. fterwards, the classifier is utilized to predict the class output. nother approach is by using ensemble learning such as the logistic regression technique. First, an individual model is built separately. fterwards, the result of both models is combined is some way. Figure 5 illustrates four stress approaches that we compared experimentally. We used logistic regression to combine two individual and classifiers. In these experiments we used SVM classifiers as they shown the most promising performance. The corresponding accuracy results are shown in Table 3. Table 3. Fusion of and. Features fusion Models fusion MFCC and 90.73± ±0.77 MFCC-Pitch and 91.34± ±1.37 Pitch and 69.04± ±2.36 The logistic regression slightly (and not statistically significantly) outperforms feature enrichment yet does not exceed the accuracy of SVM based on signal alone. Thus, two straightforward methods for combining and signals seem not to provide any advantage over individual models. Nevertheless, high values for Cohen s Kappa disagreement statistic computed for the and classifiers suggest that and signals can complement each other. Thus, methods such as dynamic integration of classifiers that can better utilize the diversity of individual models [21] and that consider evolving settings [22] may give a more promising outcome. Subject (in)dependent models. The 1-subject-leave-out cross-validation approach was studied to evaluate the model performance for the subject independent case. Figure 6 shows a comparison of 10-times 10-fold crossvalidation accuracy results against 1-subject-leave-out cross-validation. It is obvious from this graph that the accuracies of the classifiers drop (dramatically in case of using ) when using the subject independent model. Hence, if possible, it is better to address the stress problem as a subject dependent model ecovery vs workloads ecovery vs heavy workload Light vs heavy workload Tasks times 10-fold CV 1-Subject-Leave-Out CV Pitch MFCC MFCC-Pitch ST PLP Speech Features Figure times 10x10-fold 10-fold CV 1-subject-leave-out CV (left bar) CV and 1- subject-leave-out CV (right bar) accuracies for (top) and (bottom) signals.
6 5 Conclusion Modern sensor technologies enable the objective measurement of stress. We investigated different models for detecting stress using four classifiers: K-means, GMM, SVM and decision tree. The SVM outperformed the other classifiers on the considered benchmark reaching 92% accuracy by using only. Speech is indeed a good indicator for determining stress in controlled setting. However, our experiments show that we need to have some labeled data from a person for whom we make predictions; otherwise, accuracies may drop substantially. ccuracies of classifiers trained on were much lower, at most 70% for differentiating between light and heavy workload. Furthermore, it has been demonstrated in [3] that the signal is varied not only from person to person but also e.g. from one day to another for the same person. Therefore, including other measurements (instead of relying only on -based stress detection) is needed for obtaining more reliable performance. Unfortunately, straightforward approaches for combining both and signals using feature enrichment or ensemble learning with logistic regression did not improve the performance significantly. Further research in this direction is needed. The results presented in this work correspond to the data collected in controlled laboratory settings. Detecting stress levels in the real-life and/or online settings will be a more challenging and more difficult task. There is no guarantee that the signals will be free of noise and artifacts. For instance, the may contain background noise or voices of other people, while the signals may contain artifacts owing to the exact placement of the sensor and the physical activity. lso, the and may not be available at the same time. Hence one of the directions of further work is in designing mechanisms that can recognize and address all of such practical issues. eferences [1] Y. yzenberg, J. H. ivera, and. Picard. Feel: frequent eda and event logging a mobile social interaction stress monitoring system. In CHI 12 Extended bstracts on Human Factors in Computing Systems, pages , New York, NY, US, CM. [2] J. Bakker, L. Holenderski,. Kocielnik, M. Pechenizkiy, and N. Sidorova. Stess@work: From measuring stress to its understanding, prediction and handling with personalized coaching. In Proceedings of CM SIGHIT Int. Health Informatics Symposium, pages CM Press, [3] J. Bakker, M. Pechenizkiy, and N. Sidorova. What s your current stress level? Detection of stress patterns from sensor data. In Proceedings of ICDM Workshops., pages , [4] W. Boucsein. Electrodermal ctivity. The Springer series in behavioral psychophysiology and medicine. Springer, [5] J. Cacioppo, L. Tassinary, and G. Berntson. Handbook of Psychophysiology. Cambridge University Press, [6] C.-C. Chang and C.-J. Lin. LIBSVM: library for support vector machines. CM Transactions on Intelligent Systems and Technology, 2:27:1 27:27, [7] S. B. Davis and P. Mermelstein. eadings in recognition. chapter Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, pages Morgan Kaufmann, [8] Y. Deng, D. Hsu, Z. Wu, and C.-H. Chu. Feature selection and combination for stress identification using correlation and diversity. In 12th Int. Symp. on Pervasive Systems, lgorithms and Networks (ISPN), pages 37 43, [9] J. Healey and. Picard. Detecting stress during real-world driving tasks using physiological sensors. IEEE Trans. on Intelligent Transportation Systems, 6(2): , [10] H. Hermansky, N. Morgan,. Bayya, and P. Kohn. astaplp analysis, [11] C. Kirschbaum, K. M. Pirke, and D. H. Hellhammer. The Trier Social Stress Test a tool for investigating psychobiological stress responses in a laboratory setting. Neuropsychobiology, 28:76 81, [12]. Lazarus and S. Folkman. Stress, ppraisal, and Coping. Springer Series. Springer Publishing Company, [13] I. Lefter, L. J. M. othkrantz, D.. van Leeuwen, and P. Wiggers. utomatic stress detection in emergency (telephone) calls. IJIDSS, 4(2): , [14] P. ajasekaran, G. Doddington, and J. Picone. ecognition of under stress and in noise. In coustics, Speech, and Signal Processing, IEEE International Conference on ICSSP 86., volume 11, pages , apr [15] D.. eynolds. Gaussian mixture models. pages , [16] J. H. ivera,. Morris, and. Picard. Call center stress recognition with person-specific models. In ffective Computing and Intelligent Interaction, volume 6974 of Lecture Notes in Computer Science, pages Springer, [17] H. Selye. Stress and The General daptation Syndrome. The British Medical Journal, 1(4667), June [18] J. Stroop. Interference in serial verbal reactions. J. Exp. Psychol, 18: , [19] D. Talkin. robust algorithm for pitch tracking (rapt). Speech coding and synthesis, 495: , [20] L. G. Tassinary. Inferring psychological significance from physiological signals. merican Psychologist, 45:16 28, [21]. Tsymbal, M. Pechenizkiy, and P. Cunningham. Diversity in search strategies for ensemble feature selection. Information Fusion, 6(1):83 98, [22]. Tsymbal, M. Pechenizkiy, P. Cunningham, and S. Puuronen. Dynamic integration of classifiers for handling concept drift. Information Fusion, Special Issue on pplications of Ensemble Methods, 9(1):56 68, [23] J. Z. Zhang, N. Mbitiru, P. C. Tay, and. D. dams. nalysis of stress in using adaptive empirical mode decomposition. In Proc. of 43rd silomar Conference on Signals, Systems and Computers, pages IEEE Press, 2009.
Multi Modal Affective Data Analytics
Multi Modal Affective Data Analytics Mykola Pechenizkiy SDAD 2012 @ ECMLPKDD2012 2 September 2012 Bristol, UK http://www.win.tue.nl/stressatwork Affective data Social media Social media leads to masses
Emotion Detection from Speech
Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction
Predict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
Managing, Mining and Visualizing Multi-Modal Data for Stress Awareness
Eindhoven University of Technology Department of Mathematics and Computer Science Managing, Mining and Visualizing Multi-Modal Data for Stress Awareness Master Thesis Hindra Kurniawan Supervisor: dr. Mykola
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
Music Mood Classification
Music Mood Classification CS 229 Project Report Jose Padial Ashish Goel Introduction The aim of the project was to develop a music mood classifier. There are many categories of mood into which songs may
Developing an Isolated Word Recognition System in MATLAB
MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling
Artificial Neural Network for Speech Recognition
Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken
Image Normalization for Illumination Compensation in Facial Images
Image Normalization for Illumination Compensation in Facial Images by Martin D. Levine, Maulin R. Gandhi, Jisnu Bhattacharyya Department of Electrical & Computer Engineering & Center for Intelligent Machines
Establishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, [email protected]) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
Automatic Evaluation Software for Contact Centre Agents voice Handling Performance
International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,
Beating the MLB Moneyline
Beating the MLB Moneyline Leland Chen [email protected] Andrew He [email protected] 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series
BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION
BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium [email protected]
Research on physiological signal processing
Research on physiological signal processing Prof. Tapio Seppänen Biosignal processing team Department of computer science and engineering University of Oulu Finland Tekes 10-12.9.2013 Research topics Research
AUTONOMIC NERVOUS SYSTEM AND HEART RATE VARIABILITY
AUTONOMIC NERVOUS SYSTEM AND HEART RATE VARIABILITY Introduction The autonomic nervous system regulates, among other things, the functions of the vascular system. The regulation is fast and involuntary.
L9: Cepstral analysis
L9: Cepstral analysis The cepstrum Homomorphic filtering The cepstrum and voicing/pitch detection Linear prediction cepstral coefficients Mel frequency cepstral coefficients This lecture is based on [Taylor,
Wireless Remote Monitoring System for ASTHMA Attack Detection and Classification
Department of Telecommunication Engineering Hijjawi Faculty for Engineering Technology Yarmouk University Wireless Remote Monitoring System for ASTHMA Attack Detection and Classification Prepared by Orobh
Speech Signal Processing: An Overview
Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech
SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA
SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,
Separation and Classification of Harmonic Sounds for Singing Voice Detection
Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay
Thirukkural - A Text-to-Speech Synthesis System
Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,
Technical Note Series
Technical Note Series SKIN CONDUCTANCE SENSOR (SA9309M) S TN0 0 0 8-0 0 S k i n C o n d u c t a n c e S e n s o r Page 2 IMPORTANT OPERATION INFORMATION WARNING Type BF Equipment Internally powered equipment
A TOOL FOR TEACHING LINEAR PREDICTIVE CODING
A TOOL FOR TEACHING LINEAR PREDICTIVE CODING Branislav Gerazov 1, Venceslav Kafedziski 2, Goce Shutinoski 1 1) Department of Electronics, 2) Department of Telecommunications Faculty of Electrical Engineering
Call Center Stress Recognition with Person-Specific Models
Call Center Stress Recognition with Person-Specific Models Javier Hernandez, Rob R. Morris, and Rosalind W. Picard Media Lab, Massachusetts Institute of Technology, Cambridge, USA {javierhr,rmorris,picard}@media.mit.edu
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS PIERRE LANCHANTIN, ANDREW C. MORRIS, XAVIER RODET, CHRISTOPHE VEAUX Very high quality text-to-speech synthesis can be achieved by unit selection
Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19
Doppler Doppler Chapter 19 A moving train with a trumpet player holding the same tone for a very long time travels from your left to your right. The tone changes relative the motion of you (receiver) and
Colour Image Segmentation Technique for Screen Printing
60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone
The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications
Forensic Science International 146S (2004) S95 S99 www.elsevier.com/locate/forsciint The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications A.
FEEL: Frequent EDA and Event Logging A Mobile Social Interaction Stress Monitoring System
FEEL: Frequent EDA and Event Logging A Mobile Social Interaction Stress Monitoring System Yadid Ayzenberg MIT Media Lab 75 Amherst St. Cambridge, MA 02142, USA [email protected] Javier Hernandez MIT Media
Electroencephalography Analysis Using Neural Network and Support Vector Machine during Sleep
Engineering, 23, 5, 88-92 doi:.4236/eng.23.55b8 Published Online May 23 (http://www.scirp.org/journal/eng) Electroencephalography Analysis Using Neural Network and Support Vector Machine during Sleep JeeEun
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer
Embodied decision making with animations
Embodied decision making with animations S. Maggi and S. I. Fabrikant University of Zurich, Department of Geography, Winterthurerstr. 19, CH 857 Zurich, Switzerland. Email: {sara.maggi, sara.fabrikant}@geo.uzh.ch
Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.
Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet
This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.
This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Title Transcription of polyphonic signals using fast filter bank( Accepted version ) Author(s) Foo, Say Wei;
Ericsson T18s Voice Dialing Simulator
Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of
Laboratory Guide. Anatomy and Physiology
Laboratory Guide Anatomy and Physiology TBME04, Fall 2010 Name: Passed: Last updated 2010-08-13 Department of Biomedical Engineering Linköpings Universitet Introduction This laboratory session is intended
Chapter 6: HRV Measurement and Interpretation
Chapter 6: HRV Measurement and Interpretation Heart-rhythm analysis is more than a measurement of heart rate; it is a much deeper measurement of the complex interactions between the brain, the heart and
Classifying Manipulation Primitives from Visual Data
Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if
Supervised and unsupervised learning - 1
Chapter 3 Supervised and unsupervised learning - 1 3.1 Introduction The science of learning plays a key role in the field of statistics, data mining, artificial intelligence, intersecting with areas in
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
Lecture 2, Human cognition
Human Cognition An important foundation for the design of interfaces is a basic theory of human cognition The information processing paradigm (in its most simple form). Human Information Processing The
JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions
Edith Cowan University Research Online ECU Publications Pre. JPEG compression of monochrome D-barcode images using DCT coefficient distributions Keng Teong Tan Hong Kong Baptist University Douglas Chai
Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification
Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Henrik Boström School of Humanities and Informatics University of Skövde P.O. Box 408, SE-541 28 Skövde
The Effects of Moderate Aerobic Exercise on Memory Retention and Recall
The Effects of Moderate Aerobic Exercise on Memory Retention and Recall Lab 603 Group 1 Kailey Fritz, Emily Drakas, Naureen Rashid, Terry Schmitt, Graham King Medical Sciences Center University of Wisconsin-Madison
Tracking and Recognition in Sports Videos
Tracking and Recognition in Sports Videos Mustafa Teke a, Masoud Sattari b a Graduate School of Informatics, Middle East Technical University, Ankara, Turkey [email protected] b Department of Computer
2014/02/13 Sphinx Lunch
2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
Making Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University [email protected] [email protected] I. Introduction III. Model The goal of our research
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, [email protected] Abstract: Independent
Coping With Stress and Anxiety
Coping With Stress and Anxiety Stress and anxiety are the fight-and-flight instincts that are your body s way of responding to emergencies. An intruder crawling through your bedroom window in the dark
Active Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
Music Genre Classification
Music Genre Classification Michael Haggblade Yang Hong Kenny Kao 1 Introduction Music classification is an interesting problem with many applications, from Drinkify (a program that generates cocktails
Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition
, Lisbon Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition Wolfgang Macherey Lars Haferkamp Ralf Schlüter Hermann Ney Human Language Technology
Maschinelles Lernen mit MATLAB
Maschinelles Lernen mit MATLAB Jérémy Huard Applikationsingenieur The MathWorks GmbH 2015 The MathWorks, Inc. 1 Machine Learning is Everywhere Image Recognition Speech Recognition Stock Prediction Medical
Annotated bibliographies for presentations in MUMT 611, Winter 2006
Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.
KNOWLEDGE-BASED IN MEDICAL DECISION SUPPORT SYSTEM BASED ON SUBJECTIVE INTELLIGENCE
JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 22/2013, ISSN 1642-6037 medical diagnosis, ontology, subjective intelligence, reasoning, fuzzy rules Hamido FUJITA 1 KNOWLEDGE-BASED IN MEDICAL DECISION
The Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
RESEARCH ON SPOKEN LANGUAGE PROCESSING Progress Report No. 29 (2008) Indiana University
RESEARCH ON SPOKEN LANGUAGE PROCESSING Progress Report No. 29 (2008) Indiana University A Software-Based System for Synchronizing and Preprocessing Eye Movement Data in Preparation for Analysis 1 Mohammad
Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research [email protected]
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research [email protected] Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
Data Mining for Wearable Sensors in Health Monitoring Systems: A Review of Recent Trends and Challenges
Sensors 2013, 13, 17472-17500; doi:10.3390/s131217472 OPEN ACCESS sensors ISSN 1424-8220 www.mdpi.com/journal/sensors Review Data Mining for Wearable Sensors in Health Monitoring Systems: A Review of Recent
Sample Size and Power in Clinical Trials
Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance
A secure face tracking system
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 10 (2014), pp. 959-964 International Research Publications House http://www. irphouse.com A secure face tracking
A Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman
A Comparison of Speech Coding Algorithms ADPCM vs CELP Shannon Wichman Department of Electrical Engineering The University of Texas at Dallas Fall 1999 December 8, 1999 1 Abstract Factors serving as constraints
Impact of Boolean factorization as preprocessing methods for classification of Boolean data
Impact of Boolean factorization as preprocessing methods for classification of Boolean data Radim Belohlavek, Jan Outrata, Martin Trnecka Data Analysis and Modeling Lab (DAMOL) Dept. Computer Science,
Microsoft Azure Machine learning Algorithms
Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql [email protected] http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation
Physiological Anxiety Responses with Cell Phone Separation and. Subsequent Contact
Physiological Anxiety Responses with Cell Phone Separation and Subsequent Contact Alexa DeBoth, Jackelyn Meyer, Natalie Trueman, Anjoli Zejdlik, Physiology 435, Lab 601, Group 14 University of Wisconsin
II. RELATED WORK. Sentiment Mining
Sentiment Mining Using Ensemble Classification Models Matthew Whitehead and Larry Yaeger Indiana University School of Informatics 901 E. 10th St. Bloomington, IN 47408 {mewhiteh, larryy}@indiana.edu Abstract
Using Support Vector Machines for Automatic Mood Tracking in Audio Music
Audio Engineering Society Convention Paper Presented at the 130th Convention 2011 May 13 16 London, UK The papers at this Convention have been selected on the basis of a submitted abstract and extended
Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control
Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control Andre BERGMANN Salzgitter Mannesmann Forschung GmbH; Duisburg, Germany Phone: +49 203 9993154, Fax: +49 203 9993234;
Introducing diversity among the models of multi-label classification ensemble
Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and
What Alcohol Does to the Body. Chapter 25 Lesson 2
What Alcohol Does to the Body Chapter 25 Lesson 2 Short-Term Effects of Drinking The short-term term effects of alcohol on the body depend on several factors including: amount of alcohol consumed, gender,
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic
A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic Report prepared for Brandon Slama Department of Health Management and Informatics University of Missouri, Columbia
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott
From Concept to Production in Secure Voice Communications
From Concept to Production in Secure Voice Communications Earl E. Swartzlander, Jr. Electrical and Computer Engineering Department University of Texas at Austin Austin, TX 78712 Abstract In the 1970s secure
Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking
Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking The perception and correct identification of speech sounds as phonemes depends on the listener extracting various
Building a Simulink model for real-time analysis V1.15.00. Copyright g.tec medical engineering GmbH
g.tec medical engineering GmbH Sierningstrasse 14, A-4521 Schiedlberg Austria - Europe Tel.: (43)-7251-22240-0 Fax: (43)-7251-22240-39 [email protected], http://www.gtec.at Building a Simulink model for real-time
ANALYZING A CONDUCTORS GESTURES WITH THE WIIMOTE
ANALYZING A CONDUCTORS GESTURES WITH THE WIIMOTE ICSRiM - University of Leeds, School of Computing & School of Music, Leeds LS2 9JT, UK [email protected] www.icsrim.org.uk Abstract This paper presents
L25: Ensemble learning
L25: Ensemble learning Introduction Methods for constructing ensembles Combination strategies Stacked generalization Mixtures of experts Bagging Boosting CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna
Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28
Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object
Multiple Kernel Learning on the Limit Order Book
JMLR: Workshop and Conference Proceedings 11 (2010) 167 174 Workshop on Applications of Pattern Analysis Multiple Kernel Learning on the Limit Order Book Tristan Fletcher Zakria Hussain John Shawe-Taylor
Stress Psychophysiology. Introduction. The Brain. Chapter 2
Stress Psychophysiology Chapter 2 Introduction This chapter covers the process & structures activated during the physiological response to stress. Two stress pathways are available; one for short term
FUNCTIONAL EEG ANALYZE IN AUTISM. Dr. Plamen Dimitrov
FUNCTIONAL EEG ANALYZE IN AUTISM Dr. Plamen Dimitrov Preamble Autism or Autistic Spectrum Disorders (ASD) is a mental developmental disorder, manifested in the early childhood and is characterized by qualitative
Beating the NCAA Football Point Spread
Beating the NCAA Football Point Spread Brian Liu Mathematical & Computational Sciences Stanford University Patrick Lai Computer Science Department Stanford University December 10, 2010 1 Introduction Over
Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall
Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin
FPGA Implementation of Human Behavior Analysis Using Facial Image
RESEARCH ARTICLE OPEN ACCESS FPGA Implementation of Human Behavior Analysis Using Facial Image A.J Ezhil, K. Adalarasu Department of Electronics & Communication Engineering PSNA College of Engineering
Mode and Patient-mix Adjustment of the CAHPS Hospital Survey (HCAHPS)
Mode and Patient-mix Adjustment of the CAHPS Hospital Survey (HCAHPS) April 30, 2008 Abstract A randomized Mode Experiment of 27,229 discharges from 45 hospitals was used to develop adjustments for the
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
