Multi Modal Affective Data Analytics

Size: px
Start display at page:

Download "Multi Modal Affective Data Analytics"

Transcription

1 Multi Modal Affective Data Analytics Mykola Pechenizkiy SDAD ECMLPKDD September 2012 Bristol, UK

2 Affective data Social media Social media leads to masses of affective data related to peoples emotions, sentiments and opinions In the recent past was used mainly for marketing needs Web analytics Social Media Whatever the incentive was to study this, sentiment classification has become much more accurate 3

3 Multilingual Sentiment Classification 4

4 Rule based polarity detection Rule based emission model: 8 kinds of rules: Emission 5

5 SentiCorr How much positive and negative content do we read or write? 6

6 Mobile SentiCorr App What a fantastic idea, now if Great idea! Get it on ios This app is esigned to make someone else (or a computer) soon (anonymous) read our e mails for and protect us from WHAT??? How lazy can we get? Like someone Stress commented is often on CNN reactions, made if we are worse getting upset by by the tones anticipation and/or scoldings in e mails, we certainly have bigger issues that need to of be an dealt unpleasant with. C mon, guys, go event invent something useful. Not to mention, does it detect and actually dissipated irony? Will it weed out the liars? once you tackle the Pleeeeezzzeeee, what a WASTE OF SOMEONE S COLLEGIATE TIME AND ENERGY. Don t we have problem directly houses to clean and poor people to feed and old folks to help with their shopping? Go do something useful with your time, inventors of his app!!! Pamela Briggs British Psychological Society 7

7 OLAP Style Exploration of Data Summaries 8

8 Exploration of Individual Cases, e.g. e Mails 9

9 Sentiment vs. Fact Classification News in media or business are considered to be sentiment neutral, but they often contain positive or negative information, e.g. You will be fired in 3 months because of the serious budget cuts. no sentiment, but negative information Similarly, in work related correspondence there could be stressing information: How can we identify it? 10

10 Sentiment discovery: State of the art Sentiment analysis/classification is mature! Commercial products, free services, open source, variety of apps, evolves in many directions Several great overviews: Sentiment Analysis in Practice ICDM2011 tutorial by Tiger Zhang (ebay Research Labs) mentanalysisinpracticetutorial.pdf Modeling Opinions and Beyond in Social Media by Bing Liu (UIC) Liu.pptx

11 Outline Framework for Stress Analytics: Data management, OLAP support Shape based Query by Example Stress detection from speech and GSR Predictive features and classification From controlled experiments to real life

12 What is stress? Is it a bad thing?

13 Stress in NL according to Coosto.nl Not really job related

14 Impact of Stress at Work WHO: by 2020 Top 5 diseases will be stress related. USA: health care expenditures are ~50% greater for workers who report high levels of stress at work (J. Occup. Env. Med, 40: ). the Netherlands: (TNO, 2006; TU/e Cursor 2012): The direct costs of stress are 4 billion Euro per year. Every year employees become ill because of stress at work. 1/7 disabled because of stress at work. In TU Delft, 53% of surveyed students indicated that they experienced huge stress during their studies. 15

15 What do organizations try (not) to do? Reduce workload (33%) Discuss psychological load (28%) Change work processes (17%) Improve work/life balance (14%) Improve managers skills (13%) Extend regulations (9%) Source: (TNO, dossier Werkdruk)

16 What can go wrong? They are not always aware of the problem or don t know the exact cause People do not always want to share what they experience with others Not always timely enough Expensive to organize meeting with psychologists, interventions The individual causes are different and not always well understood Giving practical advises is not trivial

17 Types of Stress and Stressors Different types of stress: Survival stress a response to a physical danger Environmental stress noise, crowding, pressure from work or family Internal stress worrying about things we can't control; putting ourselves in situations we know will cause us stress (addicted to stress expanding todo list with more and more conference deadlines) Fatigue and overwork in a long term perspective Stress affects both body and mind 18

18 Types of Stress and Stressors Three kinds of stress: Acute: caused by an acute short term stress factor. Episodic acute: occurs more frequently & periodically. Chronic: caused by long term stress factors harmful. Factors causing long work hours, work overload, time pressure, difficult, demanding or complex tasks, high responsibility, lack of breaks, lack of training conflicts, underpromotion, job insecurity, lack of variety, and poor physical work conditions (limited space, temperature and lighting conditions) 19

19 Concept Be eep! 20

20 StressAnalytics Make people aware of their stress and stressors Overview of stressors Exploration of relations Access to evidence, i.e. annotated, measured stress Empowerment by awareness (+ implicit/explicit advice)

21 Our approach to StressAnalytics What, When, Where, with Whom Physiological signs Pattern Mining OLAP cube 22

22 Our approach to Stress Analytics Make a person aware of what is happening how they spend their time and when and from where the stress comes in Provide valuable input for pattern mining/knowledge discovery Much richer data sources Visual analytics Interactive exploration of stress related data Collecting subjective data/labels from a person through the interaction 23

23 GUI Exploration, Interaction, Visual Analytics OLAP Zoom in&out, slice&dice Pattern mining, prediction, query by example Data Mining Feature extraction, peak/change detection, classification Raw data, objective evidence External environment temperature, lighting, noise, airconditioning External userrelated data KPI, E mail, calendar, social media, news Physiological signs GSR, temp., voice, heart rate, facial expressions

24 Evidence: physiol. signals & external sources GSR, Temperature, Speech, Facial expressions, Sentiment in text

25 Alignment of Information Sources What person reads and writes: SentiCorr What person does in general according to agenda Environment context (lighting, noise, temp etc.) Annotate data from video, sound, text processing, and vital signs What person does with the computer Different aspect with pre processing, storing, managing 26

26 Stress Data Cube/OLAP Quick data summaries wrt predefined dimensions 27

27 Stress Analytics Visualization OLAP style exploration: selecting multidimension, zoom in, zoom out. Navigating to the evidences: i.e. raw data: GSR, skin temperature, speech, and Shape based time series similarity search State of the art UCR Suite (Keogh et al.) Demo: isualization.jsp

28 OLAP system, a Star Schema

29 Shape Based Query by Example Given a subsequence of GSR time series s Query Find a similar shape time series with s Result

30 Euclidean Distance: Shape based QBE Dynamic Time Warping (DTW) State of the art UCR Suite (Keogh et al.)

31 How to measure stress Determine stress level based on observed sweat production 32

32 Detection and Categorization of Stress Based on GSR data alone not as easy as the following figure may suggest: 33

33 Challenges in Stress Detection All kinds of noise, e.g. loosing contact with the skin Activity (exercising), environment (cold/hot) context and personal differences may impact GSR we observe 34

34 Interpretation isn t straightforward 35

35 Detection as Classification GSR features Mean, SD, min and max of GSR. Mean, min and max of peak height. Total number of GSR response. The sum of GSR amplitude. The sum of rising time response. The sum of energy response.

36 Adding more data to disambiguate Skin and room temperature, noise, accelerometer, voice, face, 37

37 e.g. activity recognition can help Writing vs. typing vs. walking vs. teaching vs Analyzing accelerometer data only (wrist band) 38

38 Uncontrolled and semi controlled Philips Research employees wearing the device during their working hours Students passing the written and multiple choice exams Students presenting demos/posters with course project results More to come via HumanCapitalCare 39

39 Experiment demo

40 41

41 Measuring GSR in (un)controlled settings Philips prototype Self made, the LEGO Mindstorms NXT 42

42 Multi Source Affective Data Classification Stress/Emotion classification from text, GSR & speech Facial expression analysis GSR & other sensors 43

43 Automatic Stress Detection speech model GSR model feature enrichment ensemble learning speech GSR speech GSR speech GSR speech features GSR features speech features GSR features speech features GSR features classification classification combine features classification classification classification ensemble

44 Stress and Skin Conductance Stress Changes in Autonomic Nervous System (ANS) activation of sweat glands Changes of skin conductance Changes of the amount of the produced sweat Relax skin is drier skin conductance is lower Stress sweat increases skin conductance is higher

45 GSR features Mean, SD, min and max of GSR. Mean, min and max of peak height. Total number of GSR response. The sum of GSR amplitude. The sum of rising time response. The sum of energy response.

46 Change detection approach Online settings

47 Preprocessing steps 50

48 Stress and Speech Stress Respiration Rate increases Increased Pitch Increased subglottal pressure Voice is a good indicator of stress [scherer, 1986]

49 Speech Features Voiced and unvoiced speech

50 Speech Features Pitch / Fundamental frequency

51 Speech Features Mel Frequency Cepstral Coefficients (MFCCs) are coefficients that approximate human perception auditory response. Audio (temporal) FFT frequency Mel scale filter filtered frequency logs power MFCCs Store the first coefficients DCT representation DCT log frequency

52 Classification Methods Support Vector Machine (SVM) State of the arts. Decision Tree classifier. K means using Vector Quantization (VQ). This method is chosen as a baseline. Gaussian Mixture Model (GMM). This method works well for speaker recognition task. Change detectors: ADWIN, thresholding

53 Stress Dataset Three types of GSR patterns. First Second Third type: type:

54 Aligning of data sources 60 seconds GSR Instance 1 Instance 2 Instance 3 speech Instance 1 Instance 2 Instance 3

55 Stress Dataset: Speech Features

56 Stress Model using GSR features 10-times 10-fold CV (not subject independent) Accuracy (percent) k means GMM SVM 20 Decision Tree 10 0 Recovery vs workloads Recovery vs heavy workload Light vs heavy workload SVM outperformed other methods. Recognizing light vs heavy workload is harder than between recovery vs heavy workload.

57 Stress Model using speech features Accuracy (percent) k means GMM SVM 20 Decision Tree 10 0 Pitch MFCC MFCC Pitch RASTA SVM outperforms the other classifier. K means and GMM do not perform well for speech. MFCC is a good indicator for stress detection.

58 1 subject leave out cross validation (subject independent model) Accuracy (percent) Accuracy (percent) It is better to address the problem of stress detection using a subject dependent model Recovery Pitch vs workloads MFCC Recovery vs heavy workload MFCC Pitch Light vs heavy RASTA workload PLP GSR Tasks Speech Features 10 times 10 fold CV 1 Subject Leave Out 1 subject leave out CV

59 Fusion Approaches Feature enrichment Ensemble learning

60 Fusion of GSR and Speech Accuracy (percent) MFCC and GSR MFCC Pitch and GSR Pitch and GSR Enriching Feature Space Logistic Regression as MetaLearner Light vs. heavy workload, balanced data

61 Kappa Agreement for Classifiers Measure agreement between two model using Cohen s Kappa test. Kappa = 1 complete agreement. Kappa = 0 complete disagreement.

62 Stress detection summary Speech is more reliable (in lab settings) than GSR, but more subject dependent. SVM is performing better on both GSR and Speech signal. ADWIN & thresholding detectors do well on GSR Combining GSR and Speech is not trivial: Speech and GSR predictions are highly independent (low kappa value) This diversity may be exploited with dynamic integrations methods

63 Further directions Extend the notion of stress (positive and negative) in the stress analytics framework. Stress analytics affective data analytics Collect more data to enable OLAP KDD part of the framework. Combine with other signals, such as facial expression, heart rate, nutrition. Long path from lab setting to real life situation; but both are needed.

64 Is Acute Stress Good or Bad? 69

65 What is the Relaxation Then? 70

66 Is Normal Condition Good or Bad? What if someone s patterns looks like NNNNNNNNNNNNNNNN 71

67 Summary The fun parts come from The fact that not much is known about stress Playing with heterogeneous/multi modal data Multi disciplinary (data collection, data management, data mining, visual analytics) Engineering approach to data mining How to show the utility i.e. what we do helps to understand better stress as a phenomenon, and the stressors, and how to helps people at the end 72

68 Take home messages Lab settings vs. real world Availability and quality of the signal Voice recorded Someone s else voice recorded Noise and missing data, uncertainty A person cannot speak (during the meeting while someone else is speaking) Ground truth, labels, subjective vs. objective A large problem space If you know how to help us with any part on StressAnalytics talk to me 73

Managing, Mining and Visualizing Multi-Modal Data for Stress Awareness

Managing, Mining and Visualizing Multi-Modal Data for Stress Awareness Eindhoven University of Technology Department of Mathematics and Computer Science Managing, Mining and Visualizing Multi-Modal Data for Stress Awareness Master Thesis Hindra Kurniawan Supervisor: dr. Mykola

More information

Emotion Detection from Speech

Emotion Detection from Speech Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction

More information

Speech Signal Processing: An Overview

Speech Signal Processing: An Overview Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech

More information

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,

More information

Unlocking Value from. Patanjali V, Lead Data Scientist, Tiger Analytics Anand B, Director Analytics Consulting,Tiger Analytics

Unlocking Value from. Patanjali V, Lead Data Scientist, Tiger Analytics Anand B, Director Analytics Consulting,Tiger Analytics Unlocking Value from Patanjali V, Lead Data Scientist, Anand B, Director Analytics Consulting, EXECUTIVE SUMMARY Today a lot of unstructured data is being generated in the form of text, images, videos

More information

Context Aware Predictive Analytics: Motivation, Potential, Challenges

Context Aware Predictive Analytics: Motivation, Potential, Challenges Context Aware Predictive Analytics: Motivation, Potential, Challenges Mykola Pechenizkiy Seminar 31 October 2011 University of Bournemouth, England http://www.win.tue.nl/~mpechen/projects/capa Outline

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Natalia Vassilieva, PhD Senior Research Manager GTC 2016 Deep learning proof points as of today Vision Speech Text Other Search & information

More information

Classification of Household Devices by Electricity Usage Profiles

Classification of Household Devices by Electricity Usage Profiles Classification of Household Devices by Electricity Usage Profiles Jason Lines 1, Anthony Bagnall 1, Patrick Caiger-Smith 2, and Simon Anderson 2 1 School of Computing Sciences University of East Anglia

More information

Separation and Classification of Harmonic Sounds for Singing Voice Detection

Separation and Classification of Harmonic Sounds for Singing Voice Detection Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Information Visualization WS 2013/14 11 Visual Analytics

Information Visualization WS 2013/14 11 Visual Analytics 1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016 Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00

More information

School Class Monitoring System Based on Audio Signal Processing

School Class Monitoring System Based on Audio Signal Processing C. R. Rashmi 1,,C.P.Shantala 2 andt.r.yashavanth 3 1 Department of CSE, PG Student, CIT, Gubbi, Tumkur, Karnataka, India. 2 Department of CSE, Vice Principal & HOD, CIT, Gubbi, Tumkur, Karnataka, India.

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Anomaly Detection in Predictive Maintenance

Anomaly Detection in Predictive Maintenance Anomaly Detection in Predictive Maintenance Anomaly Detection with Time Series Analysis Phil Winters Iris Adae Rosaria Silipo [email protected] [email protected] [email protected] Copyright

More information

How To Find Out If You Are Stressed

How To Find Out If You Are Stressed Stress Detection from Speech and Galvanic Skin esponse Signals Hindra Kurniawan 1, lexandr V. Maslov 1,2, Mykola Pechenizkiy 1 1 Department of Computer Science, TU Eindhoven, the Netherlands [email protected],

More information

Wireless Remote Monitoring System for ASTHMA Attack Detection and Classification

Wireless Remote Monitoring System for ASTHMA Attack Detection and Classification Department of Telecommunication Engineering Hijjawi Faculty for Engineering Technology Yarmouk University Wireless Remote Monitoring System for ASTHMA Attack Detection and Classification Prepared by Orobh

More information

Developing an Isolated Word Recognition System in MATLAB

Developing an Isolated Word Recognition System in MATLAB MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling

More information

6.2.8 Neural networks for data mining

6.2.8 Neural networks for data mining 6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction 1- The Importance Of Stress 2- The History Of Stress - 14 th Century - 17 th Century - 19 th Century - 20 th Century * Cannon's Concept, Fight or Flight * H. Selye: GAS * R. Lazarus:

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

Annotated bibliographies for presentations in MUMT 611, Winter 2006

Annotated bibliographies for presentations in MUMT 611, Winter 2006 Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Data Mining for Wearable Sensors in Health Monitoring Systems: A Review of Recent Trends and Challenges

Data Mining for Wearable Sensors in Health Monitoring Systems: A Review of Recent Trends and Challenges Sensors 2013, 13, 17472-17500; doi:10.3390/s131217472 OPEN ACCESS sensors ISSN 1424-8220 www.mdpi.com/journal/sensors Review Data Mining for Wearable Sensors in Health Monitoring Systems: A Review of Recent

More information

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Big Data Mining Services and Knowledge Discovery Applications on Clouds Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy [email protected] Data Availability or Data Deluge? Some decades

More information

Michael R. Pinsky, M.D., C.M., Dr.h.c., FCCP, MCCM Professor of Critical Care Medicine, Bioengineering, Anesthesiology, Cardiovascular Diseases, and

Michael R. Pinsky, M.D., C.M., Dr.h.c., FCCP, MCCM Professor of Critical Care Medicine, Bioengineering, Anesthesiology, Cardiovascular Diseases, and Michael R. Pinsky, M.D., C.M., Dr.h.c., FCCP, MCCM Professor of Critical Care Medicine, Bioengineering, Anesthesiology, Cardiovascular Diseases, and Clinical & Translational Sciences, Vice Chair for Academic

More information

Research on physiological signal processing

Research on physiological signal processing Research on physiological signal processing Prof. Tapio Seppänen Biosignal processing team Department of computer science and engineering University of Oulu Finland Tekes 10-12.9.2013 Research topics Research

More information

Myanmar Continuous Speech Recognition System Based on DTW and HMM

Myanmar Continuous Speech Recognition System Based on DTW and HMM Myanmar Continuous Speech Recognition System Based on DTW and HMM Ingyin Khaing Department of Information and Technology University of Technology (Yatanarpon Cyber City),near Pyin Oo Lwin, Myanmar Abstract-

More information

Applications of Deep Learning to the GEOINT mission. June 2015

Applications of Deep Learning to the GEOINT mission. June 2015 Applications of Deep Learning to the GEOINT mission June 2015 Overview Motivation Deep Learning Recap GEOINT applications: Imagery exploitation OSINT exploitation Geospatial and activity based analytics

More information

Lecture 9: Data Mining, Data Analytics and Big Data

Lecture 9: Data Mining, Data Analytics and Big Data Lecture 9: Data Mining, Data Analytics and Big Data Maaike Limper, Antonio Romero, Manuel Martin 1 Introduction Two openlab Projects in IT-DB Data Analytics In-Database Physics Analysis Both using data

More information

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Alignment and Preprocessing for Data Analysis

Alignment and Preprocessing for Data Analysis Alignment and Preprocessing for Data Analysis Preprocessing tools for chromatography Basics of alignment GC FID (D) data and issues PCA F Ratios GC MS (D) data and issues PCA F Ratios PARAFAC Piecewise

More information

From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data

From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data 100 001 010 111 From Raw Data to 10011100 Actionable Insights with 00100111 MATLAB Analytics 01011100 11100001 1 Access and Explore Data For scientists the problem is not a lack of available but a deluge.

More information

Artificial Neural Network for Speech Recognition

Artificial Neural Network for Speech Recognition Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken

More information

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Software Engineering for Big Data CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Big Data Big data technologies describe a new generation of technologies that aim

More information

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised

More information

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»

More information

Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

More information

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Is a Data Scientist the New Quant? Stuart Kozola MathWorks Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

More information

3/17/2009. Knowledge Management BIKM eclassifier Integrated BIKM Tools

3/17/2009. Knowledge Management BIKM eclassifier Integrated BIKM Tools Paper by W. F. Cody J. T. Kreulen V. Krishna W. S. Spangler Presentation by Dylan Chi Discussion by Debojit Dhar THE INTEGRATION OF BUSINESS INTELLIGENCE AND KNOWLEDGE MANAGEMENT BUSINESS INTELLIGENCE

More information

Big Data: Image & Video Analytics

Big Data: Image & Video Analytics Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)

More information

An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis]

An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis] An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis] Stephan Spiegel and Sahin Albayrak DAI-Lab, Technische Universität Berlin, Ernst-Reuter-Platz 7,

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES

CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES International Journal of Scientific and Research Publications, Volume 4, Issue 4, April 2014 1 CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES DR. M.BALASUBRAMANIAN *, M.SELVARANI

More information

CPSC 340: Machine Learning and Data Mining. Mark Schmidt University of British Columbia Fall 2015

CPSC 340: Machine Learning and Data Mining. Mark Schmidt University of British Columbia Fall 2015 CPSC 340: Machine Learning and Data Mining Mark Schmidt University of British Columbia Fall 2015 Outline 1) Intro to Machine Learning and Data Mining: Big data phenomenon and types of data. Definitions

More information

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer

More information

Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science [email protected]

Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Machine Learning CS 6830 Razvan C. Bunescu School of Electrical Engineering and Computer Science [email protected] What is Learning? Merriam-Webster: learn = to acquire knowledge, understanding, or skill

More information

Data Mining and Data Warehousing on US Farmer s Data

Data Mining and Data Warehousing on US Farmer s Data Data Mining and Data Warehousing on US Farmer s Data Guide: Dr. Meiliu Lu Presented By, Yogesh Isawe Kalindi Mehta Aditi Kulkarni * Data Warehousing Project * Introduction * Background * Technologies Explored

More information

Emotion Recognition Using Blue Eyes Technology

Emotion Recognition Using Blue Eyes Technology Emotion Recognition Using Blue Eyes Technology Prof. Sudan Pawar Shubham Vibhute Ashish Patil Vikram More Gaurav Sane Abstract We cannot measure the world of science in terms of progress and fact of development.

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

Machine Learning Logistic Regression

Machine Learning Logistic Regression Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.

More information

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information

More information

Big Data Text Mining and Visualization. Anton Heijs

Big Data Text Mining and Visualization. Anton Heijs Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark

More information

MHI3000 Big Data Analytics for Health Care Final Project Report

MHI3000 Big Data Analytics for Health Care Final Project Report MHI3000 Big Data Analytics for Health Care Final Project Report Zhongtian Fred Qiu (1002274530) http://gallery.azureml.net/details/81ddb2ab137046d4925584b5095ec7aa 1. Data pre-processing The data given

More information

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.

More information

Lecture 2, Human cognition

Lecture 2, Human cognition Human Cognition An important foundation for the design of interfaces is a basic theory of human cognition The information processing paradigm (in its most simple form). Human Information Processing The

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])

More information

The policy also aims to make clear the actions required when faced with evidence of work related stress.

The policy also aims to make clear the actions required when faced with evidence of work related stress. STRESS MANAGEMENT POLICY 1.0 Introduction Stress related illness accounts for a significant proportion of sickness absence in workplaces in the UK. Stress can also be a contributing factor to a variety

More information

Increase System Efficiency with Condition Monitoring. Embedded Control and Monitoring Summit National Instruments

Increase System Efficiency with Condition Monitoring. Embedded Control and Monitoring Summit National Instruments Increase System Efficiency with Condition Monitoring Embedded Control and Monitoring Summit National Instruments Motivation of Condition Monitoring Impeller Contact with casing and diffuser vanes Bent

More information

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research [email protected]

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research [email protected] Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

Concept and Applications of Data Mining. Week 1

Concept and Applications of Data Mining. Week 1 Concept and Applications of Data Mining Week 1 Topics Introduction Syllabus Data Mining Concepts Team Organization Introduction Session Your name and major The dfiiti definition of dt data mining i Your

More information

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.

More information

A Demonstration of a Robust Context Classification System (CCS) and its Context ToolChain (CTC)

A Demonstration of a Robust Context Classification System (CCS) and its Context ToolChain (CTC) A Demonstration of a Robust Context Classification System () and its Context ToolChain (CTC) Martin Berchtold, Henning Günther and Michael Beigl Institut für Betriebssysteme und Rechnerverbund Abstract.

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Product Review: James F. Koopmann Pine Horse, Inc. Quest Software s Foglight Performance Analysis for Oracle

Product Review: James F. Koopmann Pine Horse, Inc. Quest Software s Foglight Performance Analysis for Oracle Product Review: James F. Koopmann Pine Horse, Inc. Quest Software s Foglight Performance Analysis for Oracle Introduction I ve always been interested and intrigued by the processes DBAs use to monitor

More information

Structural Health Monitoring Tools (SHMTools)

Structural Health Monitoring Tools (SHMTools) Structural Health Monitoring Tools (SHMTools) Getting Started LANL/UCSD Engineering Institute LA-CC-14-046 c Copyright 2014, Los Alamos National Security, LLC All rights reserved. May 30, 2014 Contents

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of

More information

Making Sense of the Mayhem: Machine Learning and March Madness

Making Sense of the Mayhem: Machine Learning and March Madness Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University [email protected] [email protected] I. Introduction III. Model The goal of our research

More information

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,

More information

Recent advances in Digital Music Processing and Indexing

Recent advances in Digital Music Processing and Indexing Recent advances in Digital Music Processing and Indexing Acoustics 08 warm-up TELECOM ParisTech Gaël RICHARD Telecom ParisTech (ENST) www.enst.fr/~grichard/ Content Introduction and Applications Components

More information

CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis

CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis Team members: Daniel Debbini, Philippe Estin, Maxime Goutagny Supervisor: Mihai Surdeanu (with John Bauer) 1 Introduction

More information

How can we discover stocks that will

How can we discover stocks that will Algorithmic Trading Strategy Based On Massive Data Mining Haoming Li, Zhijun Yang and Tianlun Li Stanford University Abstract We believe that there is useful information hiding behind the noisy and massive

More information

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics

More information

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

More information

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?

More information

Facility & Property Management Solution

Facility & Property Management Solution Facility & Property Management Solution Center Mine Ltd. Innovative software solutions Center Mine is the global business-to-business software services company founded by the UK hightech investment fund

More information

Understanding Agile Project Management

Understanding Agile Project Management Understanding Agile Project Management Author Melanie Franklin Director Agile Change Management Limited Overview This is the transcript of a webinar I recently delivered to explain in simple terms what

More information

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376 Course Director: Dr. Kayvan Najarian (DCM&B, [email protected]) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

More information

Applying Data Science to Sales Pipelines for Fun and Profit

Applying Data Science to Sales Pipelines for Fun and Profit Applying Data Science to Sales Pipelines for Fun and Profit Andy Twigg, CTO, C9 @lambdatwigg Abstract Machine learning is now routinely applied to many areas of industry. At C9, we apply machine learning

More information

Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

More information

Cleaned Data. Recommendations

Cleaned Data. Recommendations Call Center Data Analysis Megaputer Case Study in Text Mining Merete Hvalshagen www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 10 Bloomington, IN 47404, USA +1 812-0-0110

More information