Sentiment Analysis and Opinion Mining in Collections of Qualitative Data

Size: px
Start display at page:

Download "Sentiment Analysis and Opinion Mining in Collections of Qualitative Data"

Transcription

1 Sentiment Analysis and Opinion Mining in Collections of Qualitative Data Sergej Zerr, Nam Khanh Tran, Kerstin Bischoff, and Claudia Niederée Leibniz Universität Hannover / Forschungszentrum L3S, Hannover, Germany zerr@l3s.de, NTran@L3S.de, bischoff@l3s.de, niederee@l3s.de Abstract. In social sciences, a tremendous body of data is being collected by observing or interviewing people. Such qualitative data forms a valuable source for later secondary research. One major challenge, though, is the preservation of privacy of the interviewees even after longer time periods of archival storage. Modern sentiment analysis techniques could help to judge the sensitivity of particular textual content and help the data provider to remove sensitive data from unauthorized eyes, thus reducing manual processing of large collections of primary material. Besides, mining opinions enables enhanced data access, e.g., by finding negative attitudes about a topic. In this paper we will describe properties of qualitative social science data with respect to sentiment analysis. We compare it to datasets used in the literature, identify main challenges, and provide directions for solving them. By discussing how to exploit state-of-the-art techniques to leverage the (secondary) exploration of archived qualitative data we hope to foster interdisciplinary dialogue. Keywords: digital humanities, qualitative data, sentiment analysis 1 Introduction The Sociological Research Institute (SOFI) in Göttingen (Germany) carried out a number of studies observing working situation in German automobile and shipyard industry after the rapid economic growth in post-world War II Germany - the so-called German economic miracle. Findings of these studies had a significant impact on the working situation in German industry. Intelligent access to this data would turn such data collection into a valuable source for secondary research, e.g., for longitudinal (meta)analysis or historical investigations. Within the scope of the project Gute Arbeit ( Good Work ) we are developing tools for enabling rich exploratory access to this data for secondary research. Reusing such sources is not only a challenging and time consuming task, e.g. regarding the selection of an appropriate subset, capturing context, etc. Moreover, behind each document there is a particular person whose privacy need to be respected by the data provider and secondary analyst and technically preserved by the data provider. Modern sentiment analysis (or opinion mining) techniques could help to judge the sensitivity of a particular document, paragraph, or even sentence and help the data provider to remove extremely sensitive

2 2 data from unauthorized eyes. For example, a highly negative statement about your own employer may be problematic when made somehow traced back, in particular once the interviewee climbed up the hierarchy in the very same enterprise. Moreover, since usually also the company is assured non-disclosure, overly critical statements may be especially harmful (besides of course confidential information). Second, for the secondary researcher those techniques could help to automatically find passages with interesting points of view on a particular subject and reduce manual processing of large collections of primary material. For example, our project is interested in how peoples concepts of good work evolved over the last decades. However our literature analysis revealed that due to the specificity of qualitative data, straight-forward application of state-of-the-art sentiment analysis tools is not always feasible even after modification. 2 Data Our corpus consist of qualitative data, in German language, from studies of the Sociological Research Institute (SOFI) in Göttingen. The data consists of a variety of (case) studies typically including worker interviews and observation at the workplace. It was collected within about 50 projects during a period of over 40 years, starting from the 60 s (i.e Volkswagen and German dockyard studies). one of the latest studies contains 41 interviews with individuals and groups of the vehicle manufacturing company Auto 5000, which was set up inside the Volkswagen complex in Wolfsburg, Germany in This lower cost model company was set up aiming at keeping manufacturing jobs in Germany instead of moving production to other areas of Europe. Interviews include, for example, the employment history of the formerly unemployed workers and engineers as well as topics like shift work, team work, or relations between regular Volkswagen and Auto 5000 employees. For comparison and illustration, we also use an English dataset, namely the case study on Changing Organizational Forms and the Re-shaping of Work [1]. Each case (some examples are: airlines, ceramics manufacturer, hotel services, etc.) has transcriptions or summaries of in-depth Face-to-face interviews conducted in England and Scotland between 1999 and Participants were managers and employees at all levels, sometimes also union representatives. Examples below are taken from these interviews. 3 Related Work and Challenges Dealing with qualitative interview data we face general challenges to sentiment analysis (see e.g. [2]) but find some peculiarities. For example, it is typically assumed that the subject (e.g., a YouTube video or market item) is known and that the sentiment can be estimated quite well already using simple vocabulary based techniques. In our dataset, however, indirect sentiment expressions are dominating and the vocabulary is less explicit and considerably less aggressive compared to the Web materials widely used in the literature. Instead

3 3 Fig. 1: The structure of opinions employers are dependent on their company and thus tend to express criticism rather subtile, or deliberately decide not to mention certain problematic topics, or to use reported speech. Often the sentiment can be only estimated after careful analysis of the aspects highlighted of the subject rather than on adjectives used to describe those. Fig. 1 summarizes the pattern structures we plan to detect in our data set. The Object is in our case an interviewee who expresses a specific opinions about a number of Subjects(also called opinion targets). A subject could be a person, specific item like a particular instrument or abstract concept and events. Each subject receives an opinion expression which can be either positive or negative (presented as +/-). In this section we will identify and discuss some of the challenges to be faced while extracting patterns described above. Detection of Subjective Expressions: User generated content is the major data source in literature about opinion mining. One property of such data is that a particular Web user is often hidden behind a virtual identity and behaves more freely than she would do in the real life. Generally, Web users are rarely concerned about careful selection of words and expressions. High precision in positive/negative sentiment analysis on such datasets is achieved not least due to explicit emotional adjectives (for example ugly, idiot vs. perfect, favorite [3]). In our studies the interviews were recorded face-to-face and the sentiment is often obscured. In following, we are using example sentences, extracted from our English dataset described in Section 2. There is a number of seemingly neutral expressions actually having a hidden positive or negative sentiment: The text (company rules) says it should be achievable but again the reality, the experience from some people has been otherwise. Sometimes an expression only appears subjective with respect to vocabulary without being it

4 4 (here the term good does not carry sentiment value): We are here to give them a service, clean their aircraft. it s got to have a good standard and quality of clean. Subject Identification: Typically state-of-the-art approaches assume that the document contains opinions on one main subject expressed by the author of the document (e.g. Product review, YouTube video etc.). In our case the subject(s) have first to be detected. For example the interviewee in one document can express opinions about multiple subjects such as colleagues, boss, company, family, government, etc. Moreover, a subject may be complex having different aspects. The authors [4] addressed the problem of target detection for French telephonic surveys and forum entries by developing a grammar using linguistic patterns like Target state Verb Adjective (e.g. My boss is great ). User opinions on events and impact of opinions in social Web over time was considered in [5], similarly, in our project we are interested in event descriptions and temporal opinion development analysis. Context Dependency: The expression It was cold in contexts of skiing weather and restaurant food would have completely different polarity [6]. Similarly, in different cases the same terms may also differ with respect to their degree of sentiment. The latter was considered in [7]. Indirect Sentiment: Just a vocabulary with positive/negative examples alone would not be sufficient when judging opinions. Sometimes it depends less on the expressed terms and more on the subject attributes being highlighted. Although in the literature direct and indirect attributes are distinguished [8], the impact on highlighting and omitting particular attributes was not yet considered. In order to stay polite people often speak mainly about positive aspects (e.g of the work) even if they are less important. Opinion Order: Although the expressions The work is hard, but the salary is high or The salary is high, but the work is hard share the terms as well as the topic, they are quite different in terms of sentiment. 4 Approach Step 1 - Rich Annotation Editor: In contrast to most datasets used in the literature, our dataset is missing any definite features, like favorite assignments, or (dis-)likes in Web2.0, that could directly be used for estimating the sensitivity degree of a document. This makes manual annotation of the dataset a necessity. To capture as many important properties as possible we are developing an annotation editor for gathering a high quality gold standard data. The annotator can read the source text on the left panel of the editor Fig.2(1). Selecting a piece of text and pressing new topic/concept (2) will create a new selection section. Here four buttons are present and active as soon as the annotator selects

5 5 Fig. 2: Annotation Editor some text in the left panel. Clicking on instance (3) will add the current text selection as a new instance of active subject (e.g., My Chef and My Boss ) and its particular aspect(4). Clicking on the (5) positive or negative button will add the selection as support for the corresponding sentiment. The corpora will be annotated by social scientists who were collecting and working with the data. The manual assessment of sentiment will serve as a gold standard/ground truth, which will be used as a training corpus for deriving models for automatic identification. Step 2 - NLP Analysis: First, the annotated set will be manually analyzed by the social scientists and a set of formal rules describing sentimental/neutral expressions will be defined. We will continue with the analysis using NLP tools and extract a set of further feature candidates like part of the speech, parse tree structure, typical idioms, etc. Finally we will conduct classification experiments and plot precision recall curves for evaluation of the feature selection. Especially we are interested to find out, to what degree we can automatically answer questions like What is a sentiment value of a particular document and Are there sensitive documents in the given set. Aggregation of polarity over different aspects and subject level granularity are particularly interesting issues. Step 3: Tools Development: Finally our goal is to implement a toolbox for estimating sentiment in qualitative data, apply those to our dataset and open parts of the archive to secondary research. Further analysis could provide insights on the average situation at the workplace with respect to sentiment expressions at given time points and make it comparable to other times and workplaces.

6 6 5 Conclusion In this paper we describe the directions for tackling the problem of sentiment analysis within corpora of qualitative research data. The challenge is first to detect subjective expressions given the absence of explicit, clearly sentimental vocabulary. In the next step the corresponding subjects need to be identified. Finally, the relations between sentiment degree of expressed opinions and the sensitivity of the documents needs to be analyzed. We plan to develop and evaluate corresponding tools as well as to apply those on an existing set of qualitative interviews within the German project Gute Arbeit. Acknowledgments The work was supported by the project Gute Arbeit nach dem Boom (Re-SozIT) funded by the German Federal Ministry of Education and Research (BMBF) under mark 01UG1249C within the ehumanities line of funding as well as by the European project ARCOMEM (GA270239). References 1. Marchington, M., Rubery, J., Willmott, H.: Changing organizational forms and the re-shaping of work : Case study interviews, [computer file] (2004) 2. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1-2) (January 2008) Siersdorfer, S., Chelaru, S., Nejdl, W., San Pedro, J.: How useful are your comments?: analyzing and predicting youtube comments and comment ratings. In: Proceedings of the 19th international conference on World wide web. WWW 10, New York, NY, USA, ACM (2010) Goujon, B.: Text mining for opinion target detection. In: Intelligence and Security Informatics Conference (EISIC), 2011 European. (2011) Maynard, D. Bontcheva, K.R.D.: Challenges in developing opinion mining tools for social media. can u tag usergeneratedcontent?! Workshop at LREC 2012, Istanbul, Turkey 6. Krestel, R., Siersdorfer, S.: Generating contextualized sentiment lexica based on latent topics and user ratings. In: Proceedings of the 24th ACM Conference on Hypertext and Social Media. HT 13, New York, NY, USA, ACM (2013) Stylios, G., Tsolis, D., Christodoulakis, D.: Mining and estimating users opinion strength in forum texts regarding governmental decisions. In Iliadis, L., Maglogiannis, I., Papadopoulos, H., Karatzas, K., Sioutas, S., eds.: Artificial Intelligence Applications and Innovations. Volume 382 of IFIP Advances in Information and Communication Technology. Springer Berlin Heidelberg (2012) Xiao, R.: Corpus creation. In: Handbook of Natural Language Processing. (2010)

Improving Traceability of Requirements Through Qualitative Data Analysis

Improving Traceability of Requirements Through Qualitative Data Analysis Improving Traceability of Requirements Through Qualitative Data Analysis Andreas Kaufmann, Dirk Riehle Open Source Research Group, Computer Science Department Friedrich-Alexander University Erlangen Nürnberg

More information

Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams

Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams 2012 International Conference on Computer Technology and Science (ICCTS 2012) IPCSIT vol. XX (2012) (2012) IACSIT Press, Singapore Using Text and Data Mining Techniques to extract Stock Market Sentiment

More information

The role of multimedia in archiving community memories

The role of multimedia in archiving community memories The role of multimedia in archiving community memories Jonathon S. Hare, David P. Dupplaw, Wendy Hall, Paul H. Lewis, and Kirk Martinez Electronics and Computer Science, University of Southampton, Southampton,

More information

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,

More information

Using Feedback Tags and Sentiment Analysis to Generate Sharable Learning Resources

Using Feedback Tags and Sentiment Analysis to Generate Sharable Learning Resources Using Feedback Tags and Sentiment Analysis to Generate Sharable Learning Resources Investigating Automated Sentiment Analysis of Feedback Tags in a Programming Course Stephen Cummins, Liz Burd, Andrew

More information

End-to-End Sentiment Analysis of Twitter Data

End-to-End Sentiment Analysis of Twitter Data End-to-End Sentiment Analysis of Twitter Data Apoor v Agarwal 1 Jasneet Singh Sabharwal 2 (1) Columbia University, NY, U.S.A. (2) Guru Gobind Singh Indraprastha University, New Delhi, India apoorv@cs.columbia.edu,

More information

Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results

Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results , pp.33-40 http://dx.doi.org/10.14257/ijgdc.2014.7.4.04 Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results Muzammil Khan, Fida Hussain and Imran Khan Department

More information

Analyzing survey text: a brief overview

Analyzing survey text: a brief overview IBM SPSS Text Analytics for Surveys Analyzing survey text: a brief overview Learn how gives you greater insight Contents 1 Introduction 2 The role of text in survey research 2 Approaches to text mining

More information

The Oxford Learner s Dictionary of Academic English

The Oxford Learner s Dictionary of Academic English ISEJ Advertorial The Oxford Learner s Dictionary of Academic English Oxford University Press The Oxford Learner s Dictionary of Academic English (OLDAE) is a brand new learner s dictionary aimed at students

More information

CHAPTER THREE: METHODOLOGY. 3.1. Introduction. emerging markets can successfully organize activities related to event marketing.

CHAPTER THREE: METHODOLOGY. 3.1. Introduction. emerging markets can successfully organize activities related to event marketing. Event Marketing in IMC 44 CHAPTER THREE: METHODOLOGY 3.1. Introduction The overall purpose of this project was to demonstrate how companies operating in emerging markets can successfully organize activities

More information

The Role of Reactive Typography in the Design of Flexible Hypertext Documents

The Role of Reactive Typography in the Design of Flexible Hypertext Documents The Role of Reactive Typography in the Design of Flexible Hypertext Documents Rameshsharma Ramloll Collaborative Systems Engineering Group Computing Department Lancaster University Email: ramloll@comp.lancs.ac.uk

More information

Course Syllabus My TOEFL ibt Preparation Course Online sessions: M, W, F 15:00-16:30 PST

Course Syllabus My TOEFL ibt Preparation Course Online sessions: M, W, F 15:00-16:30 PST Course Syllabus My TOEFL ibt Preparation Course Online sessions: M, W, F Instructor Contact Information Office Location Virtual Office Hours Course Announcements Email Technical support Anastasiia V. Mixcoatl-Martinez

More information

Sentiment Analysis of Movie Reviews and Twitter Statuses. Introduction

Sentiment Analysis of Movie Reviews and Twitter Statuses. Introduction Sentiment Analysis of Movie Reviews and Twitter Statuses Introduction Sentiment analysis is the task of identifying whether the opinion expressed in a text is positive or negative in general, or about

More information

How do we know what we know?

How do we know what we know? Research Methods Family in the News Can you identify some main debates (controversies) for your topic? Do you think the authors positions in these debates (i.e., their values) affect their presentation

More information

A PLATFORM FOR SHARING DATA FROM FIELD OPERATIONAL TESTS

A PLATFORM FOR SHARING DATA FROM FIELD OPERATIONAL TESTS A PLATFORM FOR SHARING DATA FROM FIELD OPERATIONAL TESTS Yvonne Barnard ERTICO ITS Europe Avenue Louise 326 B-1050 Brussels, Belgium y.barnard@mail.ertico.com Sami Koskinen VTT Technical Research Centre

More information

CONFIOUS * : Managing the Electronic Submission and Reviewing Process of Scientific Conferences

CONFIOUS * : Managing the Electronic Submission and Reviewing Process of Scientific Conferences CONFIOUS * : Managing the Electronic Submission and Reviewing Process of Scientific Conferences Manos Papagelis 1, 2, Dimitris Plexousakis 1, 2 and Panagiotis N. Nikolaou 2 1 Institute of Computer Science,

More information

CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING

CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING Mary-Elizabeth ( M-E ) Eddlestone Principal Systems Engineer, Analytics SAS Customer Loyalty, SAS Institute, Inc. Is there valuable

More information

1998 Workplace Employee Relations Survey

1998 Workplace Employee Relations Survey UK Data Archive Study Number 3955 - Workplace Employee Relations Survey: Cross-Section, 1998 1998 Workplace Employee Relations Survey User Guide Introduction Introduction to the User Guide The 1998 Workplace

More information

WHITEPAPER. Text Analytics Beginner s Guide

WHITEPAPER. Text Analytics Beginner s Guide WHITEPAPER Text Analytics Beginner s Guide What is Text Analytics? Text Analytics describes a set of linguistic, statistical, and machine learning techniques that model and structure the information content

More information

A Guide. to Assessment of Learning Outcomes. for ACEJMC Accreditation

A Guide. to Assessment of Learning Outcomes. for ACEJMC Accreditation A Guide to Assessment of Learning Outcomes for ACEJMC Accreditation Accrediting Council on Education in Journalism and Mass Communications, 2012 This guide explains ACEJMC s expectations of an assessment

More information

Digital archiving of scientific information Czech experience

Digital archiving of scientific information Czech experience Digital archiving of scientific information Czech experience P. Slavik, P. Mach, M. Snorek Czech Technical University in Prague Prague, Czech Republic Slavik mach snorek@fel.cvut.cz Abstract This paper

More information

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words , pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan

More information

Virginia English Standards of Learning Grade 8

Virginia English Standards of Learning Grade 8 A Correlation of Prentice Hall Writing Coach 2012 To the Virginia English Standards of Learning A Correlation of, 2012, Introduction This document demonstrates how, 2012, meets the objectives of the. Correlation

More information

Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED

Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED 17 19 June 2013 Monday 17 June Salón de Actos, Facultad de Psicología, UNED 15.00-16.30: Invited talk Eneko Agirre (Euskal Herriko

More information

Text Analytics for Competitive Analysis and Market Intelligence Aiaioo Labs - 2011

Text Analytics for Competitive Analysis and Market Intelligence Aiaioo Labs - 2011 Text Analytics for Competitive Analysis and Market Intelligence Aiaioo Labs - 2011 Bangalore, India Title Text Analytics Introduction Entity Person Comparative Analysis Entity or Event Text Analytics Text

More information

YOUNG PROFESSIONALS AT WORK

YOUNG PROFESSIONALS AT WORK consumerlab YOUNG PROFESSIONALS AT WORK The working lives and expectations of the emerging professional generation in the US An Ericsson Consumer Insight Summary Report April 2013 contents THE LEADERS

More information

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of

More information

Merging learner performance with browsing behavior in video lectures

Merging learner performance with browsing behavior in video lectures Merging learner performance with browsing behavior in video lectures Konstantinos Chorianopoulos Department of Informatics Ionian University Corfu, GR-49100 Greece choko@ionio.gr Michail N. Giannakos Department

More information

User research for information architecture projects

User research for information architecture projects Donna Maurer Maadmob Interaction Design http://maadmob.com.au/ Unpublished article User research provides a vital input to information architecture projects. It helps us to understand what information

More information

Text Mining - Scope and Applications

Text Mining - Scope and Applications Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss

More information

KNOWLEDGE ORGANIZATION

KNOWLEDGE ORGANIZATION KNOWLEDGE ORGANIZATION Gabi Reinmann Germany reinmann.gabi@googlemail.com Synonyms Information organization, information classification, knowledge representation, knowledge structuring Definition The term

More information

The Open University s repository of research publications and other research outputs. Collaborative sensemaking in learning analytics

The Open University s repository of research publications and other research outputs. Collaborative sensemaking in learning analytics Open Research Online s repository of research publications and other research outputs Collaborative sensemaking in learning analytics Conference Item How to cite: Knight, Simon; Buckingham Shum, Simon

More information

Why Semantic Analysis is Better than Sentiment Analysis. A White Paper by T.R. Fitz-Gibbon, Chief Scientist, Networked Insights

Why Semantic Analysis is Better than Sentiment Analysis. A White Paper by T.R. Fitz-Gibbon, Chief Scientist, Networked Insights Why Semantic Analysis is Better than Sentiment Analysis A White Paper by T.R. Fitz-Gibbon, Chief Scientist, Networked Insights Why semantic analysis is better than sentiment analysis I like it, I don t

More information

Data documentation and metadata for data archiving and sharing. Data Management and Sharing workshop Vienna, 14 15 April 2010

Data documentation and metadata for data archiving and sharing. Data Management and Sharing workshop Vienna, 14 15 April 2010 Data documentation and metadata for data archiving and sharing Data Management and Sharing workshop Vienna, 14 15 April 2010 Why document data? enables you to understand/interpret data needed to make data

More information

Research Article 2015. International Journal of Emerging Research in Management &Technology ISSN: 2278-9359 (Volume-4, Issue-4) Abstract-

Research Article 2015. International Journal of Emerging Research in Management &Technology ISSN: 2278-9359 (Volume-4, Issue-4) Abstract- International Journal of Emerging Research in Management &Technology Research Article April 2015 Enterprising Social Network Using Google Analytics- A Review Nethravathi B S, H Venugopal, M Siddappa Dept.

More information

CHOOSE THE RIGHT ONE!

CHOOSE THE RIGHT ONE! The social intelligence company CHOOSE THE RIGHT ONE! GROUPING SOCIAL MEDIA MONITORING TOOLS Whitepaper The social intelligence company BRIEF Monitoring social media! Do I need this? How do I get started?

More information

Terminology Extraction from Log Files

Terminology Extraction from Log Files Terminology Extraction from Log Files Hassan Saneifar 1,2, Stéphane Bonniol 2, Anne Laurent 1, Pascal Poncelet 1, and Mathieu Roche 1 1 LIRMM - Université Montpellier 2 - CNRS 161 rue Ada, 34392 Montpellier

More information

Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority

More information

Soziale Suche und Selbstgesteuertes Lernen

Soziale Suche und Selbstgesteuertes Lernen Soziale Suche und Selbstgesteuertes Lernen Diversity, Multiliteracies and Education Wolfgang Nejdl L3S Research Center Hannover, Germany 1 Web Science Clusters @ L3S Cluster 1 - Internet How does the Future

More information

The Italian Hate Map:

The Italian Hate Map: I-CiTies 2015 2015 CINI Annual Workshop on ICT for Smart Cities and Communities Palermo (Italy) - October 29-30, 2015 The Italian Hate Map: semantic content analytics for social good (Università degli

More information

C o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER

C o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER INTRODUCTION TO SAS TEXT MINER TODAY S AGENDA INTRODUCTION TO SAS TEXT MINER Define data mining Overview of SAS Enterprise Miner Describe text analytics and define text data mining Text Mining Process

More information

Towards Continuous Information Security Audit

Towards Continuous Information Security Audit Towards Continuous Information Security Audit Dmitrijs Kozlovs, Kristine Cjaputa, Marite Kirikova Riga Technical University, Latvia {dmitrijs.kozlovs, kristine.cjaputa, marite.kirikova}@rtu.lv Abstract.

More information

Identifying Focus, Techniques and Domain of Scientific Papers

Identifying Focus, Techniques and Domain of Scientific Papers Identifying Focus, Techniques and Domain of Scientific Papers Sonal Gupta Department of Computer Science Stanford University Stanford, CA 94305 sonal@cs.stanford.edu Christopher D. Manning Department of

More information

Donnellan, Brian Gleasure, Rob Helfert, Markus Kenneally, Jim Rothenberger, Marcus Chiarini Tremblay, Monica VanderMeer, Debra Winter, Robert

Donnellan, Brian Gleasure, Rob Helfert, Markus Kenneally, Jim Rothenberger, Marcus Chiarini Tremblay, Monica VanderMeer, Debra Winter, Robert Title Author(s) Editor(s) ITSM ProcessGuide a longitudinal and multi-method field study for real-world DSR artifact evaluation Morana, Stefan; Schacht, Silvia; Gerards, Timo; Maedche, Alexander Donnellan,

More information

A Decision Support Approach based on Sentiment Analysis Combined with Data Mining for Customer Satisfaction Research

A Decision Support Approach based on Sentiment Analysis Combined with Data Mining for Customer Satisfaction Research 145 A Decision Support Approach based on Sentiment Analysis Combined with Data Mining for Customer Satisfaction Research Nafissa Yussupova, Maxim Boyko, and Diana Bogdanova Faculty of informatics and robotics

More information

Fogbeam Vision Series - The Modern Intranet

Fogbeam Vision Series - The Modern Intranet Fogbeam Labs Cut Through The Information Fog http://www.fogbeam.com Fogbeam Vision Series - The Modern Intranet Where It All Started Intranets began to appear as a venue for collaboration and knowledge

More information

Better for recruiters... Better for candidates... Candidate Information Manual

Better for recruiters... Better for candidates... Candidate Information Manual Better for recruiters... Better for candidates... Candidate Information Manual Oil and gas people has been designed to offer a better solution to recruiters and candidates in the oil and gas industry.

More information

Text Analytics Beginner s Guide. Extracting Meaning from Unstructured Data

Text Analytics Beginner s Guide. Extracting Meaning from Unstructured Data Text Analytics Beginner s Guide Extracting Meaning from Unstructured Data Contents Text Analytics 3 Use Cases 7 Terms 9 Trends 14 Scenario 15 Resources 24 2 2013 Angoss Software Corporation. All rights

More information

The One Page Public Relations Plan

The One Page Public Relations Plan The One Page Public Relations Plan June 2008 Carol A. Scott, APR, Fellow PRSA Bad planning on your part does not constitute an emergency on my part. He who fails to plan, plans to fail. A good plan today

More information

IBM Content Analytics adds value to Cognos BI

IBM Content Analytics adds value to Cognos BI IBM Software IBM Industry Solutions IBM Content Analytics adds value to Cognos BI 2 IBM Content Analytics adds value to Cognos BI Analyzing unstructured information It is generally accepted that about

More information

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD. Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.

More information

Case Writing Guide. Figure 1: The Case Writing Process Adopted from Leenders & Erskine (1989)

Case Writing Guide. Figure 1: The Case Writing Process Adopted from Leenders & Erskine (1989) Case Writing Guide Case writing is a process that begins with the decision to use a case and ends with the use of the case in class. The entire sequence of steps in the process can be found in Figure 1.

More information

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning

More information

Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object

Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object Anne Monceaux 1, Joanna Guss 1 1 EADS-CCR, Centreda 1, 4 Avenue Didier Daurat 31700 Blagnac France

More information

ST. PETER S CHURCH OF ENGLAND (VOLUNTARY AIDED) PRIMARY SCHOOL SOUTH WEALD. Modern Foreign Language Policy

ST. PETER S CHURCH OF ENGLAND (VOLUNTARY AIDED) PRIMARY SCHOOL SOUTH WEALD. Modern Foreign Language Policy ST. PETER S CHURCH OF ENGLAND (VOLUNTARY AIDED) PRIMARY SCHOOL SOUTH WEALD Modern Foreign Language Policy January 2013 ST PETER S MODERN FOREIGN LANGUAGE (MFL) POLICY RATIONALE In the knowledge society

More information

Automated vs. manual methods of coding and analysing free text survey responses

Automated vs. manual methods of coding and analysing free text survey responses Automated vs. manual methods of coding and analysing free text survey responses Dr Kathy Seymour, Director, Seymour Research Ltd Free text data General Specific Please use the space below to provide any

More information

Provalis Research Text Analytics and the Victory Index

Provalis Research Text Analytics and the Victory Index point Provalis Research Text Analytics and the Victory Index Fern Halper, Ph.D. Fellow Daniel Kirsch Senior Analyst Provalis Research Text Analytics and the Victory Index Unstructured data is everywhere

More information

Text Opinion Mining to Analyze News for Stock Market Prediction

Text Opinion Mining to Analyze News for Stock Market Prediction Int. J. Advance. Soft Comput. Appl., Vol. 6, No. 1, March 2014 ISSN 2074-8523; Copyright SCRG Publication, 2014 Text Opinion Mining to Analyze News for Stock Market Prediction Yoosin Kim 1, Seung Ryul

More information

Semi-structured interviews

Semi-structured interviews Semi-structured interviews 3 rd December 2014 Prof. Edwin van Teijlingen BU Graduate School Aim of this session: introduce various ways of conducting interviews with qualitative methods; outline strength

More information

Random Forest Based Imbalanced Data Cleaning and Classification

Random Forest Based Imbalanced Data Cleaning and Classification Random Forest Based Imbalanced Data Cleaning and Classification Jie Gu Software School of Tsinghua University, China Abstract. The given task of PAKDD 2007 data mining competition is a typical problem

More information

The Power of 32/X 1 4300.250. 30 levels 287.1 287.1 5.0 3420.330 36.14 23.55. x21. 24hr security 32/X 1. Potential. x21. 30 storey x21. 36 sq ft.

The Power of 32/X 1 4300.250. 30 levels 287.1 287.1 5.0 3420.330 36.14 23.55. x21. 24hr security 32/X 1. Potential. x21. 30 storey x21. 36 sq ft. The Power of Technology IN CRE Data and Analytics 7800 42 0.2 3420.330 175lbs 32/X 1 4300.250 x21 36.14 3420.330 32/X 1 Potential 21.56 25x100 growth is an organization s future ability to generate larger

More information

Co-Creation of Models and Metamodels for Enterprise. Architecture Projects.

Co-Creation of Models and Metamodels for Enterprise. Architecture Projects. Co-Creation of Models and Metamodels for Enterprise Architecture Projects Paola Gómez pa.gomez398@uniandes.edu.co Hector Florez ha.florez39@uniandes.edu.co ABSTRACT The linguistic conformance and the ontological

More information

Syllabus. Dr. Calderón connects instructional practice with the Common Core State Standards, and backs up her recommendations with research:

Syllabus. Dr. Calderón connects instructional practice with the Common Core State Standards, and backs up her recommendations with research: Syllabus Course: Teaching Reading and Comprehension to English Language Learners, K-5 Presenter: Margarita Calderón Number of Credits: 3 Required ebook: Teaching Reading & Comprehension to English Learners,

More information

BILINGUALISM AND LANGUAGE ATTITUDES IN NORTHERN SAMI SPEECH COMMUNITIES IN FINLAND PhD thesis Summary

BILINGUALISM AND LANGUAGE ATTITUDES IN NORTHERN SAMI SPEECH COMMUNITIES IN FINLAND PhD thesis Summary Duray Zsuzsa BILINGUALISM AND LANGUAGE ATTITUDES IN NORTHERN SAMI SPEECH COMMUNITIES IN FINLAND PhD thesis Summary Thesis supervisor: Dr Bakró-Nagy Marianne, University Professor PhD School of Linguistics,

More information

Organizational Social Network Analysis Case Study in a Research Facility

Organizational Social Network Analysis Case Study in a Research Facility Organizational Social Network Analysis Case Study in a Research Facility Wolfgang Schlauch 1, Darko Obradovic 2, and Andreas Dengel 1,2 1 University of Kaiserslautern, Germany 2 German Research Center

More information

Improving SAS Global Forum Papers

Improving SAS Global Forum Papers Paper 3343-2015 Improving SAS Global Forum Papers Vijay Singh, Pankush Kalgotra, Goutam Chakraborty, Oklahoma State University, OK, US ABSTRACT Just as research is built on existing research, the references

More information

Where do new product ideas come from?

Where do new product ideas come from? Steps in the Opportunity Identification Phase Where do new product ideas come from? 1. Defining the New Product Strategy Product Innovation Charter 2. Market Definition Understanding Market structure from

More information

Delivering Smart Answers!

Delivering Smart Answers! Companion for SharePoint Topic Analyst Companion for SharePoint All Your Information Enterprise-ready Enrich SharePoint, your central place for document and workflow management, not only with an improved

More information

ONTOLOGY FOR MOBILE PHONE OPERATING SYSTEMS

ONTOLOGY FOR MOBILE PHONE OPERATING SYSTEMS ONTOLOGY FOR MOBILE PHONE OPERATING SYSTEMS Hasni Neji and Ridha Bouallegue Innov COM Lab, Higher School of Communications of Tunis, Sup Com University of Carthage, Tunis, Tunisia. Email: hasni.neji63@laposte.net;

More information

Holly. Anubhav. Patrick

Holly. Anubhav. Patrick Holly. Anubhav. Patrick Origins of Field Research Anthropology Ethnographic field work: The study of native cultures by learning the native language, observing and taking part in native life, originated

More information

Appendix B Data Quality Dimensions

Appendix B Data Quality Dimensions Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational

More information

EXAMS Leaving Certificate English

EXAMS Leaving Certificate English EXAMS Leaving Certificate English Theme Language focus Learning focus Learning Support Language Support Exams: English Key vocabulary for exam questions, type and structure of questions. Understanding

More information

www.coremedia.com CoreMedia 6

www.coremedia.com CoreMedia 6 COREMEDIA 6 PRODUCT BROCHURE www.coremedia.com CoreMedia 6 COREMEDIA 6 PRODUCT BROCHURE CoreMedia 6: Because contextualization is about people CoreMedia 6 empowers your Marketing, Business and IT teams,

More information

WEGOV ANALYSIS TOOLS TO CONNECT POLICY MAKERS WITH CITIZENS ONLINE

WEGOV ANALYSIS TOOLS TO CONNECT POLICY MAKERS WITH CITIZENS ONLINE WEGOV ANALYSIS TOOLS TO CONNECT POLICY MAKERS WITH CITIZENS ONLINE Timo Wandhöfer, GESIS Leibniz Institute for the Social Sciences, Knowledge Technologies for the Social Sciences, Unter Sachsenhausen 6-8,

More information

A Framework for the Delivery of Personalized Adaptive Content

A Framework for the Delivery of Personalized Adaptive Content A Framework for the Delivery of Personalized Adaptive Content Colm Howlin CCKF Limited Dublin, Ireland colm.howlin@cckf-it.com Danny Lynch CCKF Limited Dublin, Ireland colm.howlin@cckf-it.com Abstract

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Good Data Practices VIREC Cyber Seminar Series. September 9 13, 2013

Good Data Practices VIREC Cyber Seminar Series. September 9 13, 2013 Good Data Practices VIREC Cyber Seminar Series September 9 13, 2013 1 Acknowledgements Initial concept development work by Michael Berbaum, PhD, University of Illinois at Chicago Review of existing public

More information

Lausanne 2008. Procedure of Speech- and Text Analysis at BAMF Office/Germany (BAMF = Federal Office for Migration and Refugees)

Lausanne 2008. Procedure of Speech- and Text Analysis at BAMF Office/Germany (BAMF = Federal Office for Migration and Refugees) Lausanne 2008 Procedure of Speech- and Text Analysis at BAMF Office/Germany (BAMF = Federal Office for Migration and Refugees) 1. Introduction In the processing of asylum applications both in Germany and

More information

Using Semantic Data Mining for Classification Improvement and Knowledge Extraction

Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Fernando Benites and Elena Sapozhnikova University of Konstanz, 78464 Konstanz, Germany. Abstract. The objective of this

More information

Sentiment analysis for news articles

Sentiment analysis for news articles Prashant Raina Sentiment analysis for news articles Wide range of applications in business and public policy Especially relevant given the popularity of online media Previous work Machine learning based

More information

Usability Evaluation with Users CMPT 281

Usability Evaluation with Users CMPT 281 Usability Evaluation with Users CMPT 281 Outline Usability review Observational methods Interview methods Questionnaire methods Usability ISO 9241-11: The extent to which a product can be used by specified

More information

An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them

An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them Vangelis Karkaletsis and Constantine D. Spyropoulos NCSR Demokritos, Institute of Informatics & Telecommunications,

More information

TextGrid Research Infrastructure for the e-humanities

TextGrid Research Infrastructure for the e-humanities TMS - Text Mining Services Leipzig, 25.03.2009 TextGrid Research Infrastructure for the e-humanities Martina Kerzel Goettingen State and University Library Research & Development Department kerzel@sub.uni-goettingen.de

More information

Why Enterprises Need a Social Media

Why Enterprises Need a Social Media Why Enterprises Need a Social Media Management System Introduction As social media continues to evolve, businesses are incorporating new cutting-edge technologies and applications into their online marketing

More information

AWERProcedia Information Technology & Computer Science

AWERProcedia Information Technology & Computer Science AWERProcedia Information Technology & Computer Science Vol 03 (2013) 1157-1162 3 rd World Conference on Information Technology (WCIT-2012) Webification of Software Development: General Outline and the

More information

Comparative Analysis on the Armenian and Korean Languages

Comparative Analysis on the Armenian and Korean Languages Comparative Analysis on the Armenian and Korean Languages Syuzanna Mejlumyan Yerevan State Linguistic University Abstract It has been five years since the Korean language has been taught at Yerevan State

More information

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.

More information

Identifying Thesis and Conclusion Statements in Student Essays to Scaffold Peer Review

Identifying Thesis and Conclusion Statements in Student Essays to Scaffold Peer Review Identifying Thesis and Conclusion Statements in Student Essays to Scaffold Peer Review Mohammad H. Falakmasir, Kevin D. Ashley, Christian D. Schunn, Diane J. Litman Learning Research and Development Center,

More information

Online Student Engagement as Formative Assessment

Online Student Engagement as Formative Assessment Online Student Engagement as Formative Assessment Ricardo Kawase 1 and Antigoni Parmaxi 2 1 L3S Research Center, Leibniz University Hannover, Germany kawase@l3s.de 2 Cyprus University of Technology, Limassol,

More information

Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources

Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources Michelle

More information

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD 72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD Paulo Gottgtroy Auckland University of Technology Paulo.gottgtroy@aut.ac.nz Abstract This paper is

More information

Workshop Series on Open Source Research Methodology in Support of Non-Proliferation

Workshop Series on Open Source Research Methodology in Support of Non-Proliferation The International Centre for Security Analysis The Policy Institute at King s King s College London Workshop Series on Open Source Research Methodology in Support of Non-Proliferation Workshop 1: Exploiting

More information

To download the script for the listening go to: http://www.teachingenglish.org.uk/sites/teacheng/files/learning-stylesaudioscript.

To download the script for the listening go to: http://www.teachingenglish.org.uk/sites/teacheng/files/learning-stylesaudioscript. Learning styles Topic: Idioms Aims: - To apply listening skills to an audio extract of non-native speakers - To raise awareness of personal learning styles - To provide concrete learning aids to enable

More information

Top 4 Ways Social Media is Helping to Reshape Marketing

Top 4 Ways Social Media is Helping to Reshape Marketing Top 4 Ways Social Media is Helping to Reshape Marketing How implementing social media into your business strategy can position your brand for the better Inside, you ll find information on: The ever-changing

More information

Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects

Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects Abstract: Build a model to investigate system and discovering relations that connect variables in a database

More information

CHAPTER FIVE: SUMMARY AND CONCLUSIONS, DISCUSSION, AND RECOMMENDATIONS. 5.1. Summary and Conclusions

CHAPTER FIVE: SUMMARY AND CONCLUSIONS, DISCUSSION, AND RECOMMENDATIONS. 5.1. Summary and Conclusions Event Marketing in IMC 93 CHAPTER FIVE: SUMMARY AND CONCLUSIONS, DISCUSSION, AND RECOMMENDATIONS 5.1. Summary and Conclusions In the face of the increasing value of event marketing as a tool of Integrated

More information

Using Requirements Traceability Links At Runtime A Position Paper

Using Requirements Traceability Links At Runtime A Position Paper Using Requirements Traceability Links At Runtime A Position Paper Alexander Delater, Barbara Paech University of Heidelberg, Institute of omputer Science Im Neuenheimer Feld 326, 69120 Heidelberg, Germany

More information

A Comparative Study on Sentiment Classification and Ranking on Product Reviews

A Comparative Study on Sentiment Classification and Ranking on Product Reviews A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan

More information

A Business Process Services Portal

A Business Process Services Portal A Business Process Services Portal IBM Research Report RZ 3782 Cédric Favre 1, Zohar Feldman 3, Beat Gfeller 1, Thomas Gschwind 1, Jana Koehler 1, Jochen M. Küster 1, Oleksandr Maistrenko 1, Alexandru

More information

THE HUMAN TOUCH FOR TECH TALENT EMPLOYEE RETENTION COULD BE AS SIMPLE AS THANK YOU

THE HUMAN TOUCH FOR TECH TALENT EMPLOYEE RETENTION COULD BE AS SIMPLE AS THANK YOU THE HUMAN TOUCH FOR TECH TALENT EMPLOYEE RETENTION COULD BE AS SIMPLE AS THANK YOU EXECUTIVE SUMMARY Top talent, particularly in the tech industry, remains increasingly difficult to attract, recruit, and

More information