How much does word sense disambiguation help in sentiment analysis of micropost data?



Similar documents
VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter

Sentiment Lexicons for Arabic Social Media

A Comparative Study on Sentiment Classification and Ranking on Product Reviews

Kea: Expression-level Sentiment Analysis from Twitter Data

Sentiment Analysis: a case study. Giuseppe Castellucci castellucci@ing.uniroma2.it

Sentiment Analysis of Microblogs

Improving Twitter Sentiment Analysis with Topic-Based Mixture Modeling and Semi-Supervised Training

Sentiment analysis: towards a tool for analysing real-time students feedback

Microblog Sentiment Analysis with Emoticon Space Model

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

IIIT-H at SemEval 2015: Twitter Sentiment Analysis The good, the bad and the neutral!

Rich Lexical Features for Sentiment Analysis on Twitter

Sentiment Analysis of Twitter Data

Robust Sentiment Detection on Twitter from Biased and Noisy Data

Combining Social Data and Semantic Content Analysis for L Aquila Social Urban Network

Approaches for Sentiment Analysis on Twitter: A State-of-Art study

Twitter Stock Bot. John Matthew Fong The University of Texas at Austin

Chapter 8. Final Results on Dutch Senseval-2 Test Data

Sentiment analysis for news articles

Impact of Financial News Headline and Content to Market Sentiment

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov

SENTIMENT ANALYSIS USING BIG DATA FROM SOCIALMEDIA

NILC USP: A Hybrid System for Sentiment Analysis in Twitter Messages

DiegoLab16 at SemEval-2016 Task 4: Sentiment Analysis in Twitter using Centroids, Clusters, and Sentiment Lexicons

A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts

Building the Multilingual Web of Data: A Hands-on tutorial (ISWC 2014, Riva del Garda - Italy)

Semantic Sentiment Analysis of Twitter

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. ~ Spring~r

Effect of Using Regression on Class Confidence Scores in Sentiment Analysis of Twitter Data

Twitter Sentiment Analysis of Movie Reviews using Machine Learning Techniques.

A Method for Automatic De-identification of Medical Records

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D.

Sentiment Analysis of Short Informal Texts

Fine-grained German Sentiment Analysis on Social Media

Sentiment Analysis Tool using Machine Learning Algorithms

Twitter Sentiment Analysis

Sentiment Analysis for Movie Reviews

SENTIMENT ANALYSIS: A STUDY ON PRODUCT FEATURES

Sentiment Analysis: Incremental learning to build domain models

CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis

Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies

Lexical and Machine Learning approaches toward Online Reputation Management

Micro blogs Oriented Word Segmentation System

Forecasting stock markets with Twitter

Knowledge-Based WSD on Specific Domains: Performing Better than Generic Supervised WSD

Table of Contents. Chapter No. 1 Introduction 1. iii. xiv. xviii. xix. Page No.

Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED

Some Experiments on Modeling Stock Market Behavior Using Investor Sentiment Analysis and Posting Volume from Twitter

Identifying Focus, Techniques and Domain of Scientific Papers

A Joint Model of Feature Mining and Sentiment Analysis for Product Review Rating

Sentiment Classification on Polarity Reviews: An Empirical Study Using Rating-based Features

Effectiveness of term weighting approaches for sparse social media text sentiment analysis

Emoticon Smoothed Language Models for Twitter Sentiment Analysis

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

Spam Detection A Machine Learning Approach

SU-FMI: System Description for SemEval-2014 Task 9 on Sentiment Analysis in Twitter

Credit Card Fraud Detection and Concept-Drift Adaptation with Delayed Supervised Information

EFFICIENTLY PROVIDE SENTIMENT ANALYSIS DATA SETS USING EXPRESSIONS SUPPORT METHOD

Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.

SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Customer Intentions Analysis of Twitter Based on Semantic Patterns

ONLINE RESUME PARSING SYSTEM USING TEXT ANALYTICS

A Sentiment Analysis Model Integrating Multiple Algorithms and Diverse. Features. Thesis

DIY Social Sentiment Analysis in 3 Steps

Content vs. Context for Sentiment Analysis: a Comparative Analysis over Microblogs

II. RELATED WORK. Sentiment Mining

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari

TweetAlert: Semantic Analytics in Social Networks for Citizen Opinion Mining in the City of the Future

Web Document Clustering

UNED Online Reputation Monitoring Team at RepLab 2013

Web Information Mining and Decision Support Platform for the Modern Service Industry

Social Media Data Mining and Inference system based on Sentiment Analysis

Sentiment Analysis of Movie Reviews and Twitter Statuses. Introduction

Context Aware Predictive Analytics: Motivation, Potential, Challenges

Data Mining Yelp Data - Predicting rating stars from review text

University of Glasgow Terrier Team / Project Abacá at RepLab 2014: Reputation Dimensions Task

Data Mining for Tweet Sentiment Classification

End-to-End Sentiment Analysis of Twitter Data

Overview of RepLab 2013: Evaluating Online Reputation Monitoring Systems

Sentiment analysis on tweets in a financial domain

Combining Lexicon-based and Learning-based Methods for Twitter Sentiment Analysis

Terminology Extraction from Log Files

Semi-Supervised Learning for Blog Classification

Sentiment Analysis on Twitter with Stock Price and Significant Keyword Correlation. Abstract

Transcription:

How much does word sense disambiguation help in sentiment analysis of micropost data? Chiraag Sumanth PES Institute of Technology India Diana Inkpen University of Ottawa Canada 6th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis

OUTLINE Introduction What is Word Sense Disambiguation Word Sense Disambiguation and Sentiment Analysis Dataset System Description Results Error Analysis Conclusion

INTRODUCTION This short paper describes a sentiment analysis system for micro-post data that includes analysis of tweets from Twitter and Short Messaging Service (SMS) text messages. Use Word Sense Disambiguation techniques in sentiment analysis at the message level, where the entire tweet or SMS text was analysed to determine its dominant sentiment. Use of Word Sense Disambiguation alone has resulted in an improved sentiment analysis system that outperforms systems built without incorporating Word Sense Disambiguation.

What is Word Sense Disambiguation (WSD)? Any natural language processing system encounters the problem of lexical ambiguity, be it syntactic or semantic. The resolution of a word s syntactic ambiguity has largely been by part-of-speech taggers which with high levels of accuracy. The problem is that words often have more than one meaning, sometimes fairly similar and sometimes completely different. The meaning of a word in a particular usage can only be determined by examining its context. Word Sense Disambiguation (WSD) is the process of identifying the sense of such words, called polysemic words.

Word Sense Disambiguation and Sentiment Analysis Akkaya et al. (2009) showed that Subjectivity Word Sense Disambiguation (SWSD) improves contextual opinion analysis. By applying SWSD to contextual polarity classification, an accuracy improvement of 3 percentage points was observed over the original classifier (Wilson et al., 2005a) calculated on the SenMPQA dataset. Rentoumi et al. (2009) showed that WSD is valuable in polarity classification of sentences containing figurative expressions. The primary focus of our work was to throw light on the influence of WSD in the sentiment analysis of micropost or social-media text.

Dataset Dataset from Conference on Semantic Evaluation Exercises (SemEval-2013) for Task 2: Sentiment Analysis in Twitter The training dataset contained only annotated tweets and no training data for SMS text. No additional training data was used by our system. The test dataset contained both tweets and SMS texts.

System Description Single lexical Resource used: SentiWordNet (Baccianella et al., 2010) is a lexical resource for opinion mining. SentiWordNet assigns to each synset of WordNet three sentiment scores: positivity, negativity, and objectivity, totally adding up to 1.0. Babelfy is a unified, multilingual, graph-based approach to Entity Linking and Word Sense Disambiguation, based on the BabelNet 3.0 multilingual semantic network. Babelfy was used for WSD in our system as experiments on six gold-standard datasets show its state-of-the-art performance, as well as its robustness across languages. Preprocessing tweets: Tokenization: Carnegie Mellon University (CMU) Twitter NLP tokenizer tool Stop-word removal & word stemming: Tools provided by the NLTK. Word-segmentation for hashtags (starting with #) and user-ids (starting with @) reinserted them after segmentation.

System Description For each term in the pre-processed text, retrieve the SentiWordNet scores for the matching sense of that term word, after disambiguating the sense using Babelfy. Each tweet or SMS text was represented as a vector made up of three features: The total positive score for the entire text: Aggregate the SentiWordNet Positive (P) scores for every term and divide by the total length of the text. The total negative score for the entire text: Aggregate the SentiWordNet Negative (N) scores for every term and divide by the total length of the text. The total Neutral/Objective score for the entire text: Aggregate the SentiWordNet Objective (O) scores for every term and divide by the total length of the text. This phase of the system is unsupervised where the unlabeled tweets and SMS text messages are processed and the three-featured vector representation is formed.

System Description The subsequent phase uses supervised learning to make the system learn how to combine these three numeric features, representing each text, and reach a decision on the sentiment of that text. We used the Random Forest classifier of Weka for this purpose. It corrects the decision trees habit of overfitting to their training set. Advantages of this approach: Large amounts of unlabelled data can be processed and the three-featured vector representation for that dataset can be constructed without any supervision or training required (in the unsupervised phase). We use only three features (P, N and O scores) in the supervised training, and also do not use datasetspecific features such as bag of words, and therefore the system is not prone to the so-called concept drift phenomenon

RESULTS The Precision, Recall and F-score metrics for the Twitter test data are shown. Class Precision Recall F-Score Positive 69.40 54.30 60.90 Negative 57.50 31.50 40.60 Neutral 60.00 81.30 69.10 The Precision, Recall and F-score metrics for the SMS test data are shown. Class Precision Recall F-Score Positive 52.60 62.80 57.30 Negative 67.50 30.40 41.90 Neutral 73.30 81.30 77.10 The official metric used for evaluating system performance was average F-score for the positive and negative class.

RESULTS Comparison of Average F-scores for positive/negative classes [All scores reported are for the test datasets] Dataset Tweets SMS Baseline 1 29.19 19.03 Baseline 2 34.65 29.75 NRC-Canada 39.61 39.29 Our System 50.75 49.60 Baseline 1: A majority classifier that always predicts the most frequent class as output. The second most frequent class, the positive class was chosen since the neutral class is not considered for F-score evaluation. Baseline 2: We do not disambiguate word senses, and instead the reported SentiWordNet scores of first sense of the word for the correct part-of-speech. NRC-Canada: All unigram-features only results of the system developed by NRC-Canada that was ranked first in the same task in the SemEval 2013 competition. Features included punctuation, upper-case words, POS tags, hashtags, unigram-only emotion and sentiment lexicons, emoticon detection, elongated words, and negation detection.

RESULTS Report an improvement over the all-unigram features score of NRC Canada: 11.14 percentage points for tweets 10.31 percentage points for SMS text messages Report a considerable improvement over the Baseline 2 (First sense for correct POS used): 16.10 percentage points for tweets 19.85 percentage points for SMS text messages We chose SemEval 2013 data and not data from the more recent editions of SemEval, because unigram-features-only score of the best scoring system (NRC-Canada) was reported in their SemEval 2013 submission. There has been no reported changes or improvements for the allunigram-features only model in the recent editions.

Error Analysis In both the cases of tweets and SMS text messages, the Precision and Recall for the Negative class is relatively lower. Considerably lesser samples of negative tweets in training data comprises only 15% of the training dataset. Therefore, the trained model maybe biased towards the more frequent classes, that is Positive and Neutral classes. No polarity or sentiment lexicons were used. Removal of such lexicons was reported to have the highest negative pact on performance, as observed in the NRC-Canada system Not used word n-grams or character n-grams in our system as features and this may have had a detrimental impact on performance. Our system does not feature any negation detection or encoding-detection, such as emoticons, punctuations, or upper-case letters. However, using such few features has also helped determine that the considerable improvement in performance reported can be primarily attributed only to WSD. There are no other features used in our system that can claim to have contributed to the improved performance.

CONCLUSION Our system that throws light on the positive influence that WSD can have when it comes to analyzing social-media and micropost data. Used only three numeric-feature vectors to represent the data for training our system and no additional features. Therefore, the considerable improvement in performance reported over NRC-Canada and the other baselines without WSD can be primarily attributed only to Word Sense Disambiguation and the P, N and O scores that are determined from the SentiWordNet lexicon as a result of disambiguating the text. The system can also work well for future data, since we are not using bag of words features, and our system is not prone to performance degradation due to concept drift. Our future work will explore the addition of several other features to the current system, in addition to the existing WSD-aided features to further improve system performance.

THANK YOU