Rich Lexical Features for Sentiment Analysis on Twitter
|
|
|
- Thomasina Hensley
- 9 years ago
- Views:
Transcription
1 Rich Lexical Features for Sentiment Analysis on Twitter Sabih Bin Wasi, Rukhsar Neyaz, Houda Bouamor, Behrang Mohit Carnegie Mellon University in Qatar {sabih, rukhsar, hbouamor, Abstract In this paper, we describe our system for the Sentiment Analysis of Twitter shared task in SemEval Our system uses an SVM classifier along with rich set of lexical features to detect the sentiment of a phrase within a tweet (Task-A) and also the sentiment of the whole tweet (Task- B). We start from the lexical features that were used in the 2013 shared tasks, we enhance the underlying lexicon and also introduce new features. We focus our feature engineering effort mainly on Task- A. Moreover, we adapt our initial framework and introduce new features for Task- B. Our system reaches weighted score of 87.11% in Task-A and 64.52% in Task-B. This places us in the 4th rank in the Task- A and 15th in the Task-B. 1 Introduction With more than 500 million tweets sent per day, containing opinions and messages, Twitter 1 has become a gold-mine for organizations to monitor their brand reputation. As more and more users post about products and services they use, Twitter becomes a valuable source of people s opinions and sentiments: what people can think about a product or a service, how positive they can be about it or what would people prefer the product to be like. Such data can be efficiently used for marketing. However, with the increasing amount of tweets posted on a daily basis, it is challenging and expensive to manually analyze them and locate the meaningful ones. There has been a body of recent work to automatically learn the public sen- This work is licensed under a Creative Commons Attribution 4.0 International Licence. Page numbers and proceedings footer are added by the organisers. Licence details: timents from tweets using natural language processing techniques (Pang and Lee, 2008; Jansen et al., 2009; Pak and Paroubek, 2010; Tang et al., 2014). However, the task of sentiment analysis of tweets in their free format is harder than that of any well-structured document. Tweet messages usually contain different kinds of orthographic errors such as the use of special and decorative characters, letter or word duplication, extra punctuation, as well as the use of special abbreviations. In this paper, we present our machine learning based system for sentiment analysis of Twitter shared task in SemEval Our system takes as input an arbitrary tweet and assigns it to one of the following classes that best reflects its sentiment: positive, negative or neutral. While positive and negative tweets are subjective, neutral class encompasses not only objective tweets but also subjective tweets that does not contain any polar emotion. Our classifier was developed as an undergrad course project but later pursued as a research topic. Our training, development and testing experiments were performed on data sets published in SemEval 2013 (Nakov et al., 2013). Motivated with its performance, we participated in SemEval 2014 Task 9 (Rosenthal et al., 2014). Our approach includes an extensive usage of offthe-shelf resources that have been developed for conducting NLP on social media text. Our original aim was enhancement of the task-a. Moreover, we adapted our framework and introduced new features for task-b and participated in both shared tasks. We reached an F-score of 83.3% in Task-A and an F-score of 65.57% in Task-B. That placed us in the 4th rank in the task-a and 15th rank in the task-b. Our approach includes an extensive usage of off-the-shelf resources that have been developed for conducting NLP on social media text. That includes the Twitter Tokenizer and also the Twitter POS tagger, several sentiment analysis lexica 186 Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages , Dublin, Ireland, August 23-24, 2014.
2 and finally our own enhanced resources for special handling of Twitter-specific text. Our original aim in introducing and evaluating many of the features was enhancement of the task-a. Moreover, we adapted our framework and introduced new features for task-b and participated in both shared tasks. We reached an F-score of 83.3% in Task-A and an F-score of 65.57% in Task-B. That placed us in the 4th rank in the task-a and 15th rank in the task-b. 2 System Overview We participate in tasks A and B. We use threeway classification framework in which we design and use a rich feature representation of the Twitter text. In order to process the tweets, we start with a pre-processing step, followed by feature extraction and classifier training. 2.1 Data Pre-processing Before the tweet is fed to the system, it goes through pre-processing phase that breaks tweet string into words (tokenization), attaches more information to each word (POS tagging), and other treatments. Tokenization: We use CMU ARK Tokenizer (Owoputi et al., 2013) to tokenize each tweet. This tokenizer is developed to tokenize not only space-separated text but also tokens that need to be analyzed separately. POS tagging: We use CMU ARK POS Tagger (Owoputi et al., 2013) to assign POS tags to the different tokens. In addition to the grammatical tags, this tagger assigns also twitter-specific tags mentions, hash tags, etc. This information is used later for feature extraction. Other processing: In order to normalize the different tokens and convert them into a correct English, we find acronyms in the text and add their expanded forms at the end of the list. We decide to keep both the acronym and the new word to ensure that if the token without its expansion was the word the user meant, then we are not losing any information by getting its acronym. We extend the NetLingo 2 top 50 Internet acronym list to add some missing acronyms. In order to reduce inflectional forms of a word to a common base form we use WordNetlemmatizer in NLTK (Bird et al., 2 popular-text-terms.php Tweet This is so #excited Bag of Words This :1, is :1, so :1, awesome :1, :D :1,!, :1, #excited :1 POS features numhashtags:1, numadverb:1, numadjective:1 Polarity features positivewords:1, negwords:0, avgscore: Task-B specific numcapswords:0, numemoticons:1, features numurls:0 Table 1: Set of Features demonstrated on a sample tweet for Task-B. 2009) 3. This could be useful for the feature extraction, to get as much matches as possible between the train and test data (e.g., for bag-of-words feature). 2.2 Feature Extraction Assigning a sentiment to a single word, phrase or a full tweet message requires a rich set of features. For this, we adopt a forward selection approach (Ladha and Deepa, 2011) to select the features that characterize to the best the different sentiments and help distinguishing them. In this approach, we incrementally add the features one by one and test whether this boosts the development results. We heavily rely on a binary feature representation (Heinly et al., 2012) to ensure the efficiency and robustness of our classifier. The different features used are illustrated in the example given in Table 1. Bag-of-words feature: indicates whether a given token is present in the phrase. Morpho-syntactic feature: we use the POS and twitter-specific tags extracted for each token. We count the number of adjectives, adverbs and hashtags present in the focused part of the tweet message (entire tweet or phrase). We tried adding other POS based features (e.g., number of possessive pronouns, etc.), but only the aforementioned tags increased the result figures for both tasks. Polarity-based features: we use freely available sentiment resources to explicitly define the polarity at a token-level. We define three feature categories, based on the lexicon used: html
3 Task-A Task-B Dev Train Test Dev Train Test Positive % 62.06% 59.49% 34.76% 37.59% 39.01% Negative 37.89% 33.01% 35.31% 20.56% 15.06% 17.15% Neutral 5.02% 4.93% 5.21% 44.68% 47.36% 43.84% All 1,135 9,451 10,681 1,654 9,684 8,987 Table 2: Class size distribution for all the three sets for both Task-A and Task-B. Subjectivity: number of words mapped to positive from the MPQA Subjectivity lexicon (Wilson et al., 2005). Hybrid Lexicon: We combine the Sentiment140 lexicon (Mohammad et al., 2013) with the Bing Liu s bag of positive and negative words (Hu and Liu, 2004) to create a dictionary in which each token is assigned a sentiment. Token weight: we use the SentiWordNet lexicon (Baccianella et al., 2010) to define this feature. SentiWordNet contains positive, negative and objective scores between 0 and 1 for all senses in WordNet. Based on this sense level annotation, we first map each token to its weight in this lexicon and then the sum of all these weights was used as the tweet weight. Furthermore, in order to take into account the presence of negative words, which modify the polarity of the context within which they are invoked, we reverse the polarity score of adjectives or adverbs that come within 1-2 token distance after a negative word. Task specific features: In addition to the features described above, we also define some taskspecific ones. For example, we indicate the number of capital letters in the phrase as a feature in Task-A. This could help in this task, since we are focusing on short text. For Task-B we indicate instead the number of capital words. This relies on the intuition that polarized tweets would carry more (sometimes all) capital words than the neutral or objective ones. We also added the number of emoticons and number of URL links as features for Task-B. Here, the goal is to segregate fact-containing objective tweets from emotioncontaining subjective tweets. 2.3 Classifier We use a Support Vector Machine (SVM) classifier (Chang and Lin, 2011) to which we provide the rich set of features described in the previous section. We use a linear kernel and tune its parameter C separately for the two tasks. Task-A system was bound tight to the development set with C=0.18 whereas in Task-B the system was given freedom by setting C=0.55. These values were optimized during the development using a bruteforce mechanism. Task-A Task-B LiveJournal SMS Twitter Twitter Sarcasm Weighted average Table 3: F1 measures and final results of the system for Task-A and Task-B for all the data sets including the weighted average of the sets. 3 Experiments and Results In this section, we explain details of the data and the general settings for the different experiments we conducted. We train and evaluate our classifier for both tasks with the training, development and testing datasets provided for the SemEval 2014 shared task. The size of the three datasets we use as well as their class distributions are illustrated in Table 2. It is important to note that the total dataset size for training and development set (10,586) is about the same as test set making the learning considerably challenging for correct predictions. Positive instances covered more than half of each dataset for Task-A while Neutral were the most popular class for Task-B. The class distribution of training set is the same as the test set. 188
4 Task-A Task-B all features all-preprocessing 80.79(-6.32) 59.20(-5.32) all-ark tokenization 83.69(-3.42) 60.61(-3.91) all-other treatments 85.06(-2.05) 62.19(-2.33) only BOW 81.69(-5.42) 57.85(-6.67) all-bow 82.05(-5.06) 52.04(-12.48) all-pos 86.92(-0.19) 64.31(-0.21) all-polarity based features 81.80(-5.31) 57.95(-6.57) all-svm tuning 80.82(-6.29) 21.41(-43.11) all-svm c= (-2.91) 59.87(-4.65) all-svm c=selected 87.11(0.00) 64.52(0.00) all-svm c= (-0.72) 62.51(-2.01) Table 4: F-scores obtained on the test sets with the specific feature removed. The test dataset is composed of five different sets: Twitter2013 a set of tweets collected for the SemEval2013 test set, Twitter2014, tweets collected for this years version, LiveJournal2014 consisting of formal tweets, SMS2013, a collection of sms messages,twittersarcasm, a collection of sarcastic tweets. The results of our system are shown in Table 3. The top five rows shows the results by the SemEval scorer for all the data sets used by them. This scorer took the average of F1- score of only positive and negative classes. The last row shows the weighted average score of all the scores for Task A and B from the different data sets. Our scores for Task-A and Task-B were and respectively for Twitter Our system performed better on Twitter and SMS test sets from This was reasonable since we tuned our system on these datasets. On the other hand, the system performed worst on sarcasm test set. This drop is extremely evident in Task-B where the results were dropped by 25%. To analyze the effects of each step of our system, we experimented with our system using different configurations. The results are shown in Table 4 and our analysis is described in the following subsections. The results were scored by SemEval 2014 scorer and we took the weighted average of all data sets to accurately reflect the performance of our system. We show the polarities values assigned to each token of a tweet by our classifier, in Table 5. Tokens POS Tags Sentiments Polarity This O Neutral Is V Neutral So R Neutral Awesome A - - #excited # Positive 1.84 Table 5: Polarity assigned using our classifier to each word of a Tweet message. 3.1 Preprocessing Effects We compared the effects of basic tokenization (based on white space) against the richer ARK Twitter tokenizer. The scores dropped by 3.42% and 3.91% for Task-A and Task-B, respectively. Other preprocessing enhancements like lemmatization and acronym additions also gave our system performance a boost. Again, the effects were more visible for Task-B than for Task-A. Overall, the system performance was boosted by 6.32% for Task-A and 5.32% for Task-B. Considering the overall score for Task-B, this is a significant change. 3.2 Feature Engineering Effects To analyze the effect of feature extraction process, we ran our system with different kind of features disabled - one at a time. For Task-A, unigram model and polarity based features were equally important. For Task-B, bag of words feature easily outperformed the effects of any other feature. However, polarity based features were second important class of features for our system. These suggest that if more accurate, exhaustive 189
5 and social media representative lexicons are made, it would help both tasks significantly. POS based features were not directly influential in our system. However, these tags helped us find better matches in lexicons where words are further identified with their POS tag. 3.3 Classifier Tuning We also analyzed the significance of SVM tuning to our system. Without setting any parameter to SVMutil library (Chang and Lin, 2011), we noticed a drop of 6.29% to scores of Task-A and a significant drop of 43.11% to scores of Task-B. Since the library use poly kernel by default, the results were drastically worse for Task-B due to large feature set. We also compared the performance with SVM kernel set to C=1. In this restricted setting, the results were slightly lower than the result obtained for our final system. 4 Discussion During this work, we found that two improvements to our system would have yielded better scores. The first would be lexicons: Since the lexicons like Sentiment140 Lexicon are automatically generated, we found that they contain some noise. As we noticed a drop of that our results were critically dependent on these lexicons, this noise would have resulted in incorrect predictions. Hence, more accurate and larger lexicons are required for better classification, especially for the tweet-level task. Unlike SentiWordNet these lexicons should contain more informal words that are common in social media. Additionally, as we can see our system was not able to confidently predict sarcasm tweets on both expression and message level, special attention is required to analyze the nature of sarcasm on Twitter and build a feature set that can capture the true sentiment of the tweet. 5 Conclusion We demonstrated our classification system that could predict sentiment of an input tweet. Our system performed more accurately in expressionlevel prediction than on entire tweet-level prediction. Our system relied heavily on bag-of-words feature and polarity based features which in turn relied on correct part-of-speech tagging and thirdparty lexicons. With this system, we ranked 4th in SemEval 2014 expression-level prediction task and 15th in tweet-level prediction task. Acknowledgment We would like to thank Kemal Oflazer and the shared task organizers for their support throughout this work. References Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. In Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC 10), pages , Valletta, Malta. Steven Bird, Ewan Klein, and Edward Loper Natural Language Processing with Python. O Reilly Media, Inc. Chih-Chung Chang and Chih-Jen Lin LIB- SVM: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1 27:27. Jared Heinly, Enrique Dunn, and Jan-Michael Frahm Comparative Evaluation of Binary Features. In Proceedings of the 12th European Conference on Computer Vision, pages , Firenze, Italy. Minqing Hu and Bing Liu Mining and Summarizing Customer Reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages , Seattle, WA, USA. Bernard J Jansen, Mimi Zhang, Kate Sobel, and Abdur Chowdury Twitter Power: Tweets as Electronic Word of Mouth. Journal of the American society for information science and technology, 60(11): L. Ladha and T. Deepa Feature Selection Methods and Algorithms. International Journal on Computer Science and Engineering (IJCSE), 3: Saif Mohammad, Svetlana Kiritchenko, and Xiaodan Zhu NRC-Canada: Building the State-ofthe-Art in Sentiment Analysis of Tweets. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), pages , Atlanta, Georgia, USA. Preslav Nakov, Sara Rosenthal, Zornitsa Kozareva, Veselin Stoyanov, Alan Ritter, and Theresa Wilson SemEval-2013 Task 2: Sentiment Analysis in Twitter. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), pages , Atlanta, Georgia, USA. 190
6 Olutobi Owoputi, Brendan O Connor, Chris Dyer, Kevin Gimpel, Nathan Schneider, and Noah A. Smith Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages , Atlanta, Georgia. Alexander Pak and Patrick Paroubek Twitter as a Corpus for Sentiment Analysis and Opinion Mining. In Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC 10), pages , Valletta, Malta. Bo Pang and Lillian Lee Opinion Mining and Sentiment Analysis, volume 2. Now Publishers Inc. Sara Rosenthal, Preslav Nakov, Alan Ritter, and Veselin Stoyanov SemEval-2014 Task 9: Sentiment Analysis in Twitter. In Proceedings of the Eighth International Workshop on Semantic Evaluation (SemEval 14), Dublin, Ireland. Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, and Bing Qin Learning Sentiment- Specific Word Embedding for Twitter Sentiment Classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pages , Baltimore, Maryland. Theresa Wilson, Janyce Wiebe, and Paul Hoffmann Recognizing Contextual Polarity in Phraselevel Sentiment Analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing, pages
IIIT-H at SemEval 2015: Twitter Sentiment Analysis The good, the bad and the neutral!
IIIT-H at SemEval 2015: Twitter Sentiment Analysis The good, the bad and the neutral! Ayushi Dalmia, Manish Gupta, Vasudeva Varma Search and Information Extraction Lab International Institute of Information
Kea: Expression-level Sentiment Analysis from Twitter Data
Kea: Expression-level Sentiment Analysis from Twitter Data Ameeta Agrawal Computer Science and Engineering York University Toronto, Canada [email protected] Aijun An Computer Science and Engineering
Improving Twitter Sentiment Analysis with Topic-Based Mixture Modeling and Semi-Supervised Training
Improving Twitter Sentiment Analysis with Topic-Based Mixture Modeling and Semi-Supervised Training Bing Xiang * IBM Watson 1101 Kitchawan Rd Yorktown Heights, NY 10598, USA [email protected] Liang Zhou
VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter
VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,
NILC USP: A Hybrid System for Sentiment Analysis in Twitter Messages
NILC USP: A Hybrid System for Sentiment Analysis in Twitter Messages Pedro P. Balage Filho and Thiago A. S. Pardo Interinstitutional Center for Computational Linguistics (NILC) Institute of Mathematical
A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts
A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts Cataldo Musto, Giovanni Semeraro, Marco Polignano Department of Computer Science University of Bari Aldo Moro, Italy {cataldo.musto,giovanni.semeraro,marco.polignano}@uniba.it
Sentiment Lexicons for Arabic Social Media
Sentiment Lexicons for Arabic Social Media Saif M. Mohammad 1, Mohammad Salameh 2, Svetlana Kiritchenko 1 1 National Research Council Canada, 2 University of Alberta [email protected], [email protected],
Sentiment Analysis: a case study. Giuseppe Castellucci [email protected]
Sentiment Analysis: a case study Giuseppe Castellucci [email protected] Web Mining & Retrieval a.a. 2013/2014 Outline Sentiment Analysis overview Brand Reputation Sentiment Analysis in Twitter
SU-FMI: System Description for SemEval-2014 Task 9 on Sentiment Analysis in Twitter
SU-FMI: System Description for SemEval-2014 Task 9 on Sentiment Analysis in Twitter Boris Velichkov, Borislav Kapukaranov, Ivan Grozev, Jeni Karanesheva, Todor Mihaylov, Yasen Kiprov, Georgi Georgiev,
DiegoLab16 at SemEval-2016 Task 4: Sentiment Analysis in Twitter using Centroids, Clusters, and Sentiment Lexicons
DiegoLab16 at SemEval-016 Task 4: Sentiment Analysis in Twitter using Centroids, Clusters, and Sentiment Lexicons Abeed Sarker and Graciela Gonzalez Department of Biomedical Informatics Arizona State University
Fine-Grained Sentiment Analysis for Movie Reviews in Bulgarian
Fine-Grained Sentiment Analysis for Movie Reviews in Bulgarian Borislav Kapukaranov Faculty of Mathematics and Informatics Sofia University Bulgaria [email protected] Preslav Nakov Qatar Computing
Effect of Using Regression on Class Confidence Scores in Sentiment Analysis of Twitter Data
Effect of Using Regression on Class Confidence Scores in Sentiment Analysis of Twitter Data Itir Onal *, Ali Mert Ertugrul, Ruken Cakici * * Department of Computer Engineering, Middle East Technical University,
Sentiment Analysis of Movie Reviews and Twitter Statuses. Introduction
Sentiment Analysis of Movie Reviews and Twitter Statuses Introduction Sentiment analysis is the task of identifying whether the opinion expressed in a text is positive or negative in general, or about
TUGAS: Exploiting Unlabelled Data for Twitter Sentiment Analysis
TUGAS: Exploiting Unlabelled Data for Twitter Sentiment Analysis Silvio Amir +, Miguel Almeida, Bruno Martins +, João Filgueiras +, and Mário J. Silva + + INESC-ID, Instituto Superior Técnico, Universidade
End-to-End Sentiment Analysis of Twitter Data
End-to-End Sentiment Analysis of Twitter Data Apoor v Agarwal 1 Jasneet Singh Sabharwal 2 (1) Columbia University, NY, U.S.A. (2) Guru Gobind Singh Indraprastha University, New Delhi, India [email protected],
Sentiment analysis: towards a tool for analysing real-time students feedback
Sentiment analysis: towards a tool for analysing real-time students feedback Nabeela Altrabsheh Email: [email protected] Mihaela Cocea Email: [email protected] Sanaz Fallahkhair Email:
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data Apoorv Agarwal Boyi Xie Ilia Vovsha Owen Rambow Rebecca Passonneau Department of Computer Science Columbia University New York, NY 10027 USA {apoorv@cs, xie@cs, iv2121@,
Sentiment Classification on Polarity Reviews: An Empirical Study Using Rating-based Features
Sentiment Classification on Polarity Reviews: An Empirical Study Using Rating-based Features Dai Quoc Nguyen and Dat Quoc Nguyen and Thanh Vu and Son Bao Pham Faculty of Information Technology University
Microblog Sentiment Analysis with Emoticon Space Model
Microblog Sentiment Analysis with Emoticon Space Model Fei Jiang, Yiqun Liu, Huanbo Luan, Min Zhang, and Shaoping Ma State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory
Robust Sentiment Detection on Twitter from Biased and Noisy Data
Robust Sentiment Detection on Twitter from Biased and Noisy Data Luciano Barbosa AT&T Labs - Research [email protected] Junlan Feng AT&T Labs - Research [email protected] Abstract In this
Sentiment Analysis of Short Informal Texts
Journal of Artificial Intelligence Research 50 (2014) 723 762 Submitted 12/13; published 08/14 Sentiment Analysis of Short Informal Texts Svetlana Kiritchenko Xiaodan Zhu Saif M. Mohammad National Research
S-Sense: A Sentiment Analysis Framework for Social Media Sensing
S-Sense: A Sentiment Analysis Framework for Social Media Sensing Choochart Haruechaiyasak, Alisa Kongthon, Pornpimon Palingoon and Kanokorn Trakultaweekoon Speech and Audio Technology Laboratory (SPT)
Effectiveness of term weighting approaches for sparse social media text sentiment analysis
MSc in Computing, Business Intelligence and Data Mining stream. MSc Research Project Effectiveness of term weighting approaches for sparse social media text sentiment analysis Submitted by: Mulluken Wondie,
Sentiment Analysis of Microblogs
Thesis for the degree of Master in Language Technology Sentiment Analysis of Microblogs Tobias Günther Supervisor: Richard Johansson Examiner: Prof. Torbjörn Lager June 2013 Contents 1 Introduction 3 1.1
Neuro-Fuzzy Classification Techniques for Sentiment Analysis using Intelligent Agents on Twitter Data
International Journal of Innovation and Scientific Research ISSN 2351-8014 Vol. 23 No. 2 May 2016, pp. 356-360 2015 Innovative Space of Scientific Research Journals http://www.ijisr.issr-journals.org/
Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies
Sentiment analysis of Twitter microblogging posts Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies Introduction Popularity of microblogging services Twitter microblogging posts
SENTIMENT ANALYSIS USING BIG DATA FROM SOCIALMEDIA
SENTIMENT ANALYSIS USING BIG DATA FROM SOCIALMEDIA 1 VAISHALI SARATHY, 2 SRINIDHI.S, 3 KARTHIKA.S 1,2,3 Department of Information Technology, SSN College of Engineering Old Mahabalipuram Road, alavakkam
Improving Sentiment Analysis in Twitter Using Multilingual Machine Translated Data
Improving Sentiment Analysis in Twitter Using Multilingual Machine Translated Data Alexandra Balahur European Commission Joint Research Centre Via E. Fermi 2749 21027 Ispra (VA), Italy [email protected]
Prediction of Stock Market Shift using Sentiment Analysis of Twitter Feeds, Clustering and Ranking
382 Prediction of Stock Market Shift using Sentiment Analysis of Twitter Feeds, Clustering and Ranking 1 Tejas Sathe, 2 Siddhartha Gupta, 3 Shreya Nair, 4 Sukhada Bhingarkar 1,2,3,4 Dept. of Computer Engineering
Approaches for Sentiment Analysis on Twitter: A State-of-Art study
Approaches for Sentiment Analysis on Twitter: A State-of-Art study Harsh Thakkar and Dhiren Patel Department of Computer Engineering, National Institute of Technology, Surat-395007, India {harsh9t,dhiren29p}@gmail.com
Package syuzhet. February 22, 2015
Type Package Package syuzhet February 22, 2015 Title Extracts Sentiment and Sentiment-Derived Plot Arcs from Text Version 0.2.0 Date 2015-01-20 Maintainer Matthew Jockers Extracts
CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis
CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis Team members: Daniel Debbini, Philippe Estin, Maxime Goutagny Supervisor: Mihai Surdeanu (with John Bauer) 1 Introduction
Fine-grained German Sentiment Analysis on Social Media
Fine-grained German Sentiment Analysis on Social Media Saeedeh Momtazi Information Systems Hasso-Plattner-Institut Potsdam University, Germany [email protected] Abstract Expressing opinions
Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis
Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,
Lexical and Machine Learning approaches toward Online Reputation Management
Lexical and Machine Learning approaches toward Online Reputation Management Chao Yang, Sanmitra Bhattacharya, and Padmini Srinivasan Department of Computer Science, University of Iowa, Iowa City, IA, USA
Using Social Media for Continuous Monitoring and Mining of Consumer Behaviour
Using Social Media for Continuous Monitoring and Mining of Consumer Behaviour Michail Salampasis 1, Giorgos Paltoglou 2, Anastasia Giahanou 1 1 Department of Informatics, Alexander Technological Educational
Impact of Financial News Headline and Content to Market Sentiment
International Journal of Machine Learning and Computing, Vol. 4, No. 3, June 2014 Impact of Financial News Headline and Content to Market Sentiment Tan Li Im, Phang Wai San, Chin Kim On, Rayner Alfred,
The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
Particular Requirements on Opinion Mining for the Insurance Business
Particular Requirements on Opinion Mining for the Insurance Business Sven Rill, Johannes Drescher, Dirk Reinel, Jörg Scheidt, Florian Wogenstein Institute of Information Systems (iisys) University of Applied
Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.
Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach. Pranali Chilekar 1, Swati Ubale 2, Pragati Sonkambale 3, Reema Panarkar 4, Gopal Upadhye 5 1 2 3 4 5
Twitter sentiment vs. Stock price!
Twitter sentiment vs. Stock price! Background! On April 24 th 2013, the Twitter account belonging to Associated Press was hacked. Fake posts about the Whitehouse being bombed and the President being injured
Evaluating Sentiment Analysis Methods and Identifying Scope of Negation in Newspaper Articles
Evaluating Sentiment Analysis Methods and Identifying Scope of Negation in Newspaper Articles S Padmaja Dept. of CSE, UCE Osmania University Hyderabad Prof. S Sameen Fatima Dept. of CSE, UCE Osmania University
Semantic Sentiment Analysis of Twitter
Semantic Sentiment Analysis of Twitter Hassan Saif, Yulan He & Harith Alani Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom The 11 th International Semantic Web Conference
Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
Text Opinion Mining to Analyze News for Stock Market Prediction
Int. J. Advance. Soft Comput. Appl., Vol. 6, No. 1, March 2014 ISSN 2074-8523; Copyright SCRG Publication, 2014 Text Opinion Mining to Analyze News for Stock Market Prediction Yoosin Kim 1, Seung Ryul
Sentiment Analysis and Topic Classification: Case study over Spanish tweets
Sentiment Analysis and Topic Classification: Case study over Spanish tweets Fernando Batista, Ricardo Ribeiro Laboratório de Sistemas de Língua Falada, INESC- ID Lisboa R. Alves Redol, 9, 1000-029 Lisboa,
Some Experiments on Modeling Stock Market Behavior Using Investor Sentiment Analysis and Posting Volume from Twitter
Some Experiments on Modeling Stock Market Behavior Using Investor Sentiment Analysis and Posting Volume from Twitter Nuno Oliveira Centro Algoritmi Dep. Sistemas de Informação Universidade do Minho 4800-058
Joint POS Tagging and Text Normalization for Informal Text
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Joint POS Tagging and Text Normalization for Informal Text Chen Li and Yang Liu University of Texas
Twitter Analytics for Insider Trading Fraud Detection
Twitter Analytics for Insider Trading Fraud Detection W-J Ketty Gann, John Day, Shujia Zhou Information Systems Northrop Grumman Corporation, Annapolis Junction, MD, USA {wan-ju.gann, john.day2, shujia.zhou}@ngc.com
Combining Lexicon-based and Learning-based Methods for Twitter Sentiment Analysis
Combining Lexicon-based and Learning-based Methods for Twitter Sentiment Analysis Lei Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, Bing Liu HP Laboratories HPL-2011-89 Abstract: With the booming
Twitter Sentiment Analysis of Movie Reviews using Machine Learning Techniques.
Twitter Sentiment Analysis of Movie Reviews using Machine Learning Techniques. Akshay Amolik, Niketan Jivane, Mahavir Bhandari, Dr.M.Venkatesan School of Computer Science and Engineering, VIT University,
Emoticon Smoothed Language Models for Twitter Sentiment Analysis
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Emoticon Smoothed Language Models for Twitter Sentiment Analysis Kun-Lin Liu, Wu-Jun Li, Minyi Guo Shanghai Key Laboratory of
A Comparative Study on Sentiment Classification and Ranking on Product Reviews
A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan
SENTIMENT ANALYSIS: A STUDY ON PRODUCT FEATURES
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Dissertations and Theses from the College of Business Administration Business Administration, College of 4-1-2012 SENTIMENT
ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking
ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking Anne-Laure Ligozat LIMSI-CNRS/ENSIIE rue John von Neumann 91400 Orsay, France [email protected] Cyril Grouin LIMSI-CNRS rue John von Neumann 91400
A Joint Model of Feature Mining and Sentiment Analysis for Product Review Rating
A Joint Model of Feature Mining and Sentiment Analysis for Product Review Rating Jorge Carrillo de Albornoz, Laura Plaza, Pablo Gervás, and Alberto Díaz Universidad Complutense de Madrid, Departamento
Web Information Mining and Decision Support Platform for the Modern Service Industry
Web Information Mining and Decision Support Platform for the Modern Service Industry Binyang Li 1,2, Lanjun Zhou 2,3, Zhongyu Wei 2,3, Kam-fai Wong 2,3,4, Ruifeng Xu 5, Yunqing Xia 6 1 Dept. of Information
Can Twitter provide enough information for predicting the stock market?
Can Twitter provide enough information for predicting the stock market? Maria Dolores Priego Porcuna Introduction Nowadays a huge percentage of financial companies are investing a lot of money on Social
Sentiment analysis on tweets in a financial domain
Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International
Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter
Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter Hassan Saif, 1 Yulan He, 2 Miriam Fernandez 1 and Harith Alani 1 1 Knowledge Media Institute, The Open University,
Entity-centric Sentiment Analysis on Twitter data for the Potuguese Language
Entity-centric Sentiment Analysis on Twitter data for the Potuguese Language Marlo Souza 1, Renata Vieira 2 1 Instituto de Informática Universidade Federal do Rio Grande do Sul - UFRGS Porto Alegre RS
Data Mining for Tweet Sentiment Classification
Master Thesis Data Mining for Tweet Sentiment Classification Author: Internal Supervisors: R. de Groot dr. A.J. Feelders [email protected] prof. dr. A.P.J.M. Siebes External Supervisors: E. Drenthen
Evaluation Datasets for Twitter Sentiment Analysis
Evaluation Datasets for Twitter Sentiment Analysis A survey and a new dataset, the STS-Gold Hassan Saif 1, Miriam Fernandez 1, Yulan He 2 and Harith Alani 1 1 Knowledge Media Institute, The Open University,
EFFICIENTLY PROVIDE SENTIMENT ANALYSIS DATA SETS USING EXPRESSIONS SUPPORT METHOD
EFFICIENTLY PROVIDE SENTIMENT ANALYSIS DATA SETS USING EXPRESSIONS SUPPORT METHOD 1 Josephine Nancy.C, 2 K Raja. 1 PG scholar,department of Computer Science, Tagore Institute of Engineering and Technology,
ARABIC SENTENCE LEVEL SENTIMENT ANALYSIS
The American University in Cairo School of Science and Engineering ARABIC SENTENCE LEVEL SENTIMENT ANALYSIS A Thesis Submitted to The Department of Computer Science and Engineering In partial fulfillment
SemEval-2015 Task 10: Sentiment Analysis in Twitter
SemEval-2015 Task 10: Sentiment Analysis in Twitter Sara Rosenthal Columbia University [email protected] Preslav Nakov Qatar Computing Research Institute [email protected] Svetlana Kiritchenko National
Sentiment analysis for news articles
Prashant Raina Sentiment analysis for news articles Wide range of applications in business and public policy Especially relevant given the popularity of online media Previous work Machine learning based
PREDICTING MARKET VOLATILITY FEDERAL RESERVE BOARD MEETING MINUTES FROM
PREDICTING MARKET VOLATILITY FROM FEDERAL RESERVE BOARD MEETING MINUTES Reza Bosagh Zadeh and Andreas Zollmann Lab Advisers: Noah Smith and Bryan Routledge GOALS Make Money! Not really. Find interesting
Integrating NLTK with the Hadoop Map Reduce Framework 433-460 Human Language Technology Project
Integrating NLTK with the Hadoop Map Reduce Framework 433-460 Human Language Technology Project Paul Bone [email protected] June 2008 Contents 1 Introduction 1 2 Method 2 2.1 Hadoop and Python.........................
Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015
Sentiment Analysis D. Skrepetos 1 1 Department of Computer Science University of Waterloo NLP Presenation, 06/17/2015 D. Skrepetos (University of Waterloo) Sentiment Analysis NLP Presenation, 06/17/2015
Reputation Management System
Reputation Management System Mihai Damaschin Matthijs Dorst Maria Gerontini Cihat Imamoglu Caroline Queva May, 2012 A brief introduction to TEX and L A TEX Abstract Chapter 1 Introduction Word-of-mouth
A Sentiment Analysis Model Integrating Multiple Algorithms and Diverse. Features. Thesis
A Sentiment Analysis Model Integrating Multiple Algorithms and Diverse Features Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The
Multilanguage sentiment-analysis of Twitter data on the example of Swiss politicians
Multilanguage sentiment-analysis of Twitter data on the example of Swiss politicians Lucas Brönnimann University of Applied Science Northwestern Switzerland, CH-5210 Windisch, Switzerland Email: [email protected]
SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter
SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter Hassan Saif, 1 Miriam Fernandez, 1 Yulan He 2 and Harith Alani 1 1 Knowledge Media Institute, The Open University, United
Subjectivity and Sentiment Analysis of Arabic Twitter Feeds with Limited Resources
Subjectivity and Sentiment Analysis of Arabic Twitter Feeds with Limited Resources Eshrag Refaee and Verena Rieser Interaction Lab, Heriot-Watt University, EH144AS Edinburgh, Untied Kingdom. [email protected],
Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. ~ Spring~r
Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures ~ Spring~r Table of Contents 1. Introduction.. 1 1.1. What is the World Wide Web? 1 1.2. ABrief History of the Web
Gated Neural Networks for Targeted Sentiment Analysis
Gated Neural Networks for Targeted Sentiment Analysis Meishan Zhang 1,2 and Yue Zhang 2 and Duy-Tin Vo 2 1. School of Computer Science and Technology, Heilongjiang University, Harbin, China 2. Singapore
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS Gautami Tripathi 1 and Naganna S. 2 1 PG Scholar, School of Computing Science and Engineering, Galgotias University, Greater Noida,
Sentiment Analysis for Movie Reviews
Sentiment Analysis for Movie Reviews Ankit Goyal, [email protected] Amey Parulekar, [email protected] Introduction: Movie reviews are an important way to gauge the performance of a movie. While providing
