Sentiment Lexicons for Arabic Social Media
|
|
|
- Rodger Hawkins
- 9 years ago
- Views:
Transcription
1 Sentiment Lexicons for Arabic Social Media Saif M. Mohammad 1, Mohammad Salameh 2, Svetlana Kiritchenko 1 1 National Research Council Canada, 2 University of Alberta [email protected], [email protected], [email protected] Abstract Existing Arabic sentiment lexicons have low coverage only a few thousand entries. In this paper, we present several large sentiment lexicons that were automatically generated using two different methods: (1) by using distant supervision techniques on Arabic tweets, and (2) by translating English sentiment lexicons into Arabic using a freely available statistical machine translation system. We compare the usefulness of new and old sentiment lexicons in the downstream application of sentence-level sentiment analysis. Our baseline sentiment analysis system uses numerous surface form features. Nonetheless, the system benefits from using additional features drawn from sentiment lexicons. The best result is obtained using the automatically generated Dialectal Hashtag Lexicon and the Arabic translation of the NRC Emotion Lexicon (accuracy of 66.6%). Finally, we describe a qualitative study of the automatic translations of English sentiment lexicons into Arabic, which shows that about 88% of the automatically translated entries are valid for Arabic as well. Close to 10% of the invalid entries are the result of gross mistranslation, close to 40% are due to translation into a related word, and about 50% are due to differences in how the word is used in Arabic. Keywords: Arabic sentiment lexicons, sentiment analysis, social media, Arabic texts, valence, opinion, translation 1. Introduction Sentiment lexicons are lists of positive and negative words, optionally with a score indicating the degree of polarity. Sentiment analysis systems, including the best performing ones such as the NRC-Canada system (Mohammad et al., 2013; Kiritchenko et al., 2014a; Zhu et al., 2014), use sentiment lexicons to obtain significant improvements (Wilson et al., 2013; Pontiki et al., 2014; Rosenthal et al., 2015; Mohammad et al., 2016a). However, much of the past work has focused on English texts and English sentiment lexicons. Arabic sentiment analysis has benefited from recent work (Farra et al., 2010; Abdul-Mageed et al., 2011; Badaro et al., 2014; Refaee and Rieser, 2014), but a resource that is particularly lacking is a large sentiment lexicon. Existing lexicons contain only a few hundred to a few thousand entries. Further, most do not indicate the degree of association between a word and positive (or negative) sentiment. Kiritchenko et al. (2016) created a manually annotated Arabic sentiment lexicon with real-valued sentiment scores; however, that lexicon too has only around one thousand entries. In this paper, we describe how we created Arabic sentiment lexicons with tens of thousands of entries. These include new automatically generated ones as well as translations of existing English lexicons. We use the lexicons (old and new) for sentiment classification of Arabic social media posts. Our baseline system uses numerous surface form features. Nonetheless, the system benefits from using additional features drawn from sentiment lexicons. We also show the extent to which various sentiment lexicons are effective in sentiment analysis. The best result was obtained using both the Dialectal Hashtag Lexicon and the translated NRC Emotion Lexicon (an accuracy of 66.6%). Finally, we present a study that qualitatively examines the automatically generated Arabic translations of entries in an English sentiment lexicon. A native speaker of Arabic determined whether the automatic translations were appropriate. An appropriate entry is an Arabic translation that has the same sentiment association as its English source word. Translated entries that were deemed incorrect were further classified into coarse error categories. The study showed that about 88% of the automatically translated entries are valid for Arabic as well. Close to 10% of the errors were caused by gross mistranslations, close to 40% by translations into a related word, and about 50% by differences in how the word is used in Arabic. For further analysis of how translation of words and sentences alters their sentiment, we refer the reader to Mohammad et al. (2016b) and Salameh et al. (2015). All of the lexicons we created are made freely available Generating Arabic Sentiment Lexicons We created Arabic sentiment lexicons automatically using two different methods: (1) by using distant supervision techniques on Arabic tweets, and (2) by translating English sentiment lexicons into Arabic using Google Translate a freely available statistical machine translation system. 2 Even though Google Translate is a phrase-based statistical machine translation system that is primarily designed to translate sentences, it can also provide one-word translations. These translations are often the word representing the predominant sense of the word in the source language. Table 1 lists the number of entries in some of the existing Arabic sentiment lexicons. Table 2 lists the number of entries in each of the lexicons we created. Note that the manually created Kiritchenko et al. (2016) lexicon, and the automatically generated Arabic lexicons (Table 2 - a.i., a.ii., a.iii.) have real-valued sentiment association scores. For the purposes of these tables, terms with scores less than 0 are considered negative and those with scores greater than or equal to 0 are considered positive New Arabic Sentiment Lexicons The emoticons and hashtag words in a tweet can often act as sentiment labels for the rest of the tweet. We use this idea, commonly referred to as distant supervision (Go et al., 2009), to generate three Arabic sentiment lexicons: Google Translate: 33
2 Arabic Emoticon Lexicon: We collected close to one million Arabic tweets that had emoticons ( :) or :( ). For the purposes of generating a sentiment lexicon, :) was considered a positive label (pos) and :( was considered a negative label (neg). For each word w, that occurred at least five times in these tweets, a sentiment score was calculated using the formula shown below (proposed earlier in Mohammad et al. (2013) and Kiritchenko et al. (2014b)): SentimentScore(w) = P MI(w, pos) P MI(w, neg) (1) where PMI stands for Pointwise Mutual Information. We refer to the resulting entries as the Arabic Emoticon Lexicon. Arabic Hashtag Lexicon: The NRC-Canada system used 77 positive and negative seed words to generate the English NRC Hashtag Sentiment Lexicon (Mohammad et al., 2013; Kiritchenko et al., 2014b). We translated these English seeds into Arabic using Google Translate. Among the translations provided, we chose words that were less ambiguous and tended to have strong sentiment in Arabic texts. We polled the Twitter API to collect tweets that included these seed words as hashtags. For the purposes of generating a sentiment lexicon, a positive seed hashtag was considered a positive label (pos) and a negative seed hashtag was considered a negative label (neg). For each word w that occurred at least five times in these tweets, we calculated a sentiment score using Equation 1. We will refer to this lexicon as the Arabic Hashtag Lexicon. Arabic Hashtag Lexicon (Dialectal): Refaee and Rieser (2014) manually created a small sentiment lexicon of 483 dialectal Arabic sentiment words from tweets. We used these words as seeds to collect tweets that contain them, and generated a PMI-based sentiment lexicon just as described above. We refer to this lexicon as the Dialectal Arabic Hashtag Lexicon or Arabic Hashtag Lexicon (dialectal). In Section 3, we show how we used these lexicons for sentiment analysis Generating Arabic Translations of English Sentiment Lexicons We used Google Translate to translate into Arabic the words in each of the following English sentiment lexicons: AFINN (Nielsen, 2011), Bing Liu Lexicon (Hu and Liu, 2004), MPQA Subjectivity Lexicon (Wilson et al., 2005), NRC Emotion Lexicon (Mohammad and Turney, 2010; Mohammad and Turney, 2013), NRC Emoticon Lexicon aka Sentiment140 Lexicon (Mohammad et al., 2013; Kiritchenko et al., 2014b), and NRC Hashtag Sentiment Lexicon (Mohammad et al., 2013; Kiritchenko et al., 2014b). Note that Google Translate was unable to translate some words in these lexicons. Table 2 gives the number of words translated (total) as well as a break down by sentiment category (positive, negative, and neutral). In Section 4, we present a study that manually examines a subset of the automatic translations. 3. Arabic Sentiment Analysis To determine the usefulness of the Arabic sentiment lexicons, we apply them in a sentence-level sentiment analysis system. We build an Arabic sentence-level sentiment analysis system by reconstructing the NRC-Canada English system (Mohammad et al., 2013; Kiritchenko et al., 2014b) to deal with Arabic text. A linear-kernel Support Vector Machine (Chang and Lin, 2011) classifier is trained on the available training data. The classifier leverages a variety of surface-form and sentiment lexicon features. The surface form features include the presence/absence of word and character ngrams, all-cap words, hashtags, and punctuation marks. The sentiment lexicon features are derived from lexicons described in Section 2. They include the number of sentiment words with non-zero sentiment score, the sum of sentiment scores of positive words (and separately negative words), and the sentiment score of the last token. We preprocess Arabic text by tokenizing with the CMU Twitter NLP tool to deal with specific tokens such as URLs, usernames, and emoticons. Then we use MADA to generate lemmas (Habash et al., 2009). Finally, we normalize different forms of Alif and Ya to bare Alif and dotless Ya. We chose, for our experiments, an existing Arabic social media dataset the BBN Arabic Dialectal Text (Zbib et al., 2012). 3 It contains sentences from the blog posts, with a mixture of expressions from the Levantine dialect of Arabic as well as Modern Standard Arabic. We randomly selected a subset of 1200 sentences, which we will refer to as the BBN posts or BBN dataset, and annotated them for sentiment on the crowdsourcing platform CrowdFlower. 4 Table 3 shows ten-fold cross-validation accuracies obtained on the BBN dataset by our Arabic sentiment analysis system. Row a. shows results obtained using the various surface-form features as baseline. The rows within b. and c. show accuracies obtained by adding to the baseline system features derived from the Arabic sentiment lexicons and Arabic translations of English lexicons, respectively. Observe that existing manually created sentiment lexicons (Abdul-Mageed et al. (2011) lexicon, Refaee and Rieser (2014) lexicon, and Kiritchenko et al. (2016) lexicon) provide only small improvements. While some of the automatically generated Arabic sentiment lexicons provide similar gains to the manual ones (accuracies around 63%), the Arabic Hashtag Lexicon (dialectal) helps obtain marked improvements in accuracy (65.3%). Arabic translations of English sentiment lexicons also improve accuracies over the baseline, but not to the extent of the Dialectal Arabic Hashtag Lexicon. This is probably because the dialectal lexicon has dialectal Arabic terms, which are commonly used in social media texts. The dialectal lexicon also has a much larger set of terms than any of the manually created lexicons. Further, translation of English lexicon entries can lead to errors. The best results obtained with translations of English lexicons are from using the translated NRC Emotion Lexicon. Using both the Dialectal Hashtag Lexicon and the translated NRC Emotion Lexicon gives us the best results overall (66.6%)
3 Resource a. Arabic sentiment lexicons Number of instances positive negative neutral total i. Abdul-Mageed et al. (2011) Lexicon ,982 ii. Refaee and Rieser (2014) Lexicon iii. Kiritchenko et al. (2016) Lexicon ,168 Table 1: Existing Arabic sentiment lexicons. Resource a. Arabic sentiment lexicons Number of instances positive negative neutral total i. Arabic Emoticon Lexicon 22,962 20,342-43,304 ii. Arabic Hashtag Lexicon 13,118 8,846-21,964 iii. Arabic Hashtag Lexicon (dialectal) 11,941 8,179-20,128 b. English lexicons translated into Arabic i. AFINN 878 1,598-2,476 ii. Bing Liu s Lexicon 2,006 4,783-6,789 iii. MPQA Subjectivity Lexicon 2,718 4, ,199 iv. NRC Emotion Lexicon 2,317 3,338 8,527 14,182 v. NRC Emoticon Lexicon 15,210 11,530-26,740 vi. NRC Hashtag Lexicon 18,341 14,241-32,582 Table 2: New Arabic sentiment lexicons created as part of this project. System Accuracy (in percentage) a. Baseline (uses word ngrams and other surface form features) 62.0 b. Baseline + Arabic lexicon i. Abdul-Mageed et al. (2011) Lexicon 62.2 ii. Refaee and Rieser (2014) Lexicon 63.0 iii. Kiritchenko et al. (2016) Lexicon 62.7 iv. the Arabic Emoticon Lexicon 62.4 v. the Arabic Hashtag Lexicon 63.0 vi. the Arabic Hashtag Lexicon (dialectal) 65.3 vii. lexicon features from iv., v., and vi c. Baseline + Arabic translation of English lexicon i. English lexicon: AFINN 63.4 ii. English lexicon: Bing Liu Lexicon 63.0 iii. English lexicon: MPQA 61.9 iv. English lexicon: NRC Emotion Lexicon 63.5 v. English lexicon: NRC Emoticon Lexicon 62.4 vi. English lexicon: NRC Hashtag Lexicon 61.7 d. Baseline + Arabic Hashtag Lexicon (dialectal) + Arabic translation of NRC Emotion Lexicon 66.6 Table 3: Sentiment classification accuracies on the BBN sentences. Highest scores in b., c., and d. are shown in bold. 35
4 Before Translation After Translation # English Entries # positive # negative # neutral # changed positive (15.0%) negative (08.0%) neutral (12.0%) All (11.7%) Table 4: Annotations of NRC Emotion Lexicon s sentiment association entries after automatic translation into Arabic. Percentage of Error categories total errors 1. Mistranslated Translated to a related word Translation correct, but 3a., 3b., or 3c a. Different dominant sense b. Cultural differences c. Other reasons 0.0 Table 5: Percentage of erroneous entries assigned to each error category. 4. A Manual Study of Automatically Translated Sentiment Entries As shown above, lexicons created by translating existing ones in other languages can be beneficial for automatic sentiment analysis. However, the above experiments do not explicitly quantify the extent to which such translated entries are appropriate, and how translation alters the sentiment of the source word. We conducted a small manual annotation study of 300 entries from the NRC Emotion Lexicon to determine the percentage of entries that were appropriate even after automatic translation into the focus language (Arabic). An appropriate entry is an Arabic translation that has the same sentiment association as its English source word. Additionally, translated entries that were deemed incorrect for Arabic were classified into coarse error categories. A list of pre-decided error categories was presented to the annotator, but the annotator was also encouraged to create new error categories as appropriate. The error categories provided are shown below: 1. The word is completely mistranslated. 2. The translation is not perfect, but the English word is translated into a word related to the correct translation. The Arabic word provided has a different sentiment than the English source word. 3. The translation is correct, but the Arabic word has a different sentiment than the English source word. (a) The dominant sense of the Arabic word is different from the dominant sense of the English source word, and they have different sentiments. (b) Cultural and life style differences between Arabic and English speakers lead to different sentiment associations of the English word and its translation. (c) Some other reason (give reason if you can). The annotator was a native speaker of Arabic, who was also fluent in English. We chose the NRC Emotion Lexicon for the study because it was manually created and because it has neutral terms as well (in addition to positive and negative terms). Since manual annotation is tedious, for this study, we randomly selected 100 positive words, 100 negative words, and 100 neutral words from the lexicon. Table 4 shows the results of the human annotation study. Of the 100 positive entries examined, 85 entries were marked as appropriate in Arabic as well. Nine of the translations were marked as being negative in Arabic, and six were marked as neutral. Similarly, 92% of the translated negative entries and 88% of the translated neutral entries were marked appropriate in Arabic. Overall, 11.7% of the translated entries were deemed incorrect for Arabic. Table 5 gives the percentage of erroneous entries assigned to each error category. Observe that close to 10% of the errors are caused by gross mistranslations, close to 40% of the errors are caused by translations into a related word, and about 50% of the errors are caused, not by bad translation, but by differences in how the word is used in Arabic either because of different sense distributions (29%) or because of cultural differences (22.6%). 5. Conclusions We created new Arabic sentiment lexicons using techniques of distant supervision, and showed their usefulness in the sentiment analysis of social media posts. We also translated existing English sentiment lexicons (four manually created ones and two that were created automatically) into Arabic using Google Translate. We showed that the lexicons improve performance over and above a competitive baseline classifier that uses various surface-form features. The Arabic Dialectal Hashtag Lexicon was especially useful, but adding features from translated lexicons further improved classification accuracy. Finally, we analyzed a subset of the automatically translated sentiment lexicon entries to show the extent to which sentiment is preserved after translation. We also identified the different reasons that can lead to erroneous entries in the translated lexicon. All of our lexicons are made freely available. 36
5 6. Bibliographical References Abdul-Mageed, M., Diab, M. T., and Korayem, M. (2011). Subjectivity and sentiment analysis of Modern Standard Arabic. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pages , Portland, OR. Badaro, G., Baly, R., Hajj, H., Habash, N., and El-Hajj, W. (2014). A large scale Arabic sentiment lexicon for Arabic opinion mining. In Proceedings of the EMNLP Workshop on Arabic Natural Language Processing (ANLP), pages , Doha, Qatar. Chang, C.-C. and Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(3):27:1 27:27. Farra, N., Challita, E., Assi, R. A., and Hajj, H. (2010). Sentence-level and document-level sentiment mining for Arabic texts. In Proceedings of the IEEE International Conference on Data Mining Workshops, pages IEEE. Go, A., Bhayani, R., and Huang, L. (2009). Twitter sentiment classification using distant supervision. Technical report, Stanford University. Habash, N., Rambow, O., and Roth, R. (2009). MADA+TOKAN: A toolkit for Arabic tokenization, diacritization, morphological disambiguation, POS tagging, stemming and lemmatization. In Proceedings of the 2nd International Conference on Arabic Language Resources and Tools, pages , Cairo, Egypt. Hu, M. and Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 04, pages , New York, NY, USA. Kiritchenko, S., Zhu, X., Cherry, C., and Mohammad, S. (2014a). NRC-Canada-2014: Detecting aspects and sentiment in customer reviews. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages , Dublin, Ireland, August. Kiritchenko, S., Zhu, X., and Mohammad, S. M. (2014b). Sentiment analysis of short informal texts. Journal of Artificial Intelligence Research, 50: Kiritchenko, S., Mohammad, S. M., and Salameh, M. (2016). SemEval-2016 Task 7: Determining sentiment intensity of English and Arabic phrases. In Proceedings of the International Workshop on Semantic Evaluation, SemEval 16, San Diego, California. Mohammad, S. M. and Turney, P. D. (2010). Emotions evoked by common words and phrases: Using Mechanical Turk to create an emotion lexicon. In Proceedings of the NAACL-HLT Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pages 26 34, LA, California. Mohammad, S. M. and Turney, P. D. (2013). Crowdsourcing a word-emotion association lexicon. Computational Intelligence, 29(3): Mohammad, S. M., Kiritchenko, S., and Zhu, X. (2013). NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets. In Proceedings of the 7th International Workshop on Semantic Evaluation Exercises, SemEval 13, Atlanta, Georgia, USA, June. Mohammad, S. M., Kiritchenko, S., Sobhani, P., Zhu, X., and Cherry, C. (2016a). SemEval-2016 Task 6: Detecting stance in tweets. In Proceedings of the International Workshop on Semantic Evaluation, SemEval 16, San Diego, California, June. Mohammad, S. M., Salameh, M., and Kiritchenko, S. (2016b). How translation alters sentiment. Journal of Artificial Intelligence Research, 55: Nielsen, F. Å. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. In Proceedings of the ESWC2011 Workshop on Making Sense of Microposts : Big things come in small packages, pages Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos, I., and Manandhar, S. (2014). SemEval-2014 Task 4: Aspect based sentiment analysis. In Proceedings of the International Workshop on Semantic Evaluation, SemEval 14, pages 27 35, Dublin, Ireland, August. Refaee, E. and Rieser, V. (2014). An Arabic Twitter corpus for subjectivity and sentiment analysis. In Proceedings of the 9th International Conference on Language Resources and Evaluation, Reykjavik, Iceland. Rosenthal, S., Nakov, P., Kiritchenko, S., Mohammad, S. M., Ritter, A., and Stoyanov, V. (2015). SemEval Task 10: Sentiment analysis in Twitter. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages , Denver, Colorado, June. Salameh, M., Mohammad, S. M., and Kiritchenko, S. (2015). Sentiment after translation: A case-study on Arabic social media posts. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages , Denver, Colorado. Wilson, T., Wiebe, J., and Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 05, pages , Stroudsburg, PA, USA. Wilson, T., Kozareva, Z., Nakov, P., Rosenthal, S., Stoyanov, V., and Ritter, A. (2013). SemEval-2013 Task 2: Sentiment analysis in Twitter. In Proceedings of the International Workshop on Semantic Evaluation, SemEval 13, Atlanta, Georgia, USA, June. Zbib, R., Malchiodi, E., Devlin, J., Stallard, D., Matsoukas, S., Schwartz, R., Makhoul, J., Zaidan, O. F., and Callison-Burch, C. (2012). Machine translation of Arabic dialects. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages Zhu, X., Kiritchenko, S., and Mohammad, S. (2014). NRC-Canada-2014: Recent improvements in the sentiment analysis of tweets. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages , Dublin, Ireland, August. 37
Improving Twitter Sentiment Analysis with Topic-Based Mixture Modeling and Semi-Supervised Training
Improving Twitter Sentiment Analysis with Topic-Based Mixture Modeling and Semi-Supervised Training Bing Xiang * IBM Watson 1101 Kitchawan Rd Yorktown Heights, NY 10598, USA [email protected] Liang Zhou
Subjectivity and Sentiment Analysis of Arabic Twitter Feeds with Limited Resources
Subjectivity and Sentiment Analysis of Arabic Twitter Feeds with Limited Resources Eshrag Refaee and Verena Rieser Interaction Lab, Heriot-Watt University, EH144AS Edinburgh, Untied Kingdom. [email protected],
IIIT-H at SemEval 2015: Twitter Sentiment Analysis The good, the bad and the neutral!
IIIT-H at SemEval 2015: Twitter Sentiment Analysis The good, the bad and the neutral! Ayushi Dalmia, Manish Gupta, Vasudeva Varma Search and Information Extraction Lab International Institute of Information
Kea: Expression-level Sentiment Analysis from Twitter Data
Kea: Expression-level Sentiment Analysis from Twitter Data Ameeta Agrawal Computer Science and Engineering York University Toronto, Canada [email protected] Aijun An Computer Science and Engineering
VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter
VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,
Package syuzhet. February 22, 2015
Type Package Package syuzhet February 22, 2015 Title Extracts Sentiment and Sentiment-Derived Plot Arcs from Text Version 0.2.0 Date 2015-01-20 Maintainer Matthew Jockers Extracts
An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis
An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis Eshrag Refaee and Verena Rieser Interaction Lab, Heriot-Watt University, EH144AS Edinburgh, Untied Kingdom. [email protected], [email protected]
CMUQ@Qatar:Using Rich Lexical Features for Sentiment Analysis on Twitter
CMUQ@Qatar:Using Rich Lexical Features for Sentiment Analysis on Twitter Sabih Bin Wasi, Rukhsar Neyaz, Houda Bouamor, Behrang Mohit Carnegie Mellon University in Qatar {sabih, rukhsar, hbouamor, behrang}@cmu.edu
Benchmarking Machine Translated Sentiment Analysis for Arabic Tweets
Benchmarking Machine Translated Sentiment Analysis for Arabic Tweets Eshrag Refaee Interaction Lab, Heriot-Watt University EH144AS Edinburgh, UK [email protected] Verena Rieser Interaction Lab, Heriot-Watt
Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation
Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation Rabih Zbib, Gretchen Markiewicz, Spyros Matsoukas, Richard Schwartz, John Makhoul Raytheon BBN Technologies
SU-FMI: System Description for SemEval-2014 Task 9 on Sentiment Analysis in Twitter
SU-FMI: System Description for SemEval-2014 Task 9 on Sentiment Analysis in Twitter Boris Velichkov, Borislav Kapukaranov, Ivan Grozev, Jeni Karanesheva, Todor Mihaylov, Yasen Kiprov, Georgi Georgiev,
End-to-End Sentiment Analysis of Twitter Data
End-to-End Sentiment Analysis of Twitter Data Apoor v Agarwal 1 Jasneet Singh Sabharwal 2 (1) Columbia University, NY, U.S.A. (2) Guru Gobind Singh Indraprastha University, New Delhi, India [email protected],
Sentiment analysis: towards a tool for analysing real-time students feedback
Sentiment analysis: towards a tool for analysing real-time students feedback Nabeela Altrabsheh Email: [email protected] Mihaela Cocea Email: [email protected] Sanaz Fallahkhair Email:
Sentiment analysis for news articles
Prashant Raina Sentiment analysis for news articles Wide range of applications in business and public policy Especially relevant given the popularity of online media Previous work Machine learning based
Sentiment Analysis of Short Informal Texts
Journal of Artificial Intelligence Research 50 (2014) 723 762 Submitted 12/13; published 08/14 Sentiment Analysis of Short Informal Texts Svetlana Kiritchenko Xiaodan Zhu Saif M. Mohammad National Research
Sentiment Analysis: a case study. Giuseppe Castellucci [email protected]
Sentiment Analysis: a case study Giuseppe Castellucci [email protected] Web Mining & Retrieval a.a. 2013/2014 Outline Sentiment Analysis overview Brand Reputation Sentiment Analysis in Twitter
NILC USP: A Hybrid System for Sentiment Analysis in Twitter Messages
NILC USP: A Hybrid System for Sentiment Analysis in Twitter Messages Pedro P. Balage Filho and Thiago A. S. Pardo Interinstitutional Center for Computational Linguistics (NILC) Institute of Mathematical
Robust Sentiment Detection on Twitter from Biased and Noisy Data
Robust Sentiment Detection on Twitter from Biased and Noisy Data Luciano Barbosa AT&T Labs - Research [email protected] Junlan Feng AT&T Labs - Research [email protected] Abstract In this
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data Apoorv Agarwal Boyi Xie Ilia Vovsha Owen Rambow Rebecca Passonneau Department of Computer Science Columbia University New York, NY 10027 USA {apoorv@cs, xie@cs, iv2121@,
Sentiment analysis on tweets in a financial domain
Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International
Fine-Grained Sentiment Analysis for Movie Reviews in Bulgarian
Fine-Grained Sentiment Analysis for Movie Reviews in Bulgarian Borislav Kapukaranov Faculty of Mathematics and Informatics Sofia University Bulgaria [email protected] Preslav Nakov Qatar Computing
Microblog Sentiment Analysis with Emoticon Space Model
Microblog Sentiment Analysis with Emoticon Space Model Fei Jiang, Yiqun Liu, Huanbo Luan, Min Zhang, and Shaoping Ma State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory
Automatic Identification of Arabic Language Varieties and Dialects in Social Media
Automatic Identification of Arabic Language Varieties and Dialects in Social Media Fatiha Sadat University of Quebec in Montreal, 201 President Kennedy, Montreal, QC, Canada [email protected] Farnazeh
Semantic Sentiment Analysis of Twitter
Semantic Sentiment Analysis of Twitter Hassan Saif, Yulan He & Harith Alani Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom The 11 th International Semantic Web Conference
The Arabic Online Commentary Dataset: an Annotated Dataset of Informal Arabic with High Dialectal Content
The Arabic Online Commentary Dataset: an Annotated Dataset of Informal Arabic with High Dialectal Content Omar F. Zaidan and Chris Callison-Burch Dept. of Computer Science, Johns Hopkins University Baltimore,
Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies
Sentiment analysis of Twitter microblogging posts Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies Introduction Popularity of microblogging services Twitter microblogging posts
TUGAS: Exploiting Unlabelled Data for Twitter Sentiment Analysis
TUGAS: Exploiting Unlabelled Data for Twitter Sentiment Analysis Silvio Amir +, Miguel Almeida, Bruno Martins +, João Filgueiras +, and Mário J. Silva + + INESC-ID, Instituto Superior Técnico, Universidade
S-Sense: A Sentiment Analysis Framework for Social Media Sensing
S-Sense: A Sentiment Analysis Framework for Social Media Sensing Choochart Haruechaiyasak, Alisa Kongthon, Pornpimon Palingoon and Kanokorn Trakultaweekoon Speech and Audio Technology Laboratory (SPT)
Sentiment Analysis and Topic Classification: Case study over Spanish tweets
Sentiment Analysis and Topic Classification: Case study over Spanish tweets Fernando Batista, Ricardo Ribeiro Laboratório de Sistemas de Língua Falada, INESC- ID Lisboa R. Alves Redol, 9, 1000-029 Lisboa,
Twitter Stock Bot. John Matthew Fong The University of Texas at Austin [email protected]
Twitter Stock Bot John Matthew Fong The University of Texas at Austin [email protected] Hassaan Markhiani The University of Texas at Austin [email protected] Abstract The stock market is influenced
Fine-grained German Sentiment Analysis on Social Media
Fine-grained German Sentiment Analysis on Social Media Saeedeh Momtazi Information Systems Hasso-Plattner-Institut Potsdam University, Germany [email protected] Abstract Expressing opinions
Evaluation Datasets for Twitter Sentiment Analysis
Evaluation Datasets for Twitter Sentiment Analysis A survey and a new dataset, the STS-Gold Hassan Saif 1, Miriam Fernandez 1, Yulan He 2 and Harith Alani 1 1 Knowledge Media Institute, The Open University,
ASTD: Arabic Sentiment Tweets Dataset
ASTD: Arabic Sentiment Tweets Dataset Mahmoud Nabil [email protected] Mohamed Aly [email protected] Amir F. Atiya [email protected] Abstract This paper introduces ASTD, an Arabic social sentiment
Twitter sentiment vs. Stock price!
Twitter sentiment vs. Stock price! Background! On April 24 th 2013, the Twitter account belonging to Associated Press was hacked. Fake posts about the Whitehouse being bombed and the President being injured
CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis
CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis Team members: Daniel Debbini, Philippe Estin, Maxime Goutagny Supervisor: Mihai Surdeanu (with John Bauer) 1 Introduction
EFFICIENTLY PROVIDE SENTIMENT ANALYSIS DATA SETS USING EXPRESSIONS SUPPORT METHOD
EFFICIENTLY PROVIDE SENTIMENT ANALYSIS DATA SETS USING EXPRESSIONS SUPPORT METHOD 1 Josephine Nancy.C, 2 K Raja. 1 PG scholar,department of Computer Science, Tagore Institute of Engineering and Technology,
Sentiment Analysis of Microblogs
Thesis for the degree of Master in Language Technology Sentiment Analysis of Microblogs Tobias Günther Supervisor: Richard Johansson Examiner: Prof. Torbjörn Lager June 2013 Contents 1 Introduction 3 1.1
Impact of Financial News Headline and Content to Market Sentiment
International Journal of Machine Learning and Computing, Vol. 4, No. 3, June 2014 Impact of Financial News Headline and Content to Market Sentiment Tan Li Im, Phang Wai San, Chin Kim On, Rayner Alfred,
Reputation Management System
Reputation Management System Mihai Damaschin Matthijs Dorst Maria Gerontini Cihat Imamoglu Caroline Queva May, 2012 A brief introduction to TEX and L A TEX Abstract Chapter 1 Introduction Word-of-mouth
A Comparative Study on Sentiment Classification and Ranking on Product Reviews
A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan
Sentiment Classification on Polarity Reviews: An Empirical Study Using Rating-based Features
Sentiment Classification on Polarity Reviews: An Empirical Study Using Rating-based Features Dai Quoc Nguyen and Dat Quoc Nguyen and Thanh Vu and Son Bao Pham Faculty of Information Technology University
Sentiment Analysis of Movie Reviews and Twitter Statuses. Introduction
Sentiment Analysis of Movie Reviews and Twitter Statuses Introduction Sentiment analysis is the task of identifying whether the opinion expressed in a text is positive or negative in general, or about
Sentiment Analysis Tool using Machine Learning Algorithms
Sentiment Analysis Tool using Machine Learning Algorithms I.Hemalatha 1, Dr. G. P Saradhi Varma 2, Dr. A.Govardhan 3 1 Research Scholar JNT University Kakinada, Kakinada, A.P., INDIA 2 Professor & Head,
A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts
A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts Cataldo Musto, Giovanni Semeraro, Marco Polignano Department of Computer Science University of Bari Aldo Moro, Italy {cataldo.musto,giovanni.semeraro,marco.polignano}@uniba.it
Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams
2012 International Conference on Computer Technology and Science (ICCTS 2012) IPCSIT vol. XX (2012) (2012) IACSIT Press, Singapore Using Text and Data Mining Techniques to extract Stock Market Sentiment
DiegoLab16 at SemEval-2016 Task 4: Sentiment Analysis in Twitter using Centroids, Clusters, and Sentiment Lexicons
DiegoLab16 at SemEval-016 Task 4: Sentiment Analysis in Twitter using Centroids, Clusters, and Sentiment Lexicons Abeed Sarker and Graciela Gonzalez Department of Biomedical Informatics Arizona State University
Approaches for Sentiment Analysis on Twitter: A State-of-Art study
Approaches for Sentiment Analysis on Twitter: A State-of-Art study Harsh Thakkar and Dhiren Patel Department of Computer Engineering, National Institute of Technology, Surat-395007, India {harsh9t,dhiren29p}@gmail.com
Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.
Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach. Pranali Chilekar 1, Swati Ubale 2, Pragati Sonkambale 3, Reema Panarkar 4, Gopal Upadhye 5 1 2 3 4 5
Emoticon Smoothed Language Models for Twitter Sentiment Analysis
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Emoticon Smoothed Language Models for Twitter Sentiment Analysis Kun-Lin Liu, Wu-Jun Li, Minyi Guo Shanghai Key Laboratory of
Gated Neural Networks for Targeted Sentiment Analysis
Gated Neural Networks for Targeted Sentiment Analysis Meishan Zhang 1,2 and Yue Zhang 2 and Duy-Tin Vo 2 1. School of Computer Science and Technology, Heilongjiang University, Harbin, China 2. Singapore
The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
SENTIMENT ANALYSIS: A STUDY ON PRODUCT FEATURES
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Dissertations and Theses from the College of Business Administration Business Administration, College of 4-1-2012 SENTIMENT
Particular Requirements on Opinion Mining for the Insurance Business
Particular Requirements on Opinion Mining for the Insurance Business Sven Rill, Johannes Drescher, Dirk Reinel, Jörg Scheidt, Florian Wogenstein Institute of Information Systems (iisys) University of Applied
SZTE-NLP: Aspect Level Opinion Mining Exploiting Syntactic Cues
ZTE-NLP: Aspect Level Opinion Mining Exploiting yntactic Cues Viktor Hangya 1, Gábor Berend 1, István Varga 2, Richárd Farkas 1 1 University of zeged Department of Informatics {hangyav,berendg,rfarkas}@inf.u-szeged.hu
Using Social Media for Continuous Monitoring and Mining of Consumer Behaviour
Using Social Media for Continuous Monitoring and Mining of Consumer Behaviour Michail Salampasis 1, Giorgos Paltoglou 2, Anastasia Giahanou 1 1 Department of Informatics, Alexander Technological Educational
Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis
Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,
Effectiveness of term weighting approaches for sparse social media text sentiment analysis
MSc in Computing, Business Intelligence and Data Mining stream. MSc Research Project Effectiveness of term weighting approaches for sparse social media text sentiment analysis Submitted by: Mulluken Wondie,
Some Experiments on Modeling Stock Market Behavior Using Investor Sentiment Analysis and Posting Volume from Twitter
Some Experiments on Modeling Stock Market Behavior Using Investor Sentiment Analysis and Posting Volume from Twitter Nuno Oliveira Centro Algoritmi Dep. Sistemas de Informação Universidade do Minho 4800-058
Improving Sentiment Analysis in Twitter Using Multilingual Machine Translated Data
Improving Sentiment Analysis in Twitter Using Multilingual Machine Translated Data Alexandra Balahur European Commission Joint Research Centre Via E. Fermi 2749 21027 Ispra (VA), Italy [email protected]
CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet
CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet Muhammad Atif Qureshi 1,2, Arjumand Younus 1,2, Colm O Riordan 1,
How To Analyze Sentiment On A Microsoft Microsoft Twitter Account
Sentiment Analysis on Hadoop with Hadoop Streaming Piyush Gupta Research Scholar Pardeep Kumar Assistant Professor Girdhar Gopal Assistant Professor ABSTRACT Ideas and opinions of peoples are influenced
ARABIC SENTENCE LEVEL SENTIMENT ANALYSIS
The American University in Cairo School of Science and Engineering ARABIC SENTENCE LEVEL SENTIMENT ANALYSIS A Thesis Submitted to The Department of Computer Science and Engineering In partial fulfillment
Effect of Using Regression on Class Confidence Scores in Sentiment Analysis of Twitter Data
Effect of Using Regression on Class Confidence Scores in Sentiment Analysis of Twitter Data Itir Onal *, Ali Mert Ertugrul, Ruken Cakici * * Department of Computer Engineering, Middle East Technical University,
Combining Lexicon-based and Learning-based Methods for Twitter Sentiment Analysis
Combining Lexicon-based and Learning-based Methods for Twitter Sentiment Analysis Lei Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, Bing Liu HP Laboratories HPL-2011-89 Abstract: With the booming
Twitter Sentiment Analysis of Movie Reviews using Machine Learning Techniques.
Twitter Sentiment Analysis of Movie Reviews using Machine Learning Techniques. Akshay Amolik, Niketan Jivane, Mahavir Bhandari, Dr.M.Venkatesan School of Computer Science and Engineering, VIT University,
Research on Sentiment Classification of Chinese Micro Blog Based on
Research on Sentiment Classification of Chinese Micro Blog Based on Machine Learning School of Economics and Management, Shenyang Ligong University, Shenyang, 110159, China E-mail: [email protected] Abstract
SENTIMENT ANALYSIS USING BIG DATA FROM SOCIALMEDIA
SENTIMENT ANALYSIS USING BIG DATA FROM SOCIALMEDIA 1 VAISHALI SARATHY, 2 SRINIDHI.S, 3 KARTHIKA.S 1,2,3 Department of Information Technology, SSN College of Engineering Old Mahabalipuram Road, alavakkam
CSE 598 Project Report: Comparison of Sentiment Aggregation Techniques
CSE 598 Project Report: Comparison of Sentiment Aggregation Techniques Chris MacLellan [email protected] May 3, 2012 Abstract Different methods for aggregating twitter sentiment data are proposed and three
Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter
Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter Hassan Saif, 1 Yulan He, 2 Miriam Fernandez 1 and Harith Alani 1 1 Knowledge Media Institute, The Open University,
Micro blogs Oriented Word Segmentation System
Micro blogs Oriented Word Segmentation System Yijia Liu, Meishan Zhang, Wanxiang Che, Ting Liu, Yihe Deng Research Center for Social Computing and Information Retrieval Harbin Institute of Technology,
Folksonomies versus Automatic Keyword Extraction: An Empirical Study
Folksonomies versus Automatic Keyword Extraction: An Empirical Study Hend S. Al-Khalifa and Hugh C. Davis Learning Technology Research Group, ECS, University of Southampton, Southampton, SO17 1BJ, UK {hsak04r/hcd}@ecs.soton.ac.uk
Crowdfunding Support Tools: Predicting Success & Failure
Crowdfunding Support Tools: Predicting Success & Failure Michael D. Greenberg Bryan Pardo [email protected] [email protected] Karthic Hariharan [email protected] tern.edu Elizabeth
Neuro-Fuzzy Classification Techniques for Sentiment Analysis using Intelligent Agents on Twitter Data
International Journal of Innovation and Scientific Research ISSN 2351-8014 Vol. 23 No. 2 May 2016, pp. 356-360 2015 Innovative Space of Scientific Research Journals http://www.ijisr.issr-journals.org/
Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
Keywords social media, internet, data, sentiment analysis, opinion mining, business
Volume 5, Issue 8, August 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Real time Extraction
Twitter Sentiment Classification using Distant Supervision
Twitter Sentiment Classification using Distant Supervision Alec Go Stanford University Stanford, CA 94305 [email protected] Richa Bhayani Stanford University Stanford, CA 94305 [email protected]
Turker-Assisted Paraphrasing for English-Arabic Machine Translation
Turker-Assisted Paraphrasing for English-Arabic Machine Translation Michael Denkowski and Hassan Al-Haj and Alon Lavie Language Technologies Institute School of Computer Science Carnegie Mellon University
Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015
Sentiment Analysis D. Skrepetos 1 1 Department of Computer Science University of Waterloo NLP Presenation, 06/17/2015 D. Skrepetos (University of Waterloo) Sentiment Analysis NLP Presenation, 06/17/2015
