# Emoticon Smoothed Language Models for Twitter Sentiment Analysis

Save this PDF as:

Size: px
Start display at page:

## Transcription

3 TSA is actually a classification problem. To adapt LM for TSA, we concatenate all the tweets from the same class to form one synthetic document. Hence, for the polarity classification problem, one document is constructed from positive training tweets, and the other document is constructed from negative training tweets. Then we learn two LMs, one for positive class and the other for negative class. The LM learning procedure for subjectivity classification is similar. During the test phase, we treat each test tweet as a query, and then we can use the likelihoods to rank the classes. The class with the highest likelihood will be chosen as the label of the test tweet. We use c 1 and c 2 to denote the two language models. In polarity classification, c 1 is the language model for positive tweets and c 2 is for negative tweets. In subjectivity classification, c 1 is for subjective class and c 2 is for objective (neutral) class. In order to classify a tweet t to c 1 or c 2, we need to estimate the tweet likelihoods computed by P (t c 1 ) and P (t c 2 ). By using the common unigram assumption, we get: n P (t c) = P (w i c), i=1 where n is the number of words in tweet t and P (w i c) is a multinomial distribution estimated from the LM of class c. This probability simulates the generative process of the test tweet. Firstly, the first word (w 1 ) is generated by following a multinomial distribution P (w i c). After that, the second word is generated independently of the previous word by following the same distribution. This process continues until all the words in this tweet have been generated. One commonly used method to estimate the distributions is maximum likelihood estimate (MLE), which computes the probability as follows: P a (w i c) = N i,c N c, where N i,c is the number of times word w i appearing in training data of class c and N c is the total number of words in training data of class c. In general, the vocabulary is determined by the training set. To classify tweets in test set, it is very common to encounter words that do not appear in training set especially when there are not enough training data or the words are not well-formed. In these cases, smoothing (Zhai and Lafferty 2004) plays a very important role in language models because it can avoid assigning zero probability to unseen words. Furthermore, smoothing can make the model more accurate and robust. Representative smoothing methods include Dirichlet smoothing and Jelinek-Mercer (JM) smoothing (Zhai and Lafferty 2004). Although the original JM smoothing method is used to linear interpolation of the MLE model with the collection model (Zhai and Lafferty 2004), we use JM smoothing method to linearly interpolate the MLE model with the emoticon model in this paper. Emoticon Model From the emoticon data, we can also build the LMs for different classes. We propose a very effective and efficient way to estimate the emoticon LM P u (w i c) from Twitter Search API. Twitter Search API 4 is a dedicated API for running searches against the real-time index of recent tweets. Its index includes tweets between 6-9 days. Given a query which consists of one or several words, the API returns up to 1500 relevant tweets and their posting time. Polarity Classification To get P u (w i c 1 ), the probability of w i in positive class, we make an assumption that all tweets containing :) are positive. We build a query w i :) and input it to the Search API. Then it returns tweets containing both w i and :) with their posting time. After summarization, we get the number of tweets nw i and the time range of these tweets tw i. Then we build another query :) and get the number of returned tweets ns and the time range ts. Some estimations 5 show that a tweet contains 15 words on average. Assume that the tweets on Twitter are uniformly distributed with respect to time. Similar to the rule of getting P a (w i c), we can estimate P u (w i c 1 ) with the following rule: P u (w i c 1 ) = nw i tw i ns ts 15 = nw i ts 15 tw i ns. The term nwi tw i is roughly the number of times word w i appearing in class c per unit time, and the term ns ts 15 is roughly the total number of words in class c per unit time. Let F u = V j=1 P u(w j c) be the normalization factor where V is the size of vocabulary containing both seen and unseen words. Then each estimated P u (w i c) should be normalized to make them sum up to one: P u (w i c) := P u (w i c)/f u = = nw i ts 15 tw i ns V j=1 nw j ts 15 tw j ns P u (w i c) V j=1 P u(w j c) = nw i tw i V. nw j j=1 tw j We can find that there is no need to get ts and ns, because P u (w i c) can be determined only by nw i and tw i. For the LM of negative class, we assume that the negative tweets are those containing :(. The estimate procedure for P u (w i c 2 ) is similar to that for P u (w i c 1 ). The only difference is that the query should be changed to w i :(. Subjectivity Classification For subjectivity classification, the two classes are subjective and objective. The assumption for subjective tweets is that tweets with :) or :( are assumed to carry subjectivity of the users. So we build the query w i :) OR :( for the subjective class. As for the objective LM, getting P u (w i c 2 ), the probability of w i in objective class, is much more challenging than that in subjective class. To the best of our knowledge, no general assumption for objective tweets has been reported by researchers. We tried the strategy which treats tweets without emoticons as objective but the experiments showed that

5 Figure 1: Effect of emoticons on accuracy of polarity Figure 4: Effect of emoticons on F-score of subjectivity blue line corresponds to the performance of distantly supervised LM, which also corresponds to the case of zero manually labeled data. The red line is the results of ESLAM. We can find that ESLAM achieves better performance than the distantly supervised LM. With the increase of manually labeled data, the performance gap between them will become larger and larger. This verifies our claim that it is not enough to use only the data of noisy labels for training. Figure 2: Effect of emoticons on F-score of polarity different number of manually labeled training data, respectively. The results are similar to those for polarity classification which once again verifies the effectiveness of our ESLAM to utilize the noisy emoticon data. The good performance of ESLAM also verifies that our url link based method is effective to find objective tweets, which is a big challenge for most existing distantly supervised methods. Figure 5: Effect of manually labeled data on accuracy of polarity Figure 3: Effect of emoticons on accuracy of subjectivity Effect of Manually Labeled Data We compare our ESLAM method to the distantly supervised LM to verify whether the manually labeled data can provide extra useful information for Please note that the distantly supervised LM uses only the noisy emoticon data for training, while ESLAM integrates both manually labeled data and the emoticon data for training. Figure 5 and Figure 6 illustrate the accuracy and F-score of the two methods on polarity classification with different number of manually labeled training data, respectively. The Figure 6: Effect of manually labeled data on F-score of polarity Figure 7 and Figure 8 illustrate the accuracy and F-score of the two methods on subjectivity classification with different number of manually labeled training data, respectively. The results are similar to those for polarity Sensitivity to Parameters The parameter α in (1) plays a critical role to control the contribution between the manually labeled information and noisy labeled information. To show the effect of this parameter in detail, we try different values for polarity classifica- 1682

6 Figure 7: Effect of manually labeled data on accuracy of subjectivity Figure 10: Effect of the smoothing parameter α with 512 labeled training tweets. into the same probabilistic framework. Experiments on real data sets show that our ESLAM method can effectively integrate both kinds of data to outperform those methods using only one of them. Our ESLAM method is general enough to integrate other kinds of noisy labels for model training, which will be pursued in our future work. Figure 8: Effect of manually labeled data on F-score of subjectivity tion. Figure 9 and Figure 10 show the accuracy of ESLAM with 128 and 512 labeled training tweets, respectively. The case α = 0 means only noisy emoticon data are used and α = 1 is the fully supervised case. The results in the Figures clearly show that the best strategy is to integrate both manually labeled data and noisy data into training. We also notice that with 512 labeled training data ESLAM achieves its best performance with relatively bigger α than the case of 128 labeled data, which is obviously reasonable. Furthermore, we find that ESLAM is not sensitive to the small variations in the value of parameter α because the range for α to achieve the better performance is large. Figure 9: Effect of the smoothing parameter α with 128 labeled training tweets. Conclusion Existing methods use either manually labeled data or noisy labeled data for Twitter sentiment analysis, but few of them utilize both of them for training. In this paper, we propose a novel model, called emoticon smoothed language model (ESLAM), to seamlessly integrate these two kinds of data Acknowledgments This work is supported by the NSFC (No ) and the 863 Program of China (No. 2011AA01A202, No. 2012AA011003). References Barbosa, L., and Feng, J Robust sentiment detection on twitter from biased and noisy data. In COLING, Bermingham, A., and Smeaton, A. F Classifying sentiment in microblogs: is brevity an advantage? In CIKM, Davidov, D.; Tsur, O.; and Rappoport, A Enhanced sentiment learning using twitter hashtags and smileys. In COLING, Go, A.; Bhayani, R.; and Huang, L Twitter sentiment classification using distant supervision. Technical report. Guerra, P. H. C.; Veloso, A.; Jr., W. M.; and Almeida, V From bias to opinion: a transfer-learning approach to real-time sentiment analysis. In KDD, Jansen, B. J.; Zhang, M.; Sobel, K.; and Chowdury, A Twitter power: Tweets as electronic word of mouth. JASIST 60(11): Jiang, L.; Yu, M.; Zhou, M.; Liu, X.; and Zhao, T Target-dependent twitter sentiment In ACL, Kouloumpis, E.; Wilson, T.; and Moore, J Twitter sentiment analysis: The good the bad and the omg! In ICWSM, Liu, X.; Li, K.; Zhou, M.; and Xiong, Z. 2011a. Collective semantic role labeling for tweets with clustering. In IJCAI, Liu, X.; Li, K.; Zhou, M.; and Xiong, Z. 2011b. Enhancing semantic role labeling for tweets using self-training. In AAAI. Liu, X.; Zhang, S.; Wei, F.; and Zhou, M. 2011c. Recognizing named entities in tweets. In ACL,

7 Manning, C. D.; Raghavan, P.; and Schutze, H An Introduction to Information Retrieval. Cambridge University Press. Pang, B., and Lee, L Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1-2): Pang, B.; Lee, L.; and Vaithyanathan, S Thumbs up? sentiment classification using machine learning techniques. In EMNLP, Ponte, J. M., and Croft, W. B A language modeling approach to information retrieval. In SIGIR, Silva, I. S.; Gomide, J.; Veloso, A.; Jr., W. M.; and Ferreira, R Effective sentiment stream analysis with self-augmenting training and demand-driven projection. In SIGIR, Tan, C.; Lee, L.; Tang, J.; Jiang, L.; Zhou, M.; and Li, P User-level sentiment analysis incorporating social networks. In KDD, Zhai, C., and Lafferty, J. D A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2):

### Emoticon Smoothed Language Models for Twitter Sentiment Analysis

Emoticon Smoothed Language Models for Twitter Sentiment Analysis Kun-Lin Liu, Wu-Jun Li, Minyi Guo Shanghai Key Laboratory of Scalable Computing and Systems Department of Computer Science and Engineering,

### Microblog Sentiment Analysis with Emoticon Space Model

Microblog Sentiment Analysis with Emoticon Space Model Fei Jiang, Yiqun Liu, Huanbo Luan, Min Zhang, and Shaoping Ma State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory

### Semantic Sentiment Analysis of Twitter

Semantic Sentiment Analysis of Twitter Hassan Saif, Yulan He & Harith Alani Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom The 11 th International Semantic Web Conference

### Sentiment analysis on tweets in a financial domain

Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International

### End-to-End Sentiment Analysis of Twitter Data

End-to-End Sentiment Analysis of Twitter Data Apoor v Agarwal 1 Jasneet Singh Sabharwal 2 (1) Columbia University, NY, U.S.A. (2) Guru Gobind Singh Indraprastha University, New Delhi, India apoorv@cs.columbia.edu,

### Robust Sentiment Detection on Twitter from Biased and Noisy Data

Robust Sentiment Detection on Twitter from Biased and Noisy Data Luciano Barbosa AT&T Labs - Research lbarbosa@research.att.com Junlan Feng AT&T Labs - Research junlan@research.att.com Abstract In this

### Sentiment Analysis Tool using Machine Learning Algorithms

Sentiment Analysis Tool using Machine Learning Algorithms I.Hemalatha 1, Dr. G. P Saradhi Varma 2, Dr. A.Govardhan 3 1 Research Scholar JNT University Kakinada, Kakinada, A.P., INDIA 2 Professor & Head,

### Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority

### Sentiment analysis: towards a tool for analysing real-time students feedback

Sentiment analysis: towards a tool for analysing real-time students feedback Nabeela Altrabsheh Email: nabeela.altrabsheh@port.ac.uk Mihaela Cocea Email: mihaela.cocea@port.ac.uk Sanaz Fallahkhair Email:

### Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Sentiment Analysis D. Skrepetos 1 1 Department of Computer Science University of Waterloo NLP Presenation, 06/17/2015 D. Skrepetos (University of Waterloo) Sentiment Analysis NLP Presenation, 06/17/2015

### How much does word sense disambiguation help in sentiment analysis of micropost data?

How much does word sense disambiguation help in sentiment analysis of micropost data? Chiraag Sumanth PES Institute of Technology India Diana Inkpen University of Ottawa Canada 6th Workshop on Computational

### ASVUniOfLeipzig: Sentiment Analysis in Twitter using Data-driven Machine Learning Techniques

ASVUniOfLeipzig: Sentiment Analysis in Twitter using Data-driven Machine Learning Techniques Robert Remus Natural Language Processing Group, Department of Computer Science, University of Leipzig, Germany

### Combining Lexicon-based and Learning-based Methods for Twitter Sentiment Analysis

Combining Lexicon-based and Learning-based Methods for Twitter Sentiment Analysis Lei Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, Bing Liu HP Laboratories HPL-2011-89 Abstract: With the booming

### Twitter Sentiment Analysis of Movie Reviews using Machine Learning Techniques.

Twitter Sentiment Analysis of Movie Reviews using Machine Learning Techniques. Akshay Amolik, Niketan Jivane, Mahavir Bhandari, Dr.M.Venkatesan School of Computer Science and Engineering, VIT University,

### Forecasting stock markets with Twitter

Forecasting stock markets with Twitter Argimiro Arratia argimiro@lsi.upc.edu Joint work with Marta Arias and Ramón Xuriguera To appear in: ACM Transactions on Intelligent Systems and Technology, 2013,

### Micro blogs Oriented Word Segmentation System

Micro blogs Oriented Word Segmentation System Yijia Liu, Meishan Zhang, Wanxiang Che, Ting Liu, Yihe Deng Research Center for Social Computing and Information Retrieval Harbin Institute of Technology,

### Active Learning SVM for Blogs recommendation

Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the

### Emotion Analysis Platform on Chinese Microblog

Emotion Analysis Platform on Chinese Microblog Duyu Tang, Bing Qin, Ting Liu, Qiuhui Shi Research Center for Social Computing and Information Retrieval Computer Science Department, Harbin Institute of

### Natural Language Processing for Sentiment Analysis

2014 4th International Conference on Artificial Intelligence with Applications in Engineering and Technology Natural Language Processing for Sentiment Analysis An Exploratory Analysis on Wei Yen Chong

### Sentiment Analysis on Twitter Data

Sentiment Analysis on Twitter Data Varsha Sahayak Vijaya Shete Apashabi Pathan BE (IT) BE (IT) ME (Computer) Department of Information Technology, Savitribai Phule Pune University, Pune, India. Abstract

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,

### The Open University s repository of research publications and other research outputs

Open Research Online The Open University s repository of research publications and other research outputs Alleviating data sparsity for Twitter sentiment analysis Conference Item How to cite: Saif, Hassan;

### Sentiment Analysis of Twitter Data

Sentiment Analysis of Twitter Data Apoorv Agarwal Boyi Xie Ilia Vovsha Owen Rambow Rebecca Passonneau Department of Computer Science Columbia University New York, NY 10027 USA {apoorv@cs, xie@cs, iv2121@,

### Twitter sentiment vs. Stock price!

Twitter sentiment vs. Stock price! Background! On April 24 th 2013, the Twitter account belonging to Associated Press was hacked. Fake posts about the Whitehouse being bombed and the President being injured

### Sentiment analysis using emoticons

Sentiment analysis using emoticons Royden Kayhan Lewis Moharreri Steven Royden Ware Lewis Kayhan Steven Moharreri Ware Department of Computer Science, Ohio State University Problem definition Our aim was

### Sentiment Analysis of Microblogs

Sentiment Analysis of Microblogs Mining the New World Technical Report KMI-12-2 March 2012 Hassan Saif Abstract In the past years, we have witnessed an increased interest in microblogs as a hot research

### Text Mining for Sentiment Analysis of Twitter Data

Text Mining for Sentiment Analysis of Twitter Data Shruti Wakade, Chandra Shekar, Kathy J. Liszka and Chien-Chung Chan The University of Akron Department of Computer Science liszka@uakron.edu, chan@uakron.edu

### Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams

2012 International Conference on Computer Technology and Science (ICCTS 2012) IPCSIT vol. XX (2012) (2012) IACSIT Press, Singapore Using Text and Data Mining Techniques to extract Stock Market Sentiment

### NILC USP: A Hybrid System for Sentiment Analysis in Twitter Messages

NILC USP: A Hybrid System for Sentiment Analysis in Twitter Messages Pedro P. Balage Filho and Thiago A. S. Pardo Interinstitutional Center for Computational Linguistics (NILC) Institute of Mathematical

### nlp.cs.aueb.gr: Two Stage Sentiment Analysis

nlp.cs.aueb.gr: Two Stage Sentiment Analysis Prodromos Malakasiotis, Rafael Michael Karampatsis Konstantina Makrynioti and John Pavlopoulos Department of Informatics Athens University of Economics and

### INTEGRATED APPROACH FOR SENTIMENT ANALYSIS IN SOCIAL MICRO-BLOGGING SITES LIKE TWITTER

INTEGRATED APPROACH FOR SENTIMENT ANALYSIS IN SOCIAL MICRO-BLOGGING SITES LIKE TWITTER Dr.Chitrakala S 1, Jayasri R 2, Malini Sarathy 3, Shirley M 4 1,2,3,4 Computer Science and Engineering, College of

### Twitter Sentiment Classification using Distant Supervision

Twitter Sentiment Classification using Distant Supervision Alec Go Stanford University Stanford, CA 94305 alecmgo@stanford.edu Richa Bhayani Stanford University Stanford, CA 94305 rbhayani@stanford.edu

### Twitter Stock Bot. John Matthew Fong The University of Texas at Austin jmfong@cs.utexas.edu

Twitter Stock Bot John Matthew Fong The University of Texas at Austin jmfong@cs.utexas.edu Hassaan Markhiani The University of Texas at Austin hassaan@cs.utexas.edu Abstract The stock market is influenced

### Kea: Expression-level Sentiment Analysis from Twitter Data

Kea: Expression-level Sentiment Analysis from Twitter Data Ameeta Agrawal Computer Science and Engineering York University Toronto, Canada ameeta@cse.yorku.ca Aijun An Computer Science and Engineering

### Using Twitter as a source of information for stock market prediction

Using Twitter as a source of information for stock market prediction Ramon Xuriguera (rxuriguera@lsi.upc.edu) Joint work with Marta Arias and Argimiro Arratia ERCIM 2011, 17-19 Dec. 2011, University of

### Analysis of Tweets for Prediction of Indian Stock Markets

Analysis of Tweets for Prediction of Indian Stock Markets Phillip Tichaona Sumbureru Department of Computer Science and Engineering, JNTU College of Engineering Hyderabad, Kukatpally, Hyderabad-500 085,

### Wikipedia Based Semantic Smoothing for Twitter Sentiment Classification

Wikipedia Based Semantic Smoothing for Twitter Sentiment Classification Dilara Torunoğlu 1, Gürkan Telseren 1, Özgün Sağtürk 1, Murat C. Ganiz 1,2 1 Computer Engineering Dept. Doğuş University 2 VeriUs

### Using Social Media for Continuous Monitoring and Mining of Consumer Behaviour

Using Social Media for Continuous Monitoring and Mining of Consumer Behaviour Michail Salampasis 1, Giorgos Paltoglou 2, Anastasia Giahanou 1 1 Department of Informatics, Alexander Technological Educational

### Sentiment Analysis of Microblogs

Thesis for the degree of Master in Language Technology Sentiment Analysis of Microblogs Tobias Günther Supervisor: Richard Johansson Examiner: Prof. Torbjörn Lager June 2013 Contents 1 Introduction 3 1.1

### Sentiment Analysis and Topic Classification: Case study over Spanish tweets

Sentiment Analysis and Topic Classification: Case study over Spanish tweets Fernando Batista, Ricardo Ribeiro Laboratório de Sistemas de Língua Falada, INESC- ID Lisboa R. Alves Redol, 9, 1000-029 Lisboa,

### SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis Fangtao Li 1, Sheng Wang 2, Shenghua Liu 3 and Ming Zhang

### A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts

A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts Axel Schulz Technische Universität Darmstadt aschulz@tk.informatik.tu-darmstadt.de Heiko Paulheim Universität Mannheim

### CSE 598 Project Report: Comparison of Sentiment Aggregation Techniques

CSE 598 Project Report: Comparison of Sentiment Aggregation Techniques Chris MacLellan cjmaclel@asu.edu May 3, 2012 Abstract Different methods for aggregating twitter sentiment data are proposed and three

Sentiment Analysis on Twitter Data using: Hadoop ABHINANDAN P SHIRAHATTI 1, NEHA PATIL 2, DURGAPPA KUBASAD 3, ARIF MUJAWAR 4 1,2,3,4 Department of CSE, Jain College of Engineering, Belagavi-590014, India

### Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. ~ Spring~r

Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures ~ Spring~r Table of Contents 1. Introduction.. 1 1.1. What is the World Wide Web? 1 1.2. ABrief History of the Web

### The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of

### II. RELATED WORK. Sentiment Mining

Sentiment Mining Using Ensemble Classification Models Matthew Whitehead and Larry Yaeger Indiana University School of Informatics 901 E. 10th St. Bloomington, IN 47408 {mewhiteh, larryy}@indiana.edu Abstract

### Improving Web Page Retrieval using Search Context from Clicked Domain Names

Improving Web Page Retrieval using Search Context from Clicked Domain Names Rongmei Li School of Electrical, Mathematics, and Computer Science University of Twente P.O.Box 217, 7500 AE, Enschede, the Netherlands

### University of Glasgow Terrier Team / Project Abacá at RepLab 2014: Reputation Dimensions Task

University of Glasgow Terrier Team / Project Abacá at RepLab 2014: Reputation Dimensions Task Graham McDonald, Romain Deveaud, Richard McCreadie, Timothy Gollins, Craig Macdonald and Iadh Ounis School

Sentiment Analysis in Twitter Maria Karanasou, Christos Doulkeridis, Maria Halkidi Department of Digital Systems School of Information and Communication Technologies University of Piraeus http://www.ds.unipi.gr/cdoulk/

### Inference Rating Framework for Sentiment Analysis Using SVM Light

Inference Rating Framework for Sentiment Analysis Using SVM Light Tapan Biswas 1, Poonam Singh 2 and Binay Kumar Pandey 3 1 Tapan Biswas, Govindh Ballabh Pant University of Agriculure and Technology,Pantnagar,

### Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines

, 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing

### Domain Classification of Technical Terms Using the Web

Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using

### Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,

### DiegoLab16 at SemEval-2016 Task 4: Sentiment Analysis in Twitter using Centroids, Clusters, and Sentiment Lexicons

DiegoLab16 at SemEval-016 Task 4: Sentiment Analysis in Twitter using Centroids, Clusters, and Sentiment Lexicons Abeed Sarker and Graciela Gonzalez Department of Biomedical Informatics Arizona State University

### Automatic Text Processing: Cross-Lingual. Text Categorization

Automatic Text Processing: Cross-Lingual Text Categorization Dipartimento di Ingegneria dell Informazione Università degli Studi di Siena Dottorato di Ricerca in Ingegneria dell Informazone XVII ciclo

### Framework for Sentiment Analysis of Twitter Post

Framework for Sentiment Analysis of Twitter Post Rohit Shukla 1, Nishchol Mishra 2 P.G Student School of Information Technology, Rajiv Gandhi Proudyogiki Vishwavidyalaya, Airport Bypass Road, Gandhi Nagar,

### Keywords social media, internet, data, sentiment analysis, opinion mining, business

Volume 5, Issue 8, August 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Real time Extraction

### A Comparative Study on Sentiment Classification and Ranking on Product Reviews

A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan

### Automatic Web Page Classification

Automatic Web Page Classification Yasser Ganjisaffar 84802416 yganjisa@uci.edu 1 Introduction To facilitate user browsing of Web, some websites such as Yahoo! (http://dir.yahoo.com) and Open Directory

### Sentiment Analysis on Twitter with Stock Price and Significant Keyword Correlation. Abstract

Sentiment Analysis on Twitter with Stock Price and Significant Keyword Correlation Linhao Zhang Department of Computer Science, The University of Texas at Austin (Dated: April 16, 2013) Abstract Though

### Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach

Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach Alex Hai Wang College of Information Sciences and Technology, The Pennsylvania State University, Dunmore, PA 18512, USA

Volume 4, Issue 6, June 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Improving Web Image

### SINAI at WEPS-3: Online Reputation Management

SINAI at WEPS-3: Online Reputation Management M.A. García-Cumbreras, M. García-Vega F. Martínez-Santiago and J.M. Peréa-Ortega University of Jaén. Departamento de Informática Grupo Sistemas Inteligentes

### Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.

Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach. Pranali Chilekar 1, Swati Ubale 2, Pragati Sonkambale 3, Reema Panarkar 4, Gopal Upadhye 5 1 2 3 4 5

### Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies

Sentiment analysis of Twitter microblogging posts Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies Introduction Popularity of microblogging services Twitter microblogging posts

### Bagged Ensemble Classifiers for Sentiment Classification of Movie Reviews

www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 2 February, 2014 Page No. 3951-3961 Bagged Ensemble Classifiers for Sentiment Classification of Movie

### Efficient Density Based Clustering of Tweets and Sentimental Analysis Based on Segmentation

RESEARCH ARTICLE OPEN ACCESS Efficient Density Based Clustering of Tweets and Sentimental Analysis Based on Segmentation AnumolBabu 1, Rose V Pattani 2 1 (Post Graduate Student, Dept. of CSE, Mangalam

### Research on Sentiment Classification of Chinese Micro Blog Based on

Research on Sentiment Classification of Chinese Micro Blog Based on Machine Learning School of Economics and Management, Shenyang Ligong University, Shenyang, 110159, China E-mail: 8e8@163.com Abstract

### Florida International University - University of Miami TRECVID 2014

Florida International University - University of Miami TRECVID 2014 Miguel Gavidia 3, Tarek Sayed 1, Yilin Yan 1, Quisha Zhu 1, Mei-Ling Shyu 1, Shu-Ching Chen 2, Hsin-Yu Ha 2, Ming Ma 1, Winnie Chen 4,

### CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet

CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet Muhammad Atif Qureshi 1,2, Arjumand Younus 1,2, Colm O Riordan 1,

### EFFICIENTLY PROVIDE SENTIMENT ANALYSIS DATA SETS USING EXPRESSIONS SUPPORT METHOD

EFFICIENTLY PROVIDE SENTIMENT ANALYSIS DATA SETS USING EXPRESSIONS SUPPORT METHOD 1 Josephine Nancy.C, 2 K Raja. 1 PG scholar,department of Computer Science, Tagore Institute of Engineering and Technology,

### Distinguishing Opinion from News Katherine Busch

Distinguishing Opinion from News Katherine Busch Abstract Newspapers have separate sections for opinion articles and news articles. The goal of this project is to classify articles as opinion versus news

### Multilanguage sentiment-analysis of Twitter data on the example of Swiss politicians

Multilanguage sentiment-analysis of Twitter data on the example of Swiss politicians Lucas Brönnimann University of Applied Science Northwestern Switzerland, CH-5210 Windisch, Switzerland Email: lucas.broennimann@students.fhnw.ch

### Sentiment Analysis: a case study. Giuseppe Castellucci castellucci@ing.uniroma2.it

Sentiment Analysis: a case study Giuseppe Castellucci castellucci@ing.uniroma2.it Web Mining & Retrieval a.a. 2013/2014 Outline Sentiment Analysis overview Brand Reputation Sentiment Analysis in Twitter

### Citation Context Sentiment Analysis for Structured Summarization of Research Papers

Citation Context Sentiment Analysis for Structured Summarization of Research Papers Niket Tandon 1,3 and Ashish Jain 2,3 ntandon@mpi-inf.mpg.de, ashish.iiith@gmail.com 1 Max Planck Institute for Informatics,

### Sentiment Analysis of Movie Reviews and Twitter Statuses. Introduction

Sentiment Analysis of Movie Reviews and Twitter Statuses Introduction Sentiment analysis is the task of identifying whether the opinion expressed in a text is positive or negative in general, or about

### Text Classification and Clustering with. A guided example by Sergio Jiménez

Text Classification and Clustering with WEKA A guided example by Sergio Jiménez The Task Building a model for movies revisions in English for classifying it into positive or negative. Sentiment Polarity

### General Naïve Bayes. CS 188: Artificial Intelligence Fall Example: Spam Filtering. Example: OCR. Generalization and Overfitting

CS 88: Artificial Intelligence Fall 7 Lecture : Perceptrons //7 General Naïve Bayes A general naive Bayes model: C x E n parameters C parameters n x E x C parameters E E C E n Dan Klein UC Berkeley We

### Semi-Supervised Learning for Blog Classification

Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Semi-Supervised Learning for Blog Classification Daisuke Ikeda Department of Computational Intelligence and Systems Science,

### Project Plan: Sentiment Analysis on Tweets and Facebook Status Updates and The Effect of Homophily on the Diffusion of Opinions

Project Plan: Sentiment Analysis on Tweets and Facebook Status Updates and The Effect of Homophily on the Diffusion of Opinions Introduction This natural language processing project was largely motivated

### Network Big Data: Facing and Tackling the Complexities Xiaolong Jin

Network Big Data: Facing and Tackling the Complexities Xiaolong Jin CAS Key Laboratory of Network Data Science & Technology Institute of Computing Technology Chinese Academy of Sciences (CAS) 2015-08-10

### A CRF-based approach to find stock price correlation with company-related Twitter sentiment

POLITECNICO DI MILANO Scuola di Ingegneria dell Informazione POLO TERRITORIALE DI COMO Master of Science in Computer Engineering A CRF-based approach to find stock price correlation with company-related

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### Challenges of Cloud Scale Natural Language Processing

Challenges of Cloud Scale Natural Language Processing Mark Dredze Johns Hopkins University My Interests? Information Expressed in Human Language Machine Learning Natural Language Processing Intelligent

### SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY

SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY G.Evangelin Jenifer #1, Mrs.J.Jaya Sherin *2 # PG Scholar, Department of Electronics and Communication Engineering(Communication and Networking), CSI Institute

### Data Source for Sentiment analysis

Sentiment Analysis Introduction Data Source for Sentiment analysis Sentiment Analysis: Problem definition Sentiment Analysis Tools Current Research Problems Introduction Sentiment Analysis: Sentiment analysis

### 01 Opinion mining, sentiment analysis

01 Opinion mining, sentiment analysis IA161 Advanced Techniques of Natural Language Processing Z. Nevěřilová NLP Centre, FI MU, Brno September 21, 2016 Z. Nevěřilová IA161 Advanced NLP 01 Opinion mining,

Classifying Business Types from Twitter Posts Using Active Learning Chanattha Thongsuk, Choochart Haruechaiyasak, Phayung Meesad Department of Information Technology Faculty of Information Technology King

### Web Database Integration

Web Database Integration Wei Liu School of Information Renmin University of China Beijing, 100872, China gue2@ruc.edu.cn Xiaofeng Meng School of Information Renmin University of China Beijing, 100872,

### Concept Term Expansion Approach for Monitoring Reputation of Companies on Twitter

Concept Term Expansion Approach for Monitoring Reputation of Companies on Twitter M. Atif Qureshi 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group, National University

### RRSS - Rating Reviews Support System purpose built for movies recommendation

RRSS - Rating Reviews Support System purpose built for movies recommendation Grzegorz Dziczkowski 1,2 and Katarzyna Wegrzyn-Wolska 1 1 Ecole Superieur d Ingenieurs en Informatique et Genie des Telecommunicatiom

### Designing Ranking Systems for Consumer Reviews: The Impact of Review Subjectivity on Product Sales and Review Quality

Designing Ranking Systems for Consumer Reviews: The Impact of Review Subjectivity on Product Sales and Review Quality Anindya Ghose, Panagiotis G. Ipeirotis {aghose, panos}@stern.nyu.edu Department of

### YouTube Comment Classifier. mandaku2, rethina2, vraghvn2

YouTube Comment Classifier mandaku2, rethina2, vraghvn2 Abstract Youtube is the most visited and popular video sharing site, where interesting videos are uploaded by people and corporations. People also

### ReelTalk: An Interactive Sentiment Analysis Application

ReelTalk: An Interactive Sentiment Analysis Application Anne Flinchbaugh anne.flinchbaugh@gmail.com Eric Latimer e@utexas.edu Abstract In this report, we present the implementation and application of a

### NAIVE BAYES ALGORITHM FOR TWITTER SENTIMENT ANALYSIS AND ITS IMPLEMENTATION IN MAPREDUCE. A Thesis. Presented to. The Faculty of the Graduate School

NAIVE BAYES ALGORITHM FOR TWITTER SENTIMENT ANALYSIS AND ITS IMPLEMENTATION IN MAPREDUCE A Thesis Presented to The Faculty of the Graduate School At the University of Missouri In Partial Fulfillment Of

### A Hybrid Method for Opinion Finding Task (KUNLP at TREC 2008 Blog Track)

A Hybrid Method for Opinion Finding Task (KUNLP at TREC 2008 Blog Track) Linh Hoang, Seung-Wook Lee, Gumwon Hong, Joo-Young Lee, Hae-Chang Rim Natural Language Processing Lab., Korea University (linh,swlee,gwhong,jylee,rim)@nlp.korea.ac.kr

### Simple Language Models for Spam Detection

Simple Language Models for Spam Detection Egidio Terra Faculty of Informatics PUC/RS - Brazil Abstract For this year s Spam track we used classifiers based on language models. These models are used to