Twitter sentiment vs. Stock price!



Similar documents
Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams

News Sentiment Analysis Using R to Predict Stock Market Trends

Sentiment analysis on tweets in a financial domain

Twitter Stock Bot. John Matthew Fong The University of Texas at Austin

Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies

How To Analyze Sentiment On A Microsoft Microsoft Twitter Account

Social Market Analytics, Inc.

Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.

Tweets Miner for Stock Market Analysis

CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis

A Comparative Study on Sentiment Classification and Ranking on Product Reviews

Can Twitter provide enough information for predicting the stock market?

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter

Blog Post Extraction Using Title Finding

Applying Machine Learning to Stock Market Trading Bryce Taylor

Predicting the Stock Market with News Articles

CSE 598 Project Report: Comparison of Sentiment Aggregation Techniques

Analysis of Tweets for Prediction of Indian Stock Markets

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Using News Articles to Predict Stock Price Movements

Text Opinion Mining to Analyze News for Stock Market Prediction

Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement

SOPS: Stock Prediction using Web Sentiment

Micro blogs Oriented Word Segmentation System

Prediction of Stock Market Shift using Sentiment Analysis of Twitter Feeds, Clustering and Ranking

Semantic Sentiment Analysis of Twitter

The Viability of StockTwits and Google Trends to Predict the Stock Market. By Chris Loughlin and Erik Harnisch

EXPLOITING TWITTER IN MARKET RESEARCH FOR UNIVERSITY DEGREE COURSES

Robust Sentiment Detection on Twitter from Biased and Noisy Data

Forecasting stock markets with Twitter

Multilanguage sentiment-analysis of Twitter data on the example of Swiss politicians

Emoticon Smoothed Language Models for Twitter Sentiment Analysis

Keywords social media, internet, data, sentiment analysis, opinion mining, business

A CRF-based approach to find stock price correlation with company-related Twitter sentiment

Non-Parametric Spam Filtering based on knn and LSA

Data Deduplication in Slovak Corpora

International Journal of Advanced Research in Computer Science and Software Engineering

Sentiment Analysis for Movie Reviews

Sentiment Analysis and Topic Classification: Case study over Spanish tweets

Sentiment analysis: towards a tool for analysing real-time students feedback

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet

Improving Twitter Sentiment Analysis with Topic-Based Mixture Modeling and Semi-Supervised Training

Data Mining Yelp Data - Predicting rating stars from review text

Bug Report, Feature Request, or Simply Praise? On Automatically Classifying App Reviews

Enhancing the relativity between Content, Title and Meta Tags Based on Term Frequency in Lexical and Semantic Aspects

Reputation Management System

Sentiment Analysis on Twitter with Stock Price and Significant Keyword Correlation. Abstract

Sentiment analysis using emoticons

GrammAds: Keyword and Ad Creative Generator for Online Advertising Campaigns

Impact of Financial News Headline and Content to Market Sentiment

SENTIMENT EXTRACTION FROM NATURAL AUDIO STREAMS. Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen

Approaches for Sentiment Analysis on Twitter: A State-of-Art study

CS224N Final Project: Sentiment analysis of news articles for financial signal prediction

Big Data and High Quality Sentiment Analysis for Stock Trading and Business Intelligence. Dr. Sulkhan Metreveli Leo Keller

Trading Strategies To Exploit Blog and News Sentiment

Identifying Market Price Levels using Differential Evolution

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis

Bayesian Spam Filtering

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS

Mimicking human fake review detection on Trustpilot

Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems

On the Predictability of Stock Market Behavior using StockTwits Sentiment and Posting Volume

HedgeChatter Case Study: Stock Price Manipulation Detection

Twitter Sentiment Analysis of Movie Reviews using Machine Learning Techniques.

Microblog Sentiment Analysis with Emoticon Space Model

Particular Requirements on Opinion Mining for the Insurance Business

Neural Networks for Sentiment Detection in Financial Text

Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model

Real Time Analytics for Big Data. NtiSh Nati

Exploring the use of Big Data techniques for simulating Algorithmic Trading Strategies

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov

Search and Information Retrieval

Sentiment Analysis of Movie Reviews and Twitter Statuses. Introduction

Sentiment analysis for news articles

Active Learning SVM for Blogs recommendation

Web Document Clustering

A Logistic Regression Approach to Ad Click Prediction

Sentiment Analysis: a case study. Giuseppe Castellucci castellucci@ing.uniroma2.it

Exploring Big Data in Social Networks

QUANTIFYING THE EFFECTS OF ONLINE BULLISHNESS ON INTERNATIONAL FINANCIAL MARKETS

{ { Calculating Your Social Media Marketing Return on Investment. A How-To Guide for New Social Media Marketers. Peter Ghali - Senior Product Manager

International Journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online

Search Engines. Stephen Shaw 18th of February, Netsoc

New Developments in the Automatic Classification of Records. Inge Alberts, André Vellino, Craig Eby, Yves Marleau

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak

Cosdes: A Collaborative Spam Detection System with a Novel E- Mail Abstraction Scheme

Why is Internal Audit so Hard?

Social Media Mining. Data Mining Essentials

Transcription:

Twitter sentiment vs. Stock price!

Background! On April 24 th 2013, the Twitter account belonging to Associated Press was hacked. Fake posts about the Whitehouse being bombed and the President being injured were posted. This lead to a 1% loss on the Dow Jones. On May 6 th 2010 a poorly written algorithm triggered a selling spree that caused a 9.2% drop of the Dow Jones. Using text mining as part of trading algorithms is common, and more incidents similar to these have happened (e.g. fake news about American Airlines going bankrupt once made the stock price fall quickly). 2!

Aim! Inspired by this I wanted to look into the following: Is it possible to collect posts from Twitter (known as tweets), that mention a specified stock ticker (Apple Inc. uses AAPL), calculate a sentiment score of these tweets and find a visual relationship between this score and the stocks current price? When we say a visual relationship we mean that we want to plot the score and the price side by side and be able to visually see a relationship between them. More on this later 3!

Method - High level perspective! The general idea is to get all tweets for a specific hour, calculate the average sentiment score of these tweets, and plot it next to the closing price of the stock for that hour. But what is a sentiment score? 1. Find (or create) a corpus with tweets that are classified as positive or negative, create features and use in a naïve Bayes classifier (use the distribution rather than the label as the score) 2. Use a lexicon of sentiment tagged words, (e.g. bad could be negative and super could be good). For each tweet count the number of positive and negative words and create a score from these counts. 4!

Approach 1! The first approach was built upon what we have seen in the labs, creating features and using a naïve Bayes classifier. I found a corpus of 1 600 000 tweets that were labelled as positive or negative. Based on these I wanted to create features and use them in a naive Bayes classifier. I created unigram, bigram and trigram features. Furthermore I created a TF-IDF index over these tweets and used it as a feature. I also partially used the second approach (lexicon of sentiment words, more on this later ). 5!

Approach 1! However it turned out that after a few days trying to coerce my code to get this to work in reasonable time I failed. Since each run was taking very long I decided that I needed to save the tokenized and cleaned tweets, along with their features (and the TF-IDF index) to disk. However when trying to serialize the class structure I had created, the pickle module included in Python was using > 5GB of RAM to check for cycles in the objects that were saved, and it basically blew up every time (giving MemoryError). So I had a choice of fixing this (and save to an SQL database rather than to a flat file), or find another approach I decided to use another approach. 6!

Approach 2! I found three lexicons that all consisted of words with a positive or negative label attached to it. One of them also included the POS of the word used: Example of first lexicon (8221 words): word1=agony pos1=noun priorpolarity=negative word1=agree pos1=verb priorpolarity=positive Example of second lexicon (3642): Consisted of two files, one with positive words: worst, wreck,... And one with negative words: shield, shiny, Example of third lexicon (6787): Consisted of two files, one with positive words: fine, flashy,... And one with negative words: spooky, sporadic, 7!

Approach 2! These lexicons were parsed and placed into a large lexicon (duplicates were allowed and not removed from the lexicon) I then downloaded 7945 tweets that contained the word AAPL (the stock ticker for Apple Inc.) For each of the tweets I did the following processing: Lowercase, remove all http:// and other URL structures, remove all usernames (i.e. @username), removed all multiple whitespaces (i.e. became ), replaced #word with word, replaced repetitions of letters to only two (e.g. yeeeeeehaaaaa became yeehaa), removed all words that did not start with a number (i.e. 3am was removed), stripped punctuations (!,?,.,,) 8!

Approach 2! Next step was to create the actual sentiment score. For each tweet I wanted to look up the tokens in my lexicon to try and decide if the token was positive or negative. Since one of my lexicons also contained the POS of the word each of my tweets were subjected to POS tagging. Each token of a tweet was sent to the lexicon (along with the POS tag) and a sentiment was returned. I did a simple count of the positive and negative words. 9!

Approach 2! Since multiple lexicons were included in my larger lexicon I needed a way of decided which lexicon to trust for a given word (since there was some overlap between the lexicons) The following algorithm was created to solve ties: 1. If there is only one lexicon that contains the word then this lexicon wins. 2. If the token and POS matched the first lexicon then this lexicon wins. 3. If all lexicons agree on the sentiment then all win. 4. If lexicons disagree, then count (i.e. if one lexicon says positive, and the other two say negative then negative wins). 5. If it is still a tie then return neutral. 10!

Approach 2! So for each tweet there now exists a positive (p) and a negative (n) count, and the total number of tokens (N). The following two scores where then associated with each tweet: Sentiment diff: p n Positive score: p / N But I was not satisfied by this, because I felt that some words must be more negative than others, and some words must be more positive than others. 11!

Approach 2! The idea was then to create a TF-IDF index using the tokens in the lexicon (apprx. 8000 unique tokens) and 2000 tweets from the downloaded AAPL tweets. This TF-IDF index was created (and since it was a reasonable size it could be serialized to disk). The issue then arose that it was only really useful on the 2000 tweets that I used to create the TF-IDF, when incoming tweets were to be processed they did not belong to the index. 12!

Approach 2! So since ignorance is bliss I invented the average TF-IDF weight: I calculated the average TF-IDF for each token in the index, saved this value and threw away all the other values in the index, creating a very compact index of average TF-IDF values. So for any token (regardless of which tweet it came from) I could get an average weight for the token. E.g. good could have weight 0.008 and awesome could have weight 0.1. 13!

Approach 2! So armed with the average TF-IDF index I continued my sentiment scoring. Instead of counting the positive and negative words I instead looked them up in the average TF-IDF index, and summed the weights. A weighted positive count (wp) and a weighted negative count (wn) gave the following scores: Weighted sentiment diff: wp wn Weighted positive score: wp / N 14!

Plotting! The 7945 tweets that were downloaded were grouped by hour, so all tweets that were posted between 11:01 AM and 12:00 AM were considered to belong to 12:00 AM. For each grouping the individual sentiment score for each tweet was calculated (using all four sentiment scores discussed). The total sentiment score for the grouping was simply the average score. From Google Finance hourly closing prices were downloaded for AAPL (this means that at time 11:00 AM the latest price AAPL was sold for is the closing price for this hour). 15!

Plots! Sentiment difference (raw counts)! At first glance visually useless,! however it is worth noting that the! maximum of each oscillation increases! Note: The flat horizontal lines are created while the stock market is closed.! Hourly price and sentiment score between the 21 st of May and 27 th of May! 16!

Plots! p / N (raw counts)! Difficult to find anything visually appealing about this.! Hourly price and positive score between the 21 st of May and 27 th of May! 17!

Plots! wp / N (weighted sum)! Just as bad as the positive score without the TF-IDF weighting.! Hourly price and weighted sum between the 21 st of May and 27 th of May! 18!

Plots! wp wn (weighted difference)! Chartists - investors that mainly look at charts of price and volume rather than the fundamental data about a company.! Looks for trends in the charts.! One of the classical ways of finding a trend it is to find higherlows.! The support lines drawn in the charts show that both the price and the sentiment are creating higher lows, indicating that the stock and the sentiment are entering (or already in) a period of upward trend.! 19!

Results! It is easy to conclude that most results were useless, however it is interesting to see some similarity in trend (in the chartists sense of the word) between price and the weighted diff.! One obvious flaw in the process could be the fact that I averaged the sentiment score of each hour, if this was kept raw then hours were there were a lot of positive tweets would possibly outweigh other hours more clearly, and possibly remove some of the oscillation.! When comparing the sentiment scores created against the already labelled tweets from approach 1 (the 1 600 000 tweets), the accuracy of the scores were low (it would almost have been as good as just randomly guessing the sentiment).! An attractive feature of the sentiment scoring approach is the lack of a labelled corpus (the lexicons can be reused).! 20!

1.6 Task 1(c) 1.6.1 Method Average TF-IDF! Using the selected processors (Table 4) the naive Bayes classifier was ran again, however this time with some added feature generators. We included the 1000 most frequent bigrams (creating has bigram( word1, word2 )) features for each document. A feature was added that tells the classifier if the average document word length is greater, less or equal to the corpus average word length. Furthermore a 10-bin feature with cuto s in regards to the lexical diversity of the document was created. 1.6.2 Results 1.7 Task Without 1(d) average TF-IDF features! 21! Table 5: Results from Task 1(c) 1.7.1 Processors Method Features Accuracy Pre(P) Rec(P) F-M(P) Pre(N) Rec(N) F-M(N) PunctuationProcessor, HWFG, 0.79 0.84 0.75 0.79 0.75 0.84 0.79 The ideastemmingprocessor, was to include tf-idf as a binary BFG, feature. This has been done by calculating the average tf-idf weight for each term in the entire LemmatizerProcessor, corpus, and then settingldg, the feature [tfidf( word ) > Avg] or [tfidf( word ) apple Avg] for each frequent term in each document. As before only the 1000 most AWLG, frequent terms have been used. LowerProcessor, Stop- HWFG, 0.79 0.84 0.75 0.79 0.75 0.84 0.79 The TFIDF WordProcessor, feature generator StemmingProcessor, was added BFG, to the generators in Sec 1.6, using only a selection of the processors. Lemma- LDG, 1.7.2 tizerprocessor, Results AWLG, LowerProcessor, PunctuationProcessor, HWFG, 0.79 0.84 0.75 0.79 0.75 0.84 0.79 Stem- BFG, Table 6: Results from Task 1(d) With average TF-IDF features! mingprocessor, LemmatizerProcessor, LDG, Processors Features AWLG, Accuracy Pre(P) Rec(P) F-M(P) Pre(N) Rec(N) F-M(N) PunctuationProcessor, LowerProcessor, Number- HWFG, 0.80 0.79 0.84 0.76 0.75 0.80 0.79 0.76 0.75 0.84 0.79 0.80 StemmingProcessor, Punctuation- LemmatizerProcessor, StopWordProcessor, BFG, LDG, StemmingProces- AWLG, sor, LemmatizerProces- TFIDF, LowerProcessor, sor, Stop- HWFG, 0.80 0.84 0.76 0.80 0.76 0.84 0.80 WordProcessor, Stem- BFG, Conclusions Lemma- LDG, 1.6.3mingProcessor, tizerprocessor, AWLG, Adding the feature generators do change TFIDF, the column values more than trying di erent combinations of processors. However there LowerProcessor, is no di erence between Punc- the choice HWFG, of processors. 0.80 0.84 0.76 0.80 0.76 0.84 0.80

give a different result. It would be interesting to expand this lexicon further to include more words, and also to try it on text that is not as random as Tweets are. The average TF-IDF index used here is not very large (only used 1000 tweets), this could possible also increase usefulness if expanded. Sources! REFERENCES [1] A. Nagar and M. Hahsler, Using text and data mining techniques to extract stock market sentiment from live news streams. [2] N. Godbole, M. Srinivasaiah, and S. Skiena, Large-scale sentiment analysis for news and blogs, in Proceedings of the International Conference on Weblogs and Social Media (ICWSM), vol. 2, 2007. [3] V. Sehgal and C. Song, Sops: stock prediction using web sentiment, in Data Mining Workshops, 2007. ICDM Workshops 2007. Seventh IEEE International Conference on. IEEE, 2007, pp. 21 26. [4] W. Zhang and S. Skiena, Trading strategies to exploit blog and news sentiment, in Proc. of the Fourth International AAAI Conference on Weblogs and Social Media, 2010, pp. 375 378. [5] M. Hu and B. Liu, Mining and summarizing customer reviews, in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004), 2004. Lexicons! MPQA Subjectivity lexicon http://mpqa.cs.pitt.edu/ lexicons/!! tm.plugin.tags - This is an R package that contain positive and negative words!! Opinion mining, Sentiment Analysis and Opinion Spam Detection - http:// www.cs.uic.edu/~liub/fbs/ sentiment-analysis.html! 22!

www.liu.se!