1 Nowcasting the Bitcoin Market with Twitter Signals JERMAIN KAMINSKI, MIT Media Lab & Witten/Herdecke University 1 PETER A. GLOOR, MIT Center for Collective Intelligence 1. INTRODUCTION Bitcoin is a peer-to-peer electronic cash system [Nakamoto 2009] and a leading global open-source cryptocurrency [Kroll et al. 2013]. It uses cryptography to control the creation and transfer of money, while transactions are broadcasted as digitally signed messages to the shared public network, the block chain. Bitcoins can be obtained by mining 2 or in exchange for products, services, or other currencies. According to The Wall Street Journal [2013], the commercial use of bitcoin still seems comparable small, mainly as a result of high and risky price volatilities. However, there are signals of traction in retail business and elsewhere, where bitcoins are increasingly accepted in transactions. While the value of all bitcoin is currently totaling about $12.5 billion, 3 and steadily increasing, news coverage in the recent months pointed towards the legitimate use of the virtual currency, its taxation and the circumstance it might support the trade of illicit goods and services [The Economist 2013]. Noteworthy, a recent breakdown of the mayor exchange platform Mt.Gox on February 25 th 2013 marked a setback for Bitcoin traders and observers. While behavioral economics suggest that emotions can affect individual behavior and decision-making [Akerlof and Shiller 2009; Scott and Loewenstein 2010], the microblogging platform Twitter has drawn more and more attention from different disciplines as a laboratory to study large sets of social and economic data. Particularly interesting is the influence of Twitter users and information propagation [Ye and Wu 2010]. For example, Antweiler and Frank [2004] determine the correlation between activity in Internet message boards and stock volatility and trading volume, while similar methods on analyzing web communications are applied by Choudhury et al. [2008], Gloor et al. [2009] and Gilbert and Karahalios [2010]. Zhang et al. [2011], Oh and Sheng [2011], Bollen et al. [2011], Jaimes et al. [2012], Sprenger et al. [2013] and Si et al. [2013] conducted work on analyzing Twitter (or microblogging) data in correlation with financial time series. For instance, the authors Bollen et al. [2011] claim an accuracy of 87.6 % in predicting the daily up and down changes in the closing values of the Dow Jones Industrial Average (DJIA) by using a Google-Profile of Mood States (GPOMS), covering the emotional dimensions of Calm, Alert, Sure, Vital, Kind and Happy. 1 Corresponding author: jermain@mit.edu 2 For an explanation of the term mining see New York Times [2013] 3 http://bitcoincharts.com, March 31st, 2014.
1:2 J. Kaminski & Peter A. Gloor This paper is an early approach to apply the presently known practice of Twitter sentiment analysis for the case of Bitcoin and may be among the first approaches in this field. Analyzing the Bitcoin market seems particularly interesting, as it is a global and decentralized 24-hour trading market, with tweets and other information signals that are simultaneously provided, in real-time. 2. METHODOLOGY By using an automated online collector with access to the Twitter API, 4 a total of 161,200 tweets has been collected within the timeframe from November 23 rd 2013 until March 7 th 2014. 5 Queries have been updated every hour on a daily basis. According to the collector s database, the collected tweets created in sum about 300 million impressions, where impressions are the total number of times that the tweets have been delivered to Twitter streams of users. So far, it seems reasonable that even such a small proportion of captured tweets might entail multiplier effects. In order to fetch the relevant emotional tweets, following queries have been used: bitcoin AND feel OR happy OR great OR love OR awesome OR lucky OR good OR sad OR bad OR upset OR unhappy OR nervous -bot and bitcoin AND hope OR fear OR worry. By doing so, tweets that contain the word bitcoin and one of the other terms (or more) could be archived. The biggest challenge for analyzing Twitter data and preventing statistical bias is undistorded data. Therefore, some data cleaning has been applied in the next step and (for example) the combinations happy birthday and not bad have not been counted for happy or bad respectively, just as repetitive bot-created content was filtered out as far as possible. As a result, four blocks of information can be pooled: (1) sum positive tweets ( 131,117): Sum of tweets on Bitcoin containing positive signals (feel, happy, great, love, awesome, lucky, good). (2) sum negative tweets ( 19,179): Sum of tweets on Bitcoin containing negative signals (sad, bad, upset, unhappy, nervous). (3) sum emotions ( 150,296): Sum of positive and negative tweets. (4) sum hopefearworry ( 10,222): Sum of tweets on Bitcoin containing signals of uncertainty (hope, fear or worry), cf. Zhang et al. [2011]. Further, market data from the four leading Bitcoin indices was fetched, namely BitStamp, Bitfinex, BTC-e and BTC China. 6 For each index, following price data has been considered: Open, Close, High (intraday), Low (intraday), Volume (BTC), Volume (currency), Intraday Spread (High-Low), overall Intraday Return (Open-Close) and δ Price (Close Price day+2 - Close Price day0 ). In the following, Figure 1, 2 and 3 visualize the data collection results: 4 http://tweetarchivist.com 5 N=104 6 Together, BitStamp (34%), Bitfinex (26%), BTC-e (16%) and BTC China (10%) account for about 86% of the overall Bitcoin market volume. Cf. http://bitcoincharts.com/charts/volumepie/
Nowcasting the Bitcoin Market with Twitter Signals 1:3 80000000 70000000 60000000 Volume (Currency / $) 50000000 40000000 30000000 20000000 10000000 0 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 BTC-e BitFinex BitStamp Fig. 1 Bitcoin markets and their Volume (by currency). 1400 140000 1200 120000 1000 100000 Volume (BTC) 80000 60000 800 600 BitStamp Price 40000 400 20000 200 0 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 0 Volume (BTC) Intraday Spread Fig. 2 BitStamp trading volume and price indicators. Squares signal the Close price, while bars (green) reflect the Intraday Spread (High-Low).
1:4 J. Kaminski & Peter A. Gloor 160000.00 1150 140000.00 950 120000.00 750 Volume (BTC) 100000.00 80000.00 60000.00 550 350 Tweets, BTC-e indicators 40000.00 150 20000.00-50 0.00 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101-250 sum_positive_tweets/4 sum_negative_tweets/2 sum_hopefearworry/2 bitstamp_closeprice bitstamp_intradayreturn Fig. 3 Tweets and BitStamp price indicators. Remark: sum positive tweets, sum negative tweets and sum hoepfearworry have been adjusted (/x) to fit the dimensions of the graph. Prominent peaks on day 94-97 can be explained by emotions on Mt.Gox breakdown (February 25 th 2014). 3. RESULTS Table I. Correlation of Twitter sentiments and Bitcoin markets bitstamp closeprice bitfinex closeprice btce closeprice btcn closeprice sum positive tweets -.085 -.093 -.054 -.068 sum negative tweets -.262** -.261** -.278**.230** sum emotions -.131 -.138 -.109 -.110 sum hopefearworry -.275** -.249** -.271** -.259** sum ratio positive to negative.227*.228*.282**.170 ** Correlation is significant at the 0.01 level (2-tailed) * Correlation is significant at the 0.05 level (2-tailed) We can see that the biggest bitcoin exchange platforms (by trading volume) seem to be most sensitive for negative tweets and signals of uncertainty (sum hopefearworry) on Bitcoin. A day with a low amount of negative tweets will most likely end with a higher close price.
Nowcasting the Bitcoin Market with Twitter Signals 1:5 It can complementary be noted that a higher ratio of positive to negative tweets (=more positive than negative tweets) is accompanied by a higher close price. 7 To the contrary, positive tweets and the sum of emotions do not seem to influence the close price of BitStamp significantly. The previously introduced sentiment signals will sequentially be tested with indicators for BitStamp, the currently biggest exchange market: Table II. Correlation of Twitter sentiments and BitStamp market indicators bitst closeprice bitst volumebtc bitst intraday spread bitst intraday return sum positive tweets -.085.393**.127 -.064 sum negative tweets -.262**.566**.286** -.151 sum emotions -.131.452**.171 -.088 sum hopefearworry -.257**.459**.194* -.033 ratio positive to negative.227* -.426** -.339**.124 ** Correlation is significant at the 0.01 level (2-tailed) * Correlation is significant at the 0.05 level (2-tailed) Following results can be taken from Table II: (1) Emotions, especially negative tweets, positively correlate with the BitStamp trading volume. Furthermore, signals of uncertainty (hope, fear, worry) seem to support a vivid market with high trading volumes. (2) Emotions and signals of uncertainty also fuel intraday price volatilities, such as the intraday spread, reflecting the difference between the highest and lowest intraday trading price of a Bit- Stamp bitcoin. (3) The lower the ratio of positive to negative tweets, the higher the trading volume. This implies a balanced ratio (i.e. equilibrium) of positive and negative tweets significantly fuels trading. (4) The more negative emotions and signals of uncertainty appear on a given trading day, the more likely is a lower close price. Mixed with signals of uncertainty, negative sentiments may be a sign of dissatisfaction or pessimism by traders (and their observers). The current data suggests that the term of Nowcasting (predicting the present) [Choi and Varian 2012] might be applicable to Bitcoin intraday market development. In order to find out more about the predictive value of Twitter sentiments for the development of the BitStamp market, another calculation will be conducted. For this purpose, we will project the BitStamp close price each -2 and +2 days into the past and future respectively, while applying a moving average for related sentiment data. 7 It may be noted that the number of positive tweets is much higher than that of negative ones, more than 10 times higher on average. So far, the assumption by [Zhang et al. 2011] that people prefer optimistic to pessimistic words can be confirmed.
1:6 J. Kaminski & Peter A. Gloor Table III. Correlation of Twitter sentiments and BitStamp close price (± 2 days) -2d bitst close -1d bitst close bitst close +1d bitst close +2d bitst close sum positive tweets -.022 -.083 -.130 -.175 -.208* sum negative tweets -.208* -.269** -.318** -.341** -.338** sum emotions -.066 -.130 -.180 -.222* -.248* sum hopefearworry -.262** -.298** -.274** -.256** -.269** ratio positive to negative.199*.229*.263**.281**.280** ** Correlation is significant at the 0.01 level (2-tailed) * Correlation is significant at the 0.05 level (2-tailed) Table III shows that negative tweets are negatively correlated with the BitStamp close price within a 48-hour timespan. Given the increasing significance with the passage of time, a lagging negative effect for negative tweets on the BitStamp price might be observed. The sum of emotions and positive tweets both show a negative correlation with regard to the future close price. Further data suggests that signals of uncertainty (hope, fear, worry) do not only amplify trading volumes (cf. Table IV) but also the close price: The higher the amount of uncertainty signals, the lower the BitStamp close price within ± 2days. As far as other signals indicate, an equilibrium in the amount of positive and negative signals positively correlates with a higher close price within the next two days. Table IV. Correlation of Twitter sentiments and BitStamp trading volume, price δ (± 2 days) -2d volume -1d volume +1d volume +2d volume -2d δ price +2d δ price sum positive tweets.010.302**.002 -.045 -.174 -.105 sum negative tweets.237*.505**.115.107 -.180.015 sum emotions.063.364**.028 -.012 -.183 -.082 sum hopefearworry.362**.546**.025 -.036 -.021 -.028 ratio positive to negative -.366** -.447** -.238* -.177.014 -.104 ** Correlation is significant at the 0.01 level (2-tailed) * Correlation is significant at the 0.05 level (2-tailed) Table IV demonstrates the correlation between the BitStamp (intraday) trading volume as well as the intraday price spread (High - Low) δ within a ±48-hour timespan. As far as data enables an interpretation, a high amount of emotions and signals of uncertainty (in the present) correlates with high trading volumes within the last 24 hours. Especially, a high amount of negative signals is a key influencer for trading volume within the past 24 hours. Again, a balanced ratio of positive and negative sentiments also contributes to a higher trading volume, while for intraday price spreads (δ), the current data does not support any significant influence by Twitter sentiments.
Nowcasting the Bitcoin Market with Twitter Signals 1:7 Now, following results can be drawn from this paper: (1) Intraday measurements of Twitter sentiments may contribute to determine the movement of Bitcoin close price and volume. While the predictive validity of this work is very limited, it may at least confirm the possibility of Nowcasting. (2) Static intraday data suggests that an equilibrium of positive and negative tweets correlates with a higher trading volume, while a higher amount of positive tweets leads to a higher close-price on average. (3) In terms of time series and lagging effects, especially negative sentiments positively correlate with the intraday trading volume within the last 48 hours. Also, signals of uncertainty positively correlate with the trading volume within the last 48 hours. This observation can be translated as follows: When the trading volume was (and is) high, emotions fly high on Twitter. The correlation of past trading volumes with current emotions may already suggest a yet unproven causality. (4) The positive correlation of a balanced ratio in emotional sentiments and the trading volume could designate Twitter as place that reflects the speculative momentum within the market. (5) Summarizing, the microblogging platform Twitter may be interpreted as a virtual trading floor that heats up with virtual emotions and reflects market movements. 4. OUTLOOK The underlying question of this early research is how far real world emotions influence real financial markets, like Bitcoin. What can be told from an analysis of the BitStamp market is that sentiments on Twitter could be a valuable source of information on a Bitcoin trading desk. While the current data of only 104 days already looks promising, a longitudinal analysis of about 6 months or more might provide a better quality of scientific expressiveness and point towards possible predictors, especially in view of the fact that we currently observe a very volatile market. Particularly, events such as the breakdown of Mt.Gox can be considered as (both internal and external) market shocks that essentially influence the considered data. Most important, the question of causality has to be considered, as it seems still a little unclear, if emotional tweets rather follow a vivid market (as suggested) or if they rather co-create it. Technically, correlations in this analysis follow the simple and assumption that the relation between variables is linear, which is hardly satisfied in the often appearing random walk of financial market movements. Here, a multivariate Granger (1969) causality test might contribute towards a model that may better identify the cause and effect relationship of Twitter sentiments and Bitcoin market indicators. Equal attention should be paid to data and sentiment quality, which is very limited in our current methodology. A better linguistic model (and equally broader queries) might contribute to more valuable information. Therefore, we cordially invite fellow researchers to keep a close eye on Twitter and the Bitcoin market and to improve the outlined approach.
1:8 J. Kaminski & Peter A. Gloor REFERENCES George A. Akerlof and Robert J. Shiller. 2009. Animal Spirits: How Human Psychology Drives the Economy, and Why It Matters for Global Capitalism. Princeton University Press. Werner Antweiler and Murray Z. Frank. 2004. Is all that talk just noise? The information content of Internet stock message boards. Journal of Finance 59, 3 (Juna 2004), 1259 1295. Hohan Bollen, Huina Mao, and Xiao-Jun Zeng. 2011. Twitter mood predicts the stock market. Journal of Computational Science 2, 1 (March 2011), 1 8. Hyunyoung Choi and Hal Varian. 2012. Predicting the Present with Google Trends. Economic Record 88 (June 2012), 2 9. Munmun De Choudhury, Hari Sundaram, Ajita John, and Dorée D. Seligmann. 2008. Can blog communication dynamics be correlated with stock market activity?. In In HT 08: Proceedings of the nineteenth ACM conference on Hypertext and hypermedia (2008). 55 60. Eric Gilbert and Karrie Karahalios. 2010. Widespread worry and the stock market. In Proceedings of the International Conference on Weblogs and Social. Peter A. Gloor, Jonas S. Krauss, Stefan Nann, Kai Fischbach, and Detlef Schoder. 2009. Web Science 2.0: Identifying Trends Through Semantic Social Network Analysis. In International Conference on Computational Science and Engineering, Vol. 4. Alejandro Jaimes, Eduardo Ruiz, Vagelis Hristidis, Carlos Castillo, and Aristides Gionis. 2012. Correlating Financial Time Series with Micro-Blogging Data. (2012). http://labs.yahoo.com/publication/ correlating-financial-time-series-with-micro-blogging-data/ Joshua A. Kroll, Ian C. Davey, and Edward W. Felten. 2013. The Economics of Bitcoin Mining, The Economics of Bitcoin Mining, or Bitcoin in the Presence of Adversaries. (November 2013). Satoshi Nakamoto. 2009. Bitcoin: A Peer-to-Peer Electronic Cash System. (2009). New York Times. 2013. Into the Bitcoin Mines. (2013). http://dealbook.nytimes.com/2013/12/21/into-the-bitcoin-mines/ Chong Oh and Olivia Sheng. 2011. Investigating Predictive Power of Stock Micro Blog Sentiment in Forecasting Future Stock Price Directional Movement. Proceedings of the International Conference on Information Systems, ICIS 2011 (January 2011). Rick Scott and George Loewenstein. 2010. Handbook of Emotions. Guilford Press, Chapter The Role of Emotion in Economic Behaviour, 138 158. Jianfeng Si, Arjun Mukherjee, Bing Liu, Qing Li, Huayi Li, and Xiaotie Deng. 2013. Exploiting Topic based Twitter Sentiment for Stock Prediction. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013 2 (2013), 24 29. Timm O. Sprenger, Andranik Tumasjan, Philipp G. Sandner, and Isabell M. Welpe. 2013. Tweets and trades: The information content of stock microblogs. European Financial Management (May 2013). The Economist. 2013. Bitcoin under pressure. (November 2013). http://www.economist.com/news/technology-quarterly/ 21590766-virtual-currency-it-mathematically-elegant-increasingly-popular-and-highly The Wall Street Journal. 2013. Beware the Risks of the Bitcoin: Winklevii Outline the Downside. (July 2013). http://blogs.wsj. com/moneybeat/2013/07/02/beware-the-risks-of-the-bitcoin-winklevii-outline-the-downside/ Shaozhi Ye and Felix S. Wu. 2010. Measuring Message Propagation and Social Influence on Twitter.com. In Proceedings of the Second international conference on Social informatics. 216 231. Xue Zhang, Hauke Fuehres, and Peter A. Gloor. 2011. Predicting Stock Market Indicators Through Twitter I hope it is not as bad as I fear. Procedia - Social and Behavioral Sciences 26 (2011), 55 62.