Linked Data-based Social Media Analysis for Stock Market Tracking

Size: px
Start display at page:

Download "Linked Data-based Social Media Analysis for Stock Market Tracking"

Transcription

1 Linked Data-based Social Media Analysis for Stock Market Tracking Priyanka Dank University of Bonn Bonn NRW, Germany Simon Scerri Fraunhofer IAIS Schloss Birlinghoven Sankt Augustin NRW, Germany Ali Khalili Dept. of Computer Science VU University Amsterdam De Boelelaan 1081, 1081 HV Amsterdam, The Netherlands Abstract The rising demand of social media and ensuing activities provide a rich data source for opinion mining. The efforts reported in this paper analyze a popular micro-blogging social network for one of the important application domains, i.e., stock market prediction. The novel approach developed includes the use of the Semantic Web technologies to more accurately interpret social media streams. The algorithm closely examines the visible public mood vis-a-vis an organization and related entities, namely products and affiliated persons, and considers the effect of these parameters in calculating the sentiment. The entities and their relationships to the organization of interest are retrieved from one of the largest Linked Data repository available online. Furthermore, we assess the algorithm by collecting the public sentiments as well as the intra-day stock prices during our experiments followed by the statistical tests on the collected data. The evaluation takes into account different company profiles according to our heuristics to study any detected correlation in detail. The exercise is repeated with the benchmark method, which disregards the two additional entities we consider. The evaluation considers whether the fluctuating sentiment is reflected in stock prices, and the most suitable time-lag. I. INTRODUCTION Seminal theories of stock market prediction suggest that stock price fluctuations are principally governed by a random walk hypothesis, i.e., movements cannot be predicted [2], [3], [4]. However, a large number of recent studies have shown that prediction through social media content analysis is, to an extent or another, possible [1], [14], [24]. Social media has become an important tool in application areas concerned with shifting public opinions. Most opinion mining and sentiment analysis methods involve the use of some level of natural language processing in order to determine the polarity of the text, which is then attached as a label (e.g., positive, negative or neutral) or a number within a range [19]. In this paper, we also investigate whether social media can be used to predict stock market movements. To try achieving more significant results than related efforts, we introduce a novel method that exploits the existence of domain Linked Open Data (LOD) in order to more accurately gauge public online sentiment related to a primary entity (i.e., a company, organisation or enterprise). We also utilise social media streams from one of the most popular platforms, Twitter 1. In order to carry out this investigation, we apply a standard sentiment analysis method to the Twitter data stream to calculate and track changes in the online sentiment. Due to the concise and informal nature of tweets, sentiment analysis is known to be more challenging than when applied on other forms of text content [9]. We opt for an existent sentiment analysis method to determine public online sentiment at regular intervals, and identify whether any correlation exists between the resulting fluctuations and the actual stock markets for a given company. The decision to perform correlation tests lies in the assumption that public trust in a company, as well as (or based on) its products and key people, is the lurking variable or common factor between both i) changes in sentiment expression in the social media and ii) stock market values, related to the company. Besides correlation tests for concurrent data available for both variables, we also perform these tests repeatedly with different time-lags, to identify the most reliable time lapse between online sentiment expression and any corresponding trends in the stock market, given any correlation is identified. Furthermore, we advance on the state-of-the-art by attempting to improve the method for determining periodic sentiment for a given entity. Specifically, we investigate a novel method that does not only gauge sentiment surrounding a primary entity, but also considers secondary entities (i.e., a company s products and key figures or personnel) that form part of an extended context. We thus refer to a benchmark and our novel alternative approach: Benchmark (Keyword-based): Sentiment Analysis is applied on tweets which directly mention the primary entity. Alternative (Context expansion): Sentiment Analysis is applied on tweets mentioning the primary entity (direct), associated entities (indirect), or a combination of. To enable this alternative approach, we first perform context expansion to return a list of secondary entities related to a primary entry. Context expansion is achieved by retrieving entities that i) conform to a specific semantic type and ii) 1 Current records indicate over 284 million monthly active users generating over 500 million tweets per day.

2 are semantically linked to the primary entity with a specific relationship. The results are obtained by executing pre-defined SPARQL queries over DBpedia 2. In the second step, the sentiment analysis task is then performed on the tweets with extended results collectively, based on a system of weights that determine the relevance of the primary and secondary entities on the overall score. To evaluate the effectiveness of both benchmark and alternative methods, correlation tests are then performed over data collected over the course of at least 1 week, and the results presented and contrasted. Before providing the details of the two approaches in Section 3, and the results of the correlation tests in Section 4, in Section 2 we report on related efforts in the area. After concluding in Section 5, we also suggest future avenues for this research. II. RELATED WORK Studies performed over the last decade have indicated that, rather than following a random walk, stock market changes are influenced by public sentiment [17]. In particular, strong correlations have been identified [24] between hope, fear and worry indices expressed in Twitter posts and stock market indicators such as DJIA (Dow Jones Industrial Average), the National Association of Securities Dealers Automated Quotations (NASDAQ) and Standard & Poor s(s&p). In particular, the researchers behind the same experiment observed that emotional outbursts in the social media provide for a reliable predictor of stock market levels in the following day. Additional studies [1], [5] have similarly concluded that financial decisions can be linked to the universal public mood, and that therefore, stock market movements can be predicted by monitoring public mood swings in the social media. Other efforts developed algorithms that combine social media sentiment analysis with recent stock prices in order to predict stock market movements [13]. Our approach is also based on the assumption that public opinion reflects or even precedes stock market movements for an enlisted company, and we also rely on social media as an instant reflection of public mood worldwide. Regardless of the affirmative indications reported in previous work, the first experiments performed in the evaluation of our approach also investigate the presence and extent of the correlation between the two variables. A large number of related efforts also consider entitydependant approaches to social media-based stock market prediction. The approach described in [6] retrieves tweets based on a primary keyword. Target-dependent features are then identified by considering the grammatical structure and syntax of the sentence. Alternatively, efforts such as [22] rely on the selection of one of the S&P500 stock symbols as the entity of interest when retrieving tweets for analysis. The efforts presented in [15] also follow this approach but go one step further by collecting data from other data sources in addition to Twitter, including 11 online message boards such as Yahoo! Finance as well as various news articles. Other approaches 2 apply active learning by training classifiers based on manual, company-focussed labelling of tweets [23] to consider the collective impact of tweets on a specific company, its products and its stock value. However, our approach differs since none of the above-discussed approaches consider secondary entities that are closely related to the company of interest when retrieving social media posts on which to perform the sentiment analysis. Sentiment analysis and opinion mining are extensive research fields in their own right, and in this paper we do not seek to compare the state-of-the-art in the area. Instead, we adopt an existing method which we deem most suitable for our approach. However, we compare the way we chose to apply the chosen method to other approaches. Most of the more advanced efforts consider the detection of key words or phrases expressing an opinion in the vicinity of a respective entity, rather than in the entire text [11], [20]. This is even more crucial when considering the brevity of social media posts, which tend to be rather short (or very short, in the case of tweets, which are also known as microposts). Thus, we also consider sentiment around identified entities of interest. To calculate such a sentiment score, efforts such as [8] consider different lengths of sentiment regions, or token windows. We similarly consider a token window of ±3. The application of ontologies or Linked Data for this usecase is not novel. Efforts like [9] describe ontology-based techniques to calculate individual and more reliable sentiment scores for each distinct entity in a post. The authors of [21] describe an approach that adds semantic concepts as additional features in training sets for sentiment analysis. An evaluation concludes that semantic features produce a better F-score only when considering negative sentiment. Our context expansion proposal is similar to semantic query expansion, whereby an initial query related to a keyword or entity is expanded to cover additional entities. This method has been investigated for different use-cases. In [18], for example, lexico-semantic expansion is used to enhance Twitter-based event detection. Semantic relationships between hashtags, inferred from correlation statistics, are investigated to expand the original hashtags and retrieve further tweets. In contrast to this kind of approach, we do not need to manually create or infer new concept maps in order to perform our company-centric entity set extraction. Instead, we rely on up-to-date open data provided by the international community. Specifically, we take advantage of the vast information available in the DBpedia knowledge repository, a machine-processable interpretation of Wikipedia and a major node in the LOD Cloud. Related efforts such as [10] have successfully used DBPedia-based semantic networks and spreading activation methods to generate more extended and more accurate user interest profiles. Similarly, we use DBPedia to derive ontological relationships between a company and its products and key personnel. The reliance on a LOD repository also guarantees that we do not need to manually update the concept maps when facts change (e.g. key personnel change, new products are released), since in Wikipedia and subsequently DBpedia this tends to be done

3 quite instantly by individuals with an interest or stake in the described concepts. III. L INKED DATA - BASED C OMPANY S ENTIMENT T RACKING Fig. 1 illustrates the implemented company sentiment tracking process, which is primarily composed of three main steps: 1) Linked Data-based Context Expansion 2) Linked Data-based Stream Processing 3) Periodic Sentiment Approximation The methodology behind the above stages is described in detail in the sections below. A reference implementation of the approach has been developed as a plugin to our Realtime Semantic Analyzer (ReSA3 ). ReSA is an extension of the context platform[7], which fetches real-time tweets referring to an entity of interest (a string) after this is resolved to a DBpedia entity using the DBpedia Spotlight [12]. Fig. 2 shows a screenshot of the ReSA SentiTracker plugin4 in use. Fig. 2. ReSA SentiTrack in use. company in question is not. In addition, a Tweet containing two or more relevant entities can be considered to be more representative when approximating public sentiment. After a company is matched to a DBpedia representation, the URI is used to retrieve secondary entities by executing a pre-defined set of SPARQL queries over the same repository (using the sparql-client NodeJS packagehttps:// and the DBpedia SPARQL endpoint5 ). The retrieval is not straightforward due to the lack of structure and consistency exhibited by DBpedia. For example, although at times key linked people were available as instances of foaf:person, in other cases they were merely expressed as a string. In addition, a number of DBpedia properties are more or less arbitrarily used to link a company with its products and key people. Following a domain-dependent survey of DBpedia content, the list of properties shown in Table I was identified. These were then embedded in a pre-defined series of nested SPARQL queries to obtain the two types of relevant entities. B. Linked Data-based Stream Processing Fig. 1. Process Flow Diagram A. Linked Data-based Context Expansion The decision to expand a primary entity of interest (a company) into a set of entities is based on the assumption that the strategy planning done by key personnel affiliated with that company, as well as the quality and reliability of the products and services it offers, influence public opinion about the company, and can ultimately also affect its stocks. Thus, for our objective it is also worth to consider microposts mentioning products and/or key personnel, even when the The set of entities returned by the SPARQL queries are stored and their labels are fed to the Twitter Streaming API for filtering purposes. Public tweets referring to any number of these labels are retrieved and subjected to a number of operations, as outlined in the following subsections. 1) Entity Recognition and Annotation: In this step, DBpedia Spotlight is invoked for automatically annotating mentions of DBpedia resources in the micropost text. Before applying the Named Entity Recogntion (NER) tool, we use a language detector tool to detect the language of the tweet. Based on the detected language, we dispatch the NER task to the right DBpedia Spotlight instance tailored to that specific language. This will improve the quality of NER by employing better indexes for natural language processing and entity spotting. The output of NER task will be a set of quads consisting of a URI, the matching surface form, the entity type and the Try it out here: 5

4 Products dbpedia.org/ontology/developer dbpedia.org/ontology/manufacturer dbpedia.org/ontology/designer dbpedia.org/ontology/builder dbpedia.org/ontology/product dbpedia.org/ontology/owner dbpedia.org/property/parent dbpedia.org/property/parentcompany dbpedia.org/property/constructors dbpedia.org/property/engine dbpedia.org/property/manufacturer dbpedia.org/property/manufacturer dbpedia.org/property/developer dbpedia.org/property/owner dbpedia.org/property/currentowner Key People dbpedia.org/ontology/keyperson dbpedia.org/ontology/occupation dbpedia.org/ontology/board dbpedia.org/ontology/employer dbpedia.org/ontology/knownfor dbpedia.org/ontology/division dbpedia.org/ontology/institution dbpedia.org/ontology/foundedby dbpedia.org/property/keypeople dbpedia.org/property/ceo dbpedia.org/property/editor dbpedia.org/property/chiefeditor dbpedia.org/property/publisher dbpedia.org/property/employer dbpedia.org/property/team dbpedia.org/property/teams dbpedia.org/property/sbkmanufacturers dbpedia.org/property/office dbpedia.org/property/workinstitutions dbpedia.org/property/title TABLE I PROPERTIES EMBEDDED IN THE SPARQL QUERIES offset in the given text. Amongst the returned results, the ones matching the entities of interest are instantly identified by matching the returned URI to the ones returned by the SPARQL queries. Spotlight, however, is not fully adequate to tackle our requirements. First, initial trials indicated that Spotlight can on occasion not identify all matching surface forms. For this reason, we implemented a second procedure to ensure that all occurrences of SPARQL-retrieved entities in the text are recognised and annotated. Second, for our sentiment analysis task we do not require all identified DBpedia entities, but only the ones in the expanded context. Therefore, entities which do not match are discarded by employing the Ask SPARQL query. A post-filtering step is then executed to exclude any resulting tweets which have no relevant entities annotated. Table II illustrates an example of results obtained following this step, showing the identified entity URIs and their known type(s). Company Filtered Tweets Entity Inc. 10 Quotes That [DBpedia:Agent, Schema:Organization, Will Make You DBpedia:Organisation, Apple Love Apple CEO DBpedia:Company] ; Tim Cook $AAPL Cook [DBpedia:Agent, Schema:Person, Foaf:Person, DBpedia:Person]. Google accidentally leaked hundreds of thousands of [DBpedia:Company, Google customers personal DBpedia:Organisation, details and didn t DBpedia:Agent, Schema:Organization]. notice for 2 years Microsoft in positive Microsoft talks with all publishers for Xbox One s backward compatibility. com/1e9cuxy [DBpedia:Agent, Schema:Organization, DBpedia:Organisation, DBpedia:Company] ; one [Schema:CreativeWork, DBpedia:Work, DBpedia:Software]. TABLE II ANNOTATED TWEET EXAMPLES 2) Entity-based Weight Assignment: The expansion of one initial interesting entity into a set which includes secondary entities introduced the need to consider different impacts the presence of entity types has on the computed tweet sentiment score. For the purpose, we devised a series of interest profiles with varying entity-type weights. For our main use-case, we noted that for some kinds of companies, tweets expressing public sentiment tend to be more product-oriented, whereas others were more personnel-oriented. Based on this observation, we devised the heuristic shown in Table III. All three shown profiles consider company references as the most influential (half of the total weight), but differ with respect to product and personnel weights. If the company is unknown at execution stage, profile A is assigned by default. The assigned weights are used in conjunction with the computed sentiment scores in the weighted sentiment score calculation step which is explained in section III-B4. In direct comparison to this weighting scheme, the benchmark method (which only considers the presence of company references) would only have a fixed weight of 0.5 for each company reference. However, in order to determine whether factoring-in additional related entities improves sentiment tracking, in the evaluation no such weight is established for the benchmark, i.e., a neutral weight of 1 is assumed for each company reference. Profile Fitting Company Examples Company Product Person Weight Weight Weight Company A Apple Inc., Google Company B BMW, Unilever Company C Goldman Sachs, Morgan Stanley TABLE III ENTITY-BASED WEIGHTS 3) Entity-centric Sentiment Calculation: Our sentiment calculation follows a Lexicon-based approach, using the AFINN- 111 wordlist which contains 2477 manually-rated English words and phrases[16]. We employ the NodeJS Sentiment module 6, which uses the above wordlist to assign scores to a given text. However, the end-score assigned by this module is a cumulative score of the identified words in the entire text. Since we want to consider the sentiment around detected interesting entities individually, this was not an ideal solution. Therefore, we only use the Sentiment module to identify positive and negative matches against the AFINN- 111 wordlist. These are assigned scores of ±1. We then calculate sentiment around each relevant entity by considering the sentiment lexicons present in its immediate vicinity. Given the 140-character limitation of Tweets, looking at the 3 tokens ahead and after the entity is considered appropriate. The entitycentric score is calculated by summing up the scores of any identified sentiment lexicons in this range. In addition, following the weight assignment strategy introduced in Table III we respectively sum up company, product 6

5 and person sentiment scores so that each micropost analysis results in three sentiment scores (one for each entity-type). Table IV shows sample results for the tweets in Table II, including identified relevant entities and entries in the sentiment lexicon. No. Filtered Tweets cs prs kps #1 #2 #3 10 Quotes That Will Make You Love + Apple CEO Tim Cook $AAPL Google accidentally leaked hundreds of thousands of customers personal details and didn t notice for 2 years Microsoft in positive + talks with all publishers for Xbox One s backward compatibility. TABLE IV TWEET SENTIMENT SCORES ) Weighted Sentiment Score Calculation: In this stage, the assigned entity-based weight is combined with the entitycentric sentiment score to obtain a weighted tweet sentiment score, t w s. The calculation is expressed by Equation 1: t ws = ( c s c w ) + ( pr s pr w ) + ( kp s kp w ) (1) where c s, pr s and kp s are respectively the individual sentiment scores calculated for each company, product(s) and key person(s) referenced in the text, and c w, pr w and kp w are the profile-defined company, product and key personnel weights. Table V shows the result for the three example tweets based on the intermediary results shown in Table IV and the relevant weight profile introduced in Table III: No. Profile Entity-type Sentiment Entity-type Weights cs prs kps c w pr w kp w t ws #1 A #2 A #3 A TABLE V WEIGHTED TWEET SENTIMENT CALCULATION C. Periodic Sentiment Approximation The method so far is concerned with calculating sentiment for individual tweets in continuous social media streams. The next decision was to consider how the resulting stream of sentiment scores can be interpreted to approximate general public sentiment expressed about an entity in more discrete time periods. This is required to enable comparative studies with other information streams, such as stock levels for our stock market tracking ambitions. We approach the above requirement by calculating a moving average that is representative of general expressed sentiment in a specific time period. Equation 2 defines the dynamic moving average sentiment score S p calculated during a time period p. When the time period is elapsed, the final S p value represents the average detected sentiment in that time period. Tws S p = (2) T where T is the total number of (relevant) tweets for which a sentiment score is calculate within time period p and Tws is the sum of these scores in the same period 7. In order to smoothen out the evolving S p values and ensure a value in the range of ±1, we also introduce the following normalisation function for n(s p ): ifs p > 0 : n(s p ) = else : n(s p ) = ( 1 ( 1 1 ) 1 (3) 1 + S ) p 1 (4) S p In order to demonstrate the behaviour of the process and described functions, we extend the examples shown to obtain the three first values of the normalised moving average sentiment n(s p ) for the three tweets shown in the running example 8 Total Tweets T Incoming Tweet t ws Total Sentiment Tws Moving Average S p Normalised n(s p) TABLE VI MOVING AVERAGE SENTIMENT CALCULATION IV. EXPERIMENTS AND DISCUSSION The main hypothesis behind our contributions, in the context of the selected use-case, is that social media could reflect or even predict stock market movements. To evaluate our proposed method and investigate this hypothesis, we establish the following three research questions: 1) Can any correlation between periodical sentiment approximations and corresponding stock movements be identified? As discussed in the introduction, one can interpret positive correlation as an indication that both expressed social media sentiment and stock market values are influenced by a third common factor: public trust in a company (and its key people and products). 2) If a positive correlation is identified, can social media be used to predict stock movements, i.e., does the expressed general sentiment precede stock changes, and 7 By definition a moving average evolves indefinitely. However, periodical sentiment scores are easier to define and work with. In particular, for the usecase targeted in this paper it is more useful to compare an average sentiment score that is calculated on an hourly, daily, weekly, etc., basis. 8 In reality the consideration of these tweets in conjunction is not valuable for our use-case, since they refer to different companies. However, we stick to the same examples for the sake of simplicity

6 for which time interval is the observed correlation most significant? 3) Does our context-expansion approach (company, products, key persons) offer any observable value over the benchmark approach (company only)? A. Experimental Setup To answer the above questions, we organised the following experimental setup. The implemented approach was deployed on two workstations, which were configured to consume, filter and classify live tweets; as well as stock market movements, over a week. We considered 6 companies corresponding to the different company profiles in Table III. As per Equations 2 and 3, we then set a time period of 1 week, and executed the processing only during those hours in which the stock markets are active. To address the first question, we then submit the two corresponding value pairs to correlation tests. To address the second question, we perform these tests repeatedly with different timelags to identify the most promising time lapse between online sentiment expression and any corresponding trends in the stock market. Besides the default realtime comparison (lag 0, executed every 60 minutes on the hour), we submit pairs of values for correlation tests with incremental hourly lags (lag of 1 to 12 hours). To address the third question, the above experiment was parallelised to execute both Benchmark and Alternative approaches at the same time, so as to enable a direct comparison. In addition to the above basic setup and procedure, we introduced a second experiment. In this variant, the number of tweets was reset every 24-hours. This arrangement translates to a fluctuating moving average sentiment score calculated on a daily basis, rather than having a continuously calculated score over the entire span of one week. This means that the relevance of older sentiment scores expires at a steeper rate in favour of recent ones. In addition, in this variant products and/or key personnel are only factored-in when either of them is present. In terms of the chosen weighting system, it means that if no product or person was recognised in a relevant tweet, the company weight was raised to a full 100%. Thus, only when other entity types are detected is the relevance of company-centric sentiment reduced to also account for sentiment detected around secondary entities. This last condition only applies to the alternative approach, since the benchmark approach only considers one primary entity. The introduction of this variant produces four different results, which we will below refer to as Benchmark-Basic and Alternative-Basic and their variants Benchmark-Var and Alternative-Var. B. Results In this section, we interpret and discuss the data collected in the week-long experiment. To illustrate the data collected for each of the chosen 6 companies, Fig 3 demonstrates a sample (37 observations) of the results collected over the entire period for one company: BMW. The stock market movements observed are shown (values on the right y-axis) together with the four calculated moving average sentiments based on the four methods under consideration (values on the left y-axis). From the plot, it can instantly be observed that the decision to reset the number of tweets when calculating the moving average sentiment in the variants produces a behaviour that is less smooth (dashed and dotted lines) than their basic counterparts (solid lines). Fig. 3. BMW Moving Average Sentiment scores against Stock Market values for Company Benchmark Alternative Basic Var Basic Var Apple Inc Google Average (Profile I) BMW UniLever Average (Profile II) Morgan Stanley Goldman Sachs Average (Profile III) AVERAGE MEDIAN TABLE VII INSTANT CORRELATION RESULTS (NO TIME LAPSE) FOR DIFFERENT EXPERIMENTS Another visually observable result is the indication that there could indeed be some degree of correlation between the social media-based scores and the stock market movements. To examine this possibility, we submitted the computed moving average sentiment values (for each of the four described methods) and the corresponding observable stock market values to correlation tests. The experiment was repeated for the six identified companies and the 24 distinct results are shown in Table VII. In order to interpret the results collectively, averages are also calculated for each of the three company profiles (intermediate rows) and for all six companies collectively (last row). The latter is supplemented by the median. By observing these values we can make the following conclusions: 1) On average, there is low to moderate correlation (-1 < < 1) between the sentiment scores and

7 the stock values. The average result is not completely random (supported by the tests for the median), and only one of the six companies exhibits a negative correlation (Unilever). 2) The results suggest that the basic method performs generally better than the variant, for both benchmark and alternative. 3) The results suggest that our proposed alternative method does not perform significantly better or worse than the benchmark. A two-tailed t-test with a 0.05 level of significance fails with a P-Value of The correlation values between the two variables in Table VII were computed at the end of each hour in the testing period. The next series of experiments sought to investigate whether higher correlation levels are observed when the two variables are not considered at the same instant. The main purpose of this experiment is to determine which time interval is most suitable for predicting market movements from the expressed online sentiment. Fig. 4 shows the correlation coefficients calculated for each of the six selected companies when shifting the comparisons by 12 incremental hourly intervals. The first value shown in each graph (time lag = 0) is equivalent to the results shown in Table VII. Similarly, the correlation results are produced for all four alternative methods. Fig. 4. Correlation values for 12 different time intervals From these graphs, it can be observed that there is no general rule for determining the best time interval. In three of the experiments, the basic benchmark and alternative methods perform best instantly (Google, BMW, Goldman Sachs). In the other three, higher correlations are observed using time intervals 6 (Unilever) and 12 (Apple Inc., Morgan Stanley). Considering only the two basic methods (benchmark-basic, alternative-basic), the averages for the 6 result sets in Fig. 4 were calculated to interpret the results further. The result, shown in Fig. 5, suggests that instant correlations tend to be stronger. Fig. 5. Average correlation values for 12 different time intervals The above results provide a systematic answer to all three targeted questions. Although larger-scale experiments are required to confirm the first question, based on our experiments there is a strong indication that there is a common factor influencing both social media sentiment and stock market values related to a company. Although longer experiments can be organised and planned as future work, the data provided by the public Twitter API might not be sufficient to provide the required data test bed. Our answer to the second question is that, if it is possible to predict stock market movements based on sentiment expressed in social media, the prediction is more or less instant and the likelihood of a correct prediction starts to drop sharply if interpreted within an hour s timeframe, or longer. The answer to the third question, which is also the main hypothesis targeted in this paper, proved somewhat disappointing. Although our alternative method is more complex than the benchmark, and also considers tweets that refer to the secondary object of interest even when the primary one is unmentioned, the same correlation levels are observed. Following an in-depth manual investigations of the results, we observed that although our extended method is of value, the majority of tweets retrieved when performing the sentiment scoring refer to the primary object of interest, and thus the effect of the other tweets remains negligible. Secondly, we also observed that the biggest value of our linked data-based method relies on the proper identification of multiple entities of interest in one tweet, and the success rate of the sentiment analysis. Unfortunately, the two existing libraries we used for entity recognition and sentiment analysis do not always perform satisfactorily. For example, although the tweet Google Maps for Android got a smashing new feature

8 that will make iphone users jealous should be recognised and rated as positive based on our alternative method (when considering Google ), it fails both our filtering stage and the sentiment scoring. DBpedia Spotlight fails to recognise Android ; and in general the library has been observed to omit recognition of some entities seemingly randomly. In addition, although the positively-connotated word smashing is within the region of the recongised company product, it is not in the selected wordlist used by the selected sentiment module. V. CONCLUSIONS AND FUTURE WORK The presented novel method monitors sentiment expressed in the Social Media with the intent of predicting stock market movements. It advances contemporary studies by performing context expansion based on a primary entity and semantic knowledge extracted from LOD. In our implementation, we relied on Twitter and DBPedia respectively as the social media stream and knowledge repository of choice. Linked Data-based context expansion replaces the need to manually create and maintain concept maps when monitoring a single main entity of interest is not sufficient. In our use-case, we simply adopt existing ontological relationships that link a company to its products and key personnel. We investigated different methods for monitoring evolving public sentiment for a chosen company, ahead of comparing the results against its intra-day stock prices. The two variables were subjected to a series of correlation tests to identify the potential and most suitable time interval for stock market prediction. At the same time, we also compared our context expansion method against the company-centric benchmark. The results indicate low to moderate correlation, confirming that social media analysis can contribute to stock market prediction, especially when interpreted within a 1-hour time frame. Although our alternative method did not offer any immediate advantage, the analysis of individual results indicates a higher potential when used to track companies whose products or key personnel are also highly and individually discussed. The implemented tracker, which is available online, is also provided as open-source, enabling extension by the interested community. We will strive to improve the implemented method in three main ways. Through the adoption or development of a sentiment analysis module that is more suited for the stock market domain we attempt to improve entity-centric scoring. The selection of LOD as a vessel for semantic expansion also has its limitations. For example, DBpedia has an error rate of 5-10%. A lot of efforts continue to invest significant effort to improve data curation technology and guarantee the veracity of these datasets. Being at their forefront through projects such as LOD2 9 and Diachron 10, we will immediately apply the most recent best practices. We will also extend social media stream coverage to included other social networks such as Facebook and LinkedIn, as well as reliable product review fora ACKNOWLEDGEMENT We thank the CEOs at Stockpulse GmbH; Mr. Stefan Nann & Mr. Jonas Krauss for their collaboration in our efforts. REFERENCES [1] Johan Bollen, Huina Mao, and Xiaojun Zeng. Twitter mood predicts the stock market. Journal of Computational Science, 2(1):1 8, [2] Maria Rosa Borges. Efficient market hypothesis in european stock markets. The European Journal of Finance, 16(7): , [3] Paul H Cootner. The random character of stock market prices [4] Eugene F Fama. Efficient capital markets: Ii. The journal of finance, 46(5): , [5] Eric Gilbert and Karrie Karahalios. Widespread worry and the stock market. In ICWSM, pages 59 65, [6] Long Jiang, Mo Yu, Ming Zhou, Xiaohua Liu, and Tiejun Zhao. Target-dependent twitter sentiment classification. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pages ACL, [7] Ali Khalili, Sören Auer, and Axel-Cyrille Ngonga Ngomo. context lightweight text analytics using linked data. In 11th Extended Semantic Web Conference (ESWC 2014), pages Springer, [8] Soo-Min Kim and Eduard Hovy. Determining the sentiment of opinions. In Proceedings of the 20th international conference on Computational Linguistics, page Association for Computational Linguistics, [9] Efstratios Kontopoulos, Christos Berberidis, Theologos Dergiades, and Nick Bassiliades. Ontology-based sentiment analysis of twitter posts. Expert systems with applications, 40(10): , [10] Nicolas Marie, Olivier Corby, Fabien Gandon, and Ribière Myriam. Composite interests exploration thanks to on-the-fly linked data spreading activation. In Hypertext 2013, Paris, France, May [11] Justin Martineau, Akshay Java, Pranam Kolari, Tim Finin, Anupam Joshi, and James Mayfield. Blogvox: Learning sentiment classifiers. In Proceedings of the 22Nd National Conference on Artificial Intelligence - Volume 2, AAAI 07, pages AAAI Press, [12] Pablo N. Mendes, Max Jakob, Andrés García-Silva, and Christian Bizer. Dbpedia spotlight: shedding light on the web of documents. In Proceedings of the 7th International Conference on Semantic Systems, I-Semantics 11, pages 1 8, New York, USA, ACM. [13] Anshul Mittal and Arpit Goel. Stock prediction using twitter sentiment analysis. Standford University, CS229( stanford. edu/proj2011/goelmittal- StockMarketPredictionUsingTwitterSentimentAnalysis. pdf), [14] Helen Susannah Moat, Chester Curme, Adam Avakian, Dror Y Kenett, H Eugene Stanley, and Tobias Preis. Quantifying wikipedia usage patterns before stock market moves. Scientific reports, 3, [15] Stefan Nann, Jonas Krauss, and Detlef Schoder. Predictive analytics on public data-the case of stock markets. In ECIS, page 102, [16] Finn Årup Nielsen. A new ANEW: Evaluation of a word list for sentiment analysis in microblogs, pages CEUR Workshop Proceedings [17] John R Nofsinger. Social mood and financial economics. The Journal of Behavioral Finance, 6(3): , [18] Ozer Ozdikis, Pinar Senkul, and Halit Oguztuzun. Semantic expansion of hashtags for enhanced event detection in twitter. In Proceedings of the 1st International Workshop on Online Social Systems. Citeseer, [19] Bo Pang and Lillian Lee. Opinion mining and sentiment analysis. Foundations and trends in information retrieval, 2(1-2):1 135, [20] Ana-Maria Popescu and Orena Etzioni. Extracting product features and opinions from reviews. In Natural language processing and text mining, pages Springer, [21] Hassan Saif, Yulan He, and Harith Alani. Semantic sentiment analysis of twitter. In The Semantic Web ISWC 2012, pages Springer, [22] Jianfeng Si, Arjun Mukherjee, Bing Liu, Qing Li, Huayi Li, and Xiaotie Deng. Exploiting topic based twitter sentiment for stock prediction. In ACL (2), pages 24 29, [23] Jasmina Smailović, Miha Grčar, Nada Lavrač, and Martin Žnidaršič. Stream-based active learning for sentiment analysis in the financial domain. Information Sciences, 285: , [24] Xue Zhang, Hauke Fuehres, and Peter A Gloor. Predicting stock market indicators through twitter i hope it is not as bad as i fear. Procedia-Social and Behavioral Sciences, 26:55 62, 2011.

Sentiment analysis on tweets in a financial domain

Sentiment analysis on tweets in a financial domain Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International

More information

IMPACT OF SOCIAL MEDIA ON THE STOCK MARKET: EVIDENCE FROM TWEETS

IMPACT OF SOCIAL MEDIA ON THE STOCK MARKET: EVIDENCE FROM TWEETS IMPACT OF SOCIAL MEDIA ON THE STOCK MARKET: EVIDENCE FROM TWEETS Vojtěch Fiala 1, Svatopluk Kapounek 1, Ondřej Veselý 1 1 Mendel University in Brno Volume 1 Issue 1 ISSN 2336-6494 www.ejobsat.com ABSTRACT

More information

Can Twitter provide enough information for predicting the stock market?

Can Twitter provide enough information for predicting the stock market? Can Twitter provide enough information for predicting the stock market? Maria Dolores Priego Porcuna Introduction Nowadays a huge percentage of financial companies are investing a lot of money on Social

More information

Nowcasting the Bitcoin Market with Twitter Signals

Nowcasting the Bitcoin Market with Twitter Signals 1 Nowcasting the Bitcoin Market with Twitter Signals JERMAIN KAMINSKI, MIT Media Lab & Witten/Herdecke University 1 PETER A. GLOOR, MIT Center for Collective Intelligence 1. INTRODUCTION Bitcoin is a peer-to-peer

More information

Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams

Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams 2012 International Conference on Computer Technology and Science (ICCTS 2012) IPCSIT vol. XX (2012) (2012) IACSIT Press, Singapore Using Text and Data Mining Techniques to extract Stock Market Sentiment

More information

Predicting Stock Market Fluctuations. from Twitter

Predicting Stock Market Fluctuations. from Twitter Predicting Stock Market Fluctuations from Twitter An analysis of the predictive powers of real-time social media Sang Chung & Sandy Liu Stat 157 Professor ALdous Dec 12, 2011 Chung & Liu 2 1. Introduction

More information

A Description of Consumer Activity in Twitter

A Description of Consumer Activity in Twitter Justin Stewart A Description of Consumer Activity in Twitter At least for the astute economist, the introduction of techniques from computational science into economics has and is continuing to change

More information

CSE 598 Project Report: Comparison of Sentiment Aggregation Techniques

CSE 598 Project Report: Comparison of Sentiment Aggregation Techniques CSE 598 Project Report: Comparison of Sentiment Aggregation Techniques Chris MacLellan cjmaclel@asu.edu May 3, 2012 Abstract Different methods for aggregating twitter sentiment data are proposed and three

More information

Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement

Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement Ray Chen, Marius Lazer Abstract In this paper, we investigate the relationship between Twitter feed content and stock market

More information

The Viability of StockTwits and Google Trends to Predict the Stock Market. By Chris Loughlin and Erik Harnisch

The Viability of StockTwits and Google Trends to Predict the Stock Market. By Chris Loughlin and Erik Harnisch The Viability of StockTwits and Google Trends to Predict the Stock Market By Chris Loughlin and Erik Harnisch Spring 2013 Introduction Investors are always looking to gain an edge on the rest of the market.

More information

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015 Sentiment Analysis D. Skrepetos 1 1 Department of Computer Science University of Waterloo NLP Presenation, 06/17/2015 D. Skrepetos (University of Waterloo) Sentiment Analysis NLP Presenation, 06/17/2015

More information

Predicting Stock Market Indicators Through Twitter I hope it is not as bad as I fear

Predicting Stock Market Indicators Through Twitter I hope it is not as bad as I fear Available online at www.sciencedirect.com Procedia Social and Behavioral Sciences Procedia - Social and Behavioral Sciences 00 (2009) 000 000 www.elsevier.com/locate/procedia COINs2010 Predicting Stock

More information

Mining the Web of Linked Data with RapidMiner

Mining the Web of Linked Data with RapidMiner Mining the Web of Linked Data with RapidMiner Petar Ristoski, Christian Bizer, and Heiko Paulheim University of Mannheim, Germany Data and Web Science Group {petar.ristoski,heiko,chris}@informatik.uni-mannheim.de

More information

Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies

Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies Sentiment analysis of Twitter microblogging posts Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies Introduction Popularity of microblogging services Twitter microblogging posts

More information

Tweets Miner for Stock Market Analysis

Tweets Miner for Stock Market Analysis Tweets Miner for Stock Market Analysis Bohdan Pavlyshenko Electronics department, Ivan Franko Lviv National University,Ukraine, Drahomanov Str. 50, Lviv, 79005, Ukraine, e-mail: b.pavlyshenko@gmail.com

More information

Italian Journal of Accounting and Economia Aziendale. International Area. Year CXIV - 2014 - n. 1, 2 e 3

Italian Journal of Accounting and Economia Aziendale. International Area. Year CXIV - 2014 - n. 1, 2 e 3 Italian Journal of Accounting and Economia Aziendale International Area Year CXIV - 2014 - n. 1, 2 e 3 Could we make better prediction of stock market indicators through Twitter sentiment analysis? ALEXANDER

More information

CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis

CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis Team members: Daniel Debbini, Philippe Estin, Maxime Goutagny Supervisor: Mihai Surdeanu (with John Bauer) 1 Introduction

More information

Semantic Sentiment Analysis of Twitter

Semantic Sentiment Analysis of Twitter Semantic Sentiment Analysis of Twitter Hassan Saif, Yulan He & Harith Alani Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom The 11 th International Semantic Web Conference

More information

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS Stacey Franklin Jones, D.Sc. ProTech Global Solutions Annapolis, MD Abstract The use of Social Media as a resource to characterize

More information

Twitter Volume Spikes: Analysis and Application in Stock Trading

Twitter Volume Spikes: Analysis and Application in Stock Trading Twitter Volume Spikes: Analysis and Application in Stock Trading Yuexin Mao University of Connecticut yuexin.mao@uconn.edu Wei Wei FinStats.com weiwei@finstats.com Bing Wang University of Connecticut bing@engr.uconn.edu

More information

Predicting stocks returns correlations based on unstructured data sources

Predicting stocks returns correlations based on unstructured data sources Predicting stocks returns correlations based on unstructured data sources Mateusz Radzimski, José Luis Sánchez-Cervantes, José Luis López Cuadrado, Ángel García-Crespo Departamento de Informática Universidad

More information

LDIF - Linked Data Integration Framework

LDIF - Linked Data Integration Framework LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany a.schultz@fu-berlin.de,

More information

Sentiment Analysis on Big Data

Sentiment Analysis on Big Data SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social

More information

Twitter Stock Bot. John Matthew Fong The University of Texas at Austin jmfong@cs.utexas.edu

Twitter Stock Bot. John Matthew Fong The University of Texas at Austin jmfong@cs.utexas.edu Twitter Stock Bot John Matthew Fong The University of Texas at Austin jmfong@cs.utexas.edu Hassaan Markhiani The University of Texas at Austin hassaan@cs.utexas.edu Abstract The stock market is influenced

More information

The Use of Twitter Activity as a Stock Market Predictor

The Use of Twitter Activity as a Stock Market Predictor National College of Ireland Higher Diploma in Science in Data Analytics 2013/2014 Robert Coyle X13109278 robert.coyle@student.ncirl.ie The Use of Twitter Activity as a Stock Market Predictor Table of Contents

More information

Predicting the Stock Market with News Articles

Predicting the Stock Market with News Articles Predicting the Stock Market with News Articles Kari Lee and Ryan Timmons CS224N Final Project Introduction Stock market prediction is an area of extreme importance to an entire industry. Stock price is

More information

Financial Trading System using Combination of Textual and Numerical Data

Financial Trading System using Combination of Textual and Numerical Data Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,

More information

CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet

CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet Muhammad Atif Qureshi 1,2, Arjumand Younus 1,2, Colm O Riordan 1,

More information

Semantic Search in Portals using Ontologies

Semantic Search in Portals using Ontologies Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br

More information

LinkZoo: A linked data platform for collaborative management of heterogeneous resources

LinkZoo: A linked data platform for collaborative management of heterogeneous resources LinkZoo: A linked data platform for collaborative management of heterogeneous resources Marios Meimaris, George Alexiou, George Papastefanatos Institute for the Management of Information Systems, Research

More information

Text Opinion Mining to Analyze News for Stock Market Prediction

Text Opinion Mining to Analyze News for Stock Market Prediction Int. J. Advance. Soft Comput. Appl., Vol. 6, No. 1, March 2014 ISSN 2074-8523; Copyright SCRG Publication, 2014 Text Opinion Mining to Analyze News for Stock Market Prediction Yoosin Kim 1, Seung Ryul

More information

A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1

A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 Yannis Stavrakas Vassilis Plachouras IMIS / RC ATHENA Athens, Greece {yannis, vplachouras}@imis.athena-innovation.gr Abstract.

More information

Exploiting Topic based Twitter Sentiment for Stock Prediction

Exploiting Topic based Twitter Sentiment for Stock Prediction Exploiting Topic based Twitter Sentiment for Stock Prediction Jianfeng Si * Arjun Mukherjee Bing Liu Qing Li * Huayi Li Xiaotie Deng * Department of Computer Science, City University of Hong Kong, Hong

More information

Neural Networks for Sentiment Detection in Financial Text

Neural Networks for Sentiment Detection in Financial Text Neural Networks for Sentiment Detection in Financial Text Caslav Bozic* and Detlef Seese* With a rise of algorithmic trading volume in recent years, the need for automatic analysis of financial news emerged.

More information

Microblog Sentiment Analysis with Emoticon Space Model

Microblog Sentiment Analysis with Emoticon Space Model Microblog Sentiment Analysis with Emoticon Space Model Fei Jiang, Yiqun Liu, Huanbo Luan, Min Zhang, and Shaoping Ma State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory

More information

Keywords social media, internet, data, sentiment analysis, opinion mining, business

Keywords social media, internet, data, sentiment analysis, opinion mining, business Volume 5, Issue 8, August 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Real time Extraction

More information

LinksTo A Web2.0 System that Utilises Linked Data Principles to Link Related Resources Together

LinksTo A Web2.0 System that Utilises Linked Data Principles to Link Related Resources Together LinksTo A Web2.0 System that Utilises Linked Data Principles to Link Related Resources Together Owen Sacco 1 and Matthew Montebello 1, 1 University of Malta, Msida MSD 2080, Malta. {osac001, matthew.montebello}@um.edu.mt

More information

Identifying Market Price Levels using Differential Evolution

Identifying Market Price Levels using Differential Evolution Identifying Market Price Levels using Differential Evolution Michael Mayo University of Waikato, Hamilton, New Zealand mmayo@waikato.ac.nz WWW home page: http://www.cs.waikato.ac.nz/~mmayo/ Abstract. Evolutionary

More information

The Open University s repository of research publications and other research outputs

The Open University s repository of research publications and other research outputs Open Research Online The Open University s repository of research publications and other research outputs OUSocial2: a platform for gathering students feedback from social media Conference Item How to

More information

Reputation Management System

Reputation Management System Reputation Management System Mihai Damaschin Matthijs Dorst Maria Gerontini Cihat Imamoglu Caroline Queva May, 2012 A brief introduction to TEX and L A TEX Abstract Chapter 1 Introduction Word-of-mouth

More information

The Influence of Sentimental Analysis on Corporate Event Study

The Influence of Sentimental Analysis on Corporate Event Study Volume-4, Issue-4, August-2014, ISSN No.: 2250-0758 International Journal of Engineering and Management Research Available at: www.ijemr.net Page Number: 10-16 The Influence of Sentimental Analysis on

More information

CitationBase: A social tagging management portal for references

CitationBase: A social tagging management portal for references CitationBase: A social tagging management portal for references Martin Hofmann Department of Computer Science, University of Innsbruck, Austria m_ho@aon.at Ying Ding School of Library and Information Science,

More information

End-to-End Sentiment Analysis of Twitter Data

End-to-End Sentiment Analysis of Twitter Data End-to-End Sentiment Analysis of Twitter Data Apoor v Agarwal 1 Jasneet Singh Sabharwal 2 (1) Columbia University, NY, U.S.A. (2) Guru Gobind Singh Indraprastha University, New Delhi, India apoorv@cs.columbia.edu,

More information

Designing Ranking Systems for Consumer Reviews: The Impact of Review Subjectivity on Product Sales and Review Quality

Designing Ranking Systems for Consumer Reviews: The Impact of Review Subjectivity on Product Sales and Review Quality Designing Ranking Systems for Consumer Reviews: The Impact of Review Subjectivity on Product Sales and Review Quality Anindya Ghose, Panagiotis G. Ipeirotis {aghose, panos}@stern.nyu.edu Department of

More information

Exploring the use of Big Data techniques for simulating Algorithmic Trading Strategies

Exploring the use of Big Data techniques for simulating Algorithmic Trading Strategies Exploring the use of Big Data techniques for simulating Algorithmic Trading Strategies Nishith Tirpankar, Jiten Thakkar tirpankar.n@gmail.com, jitenmt@gmail.com December 20, 2015 Abstract In the world

More information

Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens

Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens 1 Optique: Improving the competitiveness of European industry For many

More information

Text Mining - Scope and Applications

Text Mining - Scope and Applications Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss

More information

EXPLOITING TWITTER IN MARKET RESEARCH FOR UNIVERSITY DEGREE COURSES

EXPLOITING TWITTER IN MARKET RESEARCH FOR UNIVERSITY DEGREE COURSES EXPLOITING TWITTER IN MARKET RESEARCH FOR UNIVERSITY DEGREE COURSES Zhenar Shaho Faeq 1,Kayhan Ghafoor 2, Bawar Abdalla 3 and Omar Al-rassam 4 1 Department of Software Engineering, Koya University, Koya,

More information

SmartLink: a Web-based editor and search environment for Linked Services

SmartLink: a Web-based editor and search environment for Linked Services SmartLink: a Web-based editor and search environment for Linked Services Stefan Dietze, Hong Qing Yu, Carlos Pedrinaci, Dong Liu, John Domingue Knowledge Media Institute, The Open University, MK7 6AA,

More information

Combining Social Data and Semantic Content Analysis for L Aquila Social Urban Network

Combining Social Data and Semantic Content Analysis for L Aquila Social Urban Network I-CiTies 2015 2015 CINI Annual Workshop on ICT for Smart Cities and Communities Palermo (Italy) - October 29-30, 2015 Combining Social Data and Semantic Content Analysis for L Aquila Social Urban Network

More information

Big Data and High Quality Sentiment Analysis for Stock Trading and Business Intelligence. Dr. Sulkhan Metreveli Leo Keller

Big Data and High Quality Sentiment Analysis for Stock Trading and Business Intelligence. Dr. Sulkhan Metreveli Leo Keller Big Data and High Quality Sentiment Analysis for Stock Trading and Business Intelligence Dr. Sulkhan Metreveli Leo Keller The greed https://www.youtube.com/watch?v=r8y6djaeolo The money https://www.youtube.com/watch?v=x_6oogojnaw

More information

The process of gathering and analyzing Twitter data to predict stock returns EC115. Economics

The process of gathering and analyzing Twitter data to predict stock returns EC115. Economics The process of gathering and analyzing Twitter data to predict stock returns EC115 Economics Purpose Many Americans save for retirement through plans such as 401k s and IRA s and these retirement plans

More information

Package syuzhet. February 22, 2015

Package syuzhet. February 22, 2015 Type Package Package syuzhet February 22, 2015 Title Extracts Sentiment and Sentiment-Derived Plot Arcs from Text Version 0.2.0 Date 2015-01-20 Maintainer Matthew Jockers Extracts

More information

QUANTIFYING THE EFFECTS OF ONLINE BULLISHNESS ON INTERNATIONAL FINANCIAL MARKETS

QUANTIFYING THE EFFECTS OF ONLINE BULLISHNESS ON INTERNATIONAL FINANCIAL MARKETS QUANTIFYING THE EFFECTS OF ONLINE BULLISHNESS ON INTERNATIONAL FINANCIAL MARKETS Huina Mao School of Informatics and Computing Indiana University, Bloomington, USA ECB Workshop on Using Big Data for Forecasting

More information

Prediction of Stock Performance Using Analytical Techniques

Prediction of Stock Performance Using Analytical Techniques 136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

More information

Visual Analysis of Statistical Data on Maps using Linked Open Data

Visual Analysis of Statistical Data on Maps using Linked Open Data Visual Analysis of Statistical Data on Maps using Linked Open Data Petar Ristoski and Heiko Paulheim University of Mannheim, Germany Research Group Data and Web Science {petar.ristoski,heiko}@informatik.uni-mannheim.de

More information

Connecting library content using data mining and text analytics on structured and unstructured data

Connecting library content using data mining and text analytics on structured and unstructured data Submitted on: May 5, 2013 Connecting library content using data mining and text analytics on structured and unstructured data Chee Kiam Lim Technology and Innovation, National Library Board, Singapore.

More information

Fall Detection System based on Kinect Sensor using Novel Detection and Posture Recognition Algorithm

Fall Detection System based on Kinect Sensor using Novel Detection and Posture Recognition Algorithm Fall Detection System based on Kinect Sensor using Novel Detection and Posture Recognition Algorithm Choon Kiat Lee 1, Vwen Yen Lee 2 1 Hwa Chong Institution, Singapore choonkiat.lee@gmail.com 2 Institute

More information

Deposit Identification Utility and Visualization Tool

Deposit Identification Utility and Visualization Tool Deposit Identification Utility and Visualization Tool Colorado School of Mines Field Session Summer 2014 David Alexander Jeremy Kerr Luke McPherson Introduction Newmont Mining Corporation was founded in

More information

Social Market Analytics, Inc.

Social Market Analytics, Inc. S-Factors : Definition, Use, and Significance Social Market Analytics, Inc. Harness the Power of Social Media Intelligence January 2014 P a g e 2 Introduction Social Market Analytics, Inc., (SMA) produces

More information

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015 Computer-Based Text- and Data Analysis Technologies and Applications Mark Cieliebak 9.6.2015 Data Scientist analyze Data Library use 2 About Me Mark Cieliebak + Software Engineer & Data Scientist + PhD

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,

More information

Linked Statistical Data Analysis

Linked Statistical Data Analysis Linked Statistical Data Analysis Sarven Capadisli 1, Sören Auer 2, Reinhard Riedl 3 1 Universität Leipzig, Institut für Informatik, AKSW, Leipzig, Germany, 2 University of Bonn and Fraunhofer IAIS, Bonn,

More information

Using News Articles to Predict Stock Price Movements

Using News Articles to Predict Stock Price Movements Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 gyozo@cs.ucsd.edu 21, June 15,

More information

Lightweight Data Integration using the WebComposition Data Grid Service

Lightweight Data Integration using the WebComposition Data Grid Service Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of

More information

Emoticon Smoothed Language Models for Twitter Sentiment Analysis

Emoticon Smoothed Language Models for Twitter Sentiment Analysis Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Emoticon Smoothed Language Models for Twitter Sentiment Analysis Kun-Lin Liu, Wu-Jun Li, Minyi Guo Shanghai Key Laboratory of

More information

Forecasting stock markets with Twitter

Forecasting stock markets with Twitter Forecasting stock markets with Twitter Argimiro Arratia argimiro@lsi.upc.edu Joint work with Marta Arias and Ramón Xuriguera To appear in: ACM Transactions on Intelligent Systems and Technology, 2013,

More information

Optimised Realistic Test Input Generation

Optimised Realistic Test Input Generation Optimised Realistic Test Input Generation Mustafa Bozkurt and Mark Harman {m.bozkurt,m.harman}@cs.ucl.ac.uk CREST Centre, Department of Computer Science, University College London. Malet Place, London

More information

Learn Software Microblogging - A Review of This paper

Learn Software Microblogging - A Review of This paper 2014 4th IEEE Workshop on Mining Unstructured Data An Exploratory Study on Software Microblogger Behaviors Abstract Microblogging services are growing rapidly in the recent years. Twitter, one of the most

More information

Folksonomies versus Automatic Keyword Extraction: An Empirical Study

Folksonomies versus Automatic Keyword Extraction: An Empirical Study Folksonomies versus Automatic Keyword Extraction: An Empirical Study Hend S. Al-Khalifa and Hugh C. Davis Learning Technology Research Group, ECS, University of Southampton, Southampton, SO17 1BJ, UK {hsak04r/hcd}@ecs.soton.ac.uk

More information

The Italian Hate Map:

The Italian Hate Map: I-CiTies 2015 2015 CINI Annual Workshop on ICT for Smart Cities and Communities Palermo (Italy) - October 29-30, 2015 The Italian Hate Map: semantic content analytics for social good (Università degli

More information

How To Analyze Sentiment On A Microsoft Microsoft Twitter Account

How To Analyze Sentiment On A Microsoft Microsoft Twitter Account Sentiment Analysis on Hadoop with Hadoop Streaming Piyush Gupta Research Scholar Pardeep Kumar Assistant Professor Girdhar Gopal Assistant Professor ABSTRACT Ideas and opinions of peoples are influenced

More information

Prediction of changes in the stock market using twitter and sentiment analysis

Prediction of changes in the stock market using twitter and sentiment analysis Prediction of changes in the stock market using twitter and sentiment analysis Iulian Vlad Serban, David Sierra González, and Xuyang Wu University College London Abstract Twitter is an online social networking

More information

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,

More information

JamiQ Social Media Monitoring Software

JamiQ Social Media Monitoring Software JamiQ Social Media Monitoring Software JamiQ's multilingual social media monitoring software helps businesses listen, measure, and gain insights from conversations taking place online. JamiQ makes cutting-edge

More information

TWITTER AND FINANCIAL MARKETS

TWITTER AND FINANCIAL MARKETS The 215 WEI International Academic Conference Proceedings TWITTER AND FINANCIAL MARKETS Muktamala Chakrabarti, Asim Kumar Pal, Ashok Banerjee Indian Institute of Management Calcutta D.H. Road, Joka, Kolkata,

More information

Analysis of Tweets for Prediction of Indian Stock Markets

Analysis of Tweets for Prediction of Indian Stock Markets Analysis of Tweets for Prediction of Indian Stock Markets Phillip Tichaona Sumbureru Department of Computer Science and Engineering, JNTU College of Engineering Hyderabad, Kukatpally, Hyderabad-500 085,

More information

Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities

Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities April, 2013 gaddsoftware.com Table of content 1. Introduction... 3 2. Vendor briefings questions and answers... 3 2.1.

More information

PRIMETERMINAL. Your personalised financial desktop

PRIMETERMINAL. Your personalised financial desktop PRIMETERMINAL Your personalised financial desktop YOUR PERSONALISED FINANCIAL DESKTOP PRIMETERMINAL DECISION SUPPORT FOR INVESTMENT ADVISORS AND ASSET MANAGERS With the volume and complexity of investment

More information

MLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS 2014. November 7, 2014. Machine Learning Group

MLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS 2014. November 7, 2014. Machine Learning Group Big Data and Its Implication to Research Methodologies and Funding Cornelia Caragea TARDIS 2014 November 7, 2014 UNT Computer Science and Engineering Data Everywhere Lots of data is being collected and

More information

Initial Report. Predicting association football match outcomes using social media and existing knowledge.

Initial Report. Predicting association football match outcomes using social media and existing knowledge. Initial Report Predicting association football match outcomes using social media and existing knowledge. Student Number: C1148334 Author: Kiran Smith Supervisor: Dr. Steven Schockaert Module Title: One

More information

Big Data Analysis and the Advantages of Organizational Sustainability Modeling

Big Data Analysis and the Advantages of Organizational Sustainability Modeling The Big Data Analysis for Measuring Popularity in the Mobile Cloud Victor Chang School of Computing, Creative Technologies and Engineering, Leeds Metropolitan University, Headinley, Leeds LS6 3QR, U.K.

More information

European Parliament elections on Twitter

European Parliament elections on Twitter Analysis of Twitter feeds 6 June 2014 Outline The goal of our project is to investigate any possible relations between public support towards two Polish major political parties - Platforma Obywatelska

More information

On the Predictability of Stock Market Behavior using StockTwits Sentiment and Posting Volume

On the Predictability of Stock Market Behavior using StockTwits Sentiment and Posting Volume On the Predictability of Stock Market Behavior using StockTwits Sentiment and Posting Volume Abstract. In this study, we explored data from StockTwits, a microblogging platform exclusively dedicated to

More information

GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING

GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING MEDIA MONITORING AND ANALYSIS GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING Searchers Reporting Delivery (Player Selection) DATA PROCESSING AND CONTENT REPOSITORY ADMINISTRATION AND MANAGEMENT

More information

Applying Machine Learning to Stock Market Trading Bryce Taylor

Applying Machine Learning to Stock Market Trading Bryce Taylor Applying Machine Learning to Stock Market Trading Bryce Taylor Abstract: In an effort to emulate human investors who read publicly available materials in order to make decisions about their investments,

More information

How To Predict Stock Price With Mood Based Models

How To Predict Stock Price With Mood Based Models Twitter Mood Predicts the Stock Market Xiao-Jun Zeng School of Computer Science University of Manchester x.zeng@manchester.ac.uk Outline Introduction and Motivation Approach Framework Twitter mood model

More information

IT services for analyses of various data samples

IT services for analyses of various data samples IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical

More information

WEGOV ANALYSIS TOOLS TO CONNECT POLICY MAKERS WITH CITIZENS ONLINE

WEGOV ANALYSIS TOOLS TO CONNECT POLICY MAKERS WITH CITIZENS ONLINE WEGOV ANALYSIS TOOLS TO CONNECT POLICY MAKERS WITH CITIZENS ONLINE Timo Wandhöfer, GESIS Leibniz Institute for the Social Sciences, Knowledge Technologies for the Social Sciences, Unter Sachsenhausen 6-8,

More information

Concept Term Expansion Approach for Monitoring Reputation of Companies on Twitter

Concept Term Expansion Approach for Monitoring Reputation of Companies on Twitter Concept Term Expansion Approach for Monitoring Reputation of Companies on Twitter M. Atif Qureshi 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group, National University

More information

Research Article 2015. International Journal of Emerging Research in Management &Technology ISSN: 2278-9359 (Volume-4, Issue-4) Abstract-

Research Article 2015. International Journal of Emerging Research in Management &Technology ISSN: 2278-9359 (Volume-4, Issue-4) Abstract- International Journal of Emerging Research in Management &Technology Research Article April 2015 Enterprising Social Network Using Google Analytics- A Review Nethravathi B S, H Venugopal, M Siddappa Dept.

More information

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,

More information

Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2

Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Department of Computer Engineering, YMCA University of Science & Technology, Faridabad,

More information

Text Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com

Text Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com Text Analytics with Ambiverse Text to Knowledge www.ambiverse.com Version 1.0, February 2016 WWW.AMBIVERSE.COM Contents 1 Ambiverse: Text to Knowledge............................... 5 1.1 Text is all Around

More information

Sentiment Analysis and Time Series with Twitter Introduction

Sentiment Analysis and Time Series with Twitter Introduction Sentiment Analysis and Time Series with Twitter Mike Thelwall, School of Technology, University of Wolverhampton, Wulfruna Street, Wolverhampton WV1 1LY, UK. E-mail: m.thelwall@wlv.ac.uk. Tel: +44 1902

More information

LiDDM: A Data Mining System for Linked Data

LiDDM: A Data Mining System for Linked Data LiDDM: A Data Mining System for Linked Data Venkata Narasimha Pavan Kappara Indian Institute of Information Technology Allahabad Allahabad, India kvnpavan@gmail.com Ryutaro Ichise National Institute of

More information

Towards a Sales Assistant using a Product Knowledge Graph

Towards a Sales Assistant using a Product Knowledge Graph Towards a Sales Assistant using a Product Knowledge Graph Haklae Kim, Jungyeon Yang, and Jeongsoon Lee Samsung Electronics Co., Ltd. Maetan dong 129, Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 443-742,

More information