The Use of Twitter Activity as a Stock Market Predictor

Size: px
Start display at page:

Download "The Use of Twitter Activity as a Stock Market Predictor"

Transcription

1 National College of Ireland Higher Diploma in Science in Data Analytics 2013/2014 Robert Coyle X The Use of Twitter Activity as a Stock Market Predictor

2 Table of Contents ABSTRACT... 6 DEFINITIONS, ACRONYMS, AND ABBREVIATIONS... 6 INTRODUCTION... 7 RELATED WORK... 8 SYSTEMS AND DATASETS... 8 DESIGN AND ARCHITECTURE... 8 Brief description of work carried out... 8 DATASETS... 8 Gathering of Twitter Data Gathering of Stock Price Data Data Preparation REQUIREMENTS Data requirements User requirements Usability requirements Functional Requirements TESTING AND EVALUATION...19 SYSTEMS TESTING Apple Stock Microsoft Stock Tesla Stock FORMULA FOR PREDICTING STOCK MOVEMENT Formula Used Apple Stock Prediction Microsoft Stock Prediction Tesla Stock Prediction CONCLUSION...46 FURTHER DEVELOPMENT...47 BIBLIOGRAPHY...48 APPENDIX...48 Project Materials: PROJECT PROPOSAL...49 INTRODUCTION BACKGROUND TECHNICAL APPROACH SPECIAL RESOURCES REQUIRED PROJECT PLAN TECHNICAL DETAILS SYSTEMS/DATASETS EVALUATION/TEST AND ANALYSIS CONSULTATION WITH SPECIALIZATION PERSONS REQUIRMENTS SPECIFICATION...53 The Use of Twitter Activity as a Stock Market Predictor 2

3 DOCUMENT CONTROL REVISION HISTORY DISTRIBUTION LIST RELATED DOCUMENTS INTRODUCTION PURPOSE PROJECT SCOPE In Scope Out of Scope DOCUMENT SCOPE DEFINITIONS, ACRONYMS, AND ABBREVIATIONS USER REQUIREMENTS DEFINITION USER CHARACTERISTICS REQUIREMENTS SPECIFICATION FUNCTIONAL REQUIREMENTS USE CASE DIAGRAM OVERALL FUNCTIONAL REQUIREMENTS REQUIREMENT 1: ACQUIRE DATA 1 AND Description & Priority Use Case Scope Description Use Case Diagram Flow Description REQUIREMENT 2: CLEAN DATA 1 AND Description & Priority Use Case Scope Description Use Case Diagram Flow Description REQUIREMENT 2: ANALYZE DATA Description & Priority Use Case Scope Description Use Case Diagram Flow Description REQUIREMENT 2: PUBLISH DATA Description & Priority Use Case Scope Description Use Case Diagram Flow Description NON-FUNCTIONAL REQUIREMENTS Availability: Must Have Storage Requirements: Must Have Connection Reliability: Must Have Connection Speed: Must Have Backup and Recovery: Must Have Program to clean data: Must Have Software Analysis tools: Must Have Communication Requirements: Must Have The Use of Twitter Activity as a Stock Market Predictor 3

4 3.2.9 Security: Must Have Data Validation: Must Have INTERFACE REQUIREMENTS GUI An example of a analysis of tweets Examples of tweets analyzed on Microsoft Excel and Geo Flow Analysis of tweets using R language Example of Excel Data for intro to Regression Example of analysis completed on R Studio ANALYSIS EVOLUTION...72 PROGRESS MANAGEMENT REPORT DOCUMENT LOCATION REVISION HISTORY APPROVALS DISTRIBUTION PURPOSE OF DOCUMENT DATE OF REPORT PERIOD COVERED SCHEDULE STATUS Updated Gantt chart DEFINITIONS, ACRONYMS, AND ABBREVIATIONS...74 PRODUCTS COMPLETED DURING THIS PERIOD...75 PROBLEMS...75 ACTUAL POTENTIAL RAID LOG: Risks Assumptions Issues Dependency PRODUCTS DUE FOR COMPLETION...77 PROJECT ISSUES STATUES CONCLUSION...78 PROGRESS MANAGEMENT REPORT DOCUMENT LOCATION REVISION HISTORY APPROVALS DISTRIBUTION PURPOSE OF DOCUMENT DATE OF REPORT PERIOD COVERED SCHEDULE STATUS Updated Gantt chart DEFINITIONS, ACRONYMS, AND ABBREVIATIONS...80 PRODUCTS COMPLETED DURING THIS PERIOD...81 PROBLEMS...81 ACTUAL POTENTIAL The Use of Twitter Activity as a Stock Market Predictor 4

5 RAID LOG: Risks Assumptions Issues Dependency PRODUCTS DUE FOR COMPLETION...84 CONCLUSION...85 PROGRESS MANAGEMENT REPORT DOCUMENT LOCATION REVISION HISTORY APPROVALS DISTRIBUTION PURPOSE OF DOCUMENT DATE OF REPORT PERIOD COVERED SCHEDULE STATUS Updated Gantt chart DEFINITIONS, ACRONYMS, AND ABBREVIATIONS...86 PRODUCTS COMPLETED DURING THIS PERIOD...86 PROBLEMS...87 ACTUAL POTENTIAL RAID LOG: Risks Assumptions Issues Dependency PRODUCTS DUE FOR COMPLETION...89 CONCLUSION...89 REFERENCES...90 The Use of Twitter Activity as a Stock Market Predictor 5

6 Abstract This thesis investigates the possibility of predicting stock market movement using Twitter activity. The Analysis will use data mining applications, data analysis techniques, correlation and regression modelling. The data mining of Twitter feeds was carried out. The process involved using Twitter API and Java code to search and download tweets with the words Apple, Microsoft and Tesla in them. These files were then processed using Amazon web service and Text Wrangler. An analysis was carried out using software such as R studio and Microsoft excel. Correlation models and Regression models were built along with the Granger Causality test in R studio. Visualisation techniques were carried out in Microsoft Excel and R studio showing some trends in the data. A formula for stock market prediction for commercial use was created. Since the data set gathered from Twitter was not large enough and the actual information in the tweets was not specified towards the stock belonging to the companies, there is an issue of noisy data corrupting the analysis. A sentiment analysis was not carried out on the tweets. Definitions, Acronyms, and Abbreviations Term API AWS Causative GPOMS Granger causality test NASDAQ Noisy Data POMS Sentiment analysis Text Wrangler Tweet Definition Application programming interface Amazon Web Service A form that indicates that a subject causes something else to do something or causes a change in state of a nonvolition event. Google Profile of Mood States, algorithm to classify public sentiment into 6 categories {Calm, Alert, Sure, Vital, Kind and Happy} A statistical hypothesis test for predicting if one time series is useful in predicting another. National Association of Securities Dealers Automated Quotations Meaningless data. Profile of Mood States. A natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials. Text editor for Mac OS X A message posted on the Twitter website. The Use of Twitter Activity as a Stock Market Predictor 6

7 Introduction The stock market is an essential way for companies to raise money. Companies can raise additional financial capital by being publicly traded in order to expand their business by selling shares of ownership. Historically it is known that share prices can have a major influence on economic activities and can be an indicator of social mood. The stock market movements has always been a rich and interesting subject with such many factors to be analysed that for a long time it would be considered unpredictable. The application of new computerized mathematical methods over the past few decades developed by companies such as Merrill Lynch and other financial management companies have created models that can maximize their returns while minimizing their risks. Stock market prediction has been around for years but it has been giving a new method of prediction thanks to the rise of social media. The objective of this project is to analyse Twitter feeds for activities and trends associated with a brand and to see how their stock market shares are related and if they are affected to the twitter activity. This analysis will look at the relationship of the amount of tweets for three specific brands on the NASDAQ, Apple, Microsoft and Tesla. The search for each company s symbols on the NASDAQ within those returned tweets would be conducted as an additional exploration of stock conversation on Twitter. These brands where chosen since they are innovative technology companies that are on the same stock exchange. Therefore gathering of the twitter data was not time zone dependent. Stock market data was collected from the Yahoo Finance website, there they provide historical data for the NASDAQ. Java scripts were used to acquire the tweets through Twitters API service. The Tweets for each brand were then counted using Amazon Web Service and Text Wrangler. The counted tweets were subsequently analysed using R studio were correlational and regression models were built and Granger Causality Test was performed. The Data was then visualised in Excel and R studio and the creation of a formula for commercial use was attempted. The Use of Twitter Activity as a Stock Market Predictor 7

8 Related Work In the previous study Stock Market Prediction Using Twitter I researched papers in relation to sentiment analysis of social media for the prediction of stock market movement. The social media in question was Twitter. The investigated looked at the correlation between the public mood and the stock market movement and how it can be used to predict stock market prices. The use of sentiment analysis was used to translate the tweets into moods using algorithms such as Google Profile of Mood States. The process of using a sentiment analysis on the tweets proved to be an accurate analysis of the data. Analysing Twitter activity does not provide sufficient behavioural attitudes towards the investors and an accurate prediction of stock movement cannot be ascertained. Sentiment analysis provides the investigation with an insight into the public attitude. The more detailed sentiment analysis on the Twitter data along with a reliable stock data the more superior and accurate the results. Twitter activity along might not give the insight the stockbroker needs to make challenging decisions in buying or selling shares. Systems and Datasets Design and Architecture Brief description of work carried out The system was designed to acquire twitter and stock market data and compare the two data sets for a relationship. For the Twitter data the use of JAVA script, AWS script and Text Wrangler were used to clean the data. The financial data was acquired from the Yahoo Finance website. The data was downloaded in excel format then saved as a CSV file. Then the results from the cleaned Twitter data were placed with the financial cleaned data in excel. Grangers Causality implemented in R Studio to find if the Twitter time s series was useful at forecasting the stock prices time series. A correlation model was built to confirm the relation between the two data types. Then excel was used to visualizes and confirm the relation. Datasets There were two forms of datasets. The first dataset acquired was the Twitter feeds. Historical tweets proved to be difficult since Twitter had sold on their information to external parties. These companies, such as DataSift offer analysis on historical data. While this would have been beneficial to the original project proposal the budget of the project was zero. Twitter launched a Historical Data Grant scheme, which allowed academic students to send in their proposal to gain access to Twitters historical data. The Use of Twitter Activity as a Stock Market Predictor 8

9 A proposal on behalf of this project was sent into the Data Grant scheme but a reply from Twitter returned far too late into the project. Subsequently from these dates the historical stock market data was gathered from Yahoo Finance. Gathering of Twitter Data. The Java script was acquired under approval of Dr. Brian Mac Namee, a Principal Investigator with CeADAR and a lecturer in the School of Computing at the Dublin Institute of Technology. The Java script was used in conjunction with Twitter API. In order to use the Twitter API user must first sign up for a developer account and create an application; there the user can acquire the API codes/keys to run their script. The script was run on my behalf at a friend s home since my own personal Internet connection was not suitable and the apprehension of disconnection, which would have returned unreliable time series. Figure 1.1: Example of the application used in twitter. (Dev.twitter.com, 2014) The Use of Twitter Activity as a Stock Market Predictor 9

10 Figure 1.2: Example of the JAVA code used for downloading the twitter feeds. Figure 1.3: Demonstrates where the unique keys were inputted into the JAVA script. Figure 1.4: Demonstrates where the key words were inputted into the JAVA script. The Use of Twitter Activity as a Stock Market Predictor 10

11 Java script Issues Since the returns from the JAVA script were so regular and to avoid any apprehension of a system crash the data was saved into text files daily. The data sets retrieved from twitter were from 60 megabytes to 100 megabytes with over 400,000 lines of tweets per day. Five sets of text files were attained representing Monday to Friday the NASDAQ opening times. Figure 1.5: Example of the acquired twitter feeds from the JAVA script in a text file. Since one of the days the script was running stopped there was a gap of which existed no tweets from 3am until 8am one day because of this tweets that were published between the trading times of the NASDAQ were used. NASDAQ trading hours is from 09:30 until 16:00 Monday to Friday. In GMT time that is 14:30 to 21:00. Counting the Tweets Next the tweets had to be counted. To this I initially proposed using Amazon Web Services because of the size of the data sets. A word count from the AWS website was used to count all the specific words in each tweet. The Use of Twitter Activity as a Stock Market Predictor 11

12 Figure 1.6: Example of the acquired Python script file from the AWS website. (Aws.amazon.com, 2014) A folder in the S3 bucket was created named project Here all necessary files such as python scripts and tweet files were uploaded. An Elastic Map Reduce Cluster was created. Figure 1.7: Example of a successful cluster from the AWS website. (Aws.amazon.com, 2014) The Use of Twitter Activity as a Stock Market Predictor 12

13 Figure 1.8: Example of a text file returned form the AWS. Word counting Issues The drawback to this script file is that it counted each time a specific word came up in a tweet providing results that were inaccurate. The Use of Twitter Activity as a Stock Market Predictor 13

14 Figure 1.9: Example of a tweet with Apple mentioned twice in Text Wrangler. (Mac App Store, 2014) What was needed was a way to count the amount of tweets that had the keyword mentioned in them. These tweets could contain all three keywords (Apple, Microsoft and Tesla) or together the twitter feeds of each word separately. Text Wrangler was used to search the individual text files for the frequency of the tweets with the key words separately but still had the same problem of counting the amount of times the word occurred. Figure 1.10: Example of tweets from Monday with Tesla mentioned, 3866 occurrences in Text Wrangler. (Mac App Store, 2014) For this reason there will be some conflicts in my analysis result because of extra word counts in tweets with the keywords mentioned twice. Date Apple AAPL Microsoft MSFT Tesla TSLA 07/04/ /04/ /04/ /04/ /04/ Figure 1.11: Displays the key words and their occurrences per day. The Original Key words were Apple, Microsoft and Tesla. I decide to also search for their NASDAQ symbol/code. From previous research into twitter mining and stock prediction researchers searched for the company codes, as it would return The Use of Twitter Activity as a Stock Market Predictor 14

15 more accurate tweet count where people were tweeting about the actual stock of the company. Gathering of Stock Price Data Once the twitter feeds had being gathered the financial data could be downloaded. The historical stock prices had to be the same dates as the Twitter feeds. The data was downloaded in excel format then saved as a CSV file for use in R for analysis. Historical data sets of stock prices can only obtained per day at the minimum from Yahoo Finance otherwise it would have to be streamed from directly from the NASDAQ website, which I did not have the access to. Ideally hourly stock prices would have worked by matching the time series with the Twitter feeds. Data sets of stock prices were collected from the Yahoo Finance website for all three companies. Each set had seven columns consisting of Date, Open, High, Low, Close, Volume and Adjusted Close. Date is the day of trading. Open is the opening price of the stock at the start of the days trading. High is the highest price of the stock form that day. Low is the lowest price of the stock from that day. Close is the closing price of the stock at the end of the days trading. Volume the number of shares traded that day. Adjusted Close is the after trading hours price. The difference between the open and close price. The Use of Twitter Activity as a Stock Market Predictor 15

16 Figure 1.6: Demonstrates the acquired historical Apple stock prices for the month of April 2014 form the Yahoo Finance website. (Finance.yahoo.com, 2014) The closing price is the data in which this analysis focoused on. Data Preparation Results from the cleaned Twitter data were placed with the financial cleaned data in excel. Date Open High Low Close Volume Adj Apple AAPL Close Figure 4.2: Displays the key words and their occurrences per day with the stock prices for Apple. This was repeated for all three companies. The Use of Twitter Activity as a Stock Market Predictor 16

17 Requirements The requirements have remained mostly the same from the original Requirements Specification except for the use of live data rather than using historical Twitter data. Historical Twitter proved to be impracticable as the project had no budget and the historical data had to be purchased. Data requirements DR# Category Description Mo sco w DR1 Use of The information produced must be of use to the user. S M Infromation DR2 Availability Information generated must not be previously available to the user. S L DR3 Access The user must have access to this information. M H S t a t u s User requirements UR# Category Description Mo sco w UR1 Analysis outcome The analysis will provide Apple, Microsoft and Tesla with a better insight of the effectiveness of their advertising campaign strategy form data acquired by the Twitter feeds and stock market. S S t a t u s M UR2 User outcome This information must be of assistance to these companies M M Usability requirements Functional Requirements FR# Category Description Mo sco w FR1 Aquire Data 1 The project will gather and store all nessary data from live Twitter feeds using JAVA scripts in conjunction with Twitter M S t a t u s H The Use of Twitter Activity as a Stock Market Predictor 17

18 API. FR2 Aquire Data 2 The project will gather and store all nessary historical stock M H mrket data regarding the brand corrosponding to the dates in relation to the Twitter data that was aquired from the Yahoo Finance website. FR3 Clean Data 2 The correct programs will be aquired and used to clean and M H retrive Twitter data regarding to key words and hash tags of the brand on certain dates. FR4 Clean Data 2 The correct programs will be aquired and used to clean and M H retrive data historcal stock market share prices regarding the brand on the same time series as the Twitter feeds data. FR5 Analyse 1 The cleaned Twitter data is then analysed and compared. M H FR6 Analyse 2 The cleaned stock market data is then analysed and compared. FR7 Publish Data The analyse will then be publised and avslible to the coustomer. M M H H The Use of Twitter Activity as a Stock Market Predictor 18

19 Testing and Evaluation Systems Testing. Correlation Correlation coefficient is the linear relationship between two variables. Also know as Pearson Product-Moment Correlation Coefficient. Correlation values can be on a scale of +1 to for very story positive relationship. -1 for a strong negative relationship. Regression Regression is used to estimate or predict the relationships among one quantitative variable with another quantitative variable. Granger Causality Granger Causality is a statistical hypothesis test for predicting if one time series is useful in predicting another. Steps in testing stage 1. Check for correlation in R studio. 2. Compose a regression model. 3. Use Granger Causality test used to test if one time series is useful at forecasting another. 4. Change time series to adjust for lag. 5. Excel and R studio to visualizes and confirm any relation. Data sets. The data sets used are the counts from the keyword searches from the AWS returns. Apple, Microsoft and Tesla. Also the counts of the NASDAQ symbols for each company within those initial counts will be used as an additional investigation AAPL, MSFT and TSLA. Apple Stock 1. Check for correlation Figure 4.3: Displays the file AprilAAPL imported into R studio. First the data is imported into R studio. The Use of Twitter Activity as a Stock Market Predictor 19

20 Figure 4.4: Displays the correlation output in R. The correlation model result shows a moderate relation between Close and the counts of the keyword Apple of Regression Model Figure 4.5: Displays the regression model output in R. lm(formula = Apple ~ Close, data = AprilAAPL) Does Apple tweet count have an effect the close price? From the Multiple R-squared it is possible to see that the regression model returned a poor result with only 4.8% explaining Close price. The process was carried out for the AAPL count. The Use of Twitter Activity as a Stock Market Predictor 20

21 Figure 4.6 Displays the regression model output in R. lm(formula = AAPL ~ Close, data = AprilAAPL) Does Apple tweet count have an effect the close price? The regression model returned a similar poor result with only 0.07% explaining Close price. 3. Granger Causality Test Close is Dependent and Apple is independent. Is Apple the cause of the effect of Close? Does Apple Granger cause Close? Figure 4.7 Displays Granger Causality Test output in R for Closing price and Apple word count. From the result above you can see that after one-day lag are P value is The Use of Twitter Activity as a Stock Market Predictor 21

22 This is more than the significance level of 5%. Therefore the rejection of the Null hypothesis cannot happen meaning Apple word count does not predict the closing price one day later. Figure 4.8 Displays Granger Causality Test output in R Closing price and AAPL word count. A similar test was performed use the keyword AAPL as the independent and Close as the dependent. Results were slight better but did not cause Granger Causality. P value of 24% >5%. Since the data set was small a lag of 2 days could not be performed. Figure 4.9 Displays Granger Causality Test unsuccessful outputs. The above image demonstrates the unsuccessful outputs of the Granger Causality test using more than 1 day s lag. The reason for this error is because the data set was too small. The Use of Twitter Activity as a Stock Market Predictor 22

23 4. Visualization. Figure demonstrates the relationship between the Apple count and Close price. From the above graph it is possible to see the positive relationship that the keyword Apple has with the Close price of Apple stock. As the Apple Count rises there is a rise in the closing stock price. Figure demonstrates the relationship between the AAPL count and Close price. The Use of Twitter Activity as a Stock Market Predictor 23

24 Close Price Apple Count From the above graph it is possible to see the negative relationship that the keyword AAPL has with the Close price of Apple stock. As the AAPL Count rises there is a decline in the closing stock price. This proves are negative results from the correlation and regression models. AAPL was not a key word in the JAVA script but a search within the key word apple Apple count and Close Price Close Apple Figure demonstrates the relationship between the Apple count and Close price. As you can see from the above chart the Close Price marked line follows a similar trend about a day later to the Apple count line. The Use of Twitter Activity as a Stock Market Predictor 24

25 Close Price AAPL Count AAPL count and Close Price Close AAPL Figure demonstrates the relationship between the AAPL count and Close price. Unfortunately the above chart shows that the Close price didn t show a similar trend with AAPL but it actually showed a trend where AAPL word count is following the Close Price. This is probably the reason the correlation model was so low between the two; also the investor community that would use the keyword AAPL (Apple stock symbol) are disusing the rise in Apple stock. Microsoft Stock The process was started again this time using the Microsoft data set. 1. Check for correlation Figure demonstrates the correlation between Microsoft and MSFT word count and Close price. The correlation model this time is much better with both keywords retuning a moderate correlation with Close price. The Use of Twitter Activity as a Stock Market Predictor 25

26 2. Regression Model Figure displays the regression model with Microsoft word count as the independent variable. Figure displays the regression model with MSFT word count as the independent variable. Figure and demonstration the two regression outputs from R as Close stock price as the dependent variable. Figure displays a Multiple R-squared value of 0.96% explaining Close price. Figure displays a Multiple R-squared value of 12.6% explaining Close price. The Use of Twitter Activity as a Stock Market Predictor 26

27 The normality plot If the residuals fall in a straight line that means the normality condition is met. Figure demonstrates Normality plot of Microsoft and Close price. Normality condition is met. Figure demonstrates Normality plot of MSFT and Close price. Normality condition is met. The Use of Twitter Activity as a Stock Market Predictor 27

28 3. Granger Causality Test Figure displays the Granger Causality. Again the Granger Causality would not use a lag bigger tan one day. Both returned values bigger than the significant level of 5%. 4. Visualization Figure demonstrates the relationship between the Microsoft count and Close price. The Use of Twitter Activity as a Stock Market Predictor 28

29 Close price Microsoft count Figure demonstrates the relationship between the MSFT count and Close price Microsoft and Close Price 4/7/14 4/8/14 4/9/14 4/10/14 4/11/ Close Microsoft Figure demonstrates the relationship between the Microsoft count and Close price on a line chart. As you can see from the above chart the Close Price marked line follows a similar trend about a day later to the Microsoft count line. The Use of Twitter Activity as a Stock Market Predictor 29

30 Close price Microsoft count Close price MSFT count MSFT and Close Price /7/14 4/8/14 4/9/14 4/10/14 4/11/ Close MSFT Figure demonstrates the relationship between the MSFT count and Close price on a line chart. Pervious results with one day lag. Microsoft and Close Price with 1 day lag /8/14 4/9/14 4/10/14 4/11/ Close Microsoft Figure demonstrates the relationship between the Microsoft count and Close price on a line chart with a one-day lag. The Use of Twitter Activity as a Stock Market Predictor 30

31 Close price MSFT count MSFT andclose Price with 1 day lag 4/8/14 4/9/14 4/10/14 4/11/ Close MSFT Figure demonstrates the relationship between the MSFT count and Close price on a line chart with a one-day lag. The decision was made to perform a manual lag in excel by moving the dates of the Microsoft count forward to see if the lines in the chart match up. This lag would mean that the tweet counts about Microsoft happened on the same dates as the actual Closing price. The results from the two graphs show that visually there is a relationship between the word counts and the Close stock price. A correlation and regression model was built again using the lagged data. 1. Correlation Figure demonstrates the correlation between Microsoft and MSFT word count and Close price with a lag of one day. The correlation model in figure shown a strong correlation with the two word counts. So a regression model was produced. The Use of Twitter Activity as a Stock Market Predictor 31

32 2. Regression Model Figure displays the regression model with Microsoft word count as the independent variable using data with a one-day lag. The Use of Twitter Activity as a Stock Market Predictor 32

33 Figure displays the regression model with MSFT word count as the independent variable with data of one-day lag. The two regression models returned a high Multiple R-squared value of 98%Figure explaining Close price. The high correlation and regression proved that there is a relation between the tweet counts and the closing stock price. The results were very high the reason for this occurrence would be the very small data set that was used. Tesla Stock The process was started again this time using the Tesla data set. Correlation and regression was performed with similar results from the pervious data sets. Figure demonstrates the correlation between Microsoft and MSFT word count and Close price. Figure demonstrates the correlation between Microsoft and MSFT word count and Close price with a one-day lag. The keyword Tesla showed a strong correlation with the Tesla closing stock price from the lagged data set. TSLA still displayed a moderate correlation. The Use of Twitter Activity as a Stock Market Predictor 33

34 Close Price Tesla count Figure displays the regression model with Tesla word count as the independent variable using data with a one-day lag. Again the regression with the lagged data set showed a huge improvement then the non-lagged Tesla data Tesla Count and Close Price 4/7/14 4/8/14 4/9/14 4/10/14 4/11/ Close Tesla Figure demonstrates the relationship between the Tesla word count and Close price on a line chart. The Use of Twitter Activity as a Stock Market Predictor 34

35 Close Price Tesla Count Tesla Count and Close Price with one day lag 4/8/14 4/9/14 4/10/14 4/11/ Close Tesla Figure demonstrates the relationship between the Tesla word count and Close price on a line chart with a one-day lag. Figures and demonstrate the difference between the non-lagged and the lagged data sets. Figure demonstrates that the one-day in lag does make a difference to the results. It demonstrates a close relationship the Tesla count has with the Close price. The Use of Twitter Activity as a Stock Market Predictor 35

36 Formula For Predicting Stock Movement The creation of a formula for commercial use was conducted. The small data set had an impact on this work since the use of a lag between two the three days was desired. From pervious research Stock Market Prediction using Twitter it was discovered that the tweets would predict stock movement two to three days after the message was tweeted. Knowing the tweet volumes of a company for two consecutive days the percentage of movement of tweets between those two days should in turn allow us to predict the movement in the company share price within in a two or three day lag. Formula Used The percentage difference between two numbers ( V1 - V2 / ((V1 + V2)/2)) * 100 V1 = total company tweets on day one. V2 = total company tweets on day two. The formula was used to find the percentage difference between the stock movement and the tweet movement. Apple Stock Prediction To save time the focus is only on the key word count of Microsoft. Calculate the percentage difference of Apple Tweets And Closing Price Difference in Apple Stock % Difference in Tweet Activity % E Day one 0.005% Day One 1.96% Day Two 1.31% Day Two 27.91% Day Three 1.29% Day Three 44.28% Day Four 0.73% Day Four 39.09% Figure demonstrates difference in Stock Close price and Tweet activity between days. The Use of Twitter Activity as a Stock Market Predictor 36

37 If the movement were not identical in percentage increase/ decrease then the formula would need to be adjusted. The movement in Tweet Activity was not proportionate (pro rata movement). Figure demonstrates the formula for predicting the third day using Close stock values. Example of the formula process Subtract the tweets of Day 1 from Day 2. The tweet volume has an increase of 1228 tweets, which represent % increase. The Apple closing stock of Day 1 is $ Multiply it by % This projects an increase of $10.29 Add this to the to the Day 1 share price ( ) = $533.7 Closing price of Day 3 = $ Formula projects a closing price of $ against an actual closing price of $ The difference in the projected actual price is $3.38 This represents a variance of 0.639% The Use of Twitter Activity as a Stock Market Predictor 37

38 The formula used here is a straight line (1:1 ratio) The Apple share prices increase at the same rate as the Twitter feeds within an error level of just 0.639%. Figure demonstrates the formula for predicting the forth day using Close stock values. The process was repeated this time using values to predict the fourth day. Unfortunately an error of % was returned. Figure demonstrates the formula for predicting the fifth day using Close stock values. The process was repeated this time using values to predict the fifth day. Unfortunately an error of 47.25% was returned. The formula didn t apply to the days after the third. The Use of Twitter Activity as a Stock Market Predictor 38

39 Calculate the percentage difference of Apple Tweets And Low Price Figure 4.4.1demonstrates the formula for predicting the third forth and fifth day using Low Stock values. Also considered was the formula used with the Low stock price to see if there was a relation. The best day the formula applied to was predicting the third day with an error of 1.89%. The Use of Twitter Activity as a Stock Market Predictor 39

40 Calculate the percentage difference of Microsoft Tweets And Volume The use of Volume in the formula was also measured. Figure demonstrates the formula for predicting the third day using the volume values. However this too had a high error rate of 30.23%. Microsoft Stock Prediction Calculate the percentage difference of Microsoft Tweets And Closing Price Difference in Stock Difference in Tweet Activity Day one 0.05% Day One 31.60% Day Two 1.63% Day Two 49.74% Day Three 2.74% Day Three 18.94% Day Four 0.38% Day Four 7.04% Figure demonstrates difference in Stock Close price and Tweet activity between days. Projecting closing stock price Day 3 The Use of Twitter Activity as a Stock Market Predictor 40

41 Figure demonstrates the formula for predicting the third forth and fifth day using the Close stock values. The formula returned a high variance for all projected days. This concludes that the formula does not apply to any of these days using Close Stock. The Use of Twitter Activity as a Stock Market Predictor 41

42 Calculate the percentage difference of Microsoft Tweets And Low Price Also considered was the formula used with the Low stock price to see if there was a relation. Tweets day1 - day Low stock of day 1 * difference of tweets day1 and day Stock low price day 1 + low stock of day 1 * difference of tweets day1 and day Low price of Day3 - projected low price day Difference between projected low day 3 and actual day 3 as a variance % Figure demonstrates the formula for predicting the third day using the Low stock values. Again the formula showed that it did not apply to the Low Stock price. Calculate the percentage difference of Microsoft Tweets And Volume Figure demonstrates the formula for predicting the third day using the Volume values. The Volume data was placed into the formula but the result shown above has a high error rate of 44.5%. The Use of Twitter Activity as a Stock Market Predictor 42

43 Tesla Stock Prediction Calculate the percentage difference of Tesla Tweets And Closing Price Difference in Stock Difference in Tweet Activity Day one Day One Day Two Day Two Day Three Day Three Day Four Day Four Figure demonstrates difference in Stock Close price and Tweet activity between days. The Use of Twitter Activity as a Stock Market Predictor 43

44 Figure demonstrates the formula for predicting the third forth and fifth day using the Close stock values. The formula had high percentage errors except for the prediction for the fifth day with an error of 2.33%. The Use of Twitter Activity as a Stock Market Predictor 44

45 Tweets day1 - day2 low stock of day 1 * difference of tweets day1 and day Stock low price day 1 + low stock of day 1 * difference of tweets day1 and day Low price of Day3 - projected low price day Difference between projected low day 3 and actual day 3 as a variance Figure demonstrates the formula for predicting the day using the Low stock values. Tweets day1 - day2-734 Volume day 1 * difference of tweets day1 and day Volume day 1 + Volume day 1 * difference of tweets day1 and day Volume Day3 - projected low price day Difference between projected Volume day 3 and actual day 3 as a variance. Figure demonstrates the formula for predicting the third day using the Volume values When the Low Stock and Volume values were placed into the formula they also displayed high errors. Low Stock had an error of over 19% and the Volume values had an error over 10%. The Use of Twitter Activity as a Stock Market Predictor 45

46 Conclusion This analysis investigated the relation between twitter activity and stock market share prices of three companies in the NASDAQ over a period of one week. The use of a Java script and Twitters API collected the tweets that had the keywords Apple, Microsoft and Tesla mentioned in them. Once the tweets were collected a python file was used to count the frequency of words in conjunction with Amazon Web Service. AWS was used because of the size of the Tweets files, which were in text format of sizes ranging from 60 to 130 megabytes. Text Wrangler was also used to count the frequency of tweets with the keywords. Since one of the data sets have missing data over five hours due to a program failure it was decided to use tweets during the NASDAQ trading hours. Stock data belonging to the three companies was acquired from the Yahoo Finance website. Similarly a count of times the NASDAQ symbols for each company was conducted and used as an additional analysis. The symbols would give the opportunity to investigate the occurrence of conversations directed to the actual company stock on the NASDAQ. Analysis was performed in R studio using a correlation model first to see the how strong a relation the tweet data had with the stock data of each company. A Linear regression algorithm was then used to see the effect that the twitter data had on the stock data. Granger Causality was performed to discover if one of the time series affected the other providing a result in the form of a lag per day. Since the data was so small a lag of only one-day could be performed providing a significant level of over 5%, which we could not select, the alternative hypothesis. During visualization of the data using line graphs it was noted that there seem to be a relation where the stock data had a similar trend one day after the tweet data. A manual lag was performed in excel by moving the tweet data time series forward by one day. This proved that a trend did exist. Subsequently a correlation model in R studio was created and the results exhibit a strong correlation of 0.9 and over. The creation of a formula for commercial use was attempted. The first formula was used to find the percentage difference between the stock movement and the tweet movement. On average there was a difference between the movement of the stocks and the shares. Another formula was created to predict the close share price. Knowing the twitter volumes of a company for two consecutive days, the percentage of movement of tweets between those two days should in turn allow us to predict the movement in the company share price three days later. The formula used is a straight line (1:1 ratio) Whilst predicting the third day for the Apple share prices an error level of just 0.639% was returned. This meant that the close share price increased at the same rate as the Twitter feeds for the key word Apple. Within an error lever of 0.639% Disappointingly the other days predicted for Apple Close stock price were not as suitable returning error rates of 27.9% and 47.25%. This trend continued throughout the analysis for the closing price in the Microsoft and Tesla stock. The formula was slightly altered to accommodate the use of other variables such as Low Close stock and Volume. Again the errors were high for each one. The Use of Twitter Activity as a Stock Market Predictor 46

47 The main issue here is that the data set is not developed enough to do this form of analysis. When acquiring the data specific tweets regarding the stock of the company should have only being collected. A company on Twitter is competing for public interest while the stock exchange is competing for capital interest. In that aspect some of the Tweets gathered in this analysis are noisy data. Further Development Further develop in the project would include extracting tweets and stock data over a longer period of time. This would have provided the analysis with a superior result from the Granger Causality test. The tweets need to be selected form a niche community, preferably the investor community who communicate through Twitter in relation to the stocks of companies. Tweets that have the company symbols and the word stock mentioned in them should be gathered using those keywords. Narrowing down the selection of companies and focusing on one would support in reducing the amount of discrepancies in the tweet count. Developing a program script to count the lines that a word appears in without recounting the word again if it has being mentioned more than once in a tweet. The potential use of developing a formula that could take account of other variables that would cause movement in stock, such as events like the release of company financial reports, takeover rumours, mergers or bad publicity. The process of using a sentiment analysis on the tweets would provide a more accurate result from the data. Analysing Twitter data activity along will not provide the analysis with any information about behavioural attitudes towards the investors. Sentiment analysis would also provide a better insight into the public attitude. The Use of Twitter Activity as a Stock Market Predictor 47

48 Bibliography Aws.amazon.com, (2014). Word Count Example : Articles & Tutorials : Amazon Web Services. [online] Available at: (Accessed 22 May. 2014). Bollen, J. and Mao, H. (2011) 'Twitter mood as a stock market predictor' Computer. Datasift.com, (2014). Power Decisions With Social Data DataSift. [online] Available at: (Accessed 24 May. 2014). Dev.twitter.com, (2014). Twitter Developers. [online] Available at: (Accessed 22 May. 2014). Finance.yahoo.com, (2014). AAPL Historical Prices Apple Inc. Stock - Yahoo! Finance. [online] Available at: &g=d (Accessed 22 May. 2014). Mac App Store, (2014). TextWrangler. [online] Available at: (Accessed 22 May. 2014). Mittal, A. and Goel, A. (2012) 'Stock prediction using Twitter sentiment analysis' Standford University, CS229( stanford. edu/proj2011/goelmittal-stockmarketpredictionusingtwittersentimentanalysis. pdf). Simsek, M. and Ozdemir, S. (2012) 'Analysis of the relation between Turkish twitter messages and stock market index'. Ucd.ie, (2014). CeADAR. [online] Available at: (Accessed 26 May. 2014). Ucd.ie, (2014). Brian Mac Namee CeADAR. [online] Available at: (Accessed 26 May. 2014). Appendix Project Materials: usp=sharingreferences The Use of Twitter Activity as a Stock Market Predictor 48

49 Project Proposal Introduction The purpose of this project is to study and analyse the activities and trends associated to the Mobile World Congress 2014, which is being held from the 24 th to the 27 th of February The Mobile World Congress is the world s largest exhibition of the mobile industry. Mobile operators, device manufacturers and technology providers are all represented at the exhibition. With a large amount of manufacturers attending and product launches the subject can be quite broad. The objective of this project is to analyse Twitter feeds for activity s and trends associated with the top mobile manufacturers before, during and after the event and to see how their stock market shares are connected and affected by the Twitter feeds. Background As Twitter matures, top brands have realized just how relevant Twitter can be as a marketing and engagement platform. According to Useful Social Media 98% of the top brands are on Twitter and 92% of top brands tweet daily. There are 230 million active users on Twitter; this provides brands with a global presence. (USM) 92% of top brands Tweet at least once daily as audiences grow. Study shows Twitter s maturity as a marketing and engagement platform. 98% of all top brands are active on Twitter. The social network has matured into a valuable and necessary channel for marketing organizations. (Usefulsocialmedia.com, 2014) i Releases such as the Samsung Galaxy s5 will hopefully see a surge of Twitter activity in relation to Samsung during the event. According to Trusted Reviews the release of the Samsung Galaxy s5 will take place during the event. (Trusted Reviews) The Samsung Galaxy S5 release date looks set to be held in a matter of days as the Korean manufacturer issues invites to a February 24 launch event, kicking Samsung Galaxy S5 rumours into overdrive. (Trusted Reviews, 2014) ii Using the data from the Twitter feeds I can then analyse them against the stock market shares. According to Mac Rumours, Samsung has the biggest phone market share with Apple in second place. (Mac Rumours) Apple Continues to Lose Smartphone Share, Gain Mobile Phone Share in 4Q 2013 (Mac Rumours, 2014) iii The Use of Twitter Activity as a Stock Market Predictor 49

50 Similar research has being done in relation to Twitter feeds influencing market shares but this project will be focusing mainly on the Mobile World Congress in relation to the markets shares of the top five mobile device manufacturers. Technical Approach This objective will be achieved by: Creating the necessary python coding to use with the Twitter API for retrieving the data. Gathering all data created on Twitter related to the mobile device brands before, during and after the event. Gather stock market share prices before, during and after the event of the mobile device brands. Clean all data gathered for analysis Analysis of the data gathered of Twitter activity against the stock market share prices. Return the results of the analysis. Special Resources Required Books to be used: Python for data analysis Mckinney, W. (2013) Twitter API: Up and Running: Learn How to Build Applications with the Twitter API Paperback by Kevin Makice. (2009) Writing Your Dissertation by Swetnam, D. & Swetnam, R. (2000). Software to be used: Python R studio MYSQL Microsoft Excel Microsoft Project Twitter API System storage to be used: Twitter API At this stage of the project I am unaware of the amount of data that I will accumulate from Twitter. The Use of Twitter Activity as a Stock Market Predictor 50

51 Project Plan Technical Details The coding I will use to retrieve the data will be python. R coding and Microsoft Excel will then be used to do the analysis of the data. Systems/Datasets The datasets used will be all collected by myself using the online Twitter API with the python coding to collect specific words, hash tags from the tweets over the duration of the events operating time per day. Evaluation/Test and Analysis I am unable to state how I will test the data due to the fact that we have only had one class of Data and web mining but I can list the types of analysis that we will be learning. Classification Regression (value estimation) Similarity matching Clustering The Use of Twitter Activity as a Stock Market Predictor 51

52 Co-occurrence grouping (frequent itemset mining) Profiling (behaviour description) Link Prediction Data reduction Causal modelling Consultation with Specialization Persons John O Connor CEO of Wellclever. Wellclever is a startup company that provides the media groups and content producers with keyword contextual online advertising solutions. Consulted with John for project ideas. John has over 20 years of experience in the advertising industry. (Wellclever, 2014) iv Oisin Creaner coordinator of the project for NCI Spoke to Oisin about project ideas through the use of Twitter API s. The Use of Twitter Activity as a Stock Market Predictor 52

53 Requirments Specification Document Control Revision History Date Version Scope of Activity Prepared Reviewed Approved 20/02/ Create RC X X 23/02/ Update RC X X 24/02/ Update RC X X Distribution List Name Title Version Oisin Creaner Lecturer Samsung Customer Robert Coyle BA Robert Coyle System Developer Robert Coyle Statistician Robert Coyle Tester Robert Coyle Advertising and Marketing Devision Related Documents Title Proposal Document Comments The Use of Twitter Activity as a Stock Market Predictor 53

54 1 Introduction 1.1 Purpose The purpose of this project is to study and analyze the activities and trends associated to a brands advertising campaign. The objective of this project is to analyze Twitter feeds for activities and trends associated with the brand before, during and after their advertising campaign and to see how their stock market shares are connected and affected by the Twitter feeds. The intended customers are the actual brands, their marketing and PR team. As Twitter matures, top brands have realized just how relevant Twitter can be as a marketing and engagement platform. According to Useful Social Media 98% of the top brands are on Twitter and 92% of top brands tweet daily. There are 230 million active users on Twitter; this provides brands with a global presence. (USM) 92% of top brands Tweet at least once daily as audiences grow. Study shows Twitter s maturity as a marketing and engagement platform. 98% of all top brands are active on Twitter. The social network has matured into a valuable and necessary channel for marketing organizations. (Usefulsocialmedia.com, 2014) v 1.2 Project Scope This analysis will compare different advertising campaigns done by a brand on the release of a new or updated product and how they differ from one another. It will also look at how a brands advertising campaign affects their stock market share prices. I will be using the historic Twitter feeds and historic stock market shares. The project will look at an individual brand such as Samsung, acquire the necessary twitter feeds associated with Samsung. Using the correct programs and scripts the program should gather any mentions of Samsung in the tweets including hash tags. The data will include the time series of the tweets and then we can match this data to the time series of the stock market data. With a budget of zero acclimating the historic Twitter feeds could be a difficult task since my researching has show that Twitter has giving/sold their data to separate/outside companies who now sell the data for use In Scope 1. The analysis of a advertising campaign with the data gathered from twitter and stock market share prices. 2. The development of python programs for cleaning data. 3. The development of an R program and the use of Microsoft Excel for the analysis of the data. The Use of Twitter Activity as a Stock Market Predictor 54

55 1.2.2 Out of Scope 1. The project will not provide Samsung with outside analysis of other brands data. 1.3 Document Scope The goal of this document is to describe the functional and non-functional requirements of the Samsung advertising campaign analysis. The stakeholder analysis was carried out prior to requirement elicitation process. 1.4 Definitions, Acronyms, and Abbreviations Term Advertising campaign BA Backed-up Cloud Data Excel GUI Moscow Pyton R Definition A series of messages to promote a product. Business Analyst The process of storing information (hardware or software based) Internet based service where storage, applications and servers are accused through the internet for an organization. Information Microsoft Excel is a spreadsheet application used here for analyzing data. Graphical user interface Is a technique used in functional requirements.must, Could, Should, Want. See Functional requirements Type of programming language Programming Langauge 2 User Requirements Definition 2.1 User Characteristics As part of Samsung s $14 billion advertising and marketing campaign last year (2013) the company requires an analysis on the effectiveness of the advertising campaign and how the twitter activity and their stock market prices were affected. According to ibtimes.co.uk Samsung were expected to spend $14 billion on there marketing campaign (ibtimes.co.uk) The South Korean company is expected to spend around $14 billion ( 8.5bn, 10.3bn) on marketing and promotion of its products in 2013, which is the biggest (as a percentage of its total revenue) advertising budget of any company ever (ibtimes 2013) vi, Samsung have not yet released there analog report for The analysis will provide Samsung with a better insight of the effectiveness of their advertising campaign strategy form data acquired by the Twitter feeds and stock market. This information will assist Samsung in managing their advertising The Use of Twitter Activity as a Stock Market Predictor 55

56 campaign more effectively and efficiently by directing the style and approach of the campaign towards their specific products. 3 Requirements Specification 3.1 Functional Requirements FR# Category Description Mo sco w FR1 Aquire Data 1 The project will gather and store all nessary data from historical Twitter feeds. M FR2 Aquire Data 2 The project will gather and store all nessary historical stock M H mrket data regarding the brand corrosponding to the dates in relation to the Twitter data that was aquired. FR3 Clean Data 2 The correct programs will be aquired and used to clean and M H retrive histoical Twitter data regarding to key words and hash tags of the brand on certain dates. FR4 Clean Data 2 The correct programs will be aquired and used to clean and M H retrive data historcal stock market share prices regarding the brand on the same time and dates as the histoical Twitter feeds data. FR5 Analyse 1 The cleaned Twitter data is then analysed and compared. M H FR6 Analyse 2 The cleaned stock market data is then analysed and compared. FR7 Publish Data The analyse will then be publised and avslible to the coustomer. M M S t a t u s H H H The Use of Twitter Activity as a Stock Market Predictor 56

57 3.1.1 Use Case Diagram Overall Functional Requirements Requirement 1: Acquire Data 1 and Description & Priority The scope of this use case is to gather all the data necessary to carrier out the analysis and continue onto the next stage of the project. This requirement has a very high status and is essential in progressing on the next stage of the analysis. The Use of Twitter Activity as a Stock Market Predictor 57

58 Use Case Scope The system shall source the historic twitter and stock market data from online data resources. Define all access points. Accuses the Data, notify its availability and then download the data. Description This use case describes the process to which the data for analysis is acquired. Use Case Diagram Flow Description Precondition The Data must be online. The data system must be operational at all times. The Use of Twitter Activity as a Stock Market Predictor 58

59 Activation Use case is activated when the programmer connects to the system online. Main Flow 1. Step: 1A. Programmer and System Developer source data. 2. Step: 2A. Programmer and Business Analyst validate data with the Customer. 3. Step: 3A. Programmer accesses the data. 4. Step: 4A. Programmer notifies data availability to the System Developer. 5. Step: 5A. Programmer downloads data for cleaning. Alternate Flow 1. Step: 1A. Programmer and System Developer source data. 2. Step: 2A. Programmer and Business Analyst validate data with the Customer. 3. Step: 2A. Customer does not validate data. Step 1A is set to recommence. 4. Step: 1A. Programmer and System Developer source data. 5. Step: 2A. Programmer and Business Analyst validate data with the Customer. 6. Step: 3A. Programmer accesses the data. 7. Step: 4A. Programmer notifies data availability to the System Developer. 8. Step: 5A. Programmer downloads data for cleaning. Exceptional Flow 1. Step: 1A. Programmer and System Developer source data. 2. Step: 2A. Programmer and Business Analyst validate data with the Customer. 3. Step: 2A. Customer does not validate data. Data is unavailable. 4. Use case ends Termination The system has gathered all necessary data. The data is then exported on the cloud storage system. This process has now being terminated. Post Condition All Data gathered, move onto the next step. The Use of Twitter Activity as a Stock Market Predictor 59

60 3.1.3 Requirement 2: Clean Data 1 and Description & Priority The scope of this use case is to clean all the data gathered from the pervious requirement. A programmer and tester investigate the data for any errors such as missing data and fix the errors. This requirement has a very high status and is essential in progressing on the next stage of the analysis Use Case Scope The system shall clean all data sets gathered from the pervious requirement. Define all error points. Get recommendations for fixing the errors. Fixes the errors and then exports the data for analysis. Description This use case describes the process to which the data is cleaned for analysis. The Use of Twitter Activity as a Stock Market Predictor 60

61 Use Case Diagram Flow Description Precondition The Data must be stored and available for cleaning at all times. Activation Use case is activated when the programmer connects to the cloud storage system and retrieves the data. Main Flow 1. Step: 1B. Programmer and System Developer retrieve data from the cloud storage system. 2. Step: 2B. Programmer and Tester identify errors in the data set. 3. Step: 3B. Programmer receives recommendations from System Developer. The Use of Twitter Activity as a Stock Market Predictor 61

62 4. Step: 4B. Programmer with the help of the Tester fixes errors and notifies the System Developer. 5. Step: 5B. Programmer exports the data for analysis. Alternate Flow 1. Step: 1B. Programmer and System Developer retrieve data from the cloud storage system. 2. Step: 2B. Programmer and Tester identify errors in the data set. 3. Step: 3B. Programmer receives recommendations from System Developer. 4. Step: 4B. Programmer with the help of the Tester fixes errors and notifies the System Developer. 5. Step: 2B. Programmer and Tester test system again and identify more errors in the data set. 6. Step: 3B. Programmer receives recommendations from System Developer. 7. Step: 4B. Programmer with the help of the Tester fixes errors and notifies the System Developer. 8. Step: 5B. Programmer exports the data for analysis. Exceptional Flow 1. Step: 1B. Programmer and System Developer retrieve data from the cloud storage system. 2. Step: 2B. Programmer and Tester identify errors in the data set. 3. Step: 3B. Programmer receives recommendations from System Developer. 4. Step: 4B. Programmer with the help of the Tester fixes cannot fix errors. Data is corrupt. 5. Use case ends. Termination The system cleaned all acquired data. The data is then saved onto the cloud storage system and exported for analysis. This process has now being terminated. Post Condition All data cleaned, move onto the next step. The Use of Twitter Activity as a Stock Market Predictor 62

63 3.1.4 Requirement 2: Analyze Data Description & Priority The scope of this use case is to analyze all the data gathered and cleaned from the pervious requirements. A Business Analyst and Statistician examine and study the data for Analysis. This requirement has a very high status and is essential in progressing on the next stage of the analysis Use Case Scope This process involves the skills and management of the Statistician and Business Analyst to compare and analyze all data. The process shall calculate and prove/predict outcomes form the data with the help of graphs for visualizing. Then all proven data is backed-up and stored. Description This use case describes the process to which the data analyzed. The Use of Twitter Activity as a Stock Market Predictor 63

64 Use Case Diagram Flow Description Precondition The Data must be available for analysis at all times. Activation Use case is activated when the BA and the Statistician connects to the cloud storage system and retrieves the data. Main Flow 1. Step: 1C. BA and Statistician retrieve data from the cloud storage system. The Use of Twitter Activity as a Stock Market Predictor 64

65 2. Step: 2C. The Statistician and BA explore and understand the data set. 3. Step: 3C. Statistician begins the calculations. 4. Step: 4C. Statistician and BA began to visualize the data. 5. Step: 5C. Programmer backs up and stores findings with the approval of the BA. Alternate Flow 1. Step: 1C. BA and Statistician retrieve data from the cloud storage system. 2. Step: 2C. The Statistician and BA explore and understand the data set. 3. Step: 3C. Statistician begins the calculations. 4. Step: 4C. Statistician and BA began to visualize the data. Ba requests the data to be recalculated with a different approach. 5. Step: 3C. Statistician begins the new calculations. 6. Step: 4C. Statistician and BA began to visualize the data. 7. Step: 5C. Programmer backs up and stores findings with the approval of the BA. Exceptional Flow 1. Step: 1C. BA and Statistician retrieve data from the cloud storage system. 2. Step: 2C. The Statistician and BA explore and understand the data set. Statistician and BA are unable to understand the data set. Ba requests new data set. 3. Use case ends Termination The analysis is completed. The data is then saved onto the cloud storage system and exported for Publishing. This process has now being terminated. Post Condition All data analyzed, move onto the next step Requirement 2: Publish Data Description & Priority The scope of this use case is to publish the findings from the analysis approved by the pervious requirements. A Business Analyst consults the Customer on topics such as the proprietor of the data, the goal from the publication, the target audience/data consumer (is the data confidential and for internal use only), media to which it is published and the release date. This requirement has a very high status. The Use of Twitter Activity as a Stock Market Predictor 65

66 Use Case Scope This process involves the communication and business skills of the BA and how to handle the customer s requirements and outcomes. The process involves the Customer, BA and the Advertising/Publications division. The process shall publicize the findings to the desired audience with the approval of the customer and recommendations of the BA. Description This use case describes the process to which the data is publicized. Use Case Diagram The Use of Twitter Activity as a Stock Market Predictor 66

67 Flow Description Precondition The Data must be available for analysis at all times. Customer/Client must be available for analysis at all times. Activation Use case is activated when the findings are present to BA, Customer and Advertising/Publication Division and all three are engaged in communication. Main Flow 1. Step: 1D. BA, Customer and Advertising/Publication Division retrieve analysis findings. Findings have acquired owner s approval. 2. Step: 2D. BA and Customer discuss the objective of the findings release. 3. Step: 3D. BA and Customer began to agree on the target audience/data consumer. 4. Step: 4D. Customer decides the medium type/the style and method of publicizing the data e.g. websites, newspaper, with the BA s approval and the assistance of the Advertising/Publication Division. 5. Step: 5D. BA notifies Advertising/Publication Division to publish the data. Alternate Flow 1. Step: 1D. BA, Customer and Advertising/Publication Division retrieve analysis findings. Findings have acquired owner s approval. 2. Step: 2D. BA and Customer discuss the objective of the findings release. 3. Step: 3D. BA and Customer began to agree on the target audience/data consumer. 4. Step: 4D. Customer decides the medium type/the style and method of publicizing the data e.g. websites, newspaper, with the BA s approval and the assistance of the Advertising/Publication Division. Customer decides to recommence Step: 3D. Again to change the publication approach. 5. Step: 3D. BA and Customer began to agree on a new target audience/data consumer 6. Step: 4D. Customer decides the medium type/the style and method of publicizing the data e.g. websites, newspaper, with the BA s approval and the assistance of the Advertising/Publication Division. 7. Step: 5D. BA notifies Advertising/Publication Division to publish the data. The Use of Twitter Activity as a Stock Market Predictor 67

68 Exceptional Flow 1. Step: 1D. BA, Customer and Advertising/Publication Division retrieve analysis findings. Findings have not acquired owner s approval. Customer decides not to publicize the data findings due to the high importance and confidentiality of the findings. 2. Use case ends Termination The publication of the data is completed. This process has now being terminated. Post Condition All data publicize, all steps completed. 3.2 Non-Functional Requirements Availability: Must Have The information must be available at all times for analysis Storage Requirements: Must Have The data kept during and after the analysis should be stored in a secure facility. Cloud storage security protocols must be assessed. The must be enough capacity in the cloud to hold the large amount of data Connection Reliability: Must Have It must have a reliable connection at all times when retrieving, uploading and updating the data. Connection lost could transpire into losing data Connection Speed: Must Have It must have fast online connection. This is needed when retrieving, uploading and updating the data. A large data set could take some time to upload Backup and Recovery: Must Have The data must be easily accessed, backed up and updated. It must have a system recovery in the case of a system failure Program to clean data: Must Have The analysis must have the correct programs to clean and fix any errors in the data Software Analysis tools: Must Have The analysis must have the correct software analysis tools that all divisions of the analysis can exercise. The Use of Twitter Activity as a Stock Market Predictor 68

69 3.2.8 Communication Requirements: Must Have The analysis must have constant communication between all divisions/ parties in the decision making process Security: Must Have The analysis must have high security measures. The analysis is operating with highly confidential data. Only key divisions from the analysis must have accuses to the data Data Validation: Must Have This process requires the use of external services in order to download the data. Once the data is gathered from the services (Twitter, Nasdaq) it should be validated. 5 Interface Requirements 5.1 GUI An example of a analysis of tweets. vii comprendia Examples of tweets analyzed on Microsoft Excel and Geo Flow The Use of Twitter Activity as a Stock Market Predictor 69

70 viii powerpivotblog The Use of Twitter Activity as a Stock Market Predictor 70

71 Analysis of tweets using R language ix evolutionanalytics Example of Excel Data for intro to Regression. This is using stock market data. x skilledup The Use of Twitter Activity as a Stock Market Predictor 71

72 Example of analysis completed on R Studio. xi datamachines Analysis Evolution The analysis will evolve over time to produce a much more focused outcome, differencing itself by the analysis of a specific product in the Samsung product range. This can occur by changing the mining of keys words in the twitter data, focusing on a product such as the Galaxy products in the Samsung range. These include the smartphone, Tablet and Watch. If the customer Samsung required an analysis to focus on the release of a specific product such as the Galaxy S4 which was released April 2013 this can be done by narrowing down the search key word, using hash tags and words such as (#samsungs4, #SamsungGalaxyS4, #GalaxyS4 #S4) and narrowing down the time lines to the release date of the phone. The Use of Twitter Activity as a Stock Market Predictor 72

73 Progress Management Report 1 Document Location This document will be uploaded through Turnitin. Revision History Date of this revision: 9/03/14 Revision date Prevision revision date Summary of changes 9/03/14 First Issue Changes marked Approvals This project requires the following approvals. Name Signature Title Date of issue Version Robert Coyle Project 10/03/14 1 Manager Distribution Name Title Date of issue Version Oisin Creaner Project Lecturer 10/03/14 1 The Use of Twitter Activity as a Stock Market Predictor 73

74 Purpose of Document Is to provide Oisin Creaner the project lecturer with a summary of the status of the project. Date of report 09/03/14 Period covered 10/02/14 9/03/14 Schedule Status This project is still on schedule at this interval. Updated Gantt chart Project Proposal Create Python codes Data retrival from Twitter API and Data retrival from Twitter API and Management Progress Report 1 Management Progress Report 2 03-Feb 23-Feb 15-Mar 04-Apr 24-Apr Definitions, Acronyms, and Abbreviations Term API JSON NASDAQ RSS Definition Application programming interface JavaScript Object Notation American Stock Exchange Rich Site Summary The Use of Twitter Activity as a Stock Market Predictor 74

75 Products completed during this period Project proposal Requirements specification The project proposal was completed on time. See (Coyle, 2014) Requirements specification was completed on time with changes t project scope. See (Coyle, 2014) Problems Actual Accessing Twitter API Acquiring free historical data. Twitter API has being more difficult to access than first anticipated due to change of regulations and updated version of twitter. The API only supports JSON. Historical feeds are proving to be difficult, as twitter has sold their data to approved sites for resale. As this project has no budget this has being a high impact on the plan. Twitter has released a grant application form online for accessing their historical data. Potential The quality and quantity of the twitter data. Gathering the data in the required time. Not having the JSON code yet I am not sure what my expected returned of data will be. Using a site called Twillert, I acquired some data but the site won t gather more that the first 100 RSS feeds, this rendering the service useless. Once I have a response from the Twitter developers grant I can determine whether the historical data is possible to acquire and progress to the next stage of the project. The Use of Twitter Activity as a Stock Market Predictor 75

76 Raid Log: Risks The Use of Twitter Activity as a Stock Market Predictor 76

77 Assumptions Issues Dependency Products due for completion By the next period the following should be accomplished. Gathering of Twitter feeds. Gathering of stock market data. Analysis of data. Preliminary presentation. Should have gathered all twitter data either historical or real time in relation to Samsung. Should have gathered all Nasdaq data in relation to Samsung in the same time series as the twitter data. Once all data has being gathered analysis can take place. Should have Preliminary presentation completed. The Use of Twitter Activity as a Stock Market Predictor 77

Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement

Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement Ray Chen, Marius Lazer Abstract In this paper, we investigate the relationship between Twitter feed content and stock market

More information

How To Predict Stock Price With Mood Based Models

How To Predict Stock Price With Mood Based Models Twitter Mood Predicts the Stock Market Xiao-Jun Zeng School of Computer Science University of Manchester x.zeng@manchester.ac.uk Outline Introduction and Motivation Approach Framework Twitter mood model

More information

Tweets Miner for Stock Market Analysis

Tweets Miner for Stock Market Analysis Tweets Miner for Stock Market Analysis Bohdan Pavlyshenko Electronics department, Ivan Franko Lviv National University,Ukraine, Drahomanov Str. 50, Lviv, 79005, Ukraine, e-mail: b.pavlyshenko@gmail.com

More information

The Viability of StockTwits and Google Trends to Predict the Stock Market. By Chris Loughlin and Erik Harnisch

The Viability of StockTwits and Google Trends to Predict the Stock Market. By Chris Loughlin and Erik Harnisch The Viability of StockTwits and Google Trends to Predict the Stock Market By Chris Loughlin and Erik Harnisch Spring 2013 Introduction Investors are always looking to gain an edge on the rest of the market.

More information

Forecasting stock markets with Twitter

Forecasting stock markets with Twitter Forecasting stock markets with Twitter Argimiro Arratia argimiro@lsi.upc.edu Joint work with Marta Arias and Ramón Xuriguera To appear in: ACM Transactions on Intelligent Systems and Technology, 2013,

More information

A CRF-based approach to find stock price correlation with company-related Twitter sentiment

A CRF-based approach to find stock price correlation with company-related Twitter sentiment POLITECNICO DI MILANO Scuola di Ingegneria dell Informazione POLO TERRITORIALE DI COMO Master of Science in Computer Engineering A CRF-based approach to find stock price correlation with company-related

More information

Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams

Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams 2012 International Conference on Computer Technology and Science (ICCTS 2012) IPCSIT vol. XX (2012) (2012) IACSIT Press, Singapore Using Text and Data Mining Techniques to extract Stock Market Sentiment

More information

Sentiment analysis on tweets in a financial domain

Sentiment analysis on tweets in a financial domain Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International

More information

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015 Sentiment Analysis D. Skrepetos 1 1 Department of Computer Science University of Waterloo NLP Presenation, 06/17/2015 D. Skrepetos (University of Waterloo) Sentiment Analysis NLP Presenation, 06/17/2015

More information

IMPACT OF SOCIAL MEDIA ON THE STOCK MARKET: EVIDENCE FROM TWEETS

IMPACT OF SOCIAL MEDIA ON THE STOCK MARKET: EVIDENCE FROM TWEETS IMPACT OF SOCIAL MEDIA ON THE STOCK MARKET: EVIDENCE FROM TWEETS Vojtěch Fiala 1, Svatopluk Kapounek 1, Ondřej Veselý 1 1 Mendel University in Brno Volume 1 Issue 1 ISSN 2336-6494 www.ejobsat.com ABSTRACT

More information

JetBlue Airways Stock Price Analysis and Prediction

JetBlue Airways Stock Price Analysis and Prediction JetBlue Airways Stock Price Analysis and Prediction Team Member: Lulu Liu, Jiaojiao Liu DSO530 Final Project JETBLUE AIRWAYS STOCK PRICE ANALYSIS AND PREDICTION 1 Motivation Started in February 2000, JetBlue

More information

Can Twitter provide enough information for predicting the stock market?

Can Twitter provide enough information for predicting the stock market? Can Twitter provide enough information for predicting the stock market? Maria Dolores Priego Porcuna Introduction Nowadays a huge percentage of financial companies are investing a lot of money on Social

More information

Pattern Recognition and Prediction in Equity Market

Pattern Recognition and Prediction in Equity Market Pattern Recognition and Prediction in Equity Market Lang Lang, Kai Wang 1. Introduction In finance, technical analysis is a security analysis discipline used for forecasting the direction of prices through

More information

The process of gathering and analyzing Twitter data to predict stock returns EC115. Economics

The process of gathering and analyzing Twitter data to predict stock returns EC115. Economics The process of gathering and analyzing Twitter data to predict stock returns EC115 Economics Purpose Many Americans save for retirement through plans such as 401k s and IRA s and these retirement plans

More information

The Influence of Sentimental Analysis on Corporate Event Study

The Influence of Sentimental Analysis on Corporate Event Study Volume-4, Issue-4, August-2014, ISSN No.: 2250-0758 International Journal of Engineering and Management Research Available at: www.ijemr.net Page Number: 10-16 The Influence of Sentimental Analysis on

More information

Leveraging Social Media

Leveraging Social Media Leveraging Social Media Social data mining and retargeting Online Marketing Strategies for Travel June 2, 2014 Session Agenda 1) Get to grips with social data mining and intelligently split your segments

More information

Sentiment Analysis on Big Data

Sentiment Analysis on Big Data SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social

More information

DATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights

DATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights DATA EXPERTS We accelerate research and transform data to help you create actionable insights WE MINE WE ANALYZE WE VISUALIZE Domains Data Mining Mining longitudinal and linked datasets from web and other

More information

Financial Trading System using Combination of Textual and Numerical Data

Financial Trading System using Combination of Textual and Numerical Data Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,

More information

Exploring the use of Big Data techniques for simulating Algorithmic Trading Strategies

Exploring the use of Big Data techniques for simulating Algorithmic Trading Strategies Exploring the use of Big Data techniques for simulating Algorithmic Trading Strategies Nishith Tirpankar, Jiten Thakkar tirpankar.n@gmail.com, jitenmt@gmail.com December 20, 2015 Abstract In the world

More information

Market Velocity and Forces

Market Velocity and Forces Market Velocity and Forces SUPPORT DOCUMENT NASDAQ Market Velocity and Forces is a market sentiment indicator measuring the pre-trade order activity in the NASDAQ Stock Market trading system. It indicates

More information

IT services for analyses of various data samples

IT services for analyses of various data samples IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical

More information

CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis

CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis Team members: Daniel Debbini, Philippe Estin, Maxime Goutagny Supervisor: Mihai Surdeanu (with John Bauer) 1 Introduction

More information

A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics

A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics contents A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics Abstract... 2 Need of Social Content Analytics... 3 Social Media Content Analytics... 4 Inferences

More information

The term marginal cost refers to the additional costs incurred in providing a unit of

The term marginal cost refers to the additional costs incurred in providing a unit of Chapter 4 Solutions Question 4.1 A) Explain the following The term marginal cost refers to the additional costs incurred in providing a unit of product or service. The term contribution refers to the amount

More information

Socialbakers Analytics User Guide

Socialbakers Analytics User Guide 1 Socialbakers Analytics User Guide Powered by 2 Contents Getting Started Analyzing Facebook Ovierview of metrics Analyzing YouTube Reports and Data Export Social visits KPIs Fans and Fan Growth Analyzing

More information

ADHAWK WORKS ADVERTISING ANALTICS ON A DASHBOARD

ADHAWK WORKS ADVERTISING ANALTICS ON A DASHBOARD ADHAWK WORKS ADVERTISING ANALTICS ON A DASHBOARD Mrs. Vijayalaxmi M. 1, Anagha Kelkar 2, Neha Puthran 2, Sailee Devne 2 Vice Principal 1, B.E. Students 2, Department of Information Technology V.E.S Institute

More information

Application of Predictive Model for Elementary Students with Special Needs in New Era University

Application of Predictive Model for Elementary Students with Special Needs in New Era University Application of Predictive Model for Elementary Students with Special Needs in New Era University Jannelle ds. Ligao, Calvin Jon A. Lingat, Kristine Nicole P. Chiu, Cym Quiambao, Laurice Anne A. Iglesia

More information

Estimating a market model: Step-by-step Prepared by Pamela Peterson Drake Florida Atlantic University

Estimating a market model: Step-by-step Prepared by Pamela Peterson Drake Florida Atlantic University Estimating a market model: Step-by-step Prepared by Pamela Peterson Drake Florida Atlantic University The purpose of this document is to guide you through the process of estimating a market model for the

More information

A Description of Consumer Activity in Twitter

A Description of Consumer Activity in Twitter Justin Stewart A Description of Consumer Activity in Twitter At least for the astute economist, the introduction of techniques from computational science into economics has and is continuing to change

More information

Stock Prediction Using Twitter Sentiment Analysis

Stock Prediction Using Twitter Sentiment Analysis Stock Prediction Using Twitter Sentiment Analysis Anshul Mittal Stanford University anmittal@stanford.edu Arpit Goel Stanford University argoel@stanford.edu ABSTRACT In this paper, we apply sentiment analysis

More information

SECURE BACKUP SYSTEM DESKTOP AND MOBILE-PHONE SECURE BACKUP SYSTEM HOSTED ON A STORAGE CLOUD

SECURE BACKUP SYSTEM DESKTOP AND MOBILE-PHONE SECURE BACKUP SYSTEM HOSTED ON A STORAGE CLOUD SECURE BACKUP SYSTEM DESKTOP AND MOBILE-PHONE SECURE BACKUP SYSTEM HOSTED ON A STORAGE CLOUD The Project Team AGENDA Introduction to cloud storage. Traditional backup solutions problems. Objectives of

More information

Easily Identify Your Best Customers

Easily Identify Your Best Customers IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do

More information

Professional Diploma in Digital Marketing

Professional Diploma in Digital Marketing Professional Diploma in Digital Marketing Agenda Day 1: Day 2: Day 3: Day 4: Day 5: to Digital Marketing Search Engine Optimisation Search Engine Marketing Email Marketing Digital Display Advertising Mobile

More information

Predicting Stock Market Fluctuations. from Twitter

Predicting Stock Market Fluctuations. from Twitter Predicting Stock Market Fluctuations from Twitter An analysis of the predictive powers of real-time social media Sang Chung & Sandy Liu Stat 157 Professor ALdous Dec 12, 2011 Chung & Liu 2 1. Introduction

More information

Cymon.io. Open Threat Intelligence. 29 October 2015 Copyright 2015 esentire, Inc. 1

Cymon.io. Open Threat Intelligence. 29 October 2015 Copyright 2015 esentire, Inc. 1 Cymon.io Open Threat Intelligence 29 October 2015 Copyright 2015 esentire, Inc. 1 #> whoami» Roy Firestein» Senior Consultant» Doing Research & Development» Other work include:» docping.me» threatlab.io

More information

Business Valuation Review

Business Valuation Review Business Valuation Review Regression Analysis in Valuation Engagements By: George B. Hawkins, ASA, CFA Introduction Business valuation is as much as art as it is science. Sage advice, however, quantitative

More information

SPC Data Visualization of Seasonal and Financial Data Using JMP WHITE PAPER

SPC Data Visualization of Seasonal and Financial Data Using JMP WHITE PAPER SPC Data Visualization of Seasonal and Financial Data Using JMP WHITE PAPER SAS White Paper Table of Contents Abstract.... 1 Background.... 1 Example 1: Telescope Company Monitors Revenue.... 3 Example

More information

OUTLOOK 2003 ADDRESS BOOK BACKUP. For this reason outlook 2003 address book backup guides are far superior compared to pdf guides.

OUTLOOK 2003 ADDRESS BOOK BACKUP. For this reason outlook 2003 address book backup guides are far superior compared to pdf guides. OUTLOOK 2003 ADDRESS BOOK BACKUP For this reason outlook 2003 address book backup guides are far superior compared to pdf guides. OUTLOOK 2003 ADDRESS BOOK BACKUP Some times, the option of format is dependent

More information

Applying Machine Learning to Stock Market Trading Bryce Taylor

Applying Machine Learning to Stock Market Trading Bryce Taylor Applying Machine Learning to Stock Market Trading Bryce Taylor Abstract: In an effort to emulate human investors who read publicly available materials in order to make decisions about their investments,

More information

Marketing Planner 2012: Social Media Monitoring Tools. The Most Effective Paid Services Explained. December 2011 www.lonelybrand.

Marketing Planner 2012: Social Media Monitoring Tools. The Most Effective Paid Services Explained. December 2011 www.lonelybrand. Marketing Planner 2012: Social Media Monitoring Tools The Most Effective Paid Services Explained December 2011 www.lonelybrand.com Executive Summary Social media remains a buzz term among marketing professionals

More information

Customer Experience Management

Customer Experience Management Customer Experience Management Best Practices for Voice of the Customer (VoC) Programmes Jörg Höhner Senior Vice President Global Head of Automotive SPA Future Thinking The Evolution of Customer Satisfaction

More information

Measure Social Media like a Pro: Social Media Analytics Uncovered SOCIAL MEDIA LIKE SHARE. Powered by

Measure Social Media like a Pro: Social Media Analytics Uncovered SOCIAL MEDIA LIKE SHARE. Powered by 1 Measure Social Media like a Pro: Social Media Analytics Uncovered # SOCIAL MEDIA LIKE # SHARE Powered by 2 Social media analytics were a big deal in 2013, but this year they are set to be even more crucial.

More information

Free Trial - BIRT Analytics - IAAs

Free Trial - BIRT Analytics - IAAs Free Trial - BIRT Analytics - IAAs 11. Predict Customer Gender Once we log in to BIRT Analytics Free Trial we would see that we have some predefined advanced analysis ready to be used. Those saved analysis

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

Using Twitter as a source of information for stock market prediction

Using Twitter as a source of information for stock market prediction Using Twitter as a source of information for stock market prediction Ramon Xuriguera (rxuriguera@lsi.upc.edu) Joint work with Marta Arias and Argimiro Arratia ERCIM 2011, 17-19 Dec. 2011, University of

More information

SPRING 14 RELEASE NOTES

SPRING 14 RELEASE NOTES SPRING 14 RELEASE NOTES At Salesforce ExactTarget Marketing Cloud your success is our top priority and we re working hard to continuously improve the Marketing Cloud solutions you use. We recently reached

More information

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS Stacey Franklin Jones, D.Sc. ProTech Global Solutions Annapolis, MD Abstract The use of Social Media as a resource to characterize

More information

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY

More information

Capturing Meaningful Competitive Intelligence from the Social Media Movement

Capturing Meaningful Competitive Intelligence from the Social Media Movement Capturing Meaningful Competitive Intelligence from the Social Media Movement Social media has evolved from a creative marketing medium and networking resource to a goldmine for robust competitive intelligence

More information

Cleaned Data. Recommendations

Cleaned Data. Recommendations Call Center Data Analysis Megaputer Case Study in Text Mining Merete Hvalshagen www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 10 Bloomington, IN 47404, USA +1 812-0-0110

More information

White paper. Gerhard Hausruckinger. Approaches to measuring on-shelf availability at the point of sale

White paper. Gerhard Hausruckinger. Approaches to measuring on-shelf availability at the point of sale White paper Gerhard Hausruckinger Approaches to measuring on-shelf availability at the point of sale Contents The goal: to raise productivity The precondition: using the OOS index to record out-of-stock

More information

Table of Contents. Chapter No. 1 Introduction 1. iii. xiv. xviii. xix. Page No.

Table of Contents. Chapter No. 1 Introduction 1. iii. xiv. xviii. xix. Page No. Table of Contents Title Declaration by the Candidate Certificate of Supervisor Acknowledgement Abstract List of Figures List of Tables List of Abbreviations Chapter Chapter No. 1 Introduction 1 ii iii

More information

ACTIVITY 4.1 READING A STOCK TABLE

ACTIVITY 4.1 READING A STOCK TABLE ACTIVITY 4.1 READING A STOCK TABLE 1. Overview of Financial Reporting A wide variety of media outlets report on the world of stocks, mutual funds, and bonds. One excellent source is The Wall Street Journal,

More information

QUANTIFYING THE EFFECTS OF ONLINE BULLISHNESS ON INTERNATIONAL FINANCIAL MARKETS

QUANTIFYING THE EFFECTS OF ONLINE BULLISHNESS ON INTERNATIONAL FINANCIAL MARKETS QUANTIFYING THE EFFECTS OF ONLINE BULLISHNESS ON INTERNATIONAL FINANCIAL MARKETS Huina Mao School of Informatics and Computing Indiana University, Bloomington, USA ECB Workshop on Using Big Data for Forecasting

More information

How to Win the Stock Market Game

How to Win the Stock Market Game How to Win the Stock Market Game 1 Developing Short-Term Stock Trading Strategies by Vladimir Daragan PART 1 Table of Contents 1. Introduction 2. Comparison of trading strategies 3. Return per trade 4.

More information

Content Marketing Integration Workbook

Content Marketing Integration Workbook Content Marketing Integration Workbook 730 Yale Avenue Swarthmore, PA 19081 www.raabassociatesinc.com info@raabassociatesinc.com Introduction Like the Molière character who is delighted to learn he has

More information

SAP Digital CRM. Getting Started Guide. All-in-one customer engagement built for teams. Run Simple

SAP Digital CRM. Getting Started Guide. All-in-one customer engagement built for teams. Run Simple SAP Digital CRM Getting Started Guide All-in-one customer engagement built for teams Run Simple 3 Powerful Tools at Your Fingertips 4 Get Started Now Log on Choose your features Explore your home page

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Stock Market Q & A. What are stocks? What is the stock market?

Stock Market Q & A. What are stocks? What is the stock market? Stock Market Q & A What are stocks? A stock is a share in the ownership of a corporation. The person buying the stock becomes a stockholder, or shareholder, of the corporation and earns dividends on his

More information

Testing Metrics. Introduction

Testing Metrics. Introduction Introduction Why Measure? What to Measure? It is often said that if something cannot be measured, it cannot be managed or improved. There is immense value in measurement, but you should always make sure

More information

Sales and Invoice Management System with Analysis of Customer Behaviour

Sales and Invoice Management System with Analysis of Customer Behaviour Sales and Invoice Management System with Analysis of Customer Behaviour Sanam Kadge Assistant Professor, Uzair Khan Arsalan Thange Shamail Mulla Harshika Gupta ABSTRACT Today, the organizations advertise

More information

Using data to make app marketing a growth driver, not a cost center

Using data to make app marketing a growth driver, not a cost center APP PUBLISHER A Using data to make app marketing a growth driver, not a cost center intelligence Includes: Find The Most Valuable Users Internationally Engineering Greater Download Volume Playing Rank

More information

The power of IBM SPSS Statistics and R together

The power of IBM SPSS Statistics and R together IBM Software Business Analytics SPSS Statistics The power of IBM SPSS Statistics and R together 2 Business Analytics Contents 2 Executive summary 2 Why integrate SPSS Statistics and R? 4 Integrating R

More information

Chapter 14. Web Extension: Financing Feedbacks and Alternative Forecasting Techniques

Chapter 14. Web Extension: Financing Feedbacks and Alternative Forecasting Techniques Chapter 14 Web Extension: Financing Feedbacks and Alternative Forecasting Techniques I n Chapter 14 we forecasted financial statements under the assumption that the firm s interest expense can be estimated

More information

Past, present, and future Analytics at Loyalty NZ. V. Morder SUNZ 2014

Past, present, and future Analytics at Loyalty NZ. V. Morder SUNZ 2014 Past, present, and future Analytics at Loyalty NZ V. Morder SUNZ 2014 Contents Visions The undisputed customer loyalty experts To create, maintain and motivate loyal customers for our Participants Win

More information

Lina Warrad. Applied Science University, Amman, Jordan

Lina Warrad. Applied Science University, Amman, Jordan Journal of Modern Accounting and Auditing, March 2015, Vol. 11, No. 3, 168-174 doi: 10.17265/1548-6583/2015.03.006 D DAVID PUBLISHING The Effect of Net Working Capital on Jordanian Industrial and Energy

More information

Neural Networks for Sentiment Detection in Financial Text

Neural Networks for Sentiment Detection in Financial Text Neural Networks for Sentiment Detection in Financial Text Caslav Bozic* and Detlef Seese* With a rise of algorithmic trading volume in recent years, the need for automatic analysis of financial news emerged.

More information

Sentiment analysis using emoticons

Sentiment analysis using emoticons Sentiment analysis using emoticons Royden Kayhan Lewis Moharreri Steven Royden Ware Lewis Kayhan Steven Moharreri Ware Department of Computer Science, Ohio State University Problem definition Our aim was

More information

Professional Diploma. in Digital Marketing. www.digitalmarketinginstitute.ie

Professional Diploma. in Digital Marketing. www.digitalmarketinginstitute.ie 2013 Professional Diploma in Digital Marketing www.digitalmarketinginstitute.ie Contents 2013 Professional Diploma in Digital Marketing 1. Introduction 2. Who is This Course For? 3. What Will You Learn?

More information

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

White Paper. How Streaming Data Analytics Enables Real-Time Decisions White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream

More information

Automating FP&A Analytics Using SAP Visual Intelligence and Predictive Analysis

Automating FP&A Analytics Using SAP Visual Intelligence and Predictive Analysis September 9 11, 2013 Anaheim, California Automating FP&A Analytics Using SAP Visual Intelligence and Predictive Analysis Varun Kumar Learning Points Create management insight tool using SAP Visual Intelligence

More information

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it KNIME TUTORIAL Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it Outline Introduction on KNIME KNIME components Exercise: Market Basket Analysis Exercise: Customer Segmentation Exercise:

More information

Data Analytics in Organisations and Business

Data Analytics in Organisations and Business Data Analytics in Organisations and Business Dr. Isabelle E-mail: isabelle.flueckiger@math.ethz.ch 1 Data Analytics in Organisations and Business Some organisational information: Tutorship: Gian Thanei:

More information

THE INFLUENCE OF MARKETING INTELLIGENCE ON PERFORMANCES OF ROMANIAN RETAILERS. Adrian MICU 1 Angela-Eliza MICU 2 Nicoleta CRISTACHE 3 Edit LUKACS 4

THE INFLUENCE OF MARKETING INTELLIGENCE ON PERFORMANCES OF ROMANIAN RETAILERS. Adrian MICU 1 Angela-Eliza MICU 2 Nicoleta CRISTACHE 3 Edit LUKACS 4 THE INFLUENCE OF MARKETING INTELLIGENCE ON PERFORMANCES OF ROMANIAN RETAILERS Adrian MICU 1 Angela-Eliza MICU 2 Nicoleta CRISTACHE 3 Edit LUKACS 4 ABSTRACT The paper was dedicated to the assessment of

More information

Project B: Portfolio Manager

Project B: Portfolio Manager Project B: Portfolio Manager Now that you've had the experience of extending an existing database-backed web application (RWB), you're ready to design and implement your own. In this project, you will

More information

DEVELOPING A SOCIAL MEDIA STRATEGY

DEVELOPING A SOCIAL MEDIA STRATEGY DEVELOPING A SOCIAL MEDIA STRATEGY Creating a social media strategy for your business 2 April 2012 Version 1.0 Contents Contents 2 Introduction 3 Skill Level 3 Video Tutorials 3 Getting Started with Social

More information

Predicting stocks returns correlations based on unstructured data sources

Predicting stocks returns correlations based on unstructured data sources Predicting stocks returns correlations based on unstructured data sources Mateusz Radzimski, José Luis Sánchez-Cervantes, José Luis López Cuadrado, Ángel García-Crespo Departamento de Informática Universidad

More information

Introducing Bing Shopping Campaigns beta

Introducing Bing Shopping Campaigns beta Introducing Bing Shopping Campaigns beta Bing Shopping Campaigns beta // available by invite only Launches in the US this summer. Most consumers shop and buy online 90% 83% of US consumers browsed, researched

More information

Analysis of Tweets for Prediction of Indian Stock Markets

Analysis of Tweets for Prediction of Indian Stock Markets Analysis of Tweets for Prediction of Indian Stock Markets Phillip Tichaona Sumbureru Department of Computer Science and Engineering, JNTU College of Engineering Hyderabad, Kukatpally, Hyderabad-500 085,

More information

Web Extension: Financing Feedbacks and Alternative Forecasting Techniques

Web Extension: Financing Feedbacks and Alternative Forecasting Techniques 19878_09W_p001-009.qxd 3/10/06 9:56 AM Page 1 C H A P T E R 9 Web Extension: Financing Feedbacks and Alternative Forecasting Techniques IMAGE: GETTY IMAGES, INC., PHOTODISC COLLECTION In Chapter 9 we forecasted

More information

Hootsuite Best Practices

Hootsuite Best Practices + Hootsuite Best Practices + University of Waterloo Guide for University of Waterloo accounts through the social media management system, Hootsuite. + Table of Contents Content Creation and Sharing...4

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Nine Common Types of Data Mining Techniques Used in Predictive Analytics 1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Sensex Realized Volatility Index

Sensex Realized Volatility Index Sensex Realized Volatility Index Introduction: Volatility modelling has traditionally relied on complex econometric procedures in order to accommodate the inherent latent character of volatility. Realized

More information

DATA MINING TECHNIQUES FOR CRM

DATA MINING TECHNIQUES FOR CRM International Journal of Scientific & Engineering Research, Volume 4, Issue 4, April-2013 509 DATA MINING TECHNIQUES FOR CRM R.Senkamalavalli, Research Scholar, SCSVMV University, Enathur, Kanchipuram

More information

Introduction To Hive

Introduction To Hive Introduction To Hive How to use Hive in Amazon EC2 CS 341: Project in Mining Massive Data Sets Hyung Jin(Evion) Kim Stanford University References: Cloudera Tutorials, CS345a session slides, Hadoop - The

More information

Project Proposal: Monitoring and Processing Stock Market Data In Real Time Using the Cyclone V FPGA

Project Proposal: Monitoring and Processing Stock Market Data In Real Time Using the Cyclone V FPGA Project Proposal: Monitoring and Processing Stock Market Data In Real Time Using the Cyclone V FPGA Alexander Gazman (ag3529), Hang Guan (hg2388), Nathan Abrams (nca2123) 1. Introduction The general premise

More information

MINING DATA FROM TWITTER. Abhishanga Upadhyay Luis Mao Malavika Goda Krishna

MINING DATA FROM TWITTER. Abhishanga Upadhyay Luis Mao Malavika Goda Krishna MINING DATA FROM TWITTER Abhishanga Upadhyay Luis Mao Malavika Goda Krishna 1 Abstract The purpose of this report is to illustrate how to data mine Twitter to anyone with a computer and Internet access.

More information

A Primer on Forecasting Business Performance

A Primer on Forecasting Business Performance A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.

More information

What is Driving Rapid Growth in the Australian Mobile Advertising Market?

What is Driving Rapid Growth in the Australian Mobile Advertising Market? What is Driving Rapid Growth in the Australian Mobile Advertising Market? Author: Phil Harpur Published: 10 Dec 2013 Key Takeaway The Australian mobile advertising market grew very strongly during 2013

More information

Do Tweets Matter for Shareholders? An Empirical Analysis

Do Tweets Matter for Shareholders? An Empirical Analysis Do Tweets Matter for Shareholders? An Empirical Analysis Brittany Cole University of Mississippi Jonathan Daigle University of Mississippi Bonnie F. Van Ness University of Mississippi We identify the 215

More information

Stock Price Prediction Using Sentiment Detection of Twitter

Stock Price Prediction Using Sentiment Detection of Twitter Stock Price Prediction Using Sentiment Detection of Twitter C. Lee Fanzilli March 18, 2015 Abstract If Amazon can predict what books we want to read, Netflix can predict what movies we want to watch, and

More information

Operationalise Predictive Analytics

Operationalise Predictive Analytics Operationalise Predictive Analytics Publish SPSS, Excel and R reports online Predict online using SPSS and R models Access models and reports via Android app Organise people and content into projects Monitor

More information

RELEVANT TO ACCA QUALIFICATION PAPER P3. Studying Paper P3? Performance objectives 7, 8 and 9 are relevant to this exam

RELEVANT TO ACCA QUALIFICATION PAPER P3. Studying Paper P3? Performance objectives 7, 8 and 9 are relevant to this exam RELEVANT TO ACCA QUALIFICATION PAPER P3 Studying Paper P3? Performance objectives 7, 8 and 9 are relevant to this exam Business forecasting and strategic planning Quantitative data has always been supplied

More information

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Machine Learning and Data Mining. Fundamentals, robotics, recognition Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,

More information

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Ernst van Waning Senior Sales Engineer May 28, 2010 Agenda SPSS, an IBM Company SPSS Statistics User-driven product

More information

It has often been said that stock

It has often been said that stock Twitter Mood as a Stock Market Predictor Johan Bollen and Huina Mao Indiana University Bloomington Behavioral finance researchers can apply computational methods to large-scale social media data to better

More information