Crowd-Squared: A New Method for Improving Predictions by. Crowd-sourcing Google Trends Keyword Selection

Size: px
Start display at page:

Download "Crowd-Squared: A New Method for Improving Predictions by. Crowd-sourcing Google Trends Keyword Selection"

Transcription

1 Crowd-Squared: A New Method for Improving Predictions by Crowd-sourcing Google Trends Keyword Selection Full Paper (word count- 6058) Erik Brynjolfsson Sloan School of Management, Massachusetts Institute of Technology erikb@mit.edu Tomer Geva Recanati Business School, Tel-Aviv University tgeva@tau.ac.il Shachar Reichman Sloan School of Management, Massachusetts Institute of Technology shachar@mit.edu Abstract Advances in information technologies and analytic tools have dramatically increased our ability to obtain accurate data on billions of economic decisions almost the instant that they are made. Services such as Google Trends aggregate billions of search queries and provide information about the search volume of different terms. This information from the crowd, had been successfully used to accurately predict a wide variety of events. Nevertheless, a major challenge for successful utilization of this source of information is the generation of an appropriate list of search terms that resemble the phenomena of interest. Current methods for search term generation often use proprietary search engine data or employ black box classifiers or require extensive computational power. We introduce Crowd-Squared, a new crowd-based method, for selecting relevant search terms. We show that this method is successful in various domains and that it performs as well or even outperforms selection methods used in previous studies. 1

2 Introduction Advances in information technologies and analytic tools over recent years have dramatically increased our ability to obtain accurate data on literally billions of economic decisions almost the instant that they are made. In particular, each time a consumer uses a web search engine during the purchase process, valuable information is revealed about that individual s intentions to make an economic transaction and about the intentions of similar users. Services such as Google Trends aggregate of billions of search queries and provide information about the relative volume of different search terms. Using this information from the crowd, researchers can make accurate predictions of a wide variety of future events, including products sales, claims for unemployment, and epidemic outbreaks. Of course, the ability to rapidly detect current activities and accurately predict future events has considerable business implications, covering almost all aspect of a firm s activities such inventory and supply chain activities, marketing activities, pricing, and others. While early research has had considerable success, perhaps the most important challenge that has emerged is the generation of an appropriate list of search terms that resemble the phenomena of interest. Current techniques for the identification of relevant terms have various limitations, such as the need for proprietary search information; the use of an automated, black box Google category classifier, which is not available for all domains; or reliance on manual term selection based on specialized domain knowledge or trial and error. Typically, the problem is handled by a one person 2

3 guessing game and even when machine learning methods are utilized to generate the term identification, the methods are still constrained by their individual design. While researchers can use their own intuition and judgment about which terms web searchers may use when seeking information about particular phenomena, these may not match well with the terms that web searchers actually choose. In this paper, we offer a new crowd-based method for selecting relevant search terms that correspond to the underlying phenomena and facilitate accurate early detection or prediction of real-world events with the use of aggregated search data. In effect, we recruit a crowd of a few hundred participants to predict the terms employed by the much larger crowd of millions of search engine users. We show that this iterative use of the crowd is successful in various domains and that it performs as well or even outperforms previous search term selection methodologies that had been used in previous studies. Crowdsourcing is receiving increased interest recently and has been proved to be a reliable and successful technique in many aspects of business and research areas including capturing new product ideas and innovations, generating accurate images tags, improving image search, and even solving scientific problems. One of the main benefits of crowdsourcing is the ability to harness human intelligence to perform small tasks that are impossible or too expensive to appropriately perform by computers. In our context, the task is finding the terms relevant to a focal item or occurrence of event. We used a crowdsourcing environment and designed a game to capture people s ideas of focal phrases. We introduced an online word association game in which consumers are asked to provide five terms that come to mind when they see a specific word or phrase. 3

4 Given a specific phrase, word association technique provides a relative index of the strength accessibility of related words in the memory. We therefore expect that this technique reflects the same keywords generating process one may perform while using a search engine. 300 participants played the game using the Amazon Mechanical Turk platform. We aggregated the results and collected aggregate search trends data for each of the top mentioned terms. We used these search trends to generate predictions in three different domains: influenza epidemics, unemployment claims and housing indexes. We then compared our results with a benchmark model in each of the domains. We find that the use of crowd-generated terms as part of the prediction model is highly effective. Our results suggest that the integration of crowd-generated search terms with aggregated data from search engines performs as well or even outperforms costly or black box key word generation methods. 4

5 Related Literature The availability of search data, web activities data, and other source of information, along with developments of analytic tools have dramatically increased our ability to obtain accurate data on billions of economic decisions as well as on individuals intentions to make these transaction (McAfee and Brynjolfsson, 2012). This leads to a new topic of interest, using search engine logs to forecast occurrences of future events or to provide faster and more accurate means of gauging and identifying current events (also referred as nowcasting by Choi and Varian (2012). Predictions models using search query volume data Search engine logs or search trends, have received significant attention in recent years for their ability to predict and detect a variety of economic outcomes. Search volume data has been shown to provide useful predictions in a wide range of domains, from epidemic outbreaks (Ginsberg et al. 2008) through movie box office sales and music billboard rankings (Goel et al. 2010) to automotive sales (Du and Kamakura, 2012; Choi and Varian 2012) and home sales (Wu and Brynjolfsson, 2009; Choi and Varian, 2012) to claims for unemployment (Choi and Varian, 2012). An important component in successful utilization of search trends data is the specification of the relevant set of search queries whose volume best reflects the phenomenon of interest. Literature reports on several types of methods for selecting the relevant set of search terms. The first approach relies on entire categories rather than using specific keywords. This method uses an automatic allocation by a black box classifier 5

6 developed by Google that classifies each search query into several hundred predefined categories and sub-categories. Choi and Varian (2012) used this method to obtain relevant search volume data and demonstrate contemporaneous predictive capabilities in various fields, including sales of motor vehicle parts, initial claims for unemployment benefits, travel, consumer confidence index, and automotive sales. Vosen and Schimdt (2011) used Google-categorized search data to predict private consumption. For this purpose, they used 56 Google categories that the authors saw as most relevant. Subsequently, they employed a factor analysis method and used the factors with the largest eigenvalues. Wu and Brynjolfsson (2009) used Google search data pertaining to real estate categories to predict future house sales and price indices as well as home appliance sales. Their paper also raises the possibility of using search volume in one domain (such as real estate or real estate agencies) to predict sales in a different domain (home appliances). This demonstrates that relevant search query selection procedures may be improved via a non-trivial process of finding terms or categories with seemingly indirect influence. Overall, the advantages of using search volume based on Google s predefined categories are its ease of use and the fact that it can encompass multiple relevant search terms. However, the underlying classifier is a black box, so it is not possible to gauge its accuracy or its coverage. More importantly, the predetermined categories are applicable only to a set of popular items, but they do not include many possible items of interest (e.g. there is a category for the Ford automotive brand, but there is no category for the 6

7 Ford Focus model). 1 In addition, for items such as housing prices or public consumption, multiple Google categories or subcategories may be relevant, and a user decision is required as to which of the categories should be used. Another approach was taken by Ginsberg et al. (2008) for constructing an early detection system for influenza epidemics. For this purpose, they used Google s internal data concerning the 50 million most popular search terms. Subsequently, they fitted a simple logistic model using each of the search terms, explaining the dependent variable. Last, they selected the top n terms according to the mean correlation with actual influenza data across nine regions. This methodology was highly successful. However, it is impossible to reproduce this methodology with the trends data that Google provides to external users (at the Google Trends website). This is due to strict restrictions on the number of terms that can be extracted from Google Trends (several hundreds per day). In addition, this kind of analysis requires expertise and high computational power both to collect a large portion of all queries performed online and to create the correlation matrix to the phenomenon of interest. Another study that used proprietary information is Goel et al. (2010). This study reports about various methodological aspects of using search trends data. To demonstrate various methodological aspects, they perform tasks such as predicting movie revenues and music billboard rankings as well as video game sales. The methodology relied on the identification of search queries and predefined relevant webpages (e.g. IMDB) that were returned by the Yahoo search engine. While the authors (who were affiliated with 1 These studies used the search volume for the entire category. However, Google Trends also allows specifying a keyword within the Google category. For example, a search for the term Argo under the movie category will return search queries related to movies that specifically include the word Argo. 7

8 Yahoo) obtained good results using this methodology, it is virtually impossible to replicate it using publicly available data, as this requires conducting an exhaustive check of all possible search terms that return a set of specific links. Other studies have used handpicked keywords. For instance, Seebach et al. (2011) tested various combinations of search terms that included vehicle brand names (e.g. Volkswagen or VW), vehicle model name (Golf, Passat), as well as various Google categories for the purpose of predicting automotive sales. They found that simple usage of search terms using brand-level names under the vehicle shopping category provided the best results in terms of correlation with brand-level sales. D Amuri and Marcucci (2012) used a single, though highly relevant, search keyword jobs in forecasting unemployment. Last, Du and Kamakura (2012) report on a method for dynamic factor analysis for extracting latent dynamic factors in multiple time series data. They demonstrate this method using Google Trends data for U.S. automotive sales. For this purpose, they use an initial set of keywords suggested by Google AdWords keyword tool, which is used to recommend relevant search terms for advertising purposes. While the Google AdWords tool suggests relevant terms, its selection criteria for relevant terms are also not publicly disclosed. In many of the above-mentioned studies, the keyword models were based on a priori knowledge, a well-defined category, or search terms that were defined as closely related with the predicted variable. Another difficulty in search term selection that may occur frequently is that there is no prior knowledge about the queries that could be relevant to the predicted event (e.g. launching a new product) or what would be the best-matched search category that corresponds to its search patterns. 8

9 Crowdsourcing and word association game The fundamental idea behind using search query data for prediction is that it reflects cumulative actions performed by people and, as a result, will capture changes in their behavior over time. With its origin in the crowd, it is reasonable to assume that we can use the crowd to better understand the keyword generation process that leads to search queries. Specifically, as search behavior can be used to reveal consumers intention (Moe and Fader, 2004), this understanding will improve classification of search patterns of different consumption activities. Crowdsourcing is the act of harnessing a distributed network of individuals to solve a problem or perform a function that was once performed by employees (Brabham 2008; Howe 2006). In recent years, the use of crowdsourcing is accelerating in many fields, including capturing new product ideas and innovations (Bayus 2013), generating accurate image tags (Von Ahn 2006), improving image search (Yan et al. 2010), and even solving scientific problems (Lakhani et al. 2007). The benefits of crowdsourcing stem from its scale and diversity, which provides a variety of user backgrounds, level of expertise, and other demographics, at low costs. We followed this stream of research and leveraged the crowd to generate relevant keywords for prediction and early detection of events with search volume data. One of the challenges of crowdsourcing is how to engage the crowd in a meaningful and productive way (Boudreau et al. 2013). As noted by (Von Ahn 2006), an online game environment is an effective technique to capture crowd knowledge and may provide reliable information without any supplementary verification of the users answers. In addition, as shown by (Snow et al. 2008), aggregating results for the same task from 9

10 multiple non-expert individuals can generate results at the same level as those created by experts. In this paper, we used a crowdsourcing game environment and designed a word association game to capture people s ideas of focal phrases. We aggregated the associated terms results, collected search data for each of the most mentioned terms, and included them in the prediction model. To the best of our knowledge, our work is the first to study how the combination of word association with search data can improve prediction and early detection accuracy. 10

11 Methodology and Evaluation We studied how a crowd-based word association game can improve the generation of useful search terms, thereby improving trends predictions. We employed the Amazon Mechanical Turk platform, an online marketplace for tasks that require human intelligence (or tasks that can easily answered by a human but require a large computation cost to be solved algorithmically). Workers (known as Turkers) are paid small amounts of money to complete small tasks (called HITs Human intelligence Tasks). The platform allows randomization of the tasks assignment to multiple Turkers and provides control over the completion of the task. Word association We introduced a technique to use human workers to help find relevant keywords in a game-like environment. Specifically, we implemented a word association game (also known as free association) where workers are asked to submit phrases that are related. Word association is a task that requires participants to spontaneously provide a word or a phrase that is related to a presented word (known as the cue). Word association taps into one s lexical knowledge that is based on real-world experience (Nelson et al. 2004) and has been shown to be important in predicting cued recall (Nelson et al. 1998). This task is used in everyday activities as a mean for collecting thoughts (Nelson et al. 2000). Word association provides an index of the probability that words are related to the cue term. This information was found to be consistent across different people in the same culture recall (Nelson et al. 1998). In the context of web search, one may use word 11

12 association to determine effective search queries. With its consistent representations of the associated terms, these terms may reflect broader search patterns and therefore will assist in measuring current events and predicting future activities. Another benefit of the word association technique is the fact that it provides power law distribution of terms association; most associations relate to proximal terms, and a few associations connect to more distant terms. This technique allows us to capture terms that are more spread around and less correlated with each other; thus, they may have more explanatory power when combined with search data. Keywords association game design We designed an online word association website specially designed and built for this study. The website provides a single page with short instructions and one phrase (the cue term). Five text boxes were shown for participants to fill in with their associated terms (an illustration of this game is presented in Figure 1). The appearance of the website was planned to simulate the common game environment, and participants were not told about the purpose of the game nor on how those terms would be used after the game. Each Turker (participant) was shown a single phrase and was asked to provide 5 terms or phrases that come to mind when seeing this phrase. Each Turker was paid 5 cents ($0.05) for completing the game. The average duration of a game instance was 46 seconds (including answering three demographic questions). 12

13 Please write 5 terms (one word or more) that come to mind when you see the word Figure 1. An illustration of the online web association game. The word Flu is the focal phrase, and Turkers were asked to write 5 terms or phrases that come to mind when seeing the focal phrase. We aggregated the game results and generated a list of the top 10 terms associated with each cue phrase (Appendix A includes the top 10 terms by cue). We used this set of terms as the list of relevant query terms that accurately reflect actual search queries. For each term, we collected its search query volume over time and included the search data in the forecasting method. Evaluation To validate our methodology, we applied it over similar data and prediction tasks reported in three different domains. We replicated the tasks reported in three wellknown related studies: Ginsberg et al. (2008) in the influenza outbreaks detection 13

14 domain, Wu and Brynjolfsson (2009) in real estate market predictions, and Choi and Varian (2012) for predictions of unemployment levels. To allow an impartial comparison, we also intentionally limited ourselves to using the exact performance measures and same sets of data and time periods that these studies used. We compared our models with the prediction models reported in each of those papers and with a baseline model when one was used in the original comparison. It is important to note that we are not suggesting new forecasting method but rather introducing a new technique for generating relevant input variables to be included in any forecasting model that uses search query data. If our methodology is valid, we expect it to obtain predictive accuracy that is at least as good as the predictive accuracy reported in these studies. Influenza epidemics The first data that we used to validate our methodology is the flu outbreak data from the CDC. This type of data was used by Ginsberg et al. (2008) for constructing an early detection system for influenza epidemics. Specifically, the dependent variable in their study was the weekly ILI (Influenza-Like Illness) factor reported by the U.S. Centers for Disease Control (CDC). For selecting the search terms that should be included in the prediction model, they used Google s internal data concerning the 50 million most popular search terms from which they selected top n terms by calculating individual term correlation with the dependent variable. Subsequently, they used the selected terms in fitting a linear model that is used to generate prediction. Their method was highly successful for this application, reaching an out-of-sample mean correlation of 0.97 across U.S. regions. Nevertheless, it is impossible to use a similar methodology 14

15 without access to Google s proprietary data since Google does not allow external access to search trends data for more than several hundred search terms a day. In this study, we used U.S.-level data between Jan 2005 and the week commencing on March 11, We validated our modeling using out-of-sample data from March 18, 2007 to May 11, 2008; this is the same out-of-sample validation period used by Ginsberg et al. Using the word association settings described above, we asked 100 Turkers (62% female, average age 31.8) to play the online game where the task description was please write 5 terms that come to mind when seeing the word Flu (see Appendix A for the top 10 list of associated words generated by the Turkers). The result set of different associated phrases was very large. Nevertheless, the use of any single phrase may not represent a common form of thinking but only one s unique thinking that will not reflect others search patterns. As shown by Snow et al. (2008), an aggregation of results from multiple individuals can generate results with high quality. We therefore restricted the analysis to include only the top 10 most popular association phrases. For each phrase, we collected the weekly search index from Google Trends. This search index is the share of searches at time t (typically week or month) relative to the total search volume across the time period. We limited our results to queries in the United States to match the predicted variable flu outbreak in the U.S. 2 We excluded data from 2003 since Google Trends provides data only from

16 Specifically, we used the following prediction model: Influenza epidemics models: ε (1) Where ILI t is the percentage of Influenza-Like Illness at time t as reported by the Centers for Disease Control and Prevention (CDC); AssociatedTermi t is the search trends value at time t for the association-based term i (i=1..10) in the aggregated results of the word association game for influenza. We first compared the results of our model for the same time period reported in their paper. The training set included 167 weeks from 2004 to We validated our model on untested data from March 18, 2007 to May 11, Our prediction results achieved a similar level of out-of-sample correlation (0.973) in predicting the ILI (compared to 0.97 in Ginsberg et al.). With seemingly similar results, it is important to point out the huge difference in the amount of data that was included in each model. First, Ginsberg et al. used 50 million different searches and 450 million different models to generate the final model with 45 queries. The computation of this process employed hundreds of machines using a distributed computing framework. Our method is based on 100 online users; each played a game for less than one minute. Our final model included only the top 10 searches and a single model. For robustness, we extended our predictions and validated our model on the most recent available influenza data from the first week of April 2012 to the last week of March We compared our results, based on a prediction model whose latest training data is from 2007, with flu trend early detection data provided by Google Flu 16

17 Trends website 3. This website provides flu outbreak detection on an ongoing basis, using the methodology suggested by Ginsberg et al. Here, our results show significant improvement of the correlation level, compared to of the Google Flu Trends results. Figure 2 shows a comparison of our model predictions with the actual reported ILI data from the CDC over the two time periods described above. Looking at the period, and specifically December 2012 to February 2013, our model generated predictions that matched better the actual influenza outbreak duration than the Google Flu Trends model. To summarize, these results suggest that with considerably less computation power and with a smaller set of initial candidate search query terms, association-based search terms generate equivalent or better results than the brute force technique reported in previous papers

18 Figure 2. A comparison of the Crowd-Squared model predictions with actual reported ILI and Ginsberg et al. (2008)/Google Flu Trends, over 2 separated periods and

19 Housing indicators The real estate market is traditionally used as a good indicator of a country s economy. Housing activities both reflect individuals financial situations and influence the country s economic growth by generating or eliminating real estate jobs and services. Hence, predictions of real estate indexes have become a common and important tool for policy makers and industries that rely on these activities. This type of data was used by Wu and Brynjolfsson(2009) for predictions of the real estate market and its complementary businesses (i.e. home appliances). The main predicted variable they used was the volume of housing sales in the U.S. 4 from the 4 th quarter of 2007 to the 2 nd quarter of Instead of selecting the search terms to include in the prediction model, they used two predefined search categories, available from Google s black box category classifier: Real Estate and Real Estate Agent and used their search trends index in the prediction model. Wu and Brynjolfsson used a seasonal autoregressive model and performed an in-sample evaluation of their model using Adjusted. They compared their model with a baseline model presented in equation (2). 4 Provided by the National Association of Realtors 19

20 Real Estate Indicator models: 1 1 ε (2) 1 1 (3) ε Where HomeSalesj t is the volume of homes sales in state j at time t, as reported by the National Association of Realtors; HPIj t 1 is the house price index of state j at time t 1, as reported by the Federal Housing Finance Agency; and AssociatedTermi t is the search trends value at time t for the association-based term i (i=1..10) in the aggregated results of the word association game for real estate; Sj is a state level fixed effect; Tj is quarter dummy variable. We followed their forecasting method and used an autoregressive model presented in equation 3. Similar to the influenza epidemic predictions, we asked 100 participants (53% female, average age 30.6) to play the word association game where the task description was please write 5 terms that come to mind when seeing the phrase Buying a House (see Appendix A for a list of the top 10 terms associated by participants). The baseline model (equation 2 above) reported by Wu and Brynjolfsson displayed a good fit with an Adjusted of Our model resulted in an Adjusted of , higher than the highest reported results in their predictions models (0.984). 20

21 Initial claims for unemployment benefits The third set of data involves early estimation of the volume of initial claims for unemployment benefits. This economic index is published by the U.S. Department of Labor each Thursday, for the previous (Sunday Saturday) week and is considered an important measure of the state of the U.S. economy. 5 Early estimation of initial claims for unemployment using search trends data has been reported by Choi and Varian (2012). Nevertheless, they also report that a simple baseline model, presented in equation (4), performs very well to the point that linear regression estimation results seem to indicate a random walk (with a drift) behavior. 1 (4) Where UIC t is the logarithm of the seasonally adjusted volume of initial claims for unemployment for week t. Choi and Varian created a prediction model which incorporated both baseline information (seasonally adjusted initial claims for the previous week) as well as (seasonally adjusted) search trends for the current week based on Google s predefined categories of Jobs and Welfare...Unemployment, as identified by Google s automated category classifier. They evaluated this model out-of-sample using a one-week-ahead rolling prediction (i.e. using the data up until week (t-1) to train a model and measure its performance over week (t)) during a time period between January 2004 and July Historical data is available at 21

22 While their model was able to generate relatively accurate predictions of economic turning points, their overall results, measured by Mean Absolute Error (MAE) was 3.68%, whereas the MAE for the strong baseline model was 3.37%. This result suggests that the search trends data, based on the predefined categories, may have contained (mostly) overlapping information to the information contained in the previous week s claims data, in addition to some noise that may have harmed the out-of-sample predictive accuracy. We asked 100 participants (54% female, average age 32.8) to play the word association game where the task description was please write 5 terms that come to mind when seeing the phrase Unemployment (see Appendix A for a list of the top 10 terms associated by participants). We used this list of top 10 associated trends and rerun a simple linear regression model as detailed in equation (5). 1 (5) Where UIC t is the logarithm of the seasonally adjusted volume of initial claims for unemployment for week t and AssociatedTermi t is the search trends value for the association-based term i (i=1..10). 6 We applied this model using a similar one-step-ahead prediction model and a similar time period as in Choi and Varian 2012 (see Figure 3 for a comparison of the prediction model and actual unemployment claims data). Our prediction model obtained an out-of- 6 Associated terms as seasonally adjusted by subtracting the week average value for each term. 22

23 sample MAE value of 3.42%. While this value is not as good as the MAE value for the competent baseline model (3.37%), our predictive accuracy was better than the one reported by Choi and Varian 2012 (3.68%). This suggests that the association-based search terms contained less noise than the search volume identified by Google s automated classifier. Figure 3. A comparison of the Association-based model predictions with the actual reported claims for unemployment published by the U.S. Department of Labor. 23

24 Discussion Remarkably accurate predictions can be made by analyzing the aggregate search activities of the crowd. However, this prediction methodology has been hampered by the lack of an effective method for selecting query terms that are accurately associated with the predicted event. In this paper, we present a new crowd-based method for selecting relevant search terms that correspond to the underlying phenomenon and facilitate accurate early detection or prediction of real-world phenomenon with the use of aggregated search data. We study how to improve the keyword selection process by using a crowd-based online game. Particularly, we used a word association game design to collect the associative thoughts of workers to a focal phrase, a process that imitates the choice of search terms when using search engines. Thus we use one crowd to select the terms that a larger crowd will search for when seeking information about the phenomenon we seek to predict. We have performed three online games, on three different topics, asking participants to provide phrases that come to mind when they see a specific phrase. We find that the use of a word association game can effectively generate a set of relevant keywords that, when combined with search volume data, generates better predictions or equal predictions at lower cost. These results show that our methodology can be successfully applied to several domains, which exemplify its robustness. 24

25 Managerial implications Accurate measures of current events and predictions of future activities are one of the key challenges for mangers and policy makers. The use of search data has been shown to provide reliable estimates; however, its application to businesses was hindered due to the limitations of the selection methods of current terms. We argue that our new method extends the availability of the use of search data for predictions, especially when the exact relevant keywords are unknown. Even when some prior knowledge exists, our method can generate new related terms that can potentially improve predictive accuracy. Further, due to its simplicity and low cost, forecasts can be updated periodically to support strategic decisions. This method can be used for both short-term and long-term decisions. For example, better measurements of current demand trends can assist in shipment routing and planning of marketing activities in the short term. It may assist in early detection of problems in current products or services, as those phrases will appear in the word association results. With respect to long-term decisions, the word association and search trends may assist in production planning, and more interesting, it may reveal consumers needs for changes in the product or new products. Overall, in the era of increasing volumes of big data, our approach allows for simple and low-cost filtering of relevant information that can be used in measurements and prediction of business activities. 25

26 Limitation and future research While the use of search volume data has been shown to improve prediction models, it is important to note that people who perform online searches do not necessarily reflect a representative sample of the population. For instance, elderly people and people with low income tend to use the Internet less often, which could lead to inaccurate predictions in some domains. In addition, due to privacy constraints, Google makes search volume data available only when the number of searches of a specific term reaches a threshold that obstructs the possibility of discerning the identity of those who performed the searches from the aggregated data. As a result, small-scale phenomena, or events that occur in areas with low population density, will not be publicized by these search tools. In a similar manner, the use of crowd-based tools also does not generate a representative sample of the population and may not suit for areas with low population or areas with a low level of technology adoption. 26

27 Appendix A List of Aggregated Associated Terms Table 1. Top 10 Associated Terms by Cue Term Influenza Housing Sales Unemployment Term Association Term Association Term Association strength * strength strength sick 53% mortgage 50% poor 20% fever 47% expensive 18% money 20% cold 19% realtor 18% jobless 16% cough 18% location 16% depression 16% contagious 15% money 14% broke 12% germs 11% loan 12% homeless 12% shot 10% agent 8% bills 10% vaccine 10% interest rate 8% no money 10% influenza 9% real estate 8% Sad 10% * Association strength is the percentage of participants providing this word. 27

28 References Bayus, B. L "Crowdsourcing New Product Ideas over Time: An Analysis of the Dell IdeaStorm Community," Management Science (59:1), pp Boudreau, K. J., and Lakhani, K. R "Using the Crowd as an Innovation Partner," Harvard Business Review (91:4), pp Brabham, D. C "Crowdsourcing as a model for problem solving an introduction and cases," Convergence: the international journal of research into new media technologies (14:1), pp Choi, H., and Varian, H "Predicting the present with google trends," Economic Record (88:s1), pp 2-9. D'Amuri, F., and Marcucci, J "The predictive power of Google searches in forecasting unemployment," Bank of Italy Temi di Discussione (Working Paper) No (891). Du, R. Y., and Kamakura, W. A "Quantitative Trendspotting," Journal of Marketing Research (49:4), pp Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., and Brilliant, L "Detecting influenza epidemics using search engine query data," Nature (457:7232), pp Goel, S., Hofman, J. M., Lahaie, S., Pennock, D. M., and Watts, D. J "Predicting consumer behavior with Web search," Proceedings of the National Academy of Sciences (107:41), pp Howe, J "The rise of crowdsourcing," Wired magazine (14:6), pp 1-4. Lakhani, K. R., Jeppesen, L. B., Lohse, P. A., and Panetta, J. A The Value of Openess in Scientific Problem Solving, (Division of Research, Harvard Business School. McAfee, A., and Brynjolfsson, E "Big data: the management revolution," Harvard business review October 2012, pp

29 Nelson, D. L., McEvoy, C. L., and Dennis, S "What is free association and what does it measure?," Memory & Cognition (28:6), pp Nelson, D. L., McEvoy, C. L., and Schreiber, T. A "The University of South Florida free association, rhyme, and word fragment norms," Behavior Research Methods, Instruments, & Computers (36:3), pp Nelson, D. L., McKinney, V. M., Gee, N. R., and Janczura, G. A "Interpreting the influence of implicitly activated memories on recall and recognition," Psychological review (105:2), p 299. Seebach, C., Pahlke, I., and Beck, R "Tracking the Digital Footprints of Customers: How Firms can Improve Their Sensing Abilities to Achieve Business Agility," ECIS 2011 Proceedings). Snow, R., O'Connor, B., Jurafsky, D., and Ng, A. Y. Year. "Cheap and fast but is it good?: evaluating non-expert annotations for natural language tasks," Proceedings of the conference on empirical methods in natural language processing, Association for Computational Linguistics2008, pp Von Ahn, L "Games with a purpose," Computer (39:6), pp Vosen, S., and Schmidt, T "Forecasting private consumption: survey based indicators vs. Google trends," Journal of Forecasting (30:6), pp Wu, L., and Brynjolfsson, E "The future of prediction: how Google searches foreshadow housing prices and quantities," Proceedings of the 30th International Conference on Information Systems, paper 147. Phoenix, Arizona. Yan, T., Kumar, V., and Ganesan, D. Year. "Crowdsearch: exploiting crowds for accurate real-time image search on mobile phones," Proceedings of the 8th international conference on Mobile systems, applications, and services, ACM2010, pp

PREDICT MARKET SHARE WITH USERS ONLINE ACTIVITIES DATA: AN INITIAL STUDY ON MARKET SHARE AND SEARCH INDEX OF MOBILE PHONE

PREDICT MARKET SHARE WITH USERS ONLINE ACTIVITIES DATA: AN INITIAL STUDY ON MARKET SHARE AND SEARCH INDEX OF MOBILE PHONE PREDICT MARKET SHARE WITH USERS ONLINE ACTIVITIES DATA: AN INITIAL STUDY ON MARKET SHARE AND SEARCH INDEX OF MOBILE PHONE Kaiquan Xu, Department of Information Systems, City University of Hong Kong, Hong

More information

Can product sales be explained by internet search traffic? The case of video games sales

Can product sales be explained by internet search traffic? The case of video games sales Can product sales be explained by internet search traffic? The case of video games sales Oliver Schaer Nikolaos Kourentzes Lancaster Centre for Forecasting Research Motivation and Question Forecasting

More information

Using internet search data as economic indicators

Using internet search data as economic indicators 134 Quarterly Bulletin 211 Q2 Using internet search data as economic indicators By Nick McLaren of the Bank s Conjunctural Assessment and Projections Division and Rachana Shanbhogue of the Bank s Structural

More information

The Future of Prediction: How Google Searches Foreshadow Housing Prices and Sales

The Future of Prediction: How Google Searches Foreshadow Housing Prices and Sales The Future of Prediction: How Google Searches Foreshadow Housing Prices and Lynn Wu MIT Sloan School of Management 50 Memorial Drive, E53-314 Cambridge, MA 02142 lynnwu@mit.edu Erik Brynjolfsson MIT Sloan

More information

Tomer Shiran VP Product Management MapR Technologies. November 12, 2013

Tomer Shiran VP Product Management MapR Technologies. November 12, 2013 Predictive Analytics with Hadoop Tomer Shiran VP Product Management MapR Technologies November 12, 2013 1 Me, Us Tomer Shiran VP Product Management, MapR Technologies tshiran@maprtech.com MapR Enterprise-grade

More information

Business Challenges and Research Directions of Management Analytics in the Big Data Era

Business Challenges and Research Directions of Management Analytics in the Big Data Era Business Challenges and Research Directions of Management Analytics in the Big Data Era Abstract Big data analytics have been embraced as a disruptive technology that will reshape business intelligence,

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

Healthcare data analytics. Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw

Healthcare data analytics. Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw Healthcare data analytics Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw Outline Data Science Enabling technologies Grand goals Issues Google flu trend Privacy Conclusion Analytics

More information

What is Big Data? The three(or four) Vs in Big Data In 2013 the total amount of stored information is estimated to be Volume.

What is Big Data? The three(or four) Vs in Big Data In 2013 the total amount of stored information is estimated to be Volume. 8/26/2014 CS581 Big Data - Fall 2014 1 8/26/2014 CS581 Big Data - Fall 2014 2 CS535/CS581A BIG DATA What is Big Data? PART 0. INTRODUCTION 1. INTRODUCTION TO BIG DATA 2. COURSE INTRODUCTION PART 0. INTRODUCTION

More information

Financial Trading System using Combination of Textual and Numerical Data

Financial Trading System using Combination of Textual and Numerical Data Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,

More information

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH 205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology

More information

PUBLIC HEALTH MEETS SOCIAL MEDIA: MINING HEALTH INFO FROM TWITTER

PUBLIC HEALTH MEETS SOCIAL MEDIA: MINING HEALTH INFO FROM TWITTER PUBLIC HEALTH MEETS SOCIAL MEDIA: MINING HEALTH INFO FROM TWITTER Michael Paul (@mjp39) Johns Hopkins University Crowdsourcing and Human Computation Lecture 18 Learning about the real world through Twitter

More information

Big Data. Fast Forward. Putting data to productive use

Big Data. Fast Forward. Putting data to productive use Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize

More information

Building a Database to Predict Customer Needs

Building a Database to Predict Customer Needs INFORMATION TECHNOLOGY TopicalNet, Inc (formerly Continuum Software, Inc.) Building a Database to Predict Customer Needs Since the early 1990s, organizations have used data warehouses and data-mining tools

More information

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 ushaduraisamy@yahoo.co.in 2 anishgracias@gmail.com Abstract A vast amount of assorted

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Statistical Challenges with Big Data in Management Science

Statistical Challenges with Big Data in Management Science Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision

More information

Economic Commentaries

Economic Commentaries n Economic Commentaries Data and statistics are a cornerstone of the Riksbank s work. In recent years, the supply of data has increased dramatically and this trend is set to continue as an ever-greater

More information

Big Data. How it is Transforming Learning and Talent Development

Big Data. How it is Transforming Learning and Talent Development Big Data How it is Transforming Learning and Talent Development Agenda 1. Big Data Background 2. Big Data in Talent and Learning Analytics 3. Examples and Getting Started Big Data Defined Big Data.. The

More information

Cleaned Data. Recommendations

Cleaned Data. Recommendations Call Center Data Analysis Megaputer Case Study in Text Mining Merete Hvalshagen www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 10 Bloomington, IN 47404, USA +1 812-0-0110

More information

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt

More information

A financial software company

A financial software company A financial software company Projecting USD10 million revenue lift with the IBM Netezza data warehouse appliance Overview The need A financial software company sought to analyze customer engagements to

More information

Marchex Summary November 2012

Marchex Summary November 2012 Marchex Summary November 2012 SAFE HARBOR STATEMENT This presentation contains forward-looking statements that involve substantial risks and uncertainties. All statements, other than statements of historical

More information

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis ElegantJ BI White Paper The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis Integrated Business Intelligence and Reporting for Performance Management, Operational

More information

Chapter-1 : Introduction 1 CHAPTER - 1. Introduction

Chapter-1 : Introduction 1 CHAPTER - 1. Introduction Chapter-1 : Introduction 1 CHAPTER - 1 Introduction This thesis presents design of a new Model of the Meta-Search Engine for getting optimized search results. The focus is on new dimension of internet

More information

Beyond listening Driving better decisions with business intelligence from social sources

Beyond listening Driving better decisions with business intelligence from social sources Beyond listening Driving better decisions with business intelligence from social sources From insight to action with IBM Social Media Analytics State of the Union Opinions prevail on the Internet Social

More information

Accelerating Complex Event Processing with Memory- Centric DataBase (MCDB)

Accelerating Complex Event Processing with Memory- Centric DataBase (MCDB) Accelerating Complex Event Processing with Memory- Centric DataBase (MCDB) A FedCentric Technologies White Paper January 2008 Executive Summary Events happen in real-time; orders are taken, calls are placed,

More information

Key Trends in Big Data and Analytics

Key Trends in Big Data and Analytics Key Trends in Big Data and Analytics Martin Willcox, Director Big Data Centre of Excellence (Teradata International) October 2015 2015 Teradata Agenda Motivating examples from an old industry From transactions

More information

Supply chain intelligence: benefits, techniques and future trends

Supply chain intelligence: benefits, techniques and future trends MEB 2010 8 th International Conference on Management, Enterprise and Benchmarking June 4 5, 2010 Budapest, Hungary Supply chain intelligence: benefits, techniques and future trends Zoltán Bátori Óbuda

More information

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios By: Michael Banasiak & By: Daniel Tantum, Ph.D. What Are Statistical Based Behavior Scoring Models And How Are

More information

Technical challenges in web advertising

Technical challenges in web advertising Technical challenges in web advertising Andrei Broder Yahoo! Research 1 Disclaimer This talk presents the opinions of the author. It does not necessarily reflect the views of Yahoo! Inc. 2 Advertising

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

The Future of Business Analytics is Now! 2013 IBM Corporation

The Future of Business Analytics is Now! 2013 IBM Corporation The Future of Business Analytics is Now! 1 The pressures on organizations are at a point where analytics has evolved from a business initiative to a BUSINESS IMPERATIVE More organization are using analytics

More information

Pulsar TRAC. Big Social Data for Research. Made by Face

Pulsar TRAC. Big Social Data for Research. Made by Face Pulsar TRAC Big Social Data for Research Made by Face PULSAR TRAC is an advanced social intelligence platform designed for researchers and planners by researchers and planners. We have developed a robust

More information

2013 Ad Solutions. Cross Channel Advertising. (800) 296-7104 sales@admedia.com Partnership Opportunities 1. (800) 296-7104 sales@admedia.

2013 Ad Solutions. Cross Channel Advertising. (800) 296-7104 sales@admedia.com Partnership Opportunities 1. (800) 296-7104 sales@admedia. 2013 Ad Solutions Cross Channel Advertising Partnership Opportunity Partnership Opportunities 1 WHO WE ARE AdMedia works with top agencies and brands to bring digital marketing solutions with our cross

More information

Opportunities and Limitations of Big Data

Opportunities and Limitations of Big Data Opportunities and Limitations of Big Data Karl Schmedders University of Zurich and Swiss Finance Institute «Big Data: Little Ethics?» HWZ-Darden-Conference June 4, 2015 On fortune.com this morning: Apple's

More information

SEO Services. Climb up the Search Engine Ladder

SEO Services. Climb up the Search Engine Ladder SEO Services Climb up the Search Engine Ladder 2 SEARCH ENGINE OPTIMIZATION Increase your Website s Visibility on Search Engines INTRODUCTION 92% of internet users try Google, Yahoo! or Bing first while

More information

Enhancing Sales and Operations Planning with Forecasting Analytics and Business Intelligence WHITE PAPER

Enhancing Sales and Operations Planning with Forecasting Analytics and Business Intelligence WHITE PAPER Enhancing Sales and Operations Planning with Forecasting Analytics and Business Intelligence WHITE PAPER Table of Contents Introduction... 1 Analytics... 1 Forecast cycle efficiencies... 3 Business intelligence...

More information

A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics

A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics contents A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics Abstract... 2 Need of Social Content Analytics... 3 Social Media Content Analytics... 4 Inferences

More information

Democratizing ad ratings Using a crowd-sourced rating system to help businesses improve results

Democratizing ad ratings Using a crowd-sourced rating system to help businesses improve results Democratizing ad ratings Using a crowd-sourced rating system to help businesses improve results Neha Bhargava, Advertising Research Lead, & Eurry Kim, Advertising Researcher June 9, 2014 Abstract We are

More information

Enhancing Sales and Operations Planning with Forecasting Analytics and Business Intelligence WHITE PAPER

Enhancing Sales and Operations Planning with Forecasting Analytics and Business Intelligence WHITE PAPER Enhancing Sales and Operations Planning with Forecasting Analytics and Business Intelligence WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Analytics.... 1 Forecast Cycle Efficiencies...

More information

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and

More information

PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. [ WhitePaper ]

PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. [ WhitePaper ] [ WhitePaper ] PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. Over the past decade, the value of log data for monitoring and diagnosing complex networks has become increasingly obvious.

More information

Big Data Big Noise. Its relevance to industrial Statistics in the context of SDG monitoring. Shyam Upadhyaya UNIDO

Big Data Big Noise. Its relevance to industrial Statistics in the context of SDG monitoring. Shyam Upadhyaya UNIDO Big Data Big Noise Its relevance to industrial Statistics in the context of SDG monitoring Shyam Upadhyaya UNIDO CCSA SPECIAL SESSION ON SHOWCASING BIG DATA 1 October 2015, Bangkok Data revolution and

More information

Data analytics Delivering intelligence in the moment

Data analytics Delivering intelligence in the moment www.pwc.co.uk Data analytics Delivering intelligence in the moment January 2014 Our point of view Extracting insight from an organisation s data and applying it to business decisions has long been a necessary

More information

DIGITAL MARKETING SERVICES

DIGITAL MARKETING SERVICES DIGITAL MARKETING SERVICES We take a custom approach to digital marketing. Unlike high-volume agencies we recognize that every client (and every project) is different. Rather than attempt to fit your project

More information

CLOUD ANALYTICS: Empowering the Army Intelligence Core Analytic Enterprise

CLOUD ANALYTICS: Empowering the Army Intelligence Core Analytic Enterprise CLOUD ANALYTICS: Empowering the Army Intelligence Core Analytic Enterprise 5 APR 2011 1 2005... Advanced Analytics Harnessing Data for the Warfighter I2E GIG Brigade Combat Team Data Silos DCGS LandWarNet

More information

Table of Contents. Copyright 2011 Synchronous Technologies Inc / GreenRope, All Rights Reserved

Table of Contents. Copyright 2011 Synchronous Technologies Inc / GreenRope, All Rights Reserved Table of Contents Introduction: Gathering Website Intelligence 1 Customize Your System for Your Organization s Needs 2 CRM, Website Analytics and Email Integration 3 Action Checklist: Increase the Effectiveness

More information

Measuring TV s Impact for Mobile Advertisers

Measuring TV s Impact for Mobile Advertisers Measuring TV s Impact for Mobile Advertisers Drive new app installations, improve retention of current users, and increase in-app transactions. Presented by: Table of Contents Introduction... How Audience-Targeted

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

Realize Campaign Performance with Call Tracking. One Way Marketing Agencies Prove Their Worth

Realize Campaign Performance with Call Tracking. One Way Marketing Agencies Prove Their Worth Realize Campaign Performance with Call Tracking One Way Marketing Agencies Prove Their Worth 73 Billion for 2018 BY THE NUMBERS BY THE NUMBERS Projected calls BY THE NUMBERS Introduction intro Marketing

More information

CoolaData Predictive Analytics

CoolaData Predictive Analytics CoolaData Predictive Analytics 9 3 6 About CoolaData CoolaData empowers online companies to become proactive and predictive without having to develop, store, manage or monitor data themselves. It is an

More information

Big Data Collection Study for Providing Efficient Information

Big Data Collection Study for Providing Efficient Information , pp. 41-50 http://dx.doi.org/10.14257/ijseia.2015.9.12.03 Big Data Collection Study for Providing Efficient Information Jun-soo Yun, Jin-tae Park, Hyun-seo Hwang and Il-young Moon Computer Science and

More information

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam ECLT 5810 E-Commerce Data Mining Techniques - Introduction Prof. Wai Lam Data Opportunities Business infrastructure have improved the ability to collect data Virtually every aspect of business is now open

More information

Augmented Search for Software Testing

Augmented Search for Software Testing Augmented Search for Software Testing For Testers, Developers, and QA Managers New frontier in big log data analysis and application intelligence Business white paper May 2015 During software testing cycles,

More information

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA ABSTRACT Current trends in data mining allow the business community to take advantage of

More information

Bootstrapping Big Data

Bootstrapping Big Data Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu

More information

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置

More information

At a recent industry conference, global

At a recent industry conference, global Harnessing Big Data to Improve Customer Service By Marty Tibbitts The goal is to apply analytics methods that move beyond customer satisfaction to nurturing customer loyalty by more deeply understanding

More information

Predicting the Present with Google Trends

Predicting the Present with Google Trends Predicting the Present with Google Trends Hyunyoung Choi, Hal Varian December 18, 2011 Abstract In this paper we show how to use search engine data to forecast near-term values of economic indicators.

More information

INTERNET SEARCH STATISTICS AS A SOURCE OF BUSINESS INTELLIGENCE: SEARCHES ON FORECLOSURE AS AN ESTIMATE OF ACTUAL HOME FORECLOSURES

INTERNET SEARCH STATISTICS AS A SOURCE OF BUSINESS INTELLIGENCE: SEARCHES ON FORECLOSURE AS AN ESTIMATE OF ACTUAL HOME FORECLOSURES INTERNET SEARCH STATISTICS AS A SOURCE OF BUSINESS INTELLIGENCE: SEARCHES ON FORECLOSURE AS AN ESTIMATE OF ACTUAL HOME FORECLOSURES G. Kent Webb, San Jose State University, webb_k@cob.sjsu.edu ABSTRACT

More information

Web 3.0 image search: a World First

Web 3.0 image search: a World First Web 3.0 image search: a World First The digital age has provided a virtually free worldwide digital distribution infrastructure through the internet. Many areas of commerce, government and academia have

More information

Role of Social Networking in Marketing using Data Mining

Role of Social Networking in Marketing using Data Mining Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:

More information

LARGE-SCALE DATA-DRIVEN DECISION- MAKING: THE NEXT REVOLUTION FOR TRADITIONAL INDUSTRIES

LARGE-SCALE DATA-DRIVEN DECISION- MAKING: THE NEXT REVOLUTION FOR TRADITIONAL INDUSTRIES LARGE-SCALE DATA-DRIVEN DECISION- MAKING: THE NEXT REVOLUTION FOR TRADITIONAL INDUSTRIES How new knowledge-extraction processes and mindsets derived from the Internet Giants technologies will disrupt and

More information

How does investor attention affect crude oil prices? New evidence from Google search volume index

How does investor attention affect crude oil prices? New evidence from Google search volume index 34th International Symposium on Forecasting How does investor attention affect crude oil prices? New evidence from Google search volume index Xun Zhang Academy of Mathematics and Systems Science, Chinese

More information

Labor Planning and Budgeting for Retail Workforce Agility

Labor Planning and Budgeting for Retail Workforce Agility Labor Planning and Budgeting for Retail Workforce Agility To establish an optimal workforce, retailers must eliminate the inefficiencies in managing their staff to more accurately schedule employees. This

More information

Sentiment Analysis on Big Data

Sentiment Analysis on Big Data SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social

More information

The Quick Guide to Search Engine Optimization

The Quick Guide to Search Engine Optimization The Quick Guide to Search Engine Optimization A white paper by Savant Consultants LLP June 2008 Page 1 of 13 Copyright 2008 All rights reserved. Published by Savant Consultants LLP No part of this publication

More information

IBM Social Media Analytics

IBM Social Media Analytics IBM Social Media Analytics Analyze social media data to better understand your customers and markets Highlights Understand consumer sentiment and optimize marketing campaigns. Improve the customer experience

More information

Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network

Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Anthony Lai (aslai), MK Li (lilemon), Foon Wang Pong (ppong) Abstract Algorithmic trading, high frequency trading (HFT)

More information

Understanding the impact of the connected revolution. Vodafone Power to you

Understanding the impact of the connected revolution. Vodafone Power to you Understanding the impact of the connected revolution Vodafone Power to you 02 Introduction With competitive pressures intensifying and the pace of innovation accelerating, recognising key trends, understanding

More information

Accelerating Business Intelligence with Large-Scale System Memory

Accelerating Business Intelligence with Large-Scale System Memory Accelerating Business Intelligence with Large-Scale System Memory A Proof of Concept by Intel, Samsung, and SAP Executive Summary Real-time business intelligence (BI) plays a vital role in driving competitiveness

More information

White paper. CRM with Big Data

White paper. CRM with Big Data White paper CRM with Big Data Big data is one of the latest buzzwords in the technology lexicon. The potential, reach and impact of big data technologies is the subject of much debate, and the hero in

More information

one Introduction chapter OVERVIEW CHAPTER

one Introduction chapter OVERVIEW CHAPTER one Introduction CHAPTER chapter OVERVIEW 1.1 Introduction to Decision Support Systems 1.2 Defining a Decision Support System 1.3 Decision Support Systems Applications 1.4 Textbook Overview 1.5 Summary

More information

Customized Efficient Collection of Big Data for Advertising Services

Customized Efficient Collection of Big Data for Advertising Services , pp.36-41 http://dx.doi.org/10.14257/astl.2015.94.09 Customized Efficient Collection of Big Data for Advertising Services Jun-Soo Yun 1, Jin-Tae Park 1, Hyun-Seo Hwang 1, Il-Young Moon 1 1 1600 Chungjeol-ro,

More information

WordStream Helps New Agency Indulge in PPC Advertising

WordStream Helps New Agency Indulge in PPC Advertising WordStream Helps New Agency Indulge in PPC Advertising How a young agency was able to build out PPC advertising as a core piece of their digital marketing services and achieve expert-level results with

More information

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling

More information

Get Google AdWords Traffic With Almost No Out Of Pocket Cost!

Get Google AdWords Traffic With Almost No Out Of Pocket Cost! Price: $49.00 Get Google AdWords Traffic With Almost No Out Of Pocket Cost! www.nocostpayperclick.com NOTICE: This is not a free book. It is a $49 manual that is published by www.nocostpayperclick.com,

More information

DIGITAL MARKETING SERVICES

DIGITAL MARKETING SERVICES DIGITAL MARKETING SERVICES We take a custom approach to digital marketing. Unlike high-volume agencies we recognize that every client (and every project) is different. Rather than attempt to fit your project

More information

Five Steps to Optimizing an ecommerce Site for Search Engines

Five Steps to Optimizing an ecommerce Site for Search Engines Five Steps to Optimizing an ecommerce Site for Search Engines A Systematic Approach to Implementing SEO on an ecommerce Website Whitepaper Written By: Tom Kuthy, Search Engine Optimization Expert, WSI

More information

Design of an FX trading system using Adaptive Reinforcement Learning

Design of an FX trading system using Adaptive Reinforcement Learning University Finance Seminar 17 March 2006 Design of an FX trading system using Adaptive Reinforcement Learning M A H Dempster Centre for Financial Research Judge Institute of Management University of &

More information

Best Practices for Log File Management (Compliance, Security, Troubleshooting)

Best Practices for Log File Management (Compliance, Security, Troubleshooting) Log Management: Best Practices for Security and Compliance The Essentials Series Best Practices for Log File Management (Compliance, Security, Troubleshooting) sponsored by Introduction to Realtime Publishers

More information

Impact. How to choose the right campaign for maximum effect. RKM Research and Communications, Inc., Portsmouth, NH. All Rights Reserved.

Impact. How to choose the right campaign for maximum effect. RKM Research and Communications, Inc., Portsmouth, NH. All Rights Reserved. Impact How to choose the right campaign for maximum effect RKM Research and Communications, Inc., Portsmouth, NH. All Rights Reserved. Executive summary Advertisers depend on traditional methods of testing

More information

Here s your full marketing OS. Reimagined.

Here s your full marketing OS. Reimagined. Here s your full marketing OS. Reimagined. We believe advertising should be personal across every connected device and that marketers should focus on attracting customers instead of managing the complexity

More information

INSIGHTS WHITEPAPER What Motivates People to Apply for an MBA? netnatives.com twitter.com/netnatives

INSIGHTS WHITEPAPER What Motivates People to Apply for an MBA? netnatives.com twitter.com/netnatives INSIGHTS WHITEPAPER What Motivates People to Apply for an MBA? netnatives.com twitter.com/netnatives NET NATIVES HISTORY & SERVICES Welcome to our report on using data to analyse the behaviour of people

More information

Using Artificial Intelligence to Manage Big Data for Litigation

Using Artificial Intelligence to Manage Big Data for Litigation FEBRUARY 3 5, 2015 / THE HILTON NEW YORK Using Artificial Intelligence to Manage Big Data for Litigation Understanding Artificial Intelligence to Make better decisions Improve the process Allay the fear

More information

Internet Marketing Proposal

Internet Marketing Proposal Internet Marketing Proposal Prepared For: [COMPANY NAME] Prepared By: Mike Hence CEO Zklld Zklld zklld.com info@zklld.com Our Expertise Our expertise involves the three interrelated disciplines including

More information

COMBINING THE METHODS OF FORECASTING AND DECISION-MAKING TO OPTIMISE THE FINANCIAL PERFORMANCE OF SMALL ENTERPRISES

COMBINING THE METHODS OF FORECASTING AND DECISION-MAKING TO OPTIMISE THE FINANCIAL PERFORMANCE OF SMALL ENTERPRISES COMBINING THE METHODS OF FORECASTING AND DECISION-MAKING TO OPTIMISE THE FINANCIAL PERFORMANCE OF SMALL ENTERPRISES JULIA IGOREVNA LARIONOVA 1 ANNA NIKOLAEVNA TIKHOMIROVA 2 1, 2 The National Nuclear Research

More information

COS 116 The Computational Universe Laboratory 9: Virus and Worm Propagation in Networks

COS 116 The Computational Universe Laboratory 9: Virus and Worm Propagation in Networks COS 116 The Computational Universe Laboratory 9: Virus and Worm Propagation in Networks You learned in lecture about computer viruses and worms. In this lab you will study virus propagation at the quantitative

More information

THE STATE OF Social Media Analytics. How Leading Marketers Are Using Social Media Analytics

THE STATE OF Social Media Analytics. How Leading Marketers Are Using Social Media Analytics THE STATE OF Social Media Analytics May 2016 Getting to Know You: How Leading Marketers Are Using Social Media Analytics» Marketers are expanding their use of advanced social media analytics and combining

More information

Using the Amazon Mechanical Turk for Transcription of Spoken Language

Using the Amazon Mechanical Turk for Transcription of Spoken Language Research Showcase @ CMU Computer Science Department School of Computer Science 2010 Using the Amazon Mechanical Turk for Transcription of Spoken Language Matthew R. Marge Satanjeev Banerjee Alexander I.

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

Streamline your supply chain with data. How visual analysis helps eliminate operational waste

Streamline your supply chain with data. How visual analysis helps eliminate operational waste Streamline your supply chain with data How visual analysis helps eliminate operational waste emagazine October 2011 contents 3 Create a data-driven supply chain: 4 paths to insight 4 National Motor Club

More information

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics

More information

Making Sense of the Mayhem: Machine Learning and March Madness

Making Sense of the Mayhem: Machine Learning and March Madness Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research

More information

The Formula for Small Business Internet Marketing Success

The Formula for Small Business Internet Marketing Success The Formula for Small Business Internet Marketing Success Being Seen on Google, Generating Leads, Closing Business Success on Google is first being seen, second generating interest in your company in the

More information

Recommendations for Performance Benchmarking

Recommendations for Performance Benchmarking Recommendations for Performance Benchmarking Shikhar Puri Abstract Performance benchmarking of applications is increasingly becoming essential before deployment. This paper covers recommendations and best

More information

HOW TO ACCURATELY TRACK YOUR SOCIAL MEDIA BUZZ

HOW TO ACCURATELY TRACK YOUR SOCIAL MEDIA BUZZ TIP SHEET HOW TO ACCURATELY TRACK YOUR SOCIAL MEDIA BUZZ Ten years ago, marketers had to rely primarily on customer surveys and mainstream media coverage to track the buzz created by a new product launch

More information

Business Process Services. White Paper. Social Media Influence: Looking Beyond Activities and Followers

Business Process Services. White Paper. Social Media Influence: Looking Beyond Activities and Followers Business Process Services White Paper Social Media Influence: Looking Beyond Activities and Followers About the Author Vandita Bansal Vandita Bansal is a subject matter expert in Analytics and Insights

More information

IBM Social Media Analytics

IBM Social Media Analytics IBM Analyze social media data to improve business outcomes Highlights Grow your business by understanding consumer sentiment and optimizing marketing campaigns. Make better decisions and strategies across

More information