(Big) Data Analytics: From Word Counts to Population Opinions



Similar documents
Internet Search Activity & Crowdsourcing

Societal Data Resources and Data Processing Infrastructure

Social Media Creating an Approach That Will Bring You More Business

Big Data. What is Big Data? Over the past years. Big Data. Big Data: Introduction and Applications

The 2012 State of Web and Social Media Analytics in Higher Education

Search Engine Marketing(SEM)

Making Social Media Work for Advocacy

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Technology for Small Business

MANAGEMENT AND AUTOMATION TOOLS

The Must Dos of Your Digital Strategy

Introduction 3. Step One: Create a Keyword Strategy 4. Step Two: Optimize Your Website 7. Step Three: Create Blog and Other Content 14

Text and data analytics for social network mining

Bigger Data for Marketing and Customer Intelligence Customer Analytics Roadmap

Target Marketing 102 What s Next?

Healthcare data analytics. Da-Wei Wang Institute of Information Science

MARKETING TIPS TO GET NEW CLIENTS AND KEEP THE ONES YOU'VE GOT. Div Bhansali Vice President, Marketing AccountantsWorld

Take Advantage of Social Media. Monitoring.

Social Media Measurement Meeting Robert Wood Johnson Foundation April 25, 2013 SOCIAL MEDIA MONITORING TOOLS

Social Media Boot Camp

How Social Media will Change the Future of Banking Services

[SOCIAL MEDIA MANAGEMENT PACKAGE]

Fast Track Program in Social Media Marketing

Measuring your Social Media Efforts

What is Data Science? Data, Databases, and the Extraction of Knowledge Renée November 2014

Quantitative Trading on News and Social Media Content

Professional Diploma in Digital Marketing

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Potential and Pitfalls of Health-Related Big Data. Ana Aizcorbe. March 6, 2014

Digital Marketing Strategy

DIGITAL MARKETING. The Page Title Meta Descriptions & Meta Keywords

Social media 101. Social Enterprise East of England: Boot Camp. 5 June 2014

Text Analytics Beginner s Guide. Extracting Meaning from Unstructured Data

DEVELOPING A SOCIAL MEDIA STRATEGY

Social marketing and sustainability

Online Marketing Training

Twitter Analytics: Architecture, Tools and Analysis

Youtube Search Engine Optimization (SEO) - How to Rank a Youtube Video:

Social Media and how Parks can benefit from it

Social Media Analysis of Key Ecommerce Portals By Konnect Social

What happens when Big Data and Master Data come together?

Above the fold: It refers to the section of a web page that is visible to a visitor without the need to scroll down.

1. Introduction to SEO (Search Engine Optimization)

CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE

Business Process Services. White Paper. Social Media Influence: Looking Beyond Activities and Followers

Advanced Internet Marketing Techniques

Social Media Strategies for Compliance. Julie Straw, MPH CHES

Social Media Marketing. Hours 45

The Experts Guide to Keyword Research for Social Media. A WordStream Guide

5 Point Social Media Action Plan.

You Are What You Tweet Information Security & Risk Management Conference Steps to Personal Branding Success. University of Guelph

SWOT Analysis Determine core opportunities to serve as the foundation for building an effective social media strategy.

Don t be anti-social with your inbound marketing. David Mitchell & Lauren Keeling the marketing people

Sentiment Analysis on Big Data

SEO, Search Engine and Online Reputation Management

What is Data Science? Girl Develop It! Meetup Renée M. P. Teate, March 2015

FPADFW Chapter - Social Media Best Practices

Social Media Marketing (Part 1)

Is your website generating leads for your business?

DIGITAL MARKETING TRAINING

MAPS/REPUTATION DASHBOARD

Initial research provides the bedrock for all good decision making and drives your digital marketing across all disciplines.


Using Web and Social Media for Influenza Surveillance

Five Tips. For Assembling Integrated Marketing Campaigns

Big Data, Official Statistics and Social Science Research: Emerging Data Challenges

Social Media Marketing Measurement A research project to understand new possibilities

Social Business Intelligence For Retail Industry

Level 3 Diploma in Social Media for Business

Measuring What Matters

Developing your Content Strategy for Social Media (and Beyond!)

Social Listening & Analytics:

Promoting your presence at the show

NuWave Commerce SEO & Social Media Packages

Big Data and Society: The Use of Big Data in the ATHENA project

**NEW CLIENTS MAY NEED AN INITIAL SET- UP and ANALYSIS

Text Mining - Scope and Applications

Whitepaper Video Marketing for Restaurants

LGBT Social Media & Web 2.0 Marketing. 5/16/2013 LGBT Social Media & Web 2.0 Marketing

ADVANCE DIGITAL MARKETING VIDEO TRAINING COURSE. Page 1 of 34 Youtube.com/ViralJadhav viral@experttraining.

Digital marketing strategy: embracing new technologies to broaden participation

Online Marketing Strategies

WHITE PAPER Closing the Loop on Social Leads. A Hootsuite & 2DiALOG HubSpot White Paper

Social Media Analytics. Social Media Workshop Twitter Facebook Instagram

Socialbakers Analytics User Guide

Online Reputation Management Services

Social Media for Financial Advisors: Expert Q&A

Slide 7. Jashapara, Knowledge Management: An Integrated Approach, 2 nd Edition, Pearson Education Limited Nisan 14 Pazartesi

A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics

INTERNET MARKETING SERVICES (IMS)

SOCIAL MEDIA REPORT FOR 2014 FROM SOCIAL MEDIA EXAMINER Provided for you by your marketing expert Marilyn Dayton Major findings

Crystal Maleski

Elevate Customer Experience and Engagement in the New Digital World

What s New in Communication:

Keyword Research for Social Media

Social Media Marketing UCSB Extension

BIG DATA FUNDAMENTALS

Promoting Your Business Using Social Media Building a Strategy. Name:

Digital Marketing and Assessment of HR Practices

the beginner s guide to SOCIAL MEDIA METRICS

Transcription:

(Big) Data Analytics: From Word Counts to Population Opinions Mark Keane Insight@University College Dublin October 2014 ~ RSS ~ Edinburgh

September 2014/EPIC 2

September 2014/EPIC 3

September 2014/EPIC 4

September 2014/EPIC 5

September 2014/EPIC 6

Outline What s New About (Big) Data Analytics 3 Sample Cases: Google Queries Predicting Epidemics Networks of Influence Financial Opinions in an Stockmarket Bubble Take Home Messages October 2014/RSS-Edinburgh 7

What s New? October 2014/RSS-Edinburgh 8

Four Vs of Big Data October 2014/RSS-Edinburgh 9

What s New?: The Suggestion of a Brave new world of (new) data analysis that can Handle vast amounts of data effortlessly with Instant press-of-a-button answers from Vast server farms of (almost free) computation October 2014/RSS-Edinburgh 10

What s New?: The Suggestion of a Brave new world of (new) data analysis that can Handle vast amounts of data effortlessly with Instant press-of-a-button answers from Vast server farms of (almost free) computation But there are significant issues And there is a lot that is old (familiar) October 2014/RSS-Edinburgh 11

What s Old? Good old-fashioned, data analysis Many statistical ideas are very familiar Many research problems are familiar Proper collection of data is important Proper treatment of data is critical October 2014/RSS-Edinburgh 12

What s Really New? An Approach Tipping-point with Very Large Data Sets» from 100s to 1,000,000,000s of data points Unusual Types of Data» video, text, thumbs-up, unstructured data Non-standard Data Sources» social media (FB, Tweets), news, phones Data is not conventionally-measured» the sensing devices are doing other things October 2014/RSS-Edinburgh 13

In this New Big-Data World! Who we know, says a lot about who we are Facebook friends, linked-in network, tweet followers What we write, says a lot about what we think text in books, news, blogs, social media and so on Where we located, says a lot about us location-based sensing, GPS, IP-addresses What we do, says a lot about our decisions/interests what we buy, web-sites visited, youtube videos watched, news re-tweeted, items shared and so on October 2014/RSS-Edinburgh 14

Three Sample Cases October 2014/RSS-Edinburgh 15

Finding Flu Outbreaks October 2014/RSS-Edinburgh 16

Case 1: Predicting Flu from Searches Google Flu Trends (GFT): aggregates search data, counting influenza keywords US Centre for Disease Control: tracks influenza-like-illnesses (ILIs) in outpatient data From 2003-2009: GFT showed high correlations with ILI stats (ILINet) until 2009 influenza virus A (H1N1) pandemic [ph1n1] Cook, S., Conrad, C., Fowlkes, A. L., & Mohebbi, M. H. (2011). Assessing Google flu trends performance in the United States during the 2009 influenza virus A (H1N1) pandemic. PloS one, 6, e23610. October 2014/RSS-Edinburgh 17

Good Correlations (Initially ) Body Level One Body Level Two Body Level Three Body Level Four» Body Level Five Cook, S., Conrad, C., Fowlkes, A. L., & Mohebbi, M. H. (2011). Assessing Google flu trends performance in the United States during the 2009 influenza virus A (H1N1) pandemic. PloS one, 6, e23610. October 2014/RSS-Edinburgh 18

Hang on a sec Body Level One Body Level Two Body Level Three Body Level Four» Body Level Five In 2009, Google modify model with new search terms October 2014/RSS-Edinburgh 19

The Message What we do, says a lot about our concerns if I think I have flu and I am looking it up on Google Here, people s illness is being defined by their search behaviour and keywords Population behaviour can be predicted (in locations) by aggregating these searches October 2014/RSS-Edinburgh 20

The Message What we do, says a lot about our concerns if I think I have flu I am looking it up on Google Here, people s illness is being defined by their search behaviour and keywords Population behaviour can be predicted (in locations) by aggregating these searches But, proper treatment of data is critical (keywords, normalising) a model of what leads a user to use a certain search term October 2014/RSS-Edinburgh 21

Networks of Influence 22

Case 2: Showing Networks of Influence Tracking news on Social Networks terrorists release youtube videos politicians comment in Facebook celebs tweet intimacies Who you comment on, What you comment on and where; can reveal networks of influence Storyful is using Insight system, to curate the lists of sources and propose new ones, by analysing social networks October 2014/RSS-Edinburgh 23

Curated Lists of Sources (Large) D. Greene, G. Sheridan, B. Smyth, & P. Cunningham (2012) Aggregating content and network information to curate twitter user lists. In Proc. 4th ACM RecSys Wkshp on Recommender Systems & The Social Web. October 2014/RSS-Edinburgh 24

Automated Recommendation D. Greene, G. Sheridan, B. Smyth, & P. Cunningham, Aggregating Content and Network Information to Curate Twitter User Lists, in Proc. 4th ACM RecSys Workshop on Recommender Systems & The Social Web, 2012. October 2014/RSS-Edinburgh 25

Networks in Syrian Conflict Network of Syrian-related Twitter accounts active during late 2013 O'Callaghan, D., Prucha, N., Greene, D., Conway, M., Carthy, J., & Cunningham, P. (2014). Online Social Media in the Syria Conflict: Encompassing the Extremes and the In-Betweens. arxiv preprint arxiv:1401.7535. October 2014/RSS-Edinburgh 26

European Parliament Networks Data analysed for 584 MEPs on Twitter during July-Sept 2014. J. P. Cross & D. Greene. (2014) Tracking information flows in the Council of the European Union: A social network analysis. Under review. October 2014/RSS-Edinburgh 27

Political Groupings Data analysed for 584 MEPs on Twitter during July-Sept 2014. Cross & Greene (2014) October 2014/RSS-Edinburgh 28

The Outlier Party Data analysed for 584 MEPs on Twitter during July-Sept 2014. Cross & Greene (2014) October 2014/RSS-Edinburgh 29

The Message Who we know, says a lot about who we are Facebook friends, linked-in network, tweet followers I can be defined by the people I know/like/respect/follow (homophily) My behaviour can be predicted by assuming that like-people act alike But, accuracy of those relationships is critical may not generalise from one domain to another September 2014/EPIC 30

Tracking Bubble Behaviour

Case 3: Tracking Herding & Market Bubbles Word frequencies reveal power-laws (Zipf s Law) Bubble would show in herd-like use of language Power laws change systematically with herding Sentiment of phrases should also be trackable Gerow, A., & Keane, M. T. (2011, July). Mining the web for the voice of the herd to track stock market bubbles. IJCAI-2011. AAAI Press. October 2014/RSS-Edinburgh 32

Zipf s Law & Moby Dick October 2014/RSS-Edinburgh 33

Agreement tween Commentators Agreeing to Agree in Power Laws of Words October 2014/RSS-Edinburgh 34

Analysing Text in News 17,713 finance articles (FT, NYT, BBC) 4 years (Jan 2006-Jan 2010) including 2007 crash 10,418,266 words, we extract nouns and verbs Gerow, A., & Keane, M. T. (2011, July). Mining the web for the voice of the herd to track stock market bubbles. IJCAI-2011. AAAI Press. October 2014/RSS-Edinburgh 35

September 2014/EPIC 36

September 2014/EPIC 37

Analysing Text in News 17,713 finance articles (FT, NYT, BBC) 4 years (Jan 2006-Jan 2010) including 2007 crash 10,418,266 words, we extract nouns and verbs Correlations for verb distributions show: DJIA (r =.79), FTSE-100 (r =.78), NIKKEI-225 (r =.73) NB: prediction is another matter Gerow, A., & Keane, M. T. (2011, July). Mining the web for the voice of the herd to track stock market bubbles. IJCAI-2011. AAAI Press. October 2014/RSS-Edinburgh 38

September 2014/EPIC 39

The Message What we write, says a lot about what we think text in books, news, blogs, social media and so on Here, agreement in a population is being captured by carefully treated word frequencies Population beliefs can be tracked by a distributional analysis of changes in words October 2014/RSS-Edinburgh 40

The Message What we write, says a lot about what we think text in books, news, blogs, social media and so on Here, agreement in a population is being captured by carefully treated word frequencies Population beliefs can be tracked by a distributional analysis of changes in words But, proper treatment of words is critical (stop-words, syntax) sentiment analysis had to be based on human judgements October 2014/RSS-Edinburgh 41

Some Conclusions October 2014/RSS-Edinburgh 42

In this New Big-Data World! Who we know, says a lot about who we are Facebook friends, linked-in network, tweet followers What we write, says a lot about what we think text in books, news, blogs, social media and so on Where we located, says a lot about us location-based sensing, GPS, IP-addresses What we do, says a lot about our decisions/interests what we buy, web-sites visited, youtube videos watched, news re-tweeted, items shared and so on October 2014/RSS-Edinburgh 43

In this New Big-Data World! Who we know, Facebook friends, linked-in network, tweet followers What we write, text in books, news, blogs, social media and so on Where we located location-based sensing, GPS, IP-addresses What we do what we buy, web-sites visited, youtube videos watched, news re-tweeted, items shared and so on NOW ROUTINELY AVAILABLE AT A SMARTPHONE NEAR YOU October 2014/RSS-Edinburgh 44

Promises and Caveats Data analytics bears promise in tracking and predicting: population actions, beliefs, opinions, illness changes in those actions, beliefs, opinions, illnesses Challenges are in finding: right treatment of the data: selection/collation of data is still critical, combining multiple data-sources right analytic methods: which, if any, are appropriate right interpretations; old-fashion exclusion-of-vars/ interpretation October 2014/RSS-Edinburgh 45

The End October 2014/RSS-Edinburgh 46