Big Data Analytics for Social and Behavioral Sciences



Similar documents
Analyzing Human Behavior from Multiplayer Online Game Logs - A Knowledge Discovery Approach -

DIGITS CENTER FOR DIGITAL INNOVATION, TECHNOLOGY, AND STRATEGY THOUGHT LEADERSHIP FOR THE DIGITAL AGE

Social Media Mining. Network Measures

Socialbakers Analytics User Guide

MEASURING THE IMPACT OF TRAINING: A FOCUS

ULTIMATE CHEAT SHEET:

How To Make A Successful Online Game On Runescape

Chapter 2 STUDENT LEARNING OUTCOMES STUDENT LEARNING OUTCOMES. Major Business Initiatives. Basic Marketing Chapter 6 Handout 6-1

A Tutorial on dynamic networks. By Clement Levallois, Erasmus University Rotterdam

Predictive Analytics Applied: Marketing and Web

Motivations of Play in Online Games. Nick Yee, Department of Communication, Stanford University. (in press in CyberPsychology and Behavior) Abstract

CUSTOMER FEEDBACK INDEX

Building and deploying effective data science teams. Nikita Lytkin, Ph.D.

Using Google Analytics to Become a Better Marketer

Leveraging the Microsoft BI Stack to provide a Digital Marketing Dashboard. Chris Kuelbs, Lead Project Manager Polaris Industries

How Big Data is Transforming Marketing into a Strategic Function

Extracting Information from Social Networks

Online Marketing Services Industry

Facebook Smart Card FB _1800

Greedy Routing on Hidden Metric Spaces as a Foundation of Scalable Routing Architectures

Social Media Creating an Approach That Will Bring You More Business

Driving Value From Big Data

Online Presence: What SMBs Want

SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS

BT Retail Social Media making it easy for our customers

MINFS544: Business Network Data Analytics and Applications

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

BIG DATA What it is and how to use?

Churn Prediction in MMORPGs: A Social Influence Based Approach

A Simple Guide to Churn Analysis

Network Analysis Basics and applications to online data

What is Prospect Analytics?

Social Media Get Beyond the Hype and Find Out the True Business Value

Customer Analytics. Turn Big Data into Big Value

The Customer Experience:

Motivations of Play in Online Games. Nick Yee. Department of Communication. Stanford University

From small businesses to large enterprise companies, Recurly offers the simplicity and sophistication your business needs as it grows.

Ph. D. Completion and Attrition: Analysis of Baseline Data

CUSTOMER RELATIONSHIP MANAGEMENT OF SELECT LIFE INSURANCE COMPANIES

Multichannel Customer Care

Text Analytics Beginner s Guide. Extracting Meaning from Unstructured Data

Attributes and Objectives of Social Media. What is Social Media? Maximize Reach with Social Media

Social Media for Automotive Dealers. A Look at How Social Media Empowers Dealers Through Increased Exposure and Interaction With Consumers.

IBM SPSS Modeler Social Network Analysis 15 User Guide

Big Data Use Case: Business Analytics

SAP Predictive Analysis: Strategy, Value Proposition

Free Trial - BIRT Analytics - IAAs

B2B opportunity predictiona Big Data and Advanced. Analytics Approach. Insert

Social Media, How To Guide for American Express Merchants

Zynga Analytics Leveraging Big Data to Make Games More Fun and Social

Cloudessa AAA and Captive Portal Cloud Service

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION

Capturing Meaningful Competitive Intelligence from the Social Media Movement

Welcome. Opening Session Internet Archives & Research Potential Building Community: Research Highlights. Discussion and Challenges

343 Industries Gets New User Insights from Big Data in the Cloud

Conducting an effective Customer Satisfaction Program - Your guide to feedback management ... Vivek Bhaskaran CEO and Co-Founder, Survey Analytics

Social Data Powering Mobile & Display. An exploration of the growing reach and capabilities of social platforms

Social Media for Small Business

Jiffy Lube Uses OdinText Software to Increase Revenue. Text Analytics, The One Methodology You Need to Grow!

Five Strategies to Build a Successful Marketing Campaign

Using SAS Enterprise Miner for Analytical CRM in Finance

measuring mobile ROI Ken Kuschei Director of CRM, Longo s Chris Bryson Founder & CEO, Unata

THE KEY ADVANTAGES OF BUSINESS INTELLIGENCE AND ANALYTICS

Social Network Analysis using Graph Metrics of Web-based Social Networks

Title/Description/Keywords & Various Other Meta Tags Development

DEVELOP INSIGHT DRIVEN CUSTOMER EXPERIENCES USING BIG DATA AND ADAVANCED ANALYTICS

Introduction to Social Media Marketing. Using social media to promote your events.

Deep Security Vulnerability Protection Summary

Market Assessment & Campaign SLA Calculator LOGO WE OPEN THE DOOR, SO YOU CAN CLOSE IT.

COMPUTER SCIENCE: MISCONCEPTIONS, CAREER PATHS AND RESEARCH CHALLENGES

FINDING BIG PROFITS IN THE AGE OF BIG DATA

The Evolution of Social Media Marketing: 9 trends to know now.

BIG DATA IN BUSINESS ENVIRONMENT

Session 2 Generating Value from 'Big Data' Mark T. Bain

Social Media. Marketing Guide B2B

UNC Leadership Survey 2012: Women in Business

How Big Data is Different

Online Marketing Module COMP. Certified Online Marketing Professional. v2.0

Managing the Next Best Activity Decision

Introduction to Social, Mobile, and Local Marketing

Determining the Social Network Worth: Analysis, Quantitative Approach, and Application

Before we jump right into LinkedIn, it's important to lay down a few ground rules

CRM and Relationship Profitability in Banking

Role of Social Networking in Marketing using Data Mining

Transcription:

Big Data Analytics for Social and Behavioral Sciences Jaideep Srivastava Professor Co-Founder, CTO Computer Science Ninja Metrics, Inc. University of Minnesota jaideep@ninjametrics.com srivasta@cs.umn.edu www.ninjametrics.com CRIS Symposium May 2 nd, 2013

Talk Outline Examples of social/behavioral big data Why study virtual worlds and games What social/behavioral sciences tell us Impact on Science Dynamics of online trust Impact on Business Loyalty and influence in CRM So what does Ninja Metrics do? Concluding remarks

Examples of Social/Behavioral Big Data

Example: Tweets for Japanese Tsunami Original Retweet Global retweets of Tweets coming from Japan for one hour after the earthquake

Example: Churn in Subscription Games Ratio of Quitters to Stayers 6.00 5.00 4.00 3.00 2.00 Social network Likelihood of quitting Solo players 65.3% Connected to small or medium networks 34.8% In the biggest network 5.7% Solo Social 1.00 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 Character Level Isolated players are 3.5x more likely to quit (B = 1.26, p<.001). Focus design on facilitating social interaction.

Levis Example of Social Retail Levis leverages its brand to ensure customers provide their social network Levis can leverage predictive social analytics technology to understand the value of the customer s social network 6

Opportunity, Innovation, Impact Companies do not understand the social graph of their customers It s not just about how they relate to their customers, but also about how customers relate to each other vs. Understanding these relationships unlocks immense value Innovation: Understanding the social network of customers Key influencers, relationship strength, Impact: Deriving actionable insights from this understanding Customer acquisition, retention, customer care, Social recommendation, influence-based marketing, identifying trend-setters, Ninja Metrics confidential information. Copyright 2012 7

Why Study Games & Virtual Worlds?

Player Behavior & Revenue Model Blizzard (subscription) World of Warcraft 12 million subscribers Revenue model $15/month Approx $3billion annual revenue 4 hours a day, 7 days a week! Hard core gamers Less socially acceptable Like Cocaine Zynga (free2play) Farmville, Fishville, Mafia Wars, etc. 180 million players Revenue model Virtual goods $700 million in 2010 0.5 hrs a day, 7 days a week Everyone More socially acceptable Like Caffeine

MMORPG Data Sets MMORPG: Massively Multiplayer Online Role Playing Games People assume characters in a fantasy world On average, each players spend 22 hours a week World-of-Warcraft has 10 million subscribers as of Feb 2012 MMORPG is $20 Billion industry Several in-game relationships: chat, trade, mentor, and housing. Helpful to understand the social processes underlying in the society

EQ2 Data Set Chat means to communicate in-game messages and invitations with other players Nodes 349,654; Edges 86,948,748; Period 1 Month Trade means to exchange, buy or sell weapons, and other in-game items Nodes 295,055 Edges 28,594,929; Period 9 Months Mentoring means to assist lower level players to increase mentors experience points Nodes 86,495 Edges 11,913,994; Period 9 Months Housing Trust means to accumulate and store in-game items; share house with the in-game partner to allow the storing of in-game items Nodes 63,918 Edges 128,048; Period 9 Months

Multiple Networks in an Online Game Black: male Red: female Partnership Instant messaging Trade Mail

One Day Snapshot of Various Networks Node color represents the community in all graphs Chat [k-core=7] Trade and Chat networks are filtered by k-core Trade [k-core=2] Mentoring Housing

In-game relationships CHAT Period of interaction: Instantaneous Level of trust: low Graph Density High TRADE Period of interaction: Instantaneous Level of trust: medium MENTORING Period of interaction: long Level of trust: high HOUSING Period of interaction: long Level of trust: very high Low

Degree Distribution for Various Networks ING

EQ2 Graph Summary Mentor Network Trading Graph Housing Graph Chat Graph Period: 9 Months (01-JAN-2006 to 11-SEP-2006) Number of Nodes : 42451 (43K) Number of Edges : 28594929 (29M) Directed: Yes, Temporal: Yes Period: 9 Months (01-JAN-2006 to 11-SEP-2006) Number of Nodes : 54287 (55K) Number of Edges : 1045521695 (1B) Directed: Yes, Temporal: Yes Period: 9 Months (01-JAN-2006 to 11-SEP-2006) Number of Nodes : 62427 (63K) Number of Edges : 1962734099 (2B) Directed: Yes, Temporal: Yes Period: 1 Months 10 days (29-JUL-2006 to 10-SEP- 2006) Number of Nodes: 349654 (350K) Number of Edges: 86948748 (87M) Directed: Yes, Temporal: Yes Granularity of each network is in second)

Friendship Graph Mentoring Graph Chat Graph Quest Graph CR3 Graph Summary Period: 5 Months (08-MAY-2010 to 30-SEP-2010) Number of nodes: 86614 (87 K) Number of edges: 1560303 (1.5M) Directed: Yes, Temporal: Yes Period: 5 Months (08-MAY-2010 to 30-SEP-2010) Number of nodes: 64003 (64K) Number of edges: 188002 (188K) Directed: Yes, Temporal: Yes Period: 1 Months (11-Oct-2010 to 09-NOV-2010) Number of Nodes: 11830 (12K) Number of Edges: 107382408 (107M) Directed: Yes, Temporal: Yes Period: 1 Month Number of nodes: 53836 (54K) Number of edges: 5521156 (5.6M) Directed: No, Temporal: Yes Granularity of each network is in second)

EVE Graph Summary Transaction Log Period: 02/25/2011 05/26/2011 No of nodes =5000 No of edges= 4,975,181 Directed: yes, temporal =yes Granularity in minutes Email Log Period: 05/06/2003 07/06/2011 No of nodes= 5391685 NO of edges=40,680,105 Granularity in minutes

What Social & Behavioral Sciences Tell Us?

History of Social Network Analysis Anthropology Organizational Theory Social Psychology Perception Socio-Cognitive Networks Cognitive Knowledge Networks Reality Social Networks Knowledge Networks Epidemiology Acquaintance (links) Knowledge (content) Sociology Social science networks have widespread application in various fields Most of the analyses techniques have come from Sociology, Statistics and Mathematics See (Wasserman and Faust, 1994) for a comprehensive introduction to social network analysis

Why do we create and sustain networks? Theories of self-interest Theories of social and resource exchange Theories of mutual interest and collective action Theories of contagion Theories of balance Theories of homophily Theories of proximity Theories of co-evolution Sources: Contractor, N. S., Wasserman, S. & Faust, K. (2006). Testing multi-theoretical multilevel hypotheses about organizational networks: An analytic framework and empirical example. Academy of Management Review. Monge, P. R. & Contractor, N. S. (2003). Theories of Communication Networks. New York: Oxford University Press.

Structural signatures of Social Theories A A B + F B + - F C - E C E D D Self interest Exchange Balance A A A B + F B C + - E F B C - + E F C D Collective Action E G o v e rn m e n t In d u s try D Homophily Novice Expert D Contagion

Application Successes Numerous in social sciences Google PageRank LinkedIn expanding your Cognitive Social Network making you aware that you re more connected and closer than you think you are Expertise discovery in organizations Knowledge experts, authorities Well-connected individuals, hubs Rapid-response teams in emergency management Information flow in organizations Twitter real time information dissemination Etc.

Impact on Science: Dynamics of Online Trust

Trust Relationship All players can carry only limited number of items at a time Player buys a house to store excess in-game items House is shared with a in-game partner until the owner revokes the permission to house There are several levels of permission of access TRUSTEE The partner can enter, store and move items in and out of the house FRIEND The partner can enter, store and move his items only VISITOR The partner can enter and see the house NONE The partner can see the house from outside REMOVE The partner cannot see the house Do players prefer a specific trust level? Is there any stable trust level? Do players express higher trust level quickly compared to lower?

Trust Dynamics BEG REMOVE NONE VISITOR FRIEND TRUSTEE END BEG REMOVE NONE VISITOR FRIEND TRUSTEE END 1. Frequency of Expression: People express stronger relationships more often than weaker relationships. See total count of the upper triangular part compared to the lower. 2. Stability of Trust: Trustee state is predominantly preferred and stable state compared to all other states. See BEG->TRUSTEE and TRUSTEE->EOD. 3. Reduction of Trust: People reduce their trust level to REMOVE compared to any other state. Compare REMOVE column with other columns.

Longitudinal Analysis of Trust Dynamics People switch to trust state much more quickly Most of the transition happens during first few days of trust establishment As the relationship stays in current state for longer period it is less likely to move out of that state

Evolving Trust Network

Reciprocation in Granting Trust Responses received No Response Second or more Interaction Trust Forward Link 16904/72445 = 23.3% 54273/72445 =74.9% 1268/72445 =1.75% Figure shows the distribution of response times for responses received trust A B response

Reciprocation in Revoking Trust Forward Links that Received a Response (8452) Backward Links that Responded (8452) Cancelled (1053) Never cancelled (7399) Responded for Cancel (207) Never cancelled (8245) Received cancel response (207) Received no cancel response (846) Received a cancel request did not respond (846) Never received a cancel request (7399) 1053 forward links cancelled out of 8452 (12.5%) and 207 of received a response (19.6%)

Revoking Trust - Response Time Distribution Most of the response is with in first few days Mean response time for trust response is 26.7 days which is much lower compared to 31.9 days for cancellation. People are more responsive to trust request than its cancel request. Similarly, 23.3% of trust requests are responded whereas only 19.6% of cancel requests are responded.

Trust and Socialization Trust is a hidden variable Measurable indirectly through observable proxies Social activities strongly correlated with trust Measurable Social Activities + Positive Feedback Loop Trust Not measurable 32

Socialization and Trust Granting 33

Is there a social hysteresis? Magnetic Hysteresis Polarity changes requires equal effort Ease of magnetization depends on the magnetic material Depends on the strength of magnetic field Social Hysteresis Trust is harder to build than distrust Ease of trust formation depends on the characters of the persons involved Depends on the type of social interaction 34

Robust Predictors of Trust Formation Problem Predictive models of trust Goal formation in social networks To find robust predictors of trust formation in different social networks in environments where more than one type of social relationship exists between two actors 35

Features Considered Node-based Topological Cross-network Average and Difference of Avatar Age Average and Difference of Character Level Human Gender and Country Indicator Average and Difference of Human Age Difference in degree centrality Sum and Difference of Node Degree Shortest distance Common neighbors Sum clustering index Salton Index Jaccard Index Sorensen Index Adar-adamic Index Resource Allocation Index Indicates the presence or absence of other social relationships during the training period within a prediction task 36

Robust Link Predictors Rank EQII Feature Generalized Description Feature Generalized Description CR3 1 Char Level Avg Agent s level of expertise 2 Char Level Difference Agent s level of expertise 3 Avatar Age Avg Agent s experience level Avatar Age Avg Sum Degree Shortest Distance Agent s experience level Propensity to connect Proximity in network 4 Shortest Distance Proximity in network Avatar Age Difference Agent s experience level 5 Sum Degree Propensity to connect Sum Clustering Index Completeness of the ego-network 6 Adar Adamic Index Based on shared neighbors Diff Degree Propensity to connect Key Results Shortest Distance is good predictor in all types of networks Propensity to connect/communicate (degree sum) is also a good predictor across all types of networks In activity-oriented networks, similarity in experience levels of two nodes is a good indicator of trust formation 37

Conclusions and Future Work Key conclusions Multiplayer games provide a great crucible for studying social dynamics in a highly nuanced manner, using graph analytics Longitudinal study of the trust relationship in EQ2 provides new insights into the social dynamics of how trust is formed, how it is revoked, impact of 2-person interaction on the community, etc. Multiple social relationships between the same group of individuals, e.g. trust, mentoring, trade, chat, etc., provide an opportunity to study how one type of relationship impacts another Structural link prediction algorithms can be made much more effective/accurate by bringing in social science knowledge Future work Study other aspects of the trust relationship Use this approach to study other relationships and interrelationship correlations

Concluding Remarks

Impact of radically new instrumentation 1950s Invention of the electron microscope fundamentally changed chemistry from playing with colored liquids in a lab to truly understanding what s going on 1970s Invention of gene sequencing fundamentally changed biology from a qualitative field to a quantitative field 1980s Deployment of the Hubble (and other) Space telescopes has had fundamental impact on astronomy and astrophysics 2000s Massive adoption of online social apps is fundamentally changing social science research Social systems, e.g. FB, Google+, LinkedIn, Twitter, online games, etc. are the new macroscopes of human behavior

The Virtual World Observatory http://129.105.161.80/wp/ Four PIs, 30+ Post-docs, PhD and MS students, UGs, high-schoolers Noshir Contractor, Northwestern: Networks M. Scott Poole, Illinois Urbana-Champaign/NCSA: Groups Jaideep Srivastava, Minnesota: Computer Science Dmitri Williams, USC: Social Psychology Collaborators Castronova (Sociology, Indiana), Yee (Xerox PARC), Consalvo, Caplan (Economics, Delaware), Burt (Sociology, U of Chicago), Adamic (Info Sci, Michigan), Data, technology, funding partners Sony (EverQuest 2), Linden Labs (2 nd Life), Bungie (Halo3), Kingsoft (Chevalier s Romance), others Cloudera Systems (Hadoop), Microsoft (SQL Server), Weka, NSF, DARPA, CDC, ARL, ARI, IARPA,

Collaborators, Sponsors, Partners Team Financial Sponsors Data Partners Technology Partners

Thank you for your attention!