Social Prediction in Mobile Networks: Can we infer users emotions and social ties?



Similar documents
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph

IJCSES Vol.7 No.4 October 2013 pp Serials Publications BEHAVIOR PERDITION VIA MINING SOCIAL DIMENSIONS

Collective Behavior Prediction in Social Media. Lei Tang Data Mining & Machine Learning Group Arizona State University

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA

Jure Leskovec Stanford University

Predict Influencers in the Social Network

Top 10 Algorithms in Data Mining

Top Top 10 Algorithms in Data Mining

Computer Forensics Application. ebay-uab Collaborative Research: Product Image Analysis for Authorship Identification

CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis

Information Management course

Large-scale Data Mining: MapReduce and Beyond Part 2: Algorithms. Spiros Papadimitriou, IBM Research Jimeng Sun, IBM Research Rong Yan, Facebook

A survey on click modeling in web search

How To Classify Data Stream Mining

Metaheuristics in Big Data: An Approach to Railway Engineering

Social-Sensed Multimedia Computing

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Community-Aware Prediction of Virality Timing Using Big Data of Social Cascades

AN INTRODUCTION TO SOCIAL NETWORK DATA ANALYTICS

Exploring Big Data in Social Networks

Social Media Mining. Data Mining Essentials

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012

SOCIAL NETWORK DATA ANALYTICS

Twitter sentiment vs. Stock price!

Microblog Sentiment Analysis with Emoticon Space Model

Graph Mining and Social Network Analysis

Social Influence Analysis in Social Networking Big Data: Opportunities and Challenges. Presenter: Sancheng Peng Zhaoqing University

BIG DATA STREAM ANALYTICS FOR CORRELATED

MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT

Classification and Prediction

Predicting Information Popularity Degree in Microblogging Diffusion Networks

Graph Processing and Social Networks

Customer Relationship Management using Adaptive Resonance Theory

How To Solve The Kd Cup 2010 Challenge

Understanding Graph Sampling Algorithms for Social Network Analysis

Cross-Validation. Synonyms Rotation estimation

Research on the UHF RFID Channel Coding Technology based on Simulink

Data Mining Yelp Data - Predicting rating stars from review text

Stock Market Forecasting Using Machine Learning Algorithms

Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model

Parallel Data Mining. Team 2 Flash Coders Team Research Investigation Presentation 2. Foundations of Parallel Computing Oct 2014

Introduction to Data Mining. Lijun Zhang

The primary goal of this thesis was to understand how the spatial dependence of

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Protein Protein Interaction Networks

Data Mining & Data Stream Mining Open Source Tools

Active Learning SVM for Blogs recommendation

Towards Inferring Web Page Relevance An Eye-Tracking Study

Using One-Versus-All classification ensembles to support modeling decisions in data stream mining

Clustering Technique in Data Mining for Text Documents

Network Analysis For Sustainability Management

Curriculum Vitae. Summer internship in a financial company that is active in quantitative analysis or development of quantitative

Exploration and Visualization of Post-Market Data

Big Data Analytics for Healthcare

MapReduce on GPUs. Amit Sabne, Ahmad Mujahid Mohammed Razip, Kun Xu

A SURVEY OF MODELS AND ALGORITHMS FOR SOCIAL INFLUENCE ANALYSIS

PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL

A Logistic Regression Approach to Ad Click Prediction

Graph Mining Techniques for Social Media Analysis

Contemporary Techniques for Data Mining Social Media

SOPS: Stock Prediction using Web Sentiment

CMU SCS Large Graph Mining Patterns, Tools and Cascade analysis

Performance Evaluation On Human Resource Management Of China S Commercial Banks Based On Improved Bp Neural Networks

large-scale machine learning revisited Léon Bottou Microsoft Research (NYC)

Supply Chain Forecasting Model Using Computational Intelligence Techniques

Large Scale Learning to Rank

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing Classifier

Jiliang Tang. 701 First Avenue Yahoo!, Voice: (408) Sunnyvale, CA, US. Contact Information

The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University

ISSN: (Online) Volume 2, Issue 10, October 2014 International Journal of Advance Research in Computer Science and Management Studies

A QoE Based Video Adaptation Algorithm for Video Conference

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

Design call center management system of e-commerce based on BP neural network and multifractal

ANALYSING THE FEATURES OF JAVA AND MAP/REDUCE ON HADOOP

Electronic Medical Record Mining. Prafulla Dawadi School of Electrical Engineering and Computer Science

Research on Sentiment Classification of Chinese Micro Blog Based on

GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory

Affinity Prediction in Online Social Networks

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop

Transcription:

Social Prediction in Mobile Networks: Can we infer users emotions and social ties? Jie Tang Tsinghua University, China 1 Collaborate with John Hopcroft, Jon Kleinberg (Cornell) Jinghai Rao (Nokia), Jimeng Sun (IBM TJ Watson) Tiancheng Lou, Wenbin Tang, Honglei Zhuang, Yuan Zhang (Tsinghua)

Motivation Social behavior VS. Emotion change 2

Motivation Emotion Social stimulates behavior the mind 3000 Emotion times change quicker than rational VS. thought!!! It's an emotional world we live in! Six degree vs. Three degree [Nature; BMJ] 3

Motivation: A Happy System Can we predict users activities and emotion? 4

Motivation: Inferring Social Ties From Home 08:40 From Office 11:35 From Office 15:20 Both in office 08:00 18:00 From Office 17:55 0.89 0.77 0.98 0.63 0.70 Friends Other From Outside 21:30 0.86 5

6 Motivation: RideSharing

MoodCast: Emotion Prediction via Dynamic Continuous Factor Graph Model ICDM 10, IEEE Trans. on Affective Computing 11 7

Happy System 8 Can we predict users emotion?

荷 塘? Dorm? Observations 教 室??? GYM Activity correlation Location correlation (Red-happy) KO 9

Observations (cont.) (a) Social correlation Social correlation (a) Implicit groups by emotions 10 Temporal correlation

Observations (cont.) We should not split the data into different time windows Calling (SMS) correlation 11

MoodCast: Dynamic Continuous Factor Graph Model MoodCast Social correlation g(.) Jennifer Allen Mike Temporal correlation h(.) Jennifer yesterday Neutral Allen Happy Jennifer today Happy Neutral Mike Predict Jennifer tomorrow? Attributes f(.) location call sms Our solution 1. We directly define continuous feature function; 2. Use Metropolis-Hasting algorithm to learn the factor graph model. 12

Problem Formulation Time t G t =(V, E t, X t, Y t ) Emotion: Sad Time t-1, t-2 Attributes: - Location: Lab - Activity: Working Learning Task: 13

Dynamic Continuous Factor Graph Model Time t Time t : Binary function 14

Model Learning y 5 y 4 y ' 3 y 3 y 2 y 1 Attribute Social Temporal 15

16 MH-based Learning algorithm

Experiment Data Set Baseline SVM SVM with network features Naïve Bayes Naïve Bayes with network features Evaluation Measure: Precision, Recall, F1-Measure #Users Avg. Links #Labels Other MSN 30 3.2 9,869 >36,000hr LiveJournal 469,707 49.6 2,665,166 17

18 Performance Result

Factor Contributions Mobile All factors are important for predicting user emotions 19

Inferring Social Ties in Mobile Networks PKDD 2011 (Best Paper Runnerup), WSDM 2012 20

Real social networks are complex... Nobody exists only in one social network. Public network vs. private network Business network vs. family network However, existing networks (e.g., Facebook and Twitter) are trying to lump everyone into one big network FB tries to solve this problem via lists/groups However Google+ which circle? Users do not take time to create it. 21

Even complex than we imaged! Only 16% of mobile phone users in Europe have created custom contact groups users do not take the time to create it users do not know how to circle their friends The fact is that our social network is black- 22

Problem Formulation Input: G=(V,E L,E U,R L,W) Partially Other Labeled Network?? Other Friend? V: Set of Users E L,R L : Labeled relationships E U : Unlabeled relationships 23 Input: G=(V,E L,E U,R L,W) Output: f: G R

Basic Idea V 1 V 3?? Friend V 2 User Node?? r 24 r 56 Other r 45 Relationship Node 24

Partially Labeled Pairwise Factor Graph Model (PLP-FGM) Constraint factor h 25 Input: Social Network v 2 Problem: v 4 v 3 v 5 v 1 PLP-FGM y 12 =Friend y 12 =advisor y 12 Latent Variable h (y 12, y 21 ) g (y 12, y 34 ) g (y 12,y 45 ) f(x 2,x 1,y 21 ) f(x 1,x 2,y 12 ) y y 21 =Friend =advisee 21 y 34 =? y 34 f(x 3,x 4,y 34 ) r 12 r 34 r 21 g (y 45, y 34 ) y 45 y 34 y 16 =coauthor y 16 =Other f(x 4,x 5,y 45 ) r 34 r 45 y 34 =? f(x 3,x 4,y 34 ) Correlation factor g relationships Attribute factors f For each Input relationship, identify which type Model Map has relationship the highest probability? to nodes in model Example: Call A makes frequency call to between B immediately two users? after the call to C. Partially Labeled Model

Solutions (con t) Different ways to instantiate factors We use exponential-linear functions Attribute Factor: Correlation / Constraint Factor: Log-Likelihood of labeled Data: 26

Learning Algorithm Maximize the log-likelihood of labeled relationships Expectation Computing Loopy Belief Propagation Gradient Ascent Method 27

Still Challenges? Questions: - How to obtain sufficiently training data? - Can we leverage knowledge from other network? 28

Inferring Social Ties Across Networks Input: Heterogeneous Networks Reviewer network Adam review Output: Inferred social ties in different networks Adam Bob review Product 1 Bob distrust distrust trust review Chris trust Chris Danny review Communication network Product 2 Knowledge Transfer for Inferring Social Ties Danny From Home 08:40 Both in office 08:00 18:00 Family Colleague From Office 11:35 Colleague From Office 15:20 From Outside 21:30 From Office 17:55 What is the knowledge to transfer? Friend Colleague Friend 29

Social balance theory Structural hole theory Social Theories friend A friend friend A non-friend friend A friend non-friend A non-friend B friend C B non-friend C B non-friend C B non-friend (A) (B) (C) (D) C 30

Social Theories Structural hole Social balance theory Structural hole theory Structural hole 31

Transfer Factor Graph Model 32 Coauthor network mobile y y 2 =? 4 =? y 2 y 4 h (y 3, y 4, y 5 ) TrFG model y 5 y y 1 1 =1 y 5 =1 Input: social network y 3 h (y 1, y 2, y 3 ) y 3 =0 y 6 y 6 =? f (s v 3, s 3,y 3 ) 3 5 v 6 f (u 5,s 5, y 5 ) 4 f (s 1, u 2,y 1 ) f (u 2, s 2,y 2 ) v f (u 3 4, s 4,y 4 ) 4 6 u 2, s 2 f (s 6, u 6,y 6 ) 2 v 5 (v 2, v 3 ) u 4, s 4 u v 5, s 5 2 u 1, s 1 u 3, s 3 (v 4, v 5 ) (v u 6, s 6 (v 4, v 6 ) 1 v 2, v 1 ) 1 (v 4, v 3 ) (v 6, v 5 ) Observations y y 2 =? 4 =? y 2 y 4 h (y 3, y 4, y 5 ) TrFG model y 5 y y 1 1 =1 y 5 =1 Input: social network y 3 h (y 1, y 2, y 3 ) y 3 =0 y 6 y 6 =? f (s v 3, s 3,y 3 ) 3 5 v 6 f (u 5,s 5, y 5 ) 4 f (s 1, u 2,y 1 ) f (u 2, s 2,y 2 ) v f (u 3 4, s 4,y 4 ) 4 6 u 2, s 2 f (s 6, u 6,y 6 ) 2 v 5 (v 2, v 3 ) u 4, s 4 u v 5, s 5 2 u 1, s 1 u 3, s 3 (v 4, v 5 ) (v u 6, s 6 (v 4, v 6 ) 1 v 2, v 1 ) 1 (v 4, v 3 ) (v 6, v 5 ) Observations Triad-based factor Bridge via social theories

Mathematical Formulation Features defined in different networks Triad-based features shared across networks 33

Data Sets Epinions a network of product reviewers: 131,828 nodes (users) and 841,372 edges trust relationships between users Slashdot: 82,144 users and 59,202 edges friend relationships between users Mobile: 107 mobile users and 5,436 edges to infer friendships between users 34

Results Data Set Method Prec. Rec. F1 Mobile Epinions to Mobile (40%) Slashdot to Mobile (40%) SVM 89.83 59.55 71.62 CRF 94.55 54.17 68.87 PFG 100.00 59.24 74.40 TranFG 82.39 83.44 82.91 TranFG 72.58 85.99 78.72 SVM and CRF are two baseline methods; PFG is the proposed partially-labeled factor graph model; TranFG is the proposed transfer based factor graph model. 35

Varying the percent of the labeled data Epinions-to-Mobile Slashdot-to-Mobile 36

Factor contribution analysis SH-Structural hole; SB-Social balance. 37

38 Conclusions Moodcast: emotion prediction Emotion stimulates the mind 3000 times quicker than rational though; We demonstrate that it is possible to accurately predict users emotions in mobile network. Inferring social ties different types of social ties have essentially different influence on people; By incorporating social theories, our proposed model can significantly improve (+4-14%) the inferring accuracy.

Emotion: Future Work Emotion diffusion in the mobile network; Predicting activities and emotions simultaneously. Inferring social ties: Inferring complex relationships between users, e.g., family, colleague, manager-subordinate; Active learning for inferring social ties. 39

Related Publications Jie Tang, Tiancheng Lou, and Jon Kleinberg. Inferring Social Ties across Heterogenous Networks. WSDM 12. Chi Wang, Jiawei Han, Yuntao Jia, Duo Zhang, Yintao Yu, Jie Tang, Jingyi Guo. Mining Advisor-Advisee Relationships from Research Publication Networks. KDD 10. Wenbin Tang, Honglei Zhuang, and Jie Tang. Learning to Infer Social Relationships in Large Networks. PKDD'11. (Best Student Paper Runner-up) Jie Tang, Yuan Zhang, Jimeng Sun, Jinghai Rao, Wenjing Yu, Yiran Chen, and ACM Fong. Quantitative Study of Individual Emotional States in Social Networks. IEEE Transactions on Affective Computing. 2011 Yuan Zhang, Jie Tang, Jimeng Sun, Yiran Chen, and Jinghai Rao. MoodCast: Emotion Prediction via Dynamic Continuous Factor Graph Model. ICDM'10. Chenhao Tan, Jie Tang, Jimeng Sun, Quan Lin, and Fengjiao Wang. Social Action Tracking via Noise Tolerant Time-varying Factor Graphs. KDD 10. Jie Tang, Jimeng Sun, Chi Wang, and Zi Yang. Social Influence Analysis in Largescale Networks. KDD'09. 40

Thanks! HP: http://keg.cs.tsinghua.edu.cn/jietang/ System: http://arnetminer.org 41