Social Network Mining

Similar documents
Social Network Analysis

Practical Graph Mining with R. 5. Link Analysis

Search Engine Optimization. Software Engineering October 5, 2011 Frank Takes LIACS, Leiden University

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Online Social Networks and Network Economics. Aris Anagnostopoulos, Online Social Networks and Network Economics

Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations

AN INTRODUCTION TO SOCIAL NETWORK DATA ANALYTICS

Business Intelligence and Process Modelling

A discussion of Statistical Mechanics of Complex Networks P. Part I

Network Analysis Basics and applications to online data

Part 1: Link Analysis & Page Rank

MINFS544: Business Network Data Analytics and Applications

Attacking Anonymized Social Network

Introduction to Networks and Business Intelligence

Graph Mining Techniques for Social Media Analysis

Healthcare Analytics. Aryya Gangopadhyay UMBC

Social Network Analysis and the illusion of gender neutral organisations

Graph Theory and Complex Networks: An Introduction. Chapter 08: Computer networks

Social Media Mining. Graph Essentials

Six Degrees of Separation in Online Society

Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks

Tutorial, IEEE SERVICE 2014 Anchorage, Alaska

Social Networks and Social Media

Search in BigData2 - When Big Text meets Big Graph 1. Introduction State of the Art on Big Data

Graph Mining and Social Network Analysis

Enhancing the Ranking of a Web Page in the Ocean of Data

Predicting Influentials in Online Social Networks

DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS

Extracting Information from Social Networks

Social Networking Analytics

Systems and Algorithms for Big Data Analytics

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Graph Analytics in Big Data. John Feo Pacific Northwest National Laboratory

IC05 Introduction on Networks &Visualization Nov

Online Social Networks: Measurement, Analysis, and Applications to Distributed Information Systems

How To Understand The Network Of A Network

ALBERTA. Social Network Analysis for the Assessment of Learning UNIVERSITY OF. Osmar R. Zaïane Professor & Scientific Director of AICML

Measurement and Analysis of Online Social Networks

Mining Social-Network Graphs

Strong and Weak Ties

Modeling User Interactions in Online Social Networks to solve real problems

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis

Search engines: ranking algorithms

Complex Networks Analysis: Clustering Methods

Scaling Up HBase, Hive, Pegasus

Dynamics of information spread on networks. Kristina Lerman USC Information Sciences Institute

Social and Economic Networks: Lecture 1, Networks?

Application of Social Network Analysis to Collaborative Team Formation

MLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS November 7, Machine Learning Group

Applying Social Network Analysis (SNA) to P2P File Sharing. Andreas Schaufelbühl Robin Stohler Benjamin Bürgisser

Protein Protein Interaction Networks

Parallelism and Cloud Computing

Developer Social Networks in Software Engineering: Construction, Analysis, and Applications

Department of Cognitive Sciences University of California, Irvine 1

Community Detection Proseminar - Elementary Data Mining Techniques by Simon Grätzer

Introduction to Graph Mining

Graph theory and network analysis. Devika Subramanian Comp 140 Fall 2008

Social Media Mining. Network Measures

Graph Algorithms and Graph Databases. Dr. Daisy Zhe Wang CISE Department University of Florida August 27th 2014

Network theory and systemic importance


Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D.

Collective Behavior Prediction in Social Media. Lei Tang Data Mining & Machine Learning Group Arizona State University

Algorithms for representing network centrality, groups and density and clustered graph representation

Introduction to Social Media

Overview of the Stateof-the-Art. Networks. Evolution of social network studies

General Network Analysis: Graph-theoretic. COMP572 Fall 2009

An Alternative Web Search Strategy? Abstract

Web Graph Analyzer Tool

Mining Social Network Graphs

A Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader

A comparative study of social network analysis tools

Social Network Analysis using Graph Metrics of Web-based Social Networks

Exploring People in Social Networking Sites: A Comprehensive Analysis of Social Networking Sites

Social Influence Analysis in Social Networking Big Data: Opportunities and Challenges. Presenter: Sancheng Peng Zhaoqing University

SEO: What is it and Why is it Important?

Graph Processing and Social Networks

Subgraph Patterns: Network Motifs and Graphlets. Pedro Ribeiro

Studying Recommendation Algorithms by Graph Analysis

PERFORMANCE M edia P lacement

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis. Contents. Introduction. Maarten van Steen. Version: April 28, 2014

Some questions... Graphs

Information Management course

Google+ or Google-? Dissecting the Evolution of the New OSN in its First Year

Walk-Based Centrality and Communicability Measures for Network Analysis

Network-based spam filter on Twitter

Universiteit Leiden Opleiding Informatica

Scientific Collaboration Networks in China s System Engineering Subject

International Journal of Engineering Research-Online A Peer Reviewed International Journal Articles are freely available online:

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

Top Online Activities (Jupiter Communications, 2000) CS276A Text Information Retrieval, Mining, and Exploitation

MapReduce Algorithms. Sergei Vassilvitskii. Saturday, August 25, 12

SOCIAL MEDIA OPTIMIZATION

Searching the Social Network: Future of Internet Search?

LARGE-SCALE GRAPH PROCESSING IN THE BIG DATA WORLD. Dr. Buğra Gedik, Ph.D.

1. Write the number of the left-hand item next to the item on the right that corresponds to it.

CAS CS 565, Data Mining

Network Algorithms for Homeland Security

Option 1: empirical network analysis. Task: find data, analyze data (and visualize it), then interpret.

CS224W Project Report: Finding Top UI/UX Design Talent on Adobe Behance

Transcription:

Social Network Mining Data Mining November 11, 2013 Frank Takes (ftakes@liacs.nl) LIACS, Universiteit Leiden

Overview Social Network Analysis Graph Mining Online Social Networks Friendship Graph Semantics Example Conclusions

Social Network Analysis Social Network Analysis: the study of social networks to understand their structure and behavior. Social Network: a social structure of people, related (directly or indirectly) to each other through a common relation or interest. Social Networks!= Social Media

Social Network Analysis Social Network Analysis (SNA) Sociology Algorithms Data Mining Social Networks Real-life (explicit) Online (explicit) Derived (implicit) e-mail networks, citation networks, co-author networks, terrorist collaboration networks

SNA Research Topics Network analysis Measuring Modelling Prediction Spread of Information Trust & Authority Community Detection Semantics Anonimity & Privacy

Graph Mining

Graphs n = 5 nodes m = 6 edges A Vertex/Node C Distance d(c, E) = d(e, C) = 2 D E Relationship/Edge/Link

Data / Graph Mining Data Mining: looking for patterns in Databases id name date_of_birth email 23. Stinson 1983-08-15 goforbarney@me.com 42 T. Mosby 1983-08-15 mosbydesigns@gmail.com classification, clustering, outlier detection, Graph Mining: looking for patterns in Graphs A E D So what are we looking for? C

Graph Mining Graph models Centrality measures (local) Graph properties (global) Frequent subgraphs (pattern mining) Detecting cliques (clustering) Link prediction (classification) Various application domains

Centrality measures Degree centrality etweenness centrality Closeness centrality Graph centrality (eccentricity centrality) Eigenvector centrality Random walk centrality Hyperlink Induced Topic Search (HITS) PageRank

Centrality Who has a central position in this graph? A E F G H A E F G H D D C C Degree Centrality: C has the highest degree

Centrality Who has a central position in this graph? A E F G H A E F G H D D C C etweenness Centrality: E is part of the largest number of shortest paths

Google PageRank

Google PageRank 4 webpages A,, C and D (N = 4) Initially: PR(A) = PR() = PR(C) = PR(D) = 1/n L(A) is the outdegree of page A Now if, C and D each link to A, the simple PageRank PR(A) of a page A is equal to:

Google PageRank 0.25 0.25 A 0.458 0.083 A C D 0.25 0.25 C D 0.208 0

Google PageRank PageRank as suggested by Larry Page in 1999 N = number of pages, p i and p j are pages M(p i ) is the set of pages linking to p i L(p j ) is the outdegree of p j d = 0.85, 85% chance to follow a link, 15% chance to jump to a random page (random surfer) t = 0 t = t + 1

Frequent Subgraphs What pattern occurs frequently in this graph? A A C A A A A A A C C C Frequent Subgraph: A--C

Clustering http://bayramannakov.wordpress.com

Online Social Networks

History 1997: SixDegrees.com 2000: Friendster 2003: LinkedIn & MySpace 2004: Hyves 2005: Facebook 2006: Twitter... 2010: The Social Network (movie)

Online Social Networks User (node) has a profile Profiles have attributes (labels/annotations) Explicit links (edges) Social Links / Friendship links User groups Implicit links Social messaging Common attributes Directed vs. undirected links

2011

Example: Facebook More than 1 billion active users Average user has 130 friends Estimated 100 billion social links Over 600 million interactive objects (pages, groups and events) More than 45 billion pieces of content (web links, news stories, blog posts, notes, photo albums, etc.) shared each month

OSNA Research Topics User behavior Privacy & Anonymity Trust & Authorities Diffusion of information Sampling & Crawling Community Detection Friendship Graph

Friendship Graph

Friendship Graph

Friendship Graph Analysis Static analysis Densely connected core Fringe of low-degree nodes Few isolated communities & singletons Static properties Node degree distribution, average distance, diameter Edge/node ratio, level of symmetry Number of cliques, k-cliques, etc. Small world phenomanon

Static properties

Degree Distribution

Distance Distribution

Small World Networks Class of networks with certain properties: Sparse graphs Highly connected Short average node-to-node distance: d ~ log(n) Fat tailed power law node degree distribution Densely connected core with many (near-)cliques Existence of hubs: nodes with a very high degree Fringe of low(er)-degree nodes

Regular vs. Small World

Small World Networks Other examples of small world networks Web graphs Gene networks E-mail networks Telephone call graphs Information networks Internet topology networks Scientific co-authorship networks Corporate networks (interlocks or ownerships)

Friendship Graph Analysis Static analysis Dynamic analysis Network evolution Network modelling Network growth Link Prediction Triadic Closure Preferential Attachment Semantic Link Prediction

Link Prediction A E A E D D C C Two principles: preferential attachment and triadic closure

Triadic Closure A C A C Two principles: preferential attachment and triadic closure

Preferential Attachment Nodes with a large degree acquire new links at a faster rate. A E F J G C D

Semantics

"Call for Semantics" Structured data volumes grow rapidly "Semantic Web"

"Call for Semantics" What is the meaning of a link? What type of relation is defined by a link? Are there any wrong links? What is the strength of a link? Are link descriptions missing?

Semantic Link Prediction Deterministic sibling E sibling E? uncle parent? parent parent uncle parent C? D C cousin D

Semantic Link Prediction Predictive (learning, classification, etc.) colleague E colleague E? colleague colleague? colleague colleague colleague colleague C colleague D C colleague D

Conclusions Online social networks are an excellent domain of study for data (or graph-) miners Social Network Analysis is important for many areas of research, not only computer science Semantics within large networks are becoming increasingly more important Challenges may be found in the temporal (dynamic) analysis of social networks