A discussion of Statistical Mechanics of Complex Networks P. Part I



Similar documents
General Network Analysis: Graph-theoretic. COMP572 Fall 2009

Introduction to Networks and Business Intelligence

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Complex Networks Analysis: Clustering Methods

DECENTRALIZED SCALE-FREE NETWORK CONSTRUCTION AND LOAD BALANCING IN MASSIVE MULTIUSER VIRTUAL ENVIRONMENTS

Statistical mechanics of complex networks

Dmitri Krioukov CAIDA/UCSD

Network/Graph Theory. What is a Network? What is network theory? Graph-based representations. Friendship Network. What makes a problem graph-like?

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis

Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations

Bioinformatics: Network Analysis

How To Understand The Network Of A Network

Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

Distance Degree Sequences for Network Analysis

Social Media Mining. Graph Essentials

Social Network Mining

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis. Contents. Introduction. Maarten van Steen. Version: April 28, 2014

Graph models for the Web and the Internet. Elias Koutsoupias University of Athens and UCLA. Crete, July 2003

Research Article A Comparison of Online Social Networks and Real-Life Social Networks: A Study of Sina Microblogging

GENERATING AN ASSORTATIVE NETWORK WITH A GIVEN DEGREE DISTRIBUTION

! E6893 Big Data Analytics Lecture 10:! Linked Big Data Graph Computing (II)

Complex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics

Part 2: Community Detection

The Network Structure of Hard Combinatorial Landscapes

Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

Graph Mining Techniques for Social Media Analysis

Recent Progress in Complex Network Analysis. Models of Random Intersection Graphs

1 Six Degrees of Separation

Distributed Computing over Communication Networks: Topology. (with an excursion to P2P)

Social and Economic Networks: Lecture 1, Networks?

The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth

Topological Properties

Introduction to Graph Mining

Complex Network Analysis of Brain Connectivity: An Introduction LABREPORT 5

Open Source Software Developer and Project Networks

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

A mixture model for random graphs

SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE

Social Media Mining. Network Measures

Graph Theory and Complex Networks: An Introduction. Chapter 08: Computer networks

The Topology of Large-Scale Engineering Problem-Solving Networks

NETWORK SCIENCE THE SCALE-FREE PROPERTY

Degree distribution in random Apollonian networks structures

Graph Theory and Networks in Biology

arxiv:physics/ v1 6 Jan 2006

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION

Character Image Patterns as Big Data

Graph theoretic approach to analyze amino acid network

MINFS544: Business Network Data Analytics and Applications

Random graphs and complex networks

Scalable Source Routing

Extracting Information from Social Networks

Mining Social Network Graphs

Greedy Routing on Hidden Metric Spaces as a Foundation of Scalable Routing Architectures

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs

Business Intelligence and Process Modelling

The average distances in random graphs with given expected degrees

Mining Social-Network Graphs

Load balancing in a heterogeneous computer system by self-organizing Kohonen network

Strategies for Optimizing Public Train Transport Networks in China: Under a Viewpoint of Complex Networks

Stationary random graphs on Z with prescribed iid degrees and finite mean connections

Structural constraints in complex networks

Evaluation of a New Method for Measuring the Internet Degree Distribution: Simulation Results

PUBLIC TRANSPORT SYSTEMS IN POLAND: FROM BIAŁYSTOK TO ZIELONA GÓRA BY BUS AND TRAM USING UNIVERSAL STATISTICS OF COMPLEX NETWORKS

The Basics of Graphical Models

Healthcare Analytics. Aryya Gangopadhyay UMBC


Option 1: empirical network analysis. Task: find data, analyze data (and visualize it), then interpret.

Analysis of dynamic sensor networks: power law then what?

Temporal Dynamics of Scale-Free Networks

1 Definitions. Supplementary Material for: Digraphs. Concept graphs

Information Network or Social Network? The Structure of the Twitter Follow Graph

136 CHAPTER 4. INDUCTION, GRAPHS AND TREES

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Cluster detection algorithm in neural networks

SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH

Transcription:

A discussion of Statistical Mechanics of Complex Networks Part I Review of Modern Physics, Vol. 74, 2002

Small Word Networks Clustering Coefficient Scale-Free Networks Erdös-Rényi model

cover only parts I, II, and IIIa (pages 1-9) more questions than answers...

Small Word Networks Clustering Coefficient Scale-Free Networks Small World Networks What is a small-world network? relatively short path between any two nodes six degrees of separation distance the shortest path between two nodes diameter of graph longest distance between any two nodes no hard definition, but diameter similar to random graph ( ln N)

Clustering Coefficient Overview Small Word Networks Clustering Coefficient Scale-Free Networks How well do your friends know each other? a metric between 0.0 and 1.0 ratio of edges (E i ) over maximum possible (complete subgraph) ( ) ki for each node : C i E i / = 2E i 2 k i (k i 1) N for graph, take average over nodes C = 1 N Can be interpreted as i=1 the probability that two neighbor nodes are connected (p = C i ) C i

Clustering Coefficient Simulations Small Word Networks Clustering Coefficient Scale-Free Networks 0.8 Clustering for 20!node graph 0.7 0.6 Clustering coefficient 0.5 0.4 0.3 0.2 0.1 0 0 10 20 30 40 50 60 70 Number of random edges

Small Word Networks Clustering Coefficient Scale-Free Networks Clustering Coefficient Observations Metric is convenient to define, but C = 1 does NOT imply that graph is completely connected C = 0 does NOT imply that all nodes are isolated C is NOT monotonic as more edges are added (!) Shortcomings doesn t understand the notion of components uses only one generation of information all-or-nothing metric may be too crude (?) Is this what we really want to measure...?

Weird Clustering Examples Small Word Networks Clustering Coefficient Scale-Free Networks a collection of isolated 3-cycles has C = 1 a n-dimensional grid has C = 0, although k = 2n example of graph where adding more edges lowers C take two disconnected subgraphs and bridge them

Clustering Examples Overview Small Word Networks Clustering Coefficient Scale-Free Networks Consider a graph with N vertices arranged in 1-d ring: k=2, d(g) = N/2, C = 0 1 2-d grid: k=4, d(g) = 2N, C = 0 3-d cube: k=6, d(g) = 3 N 1/3, C = 0... n-d hypercube: k=2n, d(g) = n N 1/n, C = 0 Is the last considered a small-world network? 1 This is why Watts-Strogatz used a 1-d ring with 4 nearest neighbors to bump up C to 3/4.

Small Word Networks Clustering Coefficient Scale-Free Networks Scale-Free Networks Degree distributions not all nodes have same degree distribution function P(k) denotes probability that random node has k edges for random graphs, this is a Poisson distribution with a peak at k with value P( k ) for some real networks, the tail of P(k) follows a power-law distribution: P(k) 1/k γ for 1 < γ < 3 other real networks exhibit exponential tails graphs with P(k) different than Poisson distribution are termed scale-free

Small Word Networks Clustering Coefficient Scale-Free Networks Issues with Scale-Free Networks no hard definition why the big fuss? Because physics has properties with power-law tails... (statistical mechanics) where does the cut-off for k take effect...? the tail has tiny portion of nodes... is it really that relevant?

Complex Network Cheat Sheet Small Word Networks Clustering Coefficient Scale-Free Networks Random Graphs Real Graphs small-world YES YES d(g) ln(n) d(g) d(g random ) clustering coeff LOW HIGH (C = p) 0.01 1.0 scale-free NO YES Poisson dist. P(k) 1/k γ for 1 < γ < 3

Complex Network Cheat Sheet Small Word Networks Clustering Coefficient Scale-Free Networks Random Graphs Real Graphs n-d lattices structure NO YES (?) YES small-world YES YES NO (?) d(g) ln(n) d(g) d(g random ) d(g) N 1/n clustering coeff LOW HIGH 0.0 (C = p) 0.01 1.0 2n neighbors scale-free NO YES YES Poisson dist. P(k) 1/k γ for 1 < γ < 3 P(2n) = 1, 0 otherwise

Cases Studied WWW Internet movie actors science collaboration STDs cellular networks ecological networks phone call network citation networks linguistic networks power grid and neural nets protein folding

WWW Studies at various levels (internet domain, site level, hyperlinks) at the hyperlink level: largest network studied (2002) directed graph, very unsymmetric (k out k in ) both Pout (k) and P in (k) how power-law tails with γout 2.5 ± 0.25 and γ in = 2.1 Adamic (1999) computed clustering coefficients by making each edge bidirectional (!) Faloutsos (1999): an edge is drawn between two domains if there is a least one route that connects them (!)

Movie Actor Collaboration Network Hooray for IMDb.com! Size: half a million actors (!) in 2000 two actors have an edge if they worked together on a film model does not take into account weighted edges (such as # of films worked on together) average distance is close to that of random graph

Observations of Table I key values: size (N) average degree ( k ) average distance (l) C C rand l l rand Things to look at: scatter plot of C vs. how dense the graph is ( k /N) scatter plot of density vs. l/l rand

Observations of Table II γ in = 2.1 is quite popular... 1 < γ < 3 for both γ in and γ out k (cut-off) seems pretty high, compared to k l power is not as good an estimator as l rand...

Random Graph Models Erdös-Rényi model Erdös-Rényi (1959) N nodes, n edges, chosen randomly from Binomial Model N nodes, every edge has p probability actual # of edges is a random variable Poisson distribution with expected value p( ( N 2 with p = n/ same? ( ) N possiblities 2 ) ) ( ) N this is similar to Erdös-Rényi, but is it the 2

Graph Enumeration Overview Erdös-Rényi model An undirected graph with N vertices, ( ) N has M = N(N 1)/2 possible edges 2 ( N(N 1) ) # of graphs with exactly n edges, is 2 = HUGE! n

Graph Enumeration Overview Erdös-Rényi model An undirected graph with N vertices, ( ) N has M = N(N 1)/2 possible edges 2 ( N(N 1) ) # of graphs with exactly n edges, is 2 = HUGE! n Given ( ) M M! n n!(m n)! Stirling s approximation: ln n! n(ln n 1), for large N

Graph Enumeration Examples Erdös-Rényi model Name Vertices k # graphs 10 1 3 10 9 100 1 10 211 1,000 1 10 3,132 10,000 1 10 41,332 math authors 70,975 3.9 10 1,200,000 movie actors 225,226 61 10 50,000,000 If every atom in the universe ( 10 80 ) was a Petaflop computer, computing since the begining of time (13 billion years ago) you would just need 10 100 such universes to enumerate the (100, 1 ) case...

Erdös-Rényi model Do we really know the landscape? HELP! we are looking at only tiny microcosm of graph space for simulations how robust are our conclusions? importance sampling (?) concern with 1-d parameterization ala Watts-Strogatz... what if we made random changes to a real network? How long before it starts losing its realness? would any of these metrics help...?

and Discussion is small-world really relevant...? (social networks rarely interact beyond three links...) not clear if current metrics really capture the right thing... given (N, C, k, γ, l) what can one say about a network? introduce new(?) metrics that better recognize components and structure cluster coefficient should be extended for weighted, bidirectional graphs power-tail distribution model needs high cut-off values for k what percentage of the available nodes is this?

...? Why is this so hard...? because we are trying to theoritize arbitrary structure...