Overlay and P2P Networks. On power law networks. Prof. Sasu Tarkoma

Overlay and P2P Networks On power law networks Prof. Sasu Tarkoma 28.1.2013

Course Progress

Schedule summary 13.1. Introduction. Exercises. (Exercise I published) 15.1. Exercises: Reception on questions I 16.1. Unstructured networks I 20.1. Unstructured networks II 22.1. Exercises: Answers to questions I (Exercise II published) 23.1. BitTorrent, modelling and evaluation 27.1. Freenet and intro to power law networks 29.1. Exercises: Reception on questions II 30.1. Power-law networks. 3.2. Consistent hashing. Distributed Hash Tables (DHTs) I 5.2. Exercises: Answers to questions II (Exercise III published) 6.2. DHTs II 10.2. Applications I 12.2. Exercises: Reception on questions III 13.2. Applications II (also invited speakers) 17.2. Conclusions and summary 19.2. Exercises: Answers to questions III

Piece and peer selection Tit-fortat Tracker Layered Bloom Filt. Limited flooding Flooding (breadth first) Centralized components Layered Limited flooding on super nodes Location (node id) clustering Path folding (opennet) Location Swapping (Darknet) Depth first BT Gnutella Skype Freenet DHTs Random Leverages Small world Small world N/A Unstructured Structured Internet (TCP/IP)

Contents Zipf s law and power law distributions Robustness of networks Scale free and small worlds Search in small worlds Freenet revisited

Introduction Network theory has many applications Connectivity of Internet routers World Wide Web Electric grid Cellular network in biology Phone systems and phone call patterns Word co-occurrence in text Collaboration graphs and social networks The structure of the network plays a crucial role for scalability, resiliency, efficiency

Power-law distribution Note the difference to the Bell curve, in which most nodes have the same number of links Figure source: http://idm09.wordpress.com/2009/11/03/networks-the-power-of-hubs/

Historical examples Pareto: income distribution, 1897 Zipf-Auerbach: city sizes, 1913/1940s Zipf-Estouf: word frequency, 1916/1940s Lotka: bibliometrics, 1926 Yule: species and genera, 1924 Mandelbrot: economics/information theory, 1950s 1990- networking community 2000- online social networks

On Zipf s distribution and power-laws A power-law implies that small occurrences are extremely common, whereas large instances are extremely rare This regularity or law is also referred to as Zipf or Pareto Zipf is used to model the rank distributions, and power-law for frequency distributions Examples Word popularity rank in English (Zipf) Ranking of cities by population (Zipf) Node degree distribution in a network (power-law)

Zipf s Law F ~ R -β, where R is the rank and the constant is close to one Straight line on a log-log plot The frequency of any word is inversely proportional to its rank. Thus most frequent word appears twice as often as the second most frequent, three times more often than the third most frequent and so on Implications for compression, text identification, statistical learning, parser implementation etc.

Applications The linguist George Zipf first proposed the law in 1935 in the context of word frequencies in languages. Many applications, for example size of cities, income distributions Zipf s law has been used to model Web links and media file references. It has therefore profound implications for content delivery on the Internet Large Web sites get disproportionately more traffic than smaller sites. Efficient caching relies heavily on Zipf s law to replicate a small number of immensely popular files near the users

http://www.cs.uoi.gr/~tsap/teaching/ InformationNetworks/lectures/ lecture3.ppt

Power Laws Two quantities x and y are related by a power law if y is proportional to x (-c) for a constant c y = a x (-c) The plot of log(y) versus log(x) is a straight line log(y) = -c.log(x) + log(a) The slope of the log-log plot is the power exponent c (we also use α later) Power laws in real networks: (a) WWW hyperlinks (b) co-starring in movies (c) co-authorship of physicists (d) co-authorship of neuroscientists The constant c is typically between 2 and 3

Scale free Power laws have the same functional form at all scales P(x) = x (-c), P(ax)= a (-c) P(x) proportional to P(x) Changes only by multiplicative factor when rescaling the independent variable x No typical scale hence scale-free

Power Laws in Networks Degree distribution often satisfies a power law: fraction of nodes f d of degree d is proportional to d -c Degree d Fraction f d = 1/(2d) 1 ½ (6/12) 2 ¼ (3/12) 3 1/6 (2/12) 4 ~1/8 (1/12) Source: CSE 522 Algorithmic and Economic Aspects of the Internet. http://www.cs.washington.edu/education/courses/cse522/05au/

Internet Connections The distribution of the number of connections a host has to other hosts on the Internet has been shown to follow the power law distribution Discussion in the community, concerns with traceroute data Some nodes maintain majority of the connections (the hubs) Therefore send queries toward hubs. High-degree nodes may make the network vulnerable to attacks

Example: AS Connectivity Source: http://www.hpl.hp.com/research/idl/papers/ranking/ adamicglottometrics.pdf

Observations Gnutella (v0.6+) and Freenet 0.7v+ support the formation of hubs They are power law networks How robust are these networks?

Robustness Given a certain expected network structure, a very interesting question is how easy it is to disrupt the network and partition it into disjoint parts. Cohen et al. have analytically shown that networks in which the vertex connectivity follows a power-law distribution with an index of at most (alpha<3) are very robust in the face of random node breakdowns. where p is a probability bound on network partitioning, m is the minimum node degree, and K is the maximum node degree.

Robustness II The Internet node connectivity has been shown to follow a power-law distribution with alpha=2.5. Similar investigation has been made for the Gnutella P2P network resulting in the observation that alpha = 2.3 Both the Internet and Gnutella present a highly robust topology. They are able to tolerate random node breakdowns.

Resiliency of power-law networks

Gnutella Robustness For a maximum and fairly typical node degree of 20, the Gnutella overlay is partitioned into disjoint parts only when more than 60% of the nodes are down. Robustness is a highly desirable property in a network. The above equation is useful in understanding the robustness of power-law networks; however, it assumes that the node failures are random.

Orchestrated Attacks Although a power-law network tolerates random node failures well, it is still vulnerable to selective attacks against nodes. An orchestrated attack against hubs in the network may be very effective in partitioning the network.

Scale-free networks are resilient with respect to random attack Example: gnutella network, 20% of nodes removed 574 nodes in giant component 427 nodes in giant component Source: www-personal.umich.edu/~ladamic/.../ networks/.../ppt/lecture20.ppt

Targeted attacks are affective against scale-free networks Example: same gnutella network, 22 most connected nodes removed (2.8% of the nodes) 574 nodes in giant component 301 nodes in giant component Source: www-personal.umich.edu/~ladamic/.../ networks/.../ppt/lecture20.ppt

Small Worlds: Milgram s experiment The Small-World Problem Milgram (1967). How many intermediaries are needed to move a letter from person A to person B through a chain of acquaintances? Designed to find out average path length. Letter-sending experiment: starting in Nebraska/Kansas,with a target person in Boston. People forwarded the message towards the target person. Six degrees of separation.

Six degrees of separation Source: Wikipedia

Small Worlds Small-world networks are characterized by a graph degree power-law distribution Definition: A small world network is a network with a dense local structure and a diameter comparable to a random graph Also the term scale-free is used for these networks. They exhibit clustering and thus are different from random networks Most nodes have relatively few local connections to other nodes, but a significant small number of nodes have large wide-ranging sets of connections.

Scale-Free Networks

Small World Graphs The diameter of a graph is the maximum distance (number of edges) between any pair of nodes The average distance of a graph is the average distance between any pair of nodes The average connected distance of a graph is the average distance between any pair of connected nodes A graph exhibits a small world phenomenon if it has low diameter or average connected distance Typically, the average distance of a small world graph is on the order of log n (where n is the number of nodes)

Small Worlds and Scale Free Networks Small world phenomenon exlains why highly clustered graphs can have short average path lengths Natural and man-made structures Watts and Strogatz 1998 It does not explain why this property emerges in real networks How do power law networks emerge? Nodes connect to well-connected nodes, connectivities follows a power law

http://www.l3s.de/~balke/lecture-p2p/vorlesung_7.pdf Watts-Strogatz model D. Watts and S. Strogatz proposed a generative model to explain small world properties Build a ring of n vertices and connect each vertex with its k clockwise neighbors on the ring Draw a random number between 0 and 1 for each edge Rewire each edge with probability p: if the edge s random number is smaller than p, keep the source vertex of the edge fixed, and choose a new target vertex uniformly at random from all other vertices k = 2 Increasing randomness p=0 p=1

Structured network high clustering large diameter regular Small-world network high clustering small diameter almost regular Random network small clustering small diameter Reference: Duncan J. Watts & Steven H. Strogatz, Nature 393, 440-442 (1998)

The Copying Generative Model Proposed by R. Kumar, P. Raghavan, et al. in 2000 In each time step randomly copy one of the existing nodes keeping its links Connect the original node and the copy Randomly remove edges from both nodes with a very small probability, and give removed edge random new target nodes In this model: the probability of a node getting a new edge is proportional to its degree More edges increases probability for a neighbour being chosen

Barabási-Albert Model Scale-free networks with power-law node degree distribution The network grows in time No random edge generation Higher the degree, higher the probability that the new vertex will attach (preferential attachment) Generative model 1. Start with a small network (a number of nodes and edges at random) 2. At every step, add a new vertex x. Add m edges to x to existing vertices. The target is drawn with the probability given by preferential attachment (proportionally to indegree).

Preferential Attachment Consider dynamic Web graph Pages join one at a time Each page has one link out Let X j (t) be the number of pages of degree j at time t. New page links: With probability α, link to a random page With probability (1- α), a link to a page chosen proportionally to indegree

Local Clustering Coefficient v The clustering coefficient C(v) of vertex v in a directed graph is given by: the number of links between the vertices within its neighborhood divided by the number of links that could possibly exist between them v C=0 Neighbourhood is the immediately connected neighbours k(k-1) possible links for k vertices For undirected graph k(k-1)/2 possible links C=1/3 v Network average: C=3/3

regular Small world random L C C(p) : clustering coefficient L(p) : average path length p Reference: Duncan J. Watts & Steven H. Strogatz, Nature 393, 440-442 (1998)

Kleinberg s result Jon Kleinberg showed that it is possible to do efficient routing on grids with the small world property. The possibility of efficient routing depends on a balance between the proportion of shortcut edges of different lengths with respect to coordinates in the base grid. The key idea is to use a frequency of edges of different lengths that decrease inverse proportionally to the length.

Kleinberg Small World Set of points on a n x n grid Each node only has local information Routing table creation needs to know the coordinate system Manhattan distance is used (sum of horizontal and vertical components in a grid)

Kleinberg s result II Results in an infinite family of small world network models on a grid with power-law distributed random long-range links K(n,k,p,q,r) p radius of neighbours to which short local links q number of random long range links k - dimension of the mesh r - clustering exponent of inverse power-law distribution. Prob.[(x,y)] dist(x,y) -r Note: Manhattan distance is used (sum of horizontal and vertical components in a grid) r determines how steeply the probability of links to far away neighbors reduces

Searching in the Small World Simple greedy routing: use routing table to find the link that takes the message cosest to the target Assumes that there is a way to associating nodes with points on the grid in order to find closest node

Search processing Node makes a greedy routing decision based on the coordinates of its local and long-range contacts the coordinates of the nodes that the message was previously routed through the coordinates of the target node. The message that needs to be delivered carries with it the target node coordinates and the coordinates of the nodes already visited nodes.

Example Node u is connected to all its neighbors (a, b, c, and d) and has a long-range link to some randomly chosen node v with a probablility proportional to dist(u, v) -r Just using the neighbours gives O(n) for destination If the clustering coefficient is zero, then the long range links are too random If one then there are too few random links Two would be the optimal value (links are uniformly distributed over all distances) Results in logarithmic diameter for the network

Constructing the graph Every node i is connected to node j within distance d Connect nodes in higher distance with probability decreasing with growing distance For every node i, additional q edges are added Probability that node j is selected is proportional to d(i,j) -r The nodes are selected by generating q random numbers based on the distribution that are distances on the graph, and then choosing a node at that distance.

Delivery Time in Lattice Networks For k=2, dip in time-to-search at r=2 For low r, random graph; no geographic correlation in links For high r, not a small world; no short paths to be found. Searcheability dips at r=2, in simulation Corresponds to using greedy heuristic of sending message to the node with the least lattice distance to goal Expected Delivery time = O((log n) 2 ), for r = 2 (and the special case k=r). Ω(n (2-r)/3 ), for 0 r < 2. Ω(n (r-2)/(r-1) ), for 2 < r.

Theorem Theorem: The routing algorithm will find short paths if and only if k = r. (k is the dimension, r is clustering exponent) The idea behind the proof is that for any r < k there are too few random edges to make the paths short. For r > k there are too many random edges, and thus too many choices to which the message could be sent. Message will make a long random walk through the network.

Kleinberg s result III Simple greedy routing can find routes in O(log 2 (n)) hops, where n is the size of the graph Decentralized Decisions based on local information Later work has investigated other topologies than grids (rings, ) and improving efficiency through topology information, cues, etc. Implication of result: greedy and local solution for building peer-to-peer overlay networks Note: mathematical assumptions need to hold! If they do not, efficient decentralized search is not possible.

Freenet Original Freenet did not take small world property into account in routing, no gurantee of the existence of an efficient decentralized search Freenet 0.7 featured the new clustering technique based on node locations How to map to Kleinberg s model?

Freenet Idea Assume that the network exhibit small world properties. Should be possible to recover an embedded Kleinberg small-world graph. This is accomplished by selecting random pairs of nodes and potentially swapping them based on an objective function. Function minimizes the product of all the distances between any given node and its neighbors.

Freenet Routing Revisited Every file is has a key (derived via a hash function) A file is stored at some node with a similar key At each peer each request is forwarded to the node in its routing table having the closest key to the requested one If the request is successful, the file is sent back via the routing nodes and each node saves the file and adds the sending node s address to its local routing table (i.e., frequently requested files are replicated) If the routing table is full, the random entry is evicted Clustering and caching for achieving the small world network benefits in routing

Freenet Mapping I Kleinberg s model has a base lattice (and then the random network of long-range contacts) Sandberg s Freenet routing algorithm assumes that a Freenet graph corresponds to the long-range contacts in some unknown k-dimensional Kleinberg network. If we can find the underlying base lattice, we can then determine the metric for discovering short paths Sandberg has shown that finding the unknown base grid can be cast as a problem in statistical estimation in which the lattice coordinates of the actual Freenet nodes are the parameters to be estimated.

Freenet Mapping II The aim of Freenet is to update node locations so that their assignment results in a small-world embedding in an imaginary base ring lattice This is realized by having the nodes examine their location keys periodically for a possible location swap Location swap does not alter the network topology, just the location identifiers! But they may swap data items. Location swap will affect future search requests.

Location swapping details 1. A node A randomly chooses a node B in its proximity and initiates a swap request. Both nodes share the locations of their respective neighbors and calculate D 1 (A, B). D 1 (A, B) is the product of the existing distances between A and each of A s neighbors L(a) L(n) multiplied by the product of the existing distances between B and each of B s neighbors. D 1 (A, B) = Y L(A) L(n) Y L(B) L(n) (A,n)2E (B,n)2E (1) 0.30.60 0.90.65 Swap? 0.25 0.85.25 0.60.50 0.10 0.45 0.40 2. The nodes also compute D 2 (A, B), the product of the products of the differences between their locations and their neighbors locations after a potential swap: D 2 (A, B) = Y L(B) L(n) Y L(A) L(n) (A,n)2E (B,n)2E (2) Figure 2. This figure shows an example network with two nodes considering a swap. The result of the swap equation is D 1 =.60 *.65 *.25 *.50 =.04875 and D 2 =.30 *.35 *.05 *.80 =.0042. Since D 1 >D 2, they swap. 3. If the nodes find that D 2 (A, B) apple D 1 (A, B), they swap locations, otherwise they swap locations with probability D 1(A,B) D 2 (A,B). The deterministic swap always decreases the average distances of nodes with their neighbors. The probabilistic swap is used to escape local minima. N. Evans et al. Routing in the Dark: Pitch Black

Is Freenet a small world? There must be a scale-free power-law distribution of links within the network. Source: www.ics.forth.gr/dcs/activities/projects/p2p/ploumid-freenet.ppt

Applications of Small World Networks Many applications in peer-to-peer networks The Gnutella network has been observed to exhibit the clustering and short path lengths of a small world network. Its overlay dynamics lead to a biased connectivity among peers where each peer is more likely connected to peers with higher uptime The Freenet routing algorithm is built on the small world assumption Other applications in distributed hashing (DHTs) such as Symphony that uses long-range contacts drawn randomly from a family of harmonic distributions

Stages of power law network research (M. Mitzenmacher, 2003) There are 5 stages of power law network research. 1) Observe: Gather data to demonstrate power law behavior in a system. 2) Interpret: Explain the importance of this observation in the system context. 3) Model: Propose an underlying model for the observed behavior of the system. 4) Validate: Find data to validate (and if necessary specialize or modify) the model. 5) Control: Design ways to control and modify the underlying behavior of the system based on the model. 57

References Barabási, Albert-László, Linked: The New Science of Networks, 2002. ISBN 0-452-28439-2 M. Mitzenmacher. A brief history of generative models for power law and lognormal distributions. Internet Mathematics, 2003. http://www.uvm.edu/~pdodds/research/papers/others/2003/ mitzenmacher2003a.pdf Watts, D. J. and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature 393:440-42, 1998. D.J. Watts. Networks, Dynamics and Small-World Phenomenon, American Journal of Sociology, Vol. 105, Number 2, 493-527, 1999. M. E. J. Newman, Random graphs as models of networks, in Handbook of Graphs and Networks, S. Bornholdt and H. G. Schuster (eds.), Wiley-VCH, Berlin (2003). J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000.

Terminology summary Small World network A small world network is a network with a dense local structure and a diameter comparable to a random graph Typically power-law degree distribution but can be something else as well. Power-law network A network with a power law node degree distribution Scale free network A power-law network