Temporal Dynamics of Scale-Free Networks

Similar documents
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

The Topology of Large-Scale Engineering Problem-Solving Networks

The Structure of Growing Social Networks

GENERATING AN ASSORTATIVE NETWORK WITH A GIVEN DEGREE DISTRIBUTION

Graph models for the Web and the Internet. Elias Koutsoupias University of Athens and UCLA. Crete, July 2003

Time-Dependent Complex Networks:

Network Analysis. BCH 5101: Analysis of -Omics Data 1/34

Complex Networks Analysis: Clustering Methods

ModelingandSimulationofthe OpenSourceSoftware Community


Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks

General Network Analysis: Graph-theoretic. COMP572 Fall 2009

Introduction to Networks and Business Intelligence

The architecture of complex weighted networks

Graph Theory and Networks in Biology

Emergence of Complexity in Financial Networks

Scale-free user-network approach to telephone network traffic analysis

The mathematics of networks

Effects of node buffer and capacity on network traffic

Online Appendix to Social Network Formation and Strategic Interaction in Large Networks

Degree distribution in random Apollonian networks structures

Open Source Software Developer and Project Networks

Research Article A Comparison of Online Social Networks and Real-Life Social Networks: A Study of Sina Microblogging

Random graphs and complex networks

Cluster detection algorithm in neural networks

DATA ANALYSIS II. Matrix Algorithms

Bioinformatics: Network Analysis

Graph Mining Techniques for Social Media Analysis

WISE Power Tutorial All Exercises

Complex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics

Overlapping ETF: Pair trading between two gold stocks

Towards Modelling The Internet Topology The Interactive Growth Model

arxiv:physics/ v1 6 Jan 2006

Complex Network Analysis of Brain Connectivity: An Introduction LABREPORT 5

Greedy Routing on Hidden Metric Spaces as a Foundation of Scalable Routing Architectures

How Placing Limitations on the Size of Personal Networks Changes the Structural Properties of Complex Networks

A discussion of Statistical Mechanics of Complex Networks P. Part I

Information Network or Social Network? The Structure of the Twitter Follow Graph

Many systems take the form of networks, sets of nodes or

Network Theory: 80/20 Rule and Small Worlds Theory

ATM Network Performance Evaluation And Optimization Using Complex Network Theory

MINFS544: Business Network Data Analytics and Applications

Sampling Biases in IP Topology Measurements

PUBLIC TRANSPORT SYSTEMS IN POLAND: FROM BIAŁYSTOK TO ZIELONA GÓRA BY BUS AND TRAM USING UNIVERSAL STATISTICS OF COMPLEX NETWORKS

Evolving Networks with Distance Preferences

Social Media Mining. Network Measures

arxiv:physics/ v2 [physics.comp-ph] 9 Nov 2006

Graph theoretic approach to analyze amino acid network

Robustness of Spatial Databases: Using Network Analysis on GIS Data Models

Structural constraints in complex networks

Dmitri Krioukov CAIDA/UCSD

Subgraph Patterns: Network Motifs and Graphlets. Pedro Ribeiro

A scalable multilevel algorithm for graph clustering and community structure detection

Pivot Point Trading. If you would rather work the pivot points out by yourself, the formula I use is below:

A mixture model for random graphs

USE OF GRAPH THEORY AND NETWORKS IN BIOLOGY

Stock price fluctuations and the mimetic behaviors of traders

How To Determine If Technical Currency Trading Is Profitable For Individual Currency Traders

Stationary random graphs on Z with prescribed iid degrees and finite mean connections

Statistical mechanics of complex networks

Transcription:

Temporal Dynamics of Scale-Free Networks Erez Shmueli, Yaniv Altshuler, and Alex Sandy Pentland MIT Media Lab {shmueli,yanival,sandy}@media.mit.edu Abstract. Many social, biological, and technological networks display substantial non-trivial topological features. One well-known and much studied feature of such networks is the scale-free power-law distribution of nodes degrees. Several works further suggest models for generating complex networks which comply with one or more of these topological features. For example, the known Barabasi-Albert preferential attachment model tells us how to create scale-free networks. Since the main focus of these generative models is in capturing one or more of the static topological features of complex networks, they are very limited in capturing the temporal dynamic properties of the networks evolvement. Therefore, when studying real-world networks, the following question arises: what is the mechanism that governs changes in the network over time? In order to shed some light on this topic, we study two years of data that we received from etoro: the world s largest social financial trading company. We discover three key findings. First, we demonstrate how the network topology may change significantly along time. More specifically, we illustrate how popular nodes may become extremely less popular, and emerging new nodes may become extremely popular, in a very short time. Then, we show that although the network may change significantly over time, the degrees of its nodes obey the powerlaw model at any given time. Finally, we observe that the magnitude of change between consecutive states of the network also presents a power-law effect. 1 Introduction Many social, biological, and technological networks display substantial non-trivial topological features. One well-known and much studied feature of such networks is the scale-free power-law distribution of nodes degrees [4]. That is, the degree of nodes is distributed according to the following formula: P [d] = c d λ. As the study of complex networks has continued to grow in importance and popularity, many other features have attracted attention as well. Such features include among the rest: short path lengths and a high clustering coefficient [12, 2], assortativity or disassortativity among vertices [1], community structure [8] and hierarchical structure [11] for undirected networks and reciprocity [7] and triad significance profile [9] for directed networks. Several works further suggested models for generating complex networks which comply with one or more of these topological features. For example, the known Barabasi-Albert model [4] tells us how to create scale-free networks. It incorporates two important general concepts: growth and preferential attachment. Growth means that the number of nodes in the network increases over time and preferential attachment means that the more connected a node is, the more likely it is

to receive new links. More specifically, the network begins with an initial connected network of m nodes. New nodes are added to the network one at a time. Each new node is connected to m m existing nodes with a probability that is proportional to the number of links that the existing nodes already have. More sophisticated models for creating scale-free networks exist. For example, in [6], at each time step, apart of m new edges between the new node and the old nodes, m c new edges are created between the old nodes, where the probability that a new edge is attached to existing nodes of degrees d 1 and d 2 is proportional to d 1 d 2. A very similar effect produces a rewiring of edges [1]. That is, instead of the creation of connections between nodes in the existing network, at each time step, m r randomly chosen vertices loose one of their connections. In m rr cases, a free end is attached to a random vertex. In the rest m rp = m r m rr cases, a free end is attached to a preferentially chosen vertex. The main focus of these generative models is in capturing one or more of the static topological features of complex networks. However, these models are very limited in capturing the temporal dynamic properties of the networks evolvement. Therefore, when studying real-world networks, the following question arises: what is the mechanism that governs changes in the network over time? In order to shed some light on this question, we studied two years of data (from 211/7/1 to 213/6/3) that we received from etoro: the worlds largest social financial trading company. We discover three key findings. First, we demonstrate how the network topology may change significantly along time. More specifically, we illustrate how popular nodes may become extremely less popular, and emerging new nodes may become extremely popular, in a very short time. Then, we show that although the network may change significantly over time, the degrees of its nodes obey the powerlaw model at any given time. Finally, we observe that the magnitude of change between consecutive states of the network also presents a power-law effect. 2 Datasets Our data come from etoro: the world s largest social financial trading company (See http://www.etoro.com). etoro is an on line discounted retail broker for foreign exchanges and commodities trading with easy-to-use buying and short selling mechanisms as well as leverages up to 4 times. Similarly to other trading platforms, etoro allows users to trade between currency pairs individually (see Fig??). In addition, etoro provides a social network platform which allows users to watch the financial trading activity of other users (displayed in a number of statistical ways) and copy their trades (see Fig. 1). More specifically, users in etoro can place three types of trades: (1) Single trade: The user places a normal trade by himself, (2) Copy trade: The user copies one single trade of another user and (3) Mirror trade: The user picks a target user to copy, and etoro automatically places all trades of the target user on behalf of the user. Our data contain over 67 million trades that were placed between 211/7/1 and 213/6/3. More than 53 million of these trades are automatically executed mirror trades, less than 25 thousands are copy trades and roughly 13 million are single trades. The total number of unique traders is roughly 275 thousands and the total number of unique mirror operations is roughly 85 thousands (one mirror operation may result in several mirror trades).

etoro The world s largest social financial trading company. Serving 3 million users worldwide. etoro Watch the financial trading activity of other users and copy them. Roughly two years of data. The platform allows users to trade between currency pairs (individually) or 1 All trades are automatically uploaded to the network where they Fig. 1. The etoro platform. Illustrating can the be displayed trading portfolio in a number of aof single statistical user ways. (left) and the trading activity of all users (right). 2 In the remainder of this paper, we use these trades to construct snapshot networks as we proceed to describe. Given a start time s and an end time e, the snapshot network s nodes consist of all users that had at least one trade open at some point in time between s and e. An edge from user u to user v exists, if and only if, user u was mirroring user v at some point in time between s and e. Figure 2 illustrates how the size of the etoro network grows along time terms of both the number of nodes and the number of edges. For each day during the two years period, a snapshot network is constructed, and the number of nodes and edges for that network are counted. 5 1 Number of nodes 4 3 2 Number of edges 8 6 4 1 2 1 2 3 4 5 6 7 8 12 34 5 6 78 Fig. 2. The size of the etoro network in terms of the number of nodes (left) and the number of edges (right) along time.

3 Results First, we examined the in-degrees of nodes in the etoro network, over the entire period of two years. As can be seen in Figure 3, the degree distribution presents a strong power-law pattern. Although, quite expected, this result is non-trivial. One might expect to see a bunch of users that are mirrored by the others, but what we actually witness is a heavy tail of users with only a few followers each. This result is consistent with the observation in [3] where the authors demonstrate by simulation that the degree distribution of social-learning networks converges to a power-law distribution, regardless of the underlying social network topology. 1-1 1-2 1-3 γ=1.64 1-4 1-5 1-6 1-7 1 3 1 4 Fig. 3. In-degree distribution of nodes in the entire etoro network. (The in-degree of a node depicts the number of mirroring traders for the trader represented by that node) Next, we investigated how the popularity of traders in etoro, in terms of the number of mirroring traders, changes along time. Fig. 4 illustrates the popularity of four traders. As can be seen in the figure, popular traders may become extremely less popular, and emerging new traders may become extremely popular, in a very short time. Note how this behavior differs significantly from the state-of-the-art rich get richer behavior. Number of mirroring traders 16 14 12 1 8 6 4 2 Number of mirroring traders 4 3 2 1 Number of mirroring traders 15 1 5 Number of mirroring traders 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Fig. 4. The in-degree of four nodes in the evolving etoro network. (Depicting the popularity of the four corresponding traders along time) To illustrate this point further we checked how similar different snapshots of the network are. Figure 5 presents the top 5 popular nodes for four different time periods: July-September 211 (snapshot 1), January-March 212 (snapshot 2),

July-September 212 (snapshot 3) and January-March 213. That is four threemonth snapshots with three-month gaps in between. As can be seen in the figure, only 11 nodes that were included in the top 5 popular nodes of snapshot 1 remained in the top 5 popular nodes of snapshot 2; only 17 nodes that were included in the top 5 popular nodes of snapshot 2 remained in the top 5 popular nodes of snapshot 3 and only 19 nodes that were included in the top 5 popular nodes of snapshot 3 remained in the top 5 popular nodes of snapshot 4. That is, the network may change significantly along time. Snapshot 1 Snapshot 2 Snapshot 3 Snapshot 4 Fig. 5. The 5 most popular nodes in each one of the four snapshots. Green nodes represent nodes that are included in the 5 most popular nodes of the current snapshot but were not included in the previous one. Red nodes represent nodes that were included in the 5 most popular nodes of the previous snapshot but are not included in the current one. Blue nodes represent nodes that were included in both snapshots. The node s circle area is proportional to its popularity. We then examined the degree distribution for each one of the four snapshots above. As can be seen in Figure 6, although the four snapshots differ significantly, the degree distribution for each one of them obey the power-law model. Snapshot 1 Snapshot 2 Snapshot 3 Snapshot 4 1-1 1-1 1-1 1-1 1-2 1-3 1-4 1-5 γ=1.52 1-2 1-3 1-4 1-5 1-6 γ=1.63 1-2 1-3 1-4 1-5 1-6 γ=1.64 1-2 1-3 1-4 1-5 1-6 1-7 γ=1.65 1 3 1 3 1 3 1 3 1 4 Fig. 6. distribution for each one of the four snapshots that are shown in Figure 5 Next, we studied more carefully the etoro network changes between consecutive days. More specifically, we measured the number of added edges (i.e., edges that did not appear in the previous day and appear in the current day) and the number of removed edges (i.e., edges that appeared in the previous day and do not appear in the current day). Since the size of the etoro network grows over time (see Fig. 2), we normalized the above quantities by dividing them in the number of edges that were present in the previous day. We found that, the normalized magnitude of change between each two consecutive snapshots (according to each one of the two measures) follows a power-law distribution (see Figure 7).

gamma=2.88 gamma=2.8 1-1 2-5 2-4 2-3 2-2 2-6 2-5 2-4 2-3 Fig. 7. Distribution of the normalized changes in the etoro network: added edges (left) and removed edges (right). In order to understand better this finding, we tried to break down the overall network changes into two smaller components. First, we measured the changes by taking into account only the nodes that were added and removed between the two consecutive days. That is, we considered only users that were not trading in the previous day but are trading in the current day and users that were trading in the previous day but are not trading in the current day. As can be seen in the top two subfigures of Figure 8, the normalized number of added and removed nodes also follows a power-law distribution. That is, in most days, only a small number of nodes are added to or removed from the network, but occasionally, a large number of nodes are added or removed. We repeated the same analysis, when taking into account only the edges that at least one of their nodes was added or removed. As can be seen in the bottom two subfigures of Figure 8, the result was again a power-law distribution. Then, we measured the changes by taking into account only the nodes that existed in both of the two consecutive days. That is, we considered only users that were trading in the previous day and are also trading in the current day. As can be seen in Figure 9, even when only the common nodes are considered, the normalized number of added and removed edges follows a power-law distribution. Our results were validated using the statistical tests for power-law distributions that were suggested in [5]. First, we applied the goodness of fit test. As can be seen in Table 1, the p-values for all cases are greater than.1, as required. Second, we tested alternative types of distribution. As can be seen in the table, the distribution is more likely to be truncated power-law than general power-law in all cases (the GOF value is negative), and the results are significant in three out of eight of the cases (the p-values are lower than.5); the distribution is more likely to be truncated power-law than exponential and the result is significant in five out of eight of the cases cases and the distribution is more likely to be truncated power-law than log-normal in all cases and the result is significant in five out of eight of the cases. 4 Summary and Future Work In this paper, we investigate how scale-free networks evolve over time. Studying a real-world network, we find that: (1) the network topology may change significantly along time, (2) the degree distribution of nodes in the network obeys the

gamma=3.64 gamma=3.24 2-6 2-5 2-4 2-3 2-2 2-6 2-5 2-4 2-3 gamma=3.13 gamma=3. 2-7 2-6 2-5 2-4 2-3 2-7 2-6 2-5 2-4 Fig. 8. Distribution of the normalized changes in the etoro network, as reflected by the added and removed nodes: added nodes (top left), removed nodes (top right), added edges (bottom left) and removed edges (bottom right) gamma=2.87 gamma=2.61 2-6 2-5 2-4 2-3 2-6 2-5 2-4 2-3 Fig. 9. Distribution of the normalized changes in the etoro network, as reflected by the common nodes: added edges (left) and removed edges (right). Goodness Power-Law vs. Trunc. Power-Law vs. Fig. Subfigure xmin alpha of Fit Trunc. Power-Law Exponential Log-Normal added eges.24 2.88.121 (-).18 (+).12 (+).396 7 removed edges.25 2.8.27 (-).12 (+).8 (+). added nodes.73 3.64.613 (-).613 (+).93 (+).732 removed nodes.23 3.24.111 (-).99 (+).16 (+).5 8 added edges.18 3.13.545 (-).544 (+).63 (+).411 removed edges.12 3..11 (-).18 (+).159 (+).6 added edges.14 2.87.123 (-).39 (+).27 (+).32 9 removed edges.14 2.61.131 (-).9 (+).14 (+). Table 1. Statistical tests for power-law distributions. The numbers in the three right columns represent the p-value and the sign of the GOF value in brackets.

power-law model at any given state and (3) the magnitude of change between consecutive states of the network also presents a power-law effect. Better understanding the temporal dynamics of scale-free networks would allow us to develop improved and more realistic algorithms for generating networks. Moreover, it would help us in better predicting future states of the network and estimating their probabilities. For example, it may help in bounding the probability that a given node remains popular over a certain period of time. In future work we intend to check how the distribution of changes between consecutive states of the networks influences the overall networks performance. We hypothesize that in cases where the distribution of changes is closer to a powerlaw distribution, the overall network performance would be higher. Furthermore, we would like to investigate the mechanism that is responsible for the power-law shape of the distribution. Finally, we would like to suggest a generative model for networks based on the above findings. References 1. Albert, R., and Barabási, A.-L. Topology of evolving networks: local events and universality. Physical review letters 85, 24 (2), 5234. 2. Amaral, L. A. N., Scala, A., Barthélémy, M., and Stanley, H. E. Classes of small-world networks. Proceedings of the National Academy of Sciences 97, 21 (2), 11149 11152. 3. Anghel, M., Toroczkai, Z., Bassler, K. E., and Korniss, G. Competitiondriven network dynamics: Emergence of a scale-free leadership structure and collective efficiency. Physical review letters 92, 5 (24), 5871. 4. Barabási, A.-L., and Albert, R. Emergence of scaling in random networks. science 286, 5439 (1999), 59 512. 5. Clauset, A., Shalizi, C. R., and Newman, M. E. Power-law distributions in empirical data. SIAM review 51, 4 (29), 661 73. 6. Dorogovtsev, S. N., and Mendes, J. F. F. Scaling behaviour of developing and decaying networks. EPL (Europhysics Letters) 52, 1 (2), 33. 7. Garlaschelli, D., and Loffredo, M. I. Patterns of link reciprocity in directed networks. Physical Review Letters 93, 26 (24), 26871. 8. Girvan, M., and Newman, M. E. Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99, 12 (22), 7821 7826. 9. Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I., Sheffer, M., and Alon, U. Superfamilies of evolved and designed networks. Science 33, 5663 (24), 1538 1542. 1. Newman, M. E. Assortative mixing in networks. Physical review letters 89, 2 (22), 2871. 11. Ravasz, E., and Barabási, A.-L. Hierarchical organization in complex networks. Physical Review E 67, 2 (23), 26112. 12. Watts, D. J., and Strogatz, S. H. Collective dynamics of smallworldnetworks. nature 393, 6684 (1998), 44 442.