Hierarchical Organization of Railway Networks



Similar documents
STATISTICAL ANALYSIS OF THE INDIAN RAILWAY NETWORK: A COMPLEX NETWORK APPROACH

Bioinformatics: Network Analysis

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Cluster detection algorithm in neural networks

Generating Hierarchically Modular Networks via Link Switching

arxiv: v1 [physics.soc-ph] 1 Mar 2012 Statistical Analysis of the Road Network of India

Open Source Software Developer and Project Networks

Dynamic Network Analyzer Building a Framework for the Graph-theoretic Analysis of Dynamic Networks

Greedy Routing on Hidden Metric Spaces as a Foundation of Scalable Routing Architectures

1. Introduction Gene regulation Genomics and genome analyses Hidden markov model (HMM)

Metabolic Network Analysis

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

Strong and Weak Ties

Self similarity of complex networks & hidden metric spaces

Mining Social Network Graphs

arxiv: v2 [physics.soc-ph] 21 Jun 2008

Degree distribution in random Apollonian networks structures

Graph theoretic approach to analyze amino acid network

Visualization and Modeling of Structural Features of a Large Organizational Network

Empirical analysis of Internet telephone network: From user ID to phone

PUBLIC TRANSPORT SYSTEMS IN POLAND: FROM BIAŁYSTOK TO ZIELONA GÓRA BY BUS AND TRAM USING UNIVERSAL STATISTICS OF COMPLEX NETWORKS

Influences of Communication Disruptions on Decentralized Routing in Transport Logistics

The architecture of complex weighted networks

Developer Social Networks in Software Engineering: Construction, Analysis, and Applications

Feed Forward Loops in Biological Systems

Gephi Tutorial Quick Start

An Alternative Web Search Strategy? Abstract

Efficient Crawling of Community Structures in Online Social Networks

Graph Theory and Complex Networks: An Introduction. Chapter 08: Computer networks

Dmitri Krioukov CAIDA/UCSD

General Network Analysis: Graph-theoretic. COMP572 Fall 2009

Option 1: empirical network analysis. Task: find data, analyze data (and visualize it), then interpret.

Social Media Mining. Network Measures

Exploring Big Data in Social Networks

Temporal Dynamics of Scale-Free Networks

Strategies for Optimizing Public Train Transport Networks in China: Under a Viewpoint of Complex Networks

Combining Spatial and Network Analysis: A Case Study of the GoMore Network

Chapter 15: Distributed Structures. Topology

Social Media Mining. Graph Essentials

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis. Contents. Introduction. Maarten van Steen. Version: April 28, 2014

VISUALIZING HIERARCHICAL DATA. Graham Wills SPSS Inc.,

Small-World Characteristics of Internet Topologies and Implications on Multicast Scaling

Social Networks and Social Media

WAN Wide Area Networks. Packet Switch Operation. Packet Switches. COMP476 Networked Computer Systems. WANs are made of store and forward switches.

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis

Group CRM: a New Telecom CRM Framework from Social Network Perspective

Gephi Tutorial Visualization

Multi-level analysis of an interaction network between individuals in a mailing-list

Applying Social Network Analysis to the Information in CVS Repositories

Crawling and Detecting Community Structure in Online Social Networks using Local Information

Part 2: Community Detection

Network Analysis For Sustainability Management

Shortcut sets for plane Euclidean networks (Extended abstract) 1

I. The SMART Project - Status Report and Plans. G. Salton. The SMART document retrieval system has been operating on a 709^

Graph models for the Web and the Internet. Elias Koutsoupias University of Athens and UCLA. Crete, July 2003

Hierarchical organization of modularity in metabolic networks

The Network Structure of Hard Combinatorial Landscapes

Risk Mitigation Strategies for Critical Infrastructures Based on Graph Centrality Analysis

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

A discussion of Statistical Mechanics of Complex Networks P. Part I

Distance Degree Sequences for Network Analysis

Hierarchical organization in complex networks

Mining Social-Network Graphs

Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks

Practical Graph Mining with R. 5. Link Analysis

The Open University s repository of research publications and other research outputs

! E6893 Big Data Analytics Lecture 10:! Linked Big Data Graph Computing (II)


Roles of Statistics in Network Science

Effects of node buffer and capacity on network traffic

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D.

The Internet and the Public Switched Telephone Network Disparities, Differences, and Distinctions

Answers to Review Questions

QUALITY OF SERVICE METRICS FOR DATA TRANSMISSION IN MESH TOPOLOGIES

The Structure of Growing Social Networks

Business Intelligence and Process Modelling

A Business Process Driven Approach for Generating Software Modules

Fault Analysis in Software with the Data Interaction of Classes

Well-Separated Pair Decomposition for the Unit-disk Graph Metric and its Applications

How To Find Local Affinity Patterns In Big Data

GENERATING AN ASSORTATIVE NETWORK WITH A GIVEN DEGREE DISTRIBUTION

Categorical Data Visualization and Clustering Using Subjective Factors

On the independence number of graphs with maximum degree 3

Outline. EE 122: Interdomain Routing Protocol (BGP) BGP Routing. Internet is more complicated... Ion Stoica TAs: Junda Liu, DK Moon, David Zats

SEMITOTAL AND TOTAL BLOCK-CUTVERTEX GRAPH

SCAN: A Structural Clustering Algorithm for Networks

Evolution of a Location-based Online Social Network: Analysis and Models

Module 7. Routing and Congestion Control. Version 2 CSE IIT, Kharagpur

Introduction to Metropolitan Area Networks and Wide Area Networks

Active Internet Traffic Filtering to Denial of Service Attacks from Flash Crowds

MapReduce Algorithms. Sergei Vassilvitskii. Saturday, August 25, 12

Software Design. Design (I) Software Design Data Design. Relationships between the Analysis Model and the Design Model

The mathematics of networks

The Topology of Large-Scale Engineering Problem-Solving Networks

Graph Mining Techniques for Social Media Analysis

Network (Tree) Topology Inference Based on Prüfer Sequence

B490 Mining the Big Data. 2 Clustering

Social Media Mining. Data Mining Essentials

A Software Architecture for a Photonic Network Planning Tool

An Intelligent Matching System for the Products of Small Business/Manufactures with the Celebrities

Transcription:

Hierarchical Organization of Railway Networks Praveen R, Animesh Mukherjee, and Niloy Ganguly Department of Computer Science and Engineering Indian Institute of Technology Kharagpur, India 721302 (Dated: May 21, 2008) I. INTRODUCTION In the recent years, networks have proved to be highly useful in modeling and characterizing complex systems that occur in various branches of science. In fact, almost any large-scale system, be it natural or man-made, can be viewed as a network of interacting entities, which is complex, irregular, and usually changing over time. One of the most important observations that researchers have repeatedly made in the past is that complex networks often exhibit hierarchical organization [2, 4 7, 10]. The nodes in the network can be divided into groups, which can be further subdivided in smaller (and possibly more cohesive) groups and so forth. In most of the cases these groups have been shown to correspond to certain functional units such as friendship communities in social networks [2], modules in biochemical networks [4, 5, 10] and sound patterns in consonant and vowel inventories [6, 7]. Railways are perhaps one of the most important means of transportation for any nation. In this context, it becomes extremely necessary to study the structural properties of Railway Networks (henceforth RN). The RN can be conceived of as nodes representing stations and a link between two nodes representing a direct train connection between them. The number of direct trains between two stations define the weight of the edge between the nodes representing those stations. Although there have been some attempts to explore the small-world properties of RNs for different countries [3, 9], none of them investigate the hidden structural patterns of these networks. Since railways play a very crucial role in shaping the economy of a nation it is important to study the above patterns, which in turn can not only be used for a more effective distribution of new trains but also for a better planning of the railway budget. Therefore, the primary objective of this work is to analyze the hierarchical organization of two RNs (for which we could collect data) namely, the Indian Railway Network (IRN) and the German Railway Network (GRN). Apart from the standard small-world properties like clustering coefficient, average path length, diameter we conduct a detailed community structure analysis of the two networks in order to capture the basis of their organization. Since the networks are weighted, we employ the Electronic address: praveen.321@gmail.com Electronic address: animeshm@cse.iitkgp.ernet.in Electronic address: niloy@cse.iitkgp.ernet.in Modified Radicchi et al. algorithm (henceforth MRad) (see [6 8] for references) to detect the community structures. We observe that the pattern of community formation reflects the underlying economic geography of the area/region with each community including within it a few hub nodes (i.e., important junctions). Furthermore, the inter-community hubs are well connected with each other. The rest of the paper is laid out as follows. In section II we formally define the two networks, outline their construction procedure and present the basic structural properties. In the next section, we briefly review the important steps of the MRad algorithm and present the results of the community analysis for the two networks. We conclude in section IV by summarizing our contributions, pointing out some of the implications of the current work and indicating the possible future directions. II. DEFINITION, CONSTRUCTION AND STRUCTURAL PROPERTIES OF THE TWO RAILWAY NETWORKS Definition: RN can be represented as a graph G = V S, E where V S is the set of nodes labeled by the stations and E is the set of edges connecting these stations. There is an edge e E between two nodes if and only if there exists a direct train between the stations represented by these nodes. The weight of the edge e (also edge-weight) is the number of direct trains between the nodes (read stations) connected by e. We shall call this network of stations as Station-Station Network or StaNet. Consequently, the Indian RN is StaNet IR and the German RN is StaNet GR. Figure 1 presents a hypothetical illustration of StaNet IR and StaNet GR. Construction: In order to construct StaNet IR we collected the data from http://www.indianrail.go.in, which is the official website of the Indian railways. The website hosts information of around 2764 stations and approximately 1377 trains halting at one or more of these stations. Consequently, the number of nodes in StaNet IR is 2764. For the German railways, we could collect only a small amount of data consisting of 80 stations (equivalent to the number of nodes in StaNet GR ) and around 100 trains passing through them from the regional timetable archival in the form of a compact disk Deutsche Bahn Electronic Timetable (see http://www.hacon.de/hafas e/ce promo.shtml for further information). Structural Properties: The study of the standard

2 FIG. 1: A hypothetical illustration of the nodes and edges of StaNet IR and StaNet GR. The labels of the nodes denote the names of the stations. The numerical values against the edges represent their corresponding weights. For example, the number of direct trains from BANKURA to BISHNUPUR is 7. Properties StaNet IR StaNet GR Weighted CC 0.79 0.75 Avg. Path Length, Diameter 2.43, 4.00 1.76, 3.00 TABLE I: Structural properties of the two networks. Clustering Coefficient. CC: small-world properties, i.e., the clustering coefficient, the average geodesic path length and the diameter for the two networks indicate that our results are in agreement with the observation made by the earlier researchers in [9] and [3] respectively. Clustering Coefficient The clustering coefficient for a node i is the proportion of links between the nodes that are the neighbors of i divided by the number of links that could possibly exist between them. For instance, in a friendship network it represents the probability that two friends of the person i are also friends themselves. For weighted graphs such as StaNet IR and StaNet GR, this definition has been suitably modified in [1]. Table I reports the values of the weighted clustering coefficient for the networks. Since these values are significantly high it implies that the neighboring stations of a station are also highly connected via direct trains for the railway system of both the nations. Average Geodesic Path Length and Diameter The average length of the geodesic (or shortest) path between any two arbitrary nodes is a measure of how well the network is connected. The results in Table I imply that both StaNet IR and StaNet GR have a very high connectivity. Furthermore, the diameter, which is the length of the longest shortest path between any two nodes is also small for both the networks (see Table I). This indicates that any arbitrary station in the network can be reached from any other arbitrary station through only a very few hops. The above results collectively show that the RNs of both the nations exhibit small-world properties, which in turn means that in practice, a traveler has to change only few trains to reach an arbitrary destination. Nevertheless, these results do not shed much light on the hidden structural patterns present in these networks. Therefore, we attempt to unfurl these patterns through the community structure analysis of the networks. In the next section, we review the algorithm for community detection and present the results obtained by applying it to the networks. III. THE MRAD ALGORITHM The MRad algorithm for detecting community structures in weighted networks has been introduced by us in [6]. For the purpose of readability, we briefly recapitulate the idea here. The original algorithm of Radicchi et al. [8] (applied on unweighted networks) counts, for each edge, the number of loops of length three it is a part of and declares the edges with very low counts as inter-community edges. Modification for Weighted Network: For weighted networks, rather than considering simply the triangles (loops of length three) we need to consider the weights on the edges forming these triangles. The basic idea is that if the weights on the edges forming a triangle are comparable then the group of stations represented by this triangle are highly and equally well-connected with each other, thereby, rendering a pattern of connection. In contrast, if these weights are not comparable then there is no such pattern. In order to capture this property we define a strength metric S for each of the edges of StaNet as follows. Let the weight of the edge (u,v), where u, v V S, be denoted by w uv. We define S as, S = w uv i VC {u,v} (w ui w vi ) 2 (1)

3 Communities from StaNet IR Region η Adra Jn., Bankura, Midnapore, Purulia Jn., Bishnupur West Bengal 0.42 Ajmer, Beawar, Kishangarh Rajasthan 0.42 Abohar, Giddarbaha, Malout, Shri Ganganagar Punjab 0.50 Communities from StaNet GR Region η Bremen, Hamburg, Osnabrück, Münster North-west of Germany 0.72 Augsburg, Munich, Ulm, Stuttgart Extreme south of Germany 0.72 Diasburg, Düsseldorf, Dortmund Extreme west of Germany 1.25 TABLE II: Some of the communities obtained from StaNet IR and StaNet GR. if i V C {u,v} (w ui w vi ) 2 > 0 else S =. The denominator in this expression essentially tries to capture whether or not the weights on the edges forming the triangles are comparable. If the weights are not comparable then this denominator will be high, thus reducing the overall value of S. StaNet may be then partitioned into clusters or communities by retaining only those edges that have S greater than a predefined threshold η. We apply the MRad algorithm to StaNet IR and StaNet GR in order to extract the communities. Some of the example communities obtained by varying η are presented in Table II. Note that these results are representative. The results indicate that geographic proximity is the basis of the hierarchical organization of the RNs. The geographically distant communities are connected among each other only through a set of hubs or junction stations. Some of the results are presented on the maps (or partial maps) of the two countries in Figure 2 for a better visualization. Not only does the above results help us to determine the basis of the hierarchical organization of the RNs but also can be useful while planning the distribution of new trains. For instance, although Bharatpur is a district headquarter in West Bengal it seems that it is not sufficiently well-connected with the neighboring stations like Farakka and Maldah and therefore, is not within the community formed by these stations (see top left map in Figure 2). A similar example is Hanover in Germany that should have been connected to the community of Hamburg / Bremen (refer to the map at the bottom in Figure 2). findings are 1. The RNs of both the nations indicate small-world properties, which is in agreement with the observations made by the earlier researchers; 2. The networks exhibit hierarchical organization and the hierarchy is formed on the basis of geographic proximity; 3. Community analysis of the RN of a nation can be helpful in planning the distribution of new trains. A question that still remains unanswered is that how the two different nations with completely different political and social structures can have exactly the same pattern of organization of their transport system. This is possibly because of the fact that transportation needs of humans (across geography and culture) are to a large extent similar. For instance, short-distance travel for any individual is always more frequent than the long distance ones. Furthermore, on a daily basis, a much larger bulk of the population do short-distance travels while only a small fraction does long-distance travels. Therefore, there is always a pressure on having more trains within a region than across regions. Thus universal needs for transportation perhaps renders universal patterns in the RNs. The aforementioned argument can be verified by having a single growth model that, subject to some parameters, can accurately explain the structural properties of the empirically obtained networks. We look forward to develop such a model in future. IV. CONCLUSION In this paper, we analyzed the structural properties of the RNs of India and Germany. Some of our important [1] A. Barrat, M. Barthélemy, R. Pastor-Satorras and A. Vespignani. 2004. The architecture of complex weighted networks. PNAS 101, 3747 3752. [2] A. Clauset, M. E. J. Newman and C. Moore. 2004. Finding community structure in very large networks Phys. Rev. E 70, 066111. [3] L. Cui-mei and L. Jun-wei. 2007. Small-world and the growing properties of the Chinese railway network. Frontiers of Physics in China 2(3), 364 367. [4] R. Guimera and L. A. N. Amaral. 2005. Functional cartography of complex metabolic networks. Nature 433, 895.

FIG. 2: Some of the communities shown on the maps (partial maps) of the two countries. Top left map is of West Bengal, which is an eastern state of India. At the top right is the map of Kerala, which is a state in south India. The map at the bottom shows the whole of Germany. The circles of a single colour denote those stations that are in the same community. Courtesy: http://www.mapsofindia.com and http://www.mapsofworld.com. 4

5 [5] M. C. Lagomarsino, P. Jona, B. Bassetti and H. Isambert. 2001. Hierarchy and feedback in the evolution of the Escherichia coli transcription network. PNAS 104, 5516. [6] A. Mukherjee, M. Choudhury, A. Basu and N. Ganguly. 2007. Modeling the co-occurrence principles of the consonant inventories: A complex network approach, International Journal of Modern Physics C, 18(2), 281 295. [7] A. Mukherjee, M. Choudhury, S. RoyChowdhury, A. Basu and N. Ganguly. (to appear). Rediscovering the Co-occurrence Principles of Vowel Inventories: A Complex Network Approach, Advances in Complex Systems preprint: http://arxiv.org/abs/physics/0702056. [8] F. Radicchi, C. Castellano, F. Cecconi, V. Loreto and D. Parisi. 2003. Defining and identifying communities in networks. PNAS, 101(9), 2658 2663. [9] P. Sen, S. Dasgupta, A. Chatterjee, P. A, Sreeram, G. Mukherjee and S. S. Manna. 2003. Small-world properties of the Indian railway network. Phys. Rev. E 67, 036106. [10] E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai and A.-L. Barabási. 2002. Hierarchical organization of modularity in metabolic networks, Science 30, 1551.