Combining Spatial and Network Analysis: A Case Study of the GoMore Network



Similar documents
Introduction to Networks and Business Intelligence

Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations

A comparative study of social network analysis tools

Graph Mining Techniques for Social Media Analysis

ATM Network Performance Evaluation And Optimization Using Complex Network Theory

Graph models for the Web and the Internet. Elias Koutsoupias University of Athens and UCLA. Crete, July 2003

Effects of node buffer and capacity on network traffic

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

The Topology of Large-Scale Engineering Problem-Solving Networks

MINING COMMUNITIES OF BLOGGERS: A CASE STUDY

Algorithms for representing network centrality, groups and density and clustered graph representation

PUBLIC TRANSPORT SYSTEMS IN POLAND: FROM BIAŁYSTOK TO ZIELONA GÓRA BY BUS AND TRAM USING UNIVERSAL STATISTICS OF COMPLEX NETWORKS

The Structure of Growing Social Networks

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION

Network Theory: 80/20 Rule and Small Worlds Theory

SGL: Stata graph library for network analysis

Complex Networks Analysis: Clustering Methods

The Network Structure of Hard Combinatorial Landscapes

Network/Graph Theory. What is a Network? What is network theory? Graph-based representations. Friendship Network. What makes a problem graph-like?

General Network Analysis: Graph-theoretic. COMP572 Fall 2009

How Placing Limitations on the Size of Personal Networks Changes the Structural Properties of Complex Networks

Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks

Bioinformatics: Network Analysis

Exploring spatial decay effect in mass media and social media: a case study of China

Applying Social Network Analysis to the Information in CVS Repositories

Social Media Mining. Network Measures

Understanding the evolution dynamics of internet topology

Course Syllabus. BIA658 Social Network Analytics Fall, 2013

Self similarity of complex networks & hidden metric spaces

The mathematics of networks

Cluster detection algorithm in neural networks

The city s green areas represent about 25% of the city s overall area and on

The architecture of complex weighted networks

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

An Alternative Web Search Strategy? Abstract

AARHUS LIGHT RAIL PROJECT

A Centrality Measure for Electrical Networks

IC05 Introduction on Networks &Visualization Nov

1 Six Degrees of Separation

Mining Network Relationships in the Internet of Things

Available online at ScienceDirect. Procedia Computer Science 52 (2015 )

Hierarchical Organization of Railway Networks

Practical Graph Mining with R. 5. Link Analysis

Structure of a large social network

ModelingandSimulationofthe OpenSourceSoftware Community

Data Exploration with GIS Viewsheds and Social Network Analysis

A MULTI-MODEL DOCKING EXPERIMENT OF DYNAMIC SOCIAL NETWORK SIMULATIONS ABSTRACT

How To Understand The Network Of A Network

Application of Social Network Analysis to Collaborative Team Formation

Strategies for Optimizing Public Train Transport Networks in China: Under a Viewpoint of Complex Networks

Network-Based Tools for the Visualization and Analysis of Domain Models

Temporal Dynamics of Scale-Free Networks

Walk-Based Centrality and Communicability Measures for Network Analysis

Information Network or Social Network? The Structure of the Twitter Follow Graph

An Investigation to Improve Community Resilience using Network Graph Analysis of Infrastructure Systems

An Interest-Oriented Network Evolution Mechanism for Online Communities

THE JUTLAND ROUTE CORRIDOR

MINFS544: Business Network Data Analytics and Applications

Social Network Mining

Protect Network Neutrality against Intellectual Property Rights A Legal and Social Network Perspective

Week 3. Network Data; Introduction to Graph Theory and Sociometric Notation

Urban Design Interventions Towards a Bike Friendly City

V. Adamchik 1. Graph Theory. Victor Adamchik. Fall of 2005

Some questions... Graphs

Small-World Characteristics of Internet Topologies and Implications on Multicast Scaling

Six Degrees of Separation among US Researchers

1.204 Final Project Network Analysis of Urban Street Patterns

in R Binbin Lu, Martin Charlton National Centre for Geocomputation National University of Ireland Maynooth The R User Conference 2011

Research Article A Comparison of Online Social Networks and Real-Life Social Networks: A Study of Sina Microblogging

School of Computer Science Carnegie Mellon Graph Mining, self-similarity and power laws

DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS

Social network analysis with R sna package

Strong and Weak Ties

Sociology 323: Social networks

The average distances in random graphs with given expected degrees

The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth

Graphs, Networks and Python: The Power of Interconnection. Lachlan Blackhall - lachlan@repositpower.com

! E6893 Big Data Analytics Lecture 10:! Linked Big Data Graph Computing (II)

The Shape of the Network. The Shape of the Internet. Why study topology? Internet topologies. Early work. More on topologies..

GENERATING AN ASSORTATIVE NETWORK WITH A GIVEN DEGREE DISTRIBUTION

Train Fund Denmark. Electrification of the main railway lines Realisation of the 1 Hour Model vision. 1 Hour. Transportministeriet

Graph theoretic approach to analyze amino acid network

A Greener Transport System in Denmark. Environmentally Friendly and Energy Efficient Transport

Open Source Software Developer and Project Networks

STATISTICAL ANALYSIS OF THE INDIAN RAILWAY NETWORK: A COMPLEX NETWORK APPROACH

Development of a Regional (short distance) Transport Model System in Norway

Traffic Prediction in Wireless Mesh Networks Using Process Mining Algorithms

10 Aviation Element Introduction Purpose of Chapter

The real communication network behind the formal chart: community structure in organizations

The Length of Bridge Ties: Structural and Geographic Properties of Online Social Interactions

Social Networks and Social Media

Complex Network Analysis of Brain Connectivity: An Introduction LABREPORT 5

IRG-Rail (13) 2. Independent Regulators Group Rail IRG Rail Annual Market Monitoring Report

Flexible mobility management strategy in cellular networks

Drivers of cycling demand and cycling futures in the Danish context.

Moab and Fabriscale s Fabric Manager White Paper

What is Network Mapping?

Healthcare Analytics. Aryya Gangopadhyay UMBC

arxiv: v1 [cs.ne] 12 Feb 2014

Overview of the Travel Demand Forecasting Methodology

Transcription:

1. Introduction Combining Spatial and Network Analysis: A Case Study of the GoMore Network Kate Lyndegaard 1 1 Department of Geography, Penn State University, University Park, PA, USA Email: kate@lyndegaard.com Complex network analyses are of interest to geographers seeking to utilize the network structure to describe social, economic, political or other phenomena occurring throughout the geographic region of the network itself. This study employs network analysis for the purpose of examining the geographic distribution of travel within Northern Europe s leading online rideshare company. A number of key network properties are examined in order to characterize the interconnectivity of GoMore travel, including measures of centrality which are studied to determine the relative importance of travel origins and destinations and their roles as connectors, or bridges within the network. Weighted network properties are examined in relation to regional demographic variables such as population to analyze the volume of travel observed. Finally, the use of a gravity model is explored to forecast the expected volume of travel to the largest city in each Danish Kommune. The application of complex network analysis provides a new perspective on the mobility of network end-users with the potential to support decision making and marketing efforts aimed at increasing the network s user base. 2. Network Analysis 2.1Topological Analysis The field of network science grants us a unique window through which to view the interconnectivity of complex systems and provides a model construct by which these relationships can be quantified. The Danish-based service, GoMore (www.gomore.dk), is an example of a network designed to address the unmet needs of commuters and leisure travelers open to shared transport. The movements of these users are represented here as a network of events which occur between two geographic locations, a start, and an end location, forming the basis upon which topological and weighted network analyses are performed.

Figure 2. Filter applied to display only those edges with greater than 6 travelers. The GoMore network graph is comprised of data originating from the time period of February 2013 October 2013. The directed network graph consists of 2302 vertices and 7004 edges, and contains no self-loops or multi-edges. Table 1. GoMore topological properties. Nodes Edges Maximum in degree Maximum out degree Average Directed degree Average Local Clustering Coefficient Average Path Length GoMore Graph 2302 7004 283 288 3.043 0.079 3.385 The directionality of edges is considered because each network event represents directed movement from one network node to another. The network properties in degree,, and out degree,, capture the number of inbound links by which a node is connected, and the number of outbound links by which it connects to others. The average degree of a directed network is expressed as (Barabási Forthcoming): (1) ( ) ( )

In the GoMore network, in degree values range between 0 and 283, and out degree values between 0 and 288. The average directed degree of the network is 3.043, a figure much lower than the upper in and out degree range values, and indicative of the existence of highly connected hubs. In a network where nodes with high degree centrality connect large numbers of nodes to one another, the local clustering coefficient can provide insight into the nature of these connections. In the context of the GoMore network, it is an indication of whether network traffic funnels primarily through network hubs, or whether hubs peripheral cities also maintain connections to each other. The clustering coefficient is given by (Watts and Strogatz, 1998): (2) ( ) Clustering coefficient values indicate that low degree nodes typically possess higher values than high degree nodes, reflective of the high degree nodes embeddedness. In order to visualize the significance of the average local clustering coefficient over the network s entirety, a scatterplot is derived to display the average local clustering coefficient for all nodes with degree. Figure 1. Clustering degree correlation. The clustering-degree correlation is expressed as: ( ) ( ) ( ) (3)

The correlation between node degree and the clustering coefficient demonstrates that throughout the network, nodes peripheral to highly central hubs are not well interconnected with other, peripheral nodes. Measures of centrality based on geodesic paths provide another perspective on the notion of centrality within a network. Betweenness centrality examines the number of shortest paths between nodes in order to gage the importance of a node s role as a bridge, through which nodes connect to one another across the shortest possible distance. Betweenness centrality is expressed as (Brandes, 2001): ( ) ( ) (4) In the GoMore network, a high betweenness value could be interpreted as a node well-situated to become a meeting point on multi-destination, or extended trips. Nodes having high in and out degree values are also seen to have high betweenness values which confirm that their interconnectivity facilitates transfers along a high number of shortest paths between other nodes. Table 2 establishes that both in and out degree, and betweenness values are highly correlated. Table 2. Pearson s r correlation between centrality measures. Betweenness In-Degree Out- Degree Betweenness 1 In-Degree 0.955905714 1 Out-Degree 0.959936980 0.981554619 1 2.2 Weighted Network Analysis The weighted representation of the GoMore network facilitates the analysis of traveler flows between cities. The size of the network remains unchanged, with and. Table 3. GoMore weighted graph properties. GoMore Graph Nodes Edges Maximum edge weight Minimum edge weight Average Strength 2302 7004 1,665 2 50.400 The weighted network attributes each edge,, with the total number of travelers, summed over all trips that occurred between the two locations. Node strength is expressed as Barrat et al. (2004):

Node strength values in the GoMore network range between 0 and 17,502 with an average node strength of 50.4; metrics useful for analyzing traffic volume. The critical difference between a topological network analysis and a weighted network analysis, is that in the latter, edges are differentiated by their relative importance, or weight, rather than an indication of their presence or absence. In the GoMore network, weighted clustering coefficient values are higher than the topologically derived clustering coefficients:. The weighted clustering coefficient is expressed as Barrat et al. (2004): ( ) ( ) (5) ( ) ( ) (6) This relationship indicates that triadic closure is influenced by large edge weights, and typically signifies that locations which support large volumes of traffic are interconnected. The weighted clustering-degree correlation scatterplot in Figure 3 demonstrates that relatively low degree nodes possess a range of weighted clustering coefficient values, dependent upon the strength of the nodes to which they are connected. Figure 3. Weighted clustering degree correlation.

3. The Geography of Denmark The nation of Denmark spans an area of 42,895 km 2 (Statistics Denmark). The country is divided into five regions, and is further subdivided into ninety-nine kommunes. In order to contextualize the topological and weighted network analyses, administrative boundaries and their corresponding demographic data were considered. In order to provide a more granular view of GoMore travel within major urban centers, the kommunes of Copenhagen and Aarhus were further subdivided using electoral district boundaries. Figure 4. Geographic locations of nodes displayed. As of Q4 2013, the population of Denmark was recorded at 5,623,501 persons (Statistics Denmark). Table 4 presents a list of the eight most populated kommunes in Denmark. Table 4. Q4 2013 population (Source: Statistics Denmark). Danish Kommune Population Copenhagen 569000 Aarhus 323644 Aalborg 205614 Odense 195598 Table 5. Q4 2013 population density. Danish Kommune Pop. Density (Sq. km) Frederiksberg 12843.88 Copenhagen 7370.466 Rødovre 3076.148 Gentofte 2899.258

Esbjerg 115046 Vejle 109458 Frederiksberg 102751 Randers 96171 Gladsaxe 2668.514 Herlev 2264.876 Hvidovre 2261.397 Glostrup 1658.271 The examination of degree centrality within areal units, in this case the administrative boundaries of Danish kommunes, showed that the most populated kommunes supported the largest number of GoMore travelers. Table 5 presents a list of kommunes having the highest population density, with Frederiksberg and Copenhagen occupying the first and second positions. The following six most densely populated kommunes, all of which are immediately adjacent to Copenhagen and are effectively suburbs of the central kommune, do not appear in Table 6 as kommunes which support large volumes of traffic. Kommune Table 6. Summed totals of node strengths. Sum of Node Strength In Kommune Sum of Node Strength Out Copenhagen Total 18516 Copenhagen Total 18082 Indre By 13275 Indre By 12349 Sundbyvester 1117 Sundbyvester 1151 Vesterbro 1110 Vesterbro 1047 Østerbro 914 Valby 997 Valby 812 Østerbro 945 Nørrebro 595 Nørrebro 779 Utterslev 320 Brønshøj 353 Brønshøj 279 Utterslev 337 Sundbyøster 94 Sundbyøster 124 Aarhus Total 14550 Aarhus Total 16953 Aarhus Øst 8906 Aarhus Øst 10209 Aarhus Syd 4027 Aarhus Syd 4387 Aarhus Nord 1178 Aarhus Nord 1571 Aarhus Vest 439 Aarhus Vest 786 Aalborg 3339 Aalborg 3410 Odense 3199 Odense 3105 Kolding 1263 Kolding 1274 Vejle 1212 Frederiksberg 1260 Frederiksberg 1045 Vejle 1071 Esbjerg 931 Esbjerg 897 Statistics Denmark data indicate that family size and car ownership is approximately the same for the kommune of Copenhagen and its surrounding kommunes. One

potential explanation for the comparatively fewer number of rides is the issue surrounding user-generated location descriptions within the greater Copenhagen region, preventing any definitive conclusions. 4. Spatial Analysis Figure 5. Population density, units are km 2. Gravity models predict the spatial interaction between origins and destinations in relation to not only their geographic proximity, but commonly, in relation to a multitude of additional explanatory variables such as population or other social or economic variables (Haynes and Fotheringham, 1984).

Figure 6. Geographic locations of Kommunes largest cities. Utilizing the population of the largest city in each kommune, and road distances between each city, a total flow constrained gravity model (7) was calibrated using travel volumes derived from the weighted network analysis of the GoMore network. The predictive values obtained were measured against observed travel volumes in an effort to identify geographic locales producing a lower number of rides than what would be expected based on that area s population. Interestingly, in the case of the GoMore network, it was found that rather than distance acting as a deterrent, ride-share users utilize the service primarily for long distance trips which may be more costefficient to take using GoMore, than the Danish public rail system. Areas of study for future consideration include the further specification of distance decay. 5. Conclusion The GoMore network provides a unique opportunity to examine the structural properties of an online network from a geographic perspective, contextualized by demographic variables. The topological analysis revealed the existence of hubs in the network, identified as Denmark s largest urban centers. Clustering coefficient values indicated that low degree nodes typically possessed higher values than high degree nodes, reflective of the high degree nodes embeddedness. Throughout the network, (7)

nodes peripheral to highly central hubs were not well interconnected with other, peripheral nodes. An analysis of centrality measures demonstrated that in degree, out degree and betweenness values were highly correlated. Interestingly, the study revealed that nodes with high betweenness values are optimally located to provide services to end users, serve as pick-up points or as stops on multi-destination trips. The weighted network analysis confirmed that triangle density is supported by edges having large weights, signaling that high degree nodes supporting high traffic volumes tend to be interconnected. The examination of degree centrality within areal units, in this case the administrative boundaries of Danish kommunes, showed that the most populated kommunes supported the largest number of GoMore travelers. Population density was also considered, however issues surrounding user-generated location descriptions within the greater Copenhagen region prevent any definitive conclusions. Acknowledgements The author would like to thank Frank Hardisty for his comments and review of this work. References Albert R and Barabási A L 2000, Topology of evolving networks: local events and universality. Physical review letters, 85(24):5234-5237. Amaral L A N, Scala A, Barthélémy M and Stanley H E, 2000, Classes of small-world networks. Proceedings of the National Academy of Sciences, 97(21): 11149-11152. Barabási A L, Network Science. Forthcoming. Barthélemy M, Barrat A, Pastor-Satorras R and Vespignani A, 2005, Characterization and modeling of weighted networks. Physica a: Statistical mechanics and its applications, 346(1): 34-43. Barrat A, Barthelemy M, Pastor-Satorras R and Vespignani A, 2004, The architecture of complex weighted networks. Proceedings of the National Academy of Sciences, 101(11): 3747-3752. Brandes U, 2001, A faster algorithm for betweenness centrality*. Journal of Mathematical Sociology, 25(2):163-177. Bastian M, Heymann S and Jacomy M, 2009, Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media. Csardi G and Nepusz T, 2006, The igraph software package for complex network research. InterJournal, Complex Systems 1695, http://igraph.sf.net. De Montis A, Barthélemy M, Chessa A and Vespignani A, 2005, The structure of inter-urban traffic: A weighted network analysis. arxiv preprint physics/0507106. De Montis A, Caschili S and Chessa A, 2011, Spatial Complex Network Analysis and Accessibility Indicators: the Case of Municipal Commuting in Sardinia, Italy. European Journal of Transport and Infrastructure Research, 4(11):405-411. De Montis A, Chessa A, Campagna M, Caschili S and Deplano G, 2010, Modeling commuting systems through a complex network analysis: A study of the Italian islands of Sardinia and Sicily. Journal of Transport and Land Use, 2(3):39-55. Easley D and Kleinberg J, 1960, Networks, crowds, and markets. Cambridge University Press, Cambridge, UK. Erdős P and Rényi A, 1960, On the evolution of random graphs. Magyar Tud. Akad. Mat. Kutató Int. Közl 5, 17-61. Freeman L C, 1971, Centrality in social networks conceptual clarification. Social Networks, 1(3): 215-239. Gastner M T and Newman M E, 2006, The spatial structure of networks. European Physical Journal B, 49(2):247-252. Haggett P and Chorley R, 1969, Network Analysis in Geography. St. Martin s Press, New York, USA. Haynes K E and Fotheringham A S, 1984, Gravity and spatial interaction models. Sage Publications, Beverly Hills, USA.

Liben-Nowell D, Novak J, Kumar R, Raghavan P and Tomkins A, 2005, Geographic routing in social networks. Proceedings of the National Academy of Sciences, 102(33): 11623-11628. Danish Ministry of the Environment Geodata Agency, http://www.gst.dk. Newman M E, 2004, Analysis of weighted networks. arxiv:cond-mat/0407503. Patuelli R, Reggiani A, Gorman S P, Nijkamp P and Bade F J, 2007, Network analysis of commuting flows: A comparative static approach to German data. Networks and Spatial Economics, 7(4): 315-331. R Core Team, 2013, R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, http://www.r-project.org/. Statistics Denmark, http://www.dst.dk. Wang J, Mo H, Wang F and Jin F, 2011, Exploring the network structure and nodal centrality of China s air transport network: A complex network approach. Journal of Transport Geography, 19:712-721. Watts D J and Strogatz S H, 1998, Collective dynamics of small-world networks. Nature, 393(6684):440-442. Maps throughout this book were created using ArcGIS software by Esri. ArcGIS and ArcMap are the intellectual property of Esri and are used herein under license. Copyright Esri. All rights reserved. For more information about Esri software, please visit www.esri.com