SGL: Stata graph library for network analysis

Similar documents

Visualization of Social Networks in Stata by Multi-dimensional Scaling

DATA ANALYSIS II. Matrix Algorithms

Follow links for Class Use and other Permissions. For more information send to:

Eigenvector-like measures of centrality for asymmetric relations

Social Media Mining. Network Measures

Social Network Analysis: Lecture 4-Centrality Measures

Social network and smoking: a pilotstudyamonghigh-schoolstudents

Practical Graph Mining with R. 5. Link Analysis

Social Media Mining. Graph Essentials

Algorithms for representing network centrality, groups and density and clustered graph representation

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

HISTORICAL DEVELOPMENTS AND THEORETICAL APPROACHES IN SOCIOLOGY Vol. I - Social Network Analysis - Wouter de Nooy

Mining Social-Network Graphs

Walk-Based Centrality and Communicability Measures for Network Analysis

Part 2: Community Detection

Data Mining: Algorithms and Applications Matrix Math Review

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp

How To Understand And Solve Algebraic Equations

Graph Processing and Social Networks

How To Understand The Network Of A Network

An Introduction to APGL

General Network Analysis: Graph-theoretic. COMP572 Fall 2009

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Network Analysis: Lecture 1. Sacha Epskamp

arxiv: v3 [math.na] 1 Oct 2012

USE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS

Social and Technological Network Analysis. Lecture 3: Centrality Measures. Dr. Cecilia Mascolo (some material from Lada Adamic s lectures)

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

! E6893 Big Data Analytics Lecture 10:! Linked Big Data Graph Computing (II)

Analyzing Big Data: Social Choice & Measurement

Why? A central concept in Computer Science. Algorithms are ubiquitous.

SECTIONS NOTES ON GRAPH THEORY NOTATION AND ITS USE IN THE STUDY OF SPARSE SYMMETRIC MATRICES

MATH APPLIED MATRIX THEORY

Operation Count; Numerical Linear Algebra

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

Arithmetic and Algebra of Matrices

1 Introduction to Matrices

Linear Algebra and TI 89

Analyzing Big Data: Social Choice & Measurement

V. Adamchik 1. Graph Theory. Victor Adamchik. Fall of 2005

Practical statistical network analysis (with R and igraph)

Solving Simultaneous Equations and Matrices

The University of Chicago Press is collaborating with JSTOR to digitize, preserve and extend access to American Journal of Sociology.

The PageRank Citation Ranking: Bring Order to the Web

Statistical Analysis of Complete Social Networks

Similar matrices and Jordan form

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

Network Analysis Basics and applications to online data

Hadoop SNS. renren.com. Saturday, December 3, 11

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION

Manifold Learning Examples PCA, LLE and ISOMAP

Analysis of Algorithms, I

CONTROLLABILITY. Chapter Reachable Set and Controllability. Suppose we have a linear system described by the state equation

Graph/Network Visualization

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

Nonlinear Iterative Partial Least Squares Method

BookTOC.txt. 1. Functions, Graphs, and Models. Algebra Toolbox. Sets. The Real Numbers. Inequalities and Intervals on the Real Number Line

SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE

(67902) Topics in Theory and Complexity Nov 2, Lecture 7

SOLVING LINEAR SYSTEMS

Social network analysis with R sna package

Network (Tree) Topology Inference Based on Prüfer Sequence

Introduction to Matrix Algebra

Warshall s Algorithm: Transitive Closure

Linear Programming. March 14, 2014

The Characteristic Polynomial

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the school year.

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS

Linear Programming I

Course on Social Network Analysis Graphs and Networks

A comparative study of social network analysis tools

3. INNER PRODUCT SPACES

Introduction to Principal Component Analysis: Stock Market Values

Network-Based Tools for the Visualization and Analysis of Domain Models

Factor Analysis. Chapter 420. Introduction

Matrix Multiplication

Matrix Differentiation

A Primer on Index Notation

Predicting Influentials in Online Social Networks

Data Structure [Question Bank]

The mathematics of networks

ANALYTIC HIERARCHY PROCESS (AHP) TUTORIAL

FURTHER VECTORS (MEI)

Cloud Computing is NP-Complete

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form

SCAN: A Structural Clustering Algorithm for Networks

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis

Transcription:

SGL: Stata graph library for network analysis Hirotaka Miura Federal Reserve Bank of San Francisco Stata Conference Chicago 2011 The views presented here are my own and do not necessarily represent the views of the Federal Reserve System or any person affiliated with it.

Introduction What is network analysis? Network analysis is an application of network theory, which is a subfield of graph theory, and is concerned with analyzing relational data. Some questions network analysis tries to address is how important, or how central are the actors in the network and how concentrated is the network. Example usages of network analysis include: Determining the importance of a web page using Google s PageRank. Examining communication networks in intelligence and computer security. Solving transportation problems that involve flow of traffic or commodities. Addressing the too-connected-to-fail problem in financial networks. Analyzing social relationships between individuals in social network analysis. 2 / 41

Introduction Outline Outline Modeling relational data Matrix representations Centrality measures Clustering coefficient Stata implementation Conclusion 3 / 41

Modeling relational data Definitions Graph model A graph model representing a network G = (V, E) consists of a set of vertices V and a set of edges E. V equals the number of vertices. E equals the number of edges. An edge is defined as a link between two vertices i and j, not necessarily distinct, that has vertex i on one end and vertex j on the other. An edge may be directed or undirected and may also be weighted with differing edge values or have all equal edge values of one in which case the network is said to be unweighted. 4 / 41

Modeling relational data Definitions Special cases Special types of vertices and edges exist for which standard graph algorithms are not designed to handle or there simply does not exist routines to accommodate such types. Thus the following types of vertices and edges are currently excluded from analysis: Isolated vertex - a vertex that is not attached to any edges. Parallel edges - two or more edges that connect the same pair of vertices. Self-loop - an edge connecting vertex i to itself. Zero or negative weighted edge. 5 / 41

Modeling relational data Storing data Storing relational data A variety of storage types are available for capturing relational data: Adjacency matrix Adjacency list Core SGL algorithms use this structure. Edge list Most suited for storing relational data in Stata, as it allows the use of options such as if exp and in range. Plus others such as the Compressed Sparse Row format for efficient storage and access. 6 / 41

Modeling relational data Storing data Example storage types 4 1 2 Undirected unweighted network. Drawn using NETPLOT (Corten, 2011). 3 Adjacency matrix Adjacency list Edge list 0 1 0 0 1 0 1 1 0 1 0 0 0 1 0 0 Vertex Neighbor(s) 1 2 2 1 3 4 3 2 4 2 1 2 2 3 4 2 7 / 41

Modeling relational data Using joinby Creating an edge list using joinby (1) We illustrate a method of creating an edge list using the joinby command and example datasets used in [D] Data-Management Reference Manual, child.dta and parent.dta. Directed edges can represent parent vertices providing care to child vertices.. use child, clear (Data on Children). list family ~ d child_id x1 x2 1. 1025 3 11 320 2. 1025 1 12 300 3. 1025 4 10 275 4. 1026 2 13 280 5. 1027 5 15 210. use parent, clear (Data on Parents). list family ~ d parent ~ d x1 x3 1. 1030 10 39 600 2. 1025 11 20 643 3. 1025 12 27 721 4. 1026 13 30 760 5. 1026 14 26 668 6. 1030 15 32 684 8 / 41

Modeling relational data Using joinby Creating an edge list using joinby (2). sort family_id. joinby family_id using child. list parent_id child_id 3 1 12 12 parent ~ d child_id 1. 12 4 2. 12 1 3. 12 3 4. 11 4 5. 11 3 4 2 11 11 6. 11 1 7. 14 2 8. 13 2 13 13 14 14 Drawn using NETPLOT (Corten, 2011). 9 / 41

Matrix representations Matrix representations 10 / 41

Matrix representations Adjacency matrix Adjacency matrix Adjacency matrix A for unweighted networks is defined as a V V matrix with A ij entries being equal to one if an edge connects vertices i and j and zero otherwise. A ii entries are set to zero. Matrix A is symmetric if the network is undirected. For directed networks, rows of matrix A represent outgoing edges and columns represent incoming edges. The convention of denoting X ij entries as an edge from i to j is adopted for all matrices. For weighted networks, A ij entries are equal to the weight of the edge connecting vertices i and j. 11 / 41

Matrix representations Distance matrix Distance matrix Distance matrix D is defined as a V V matrix with D ij entries being equal to the length of the shortest path between vertices i and j. A path is defined as a way of reaching vertex j starting from vertex i using a combination of edges that do not go through a particular vertex more than once. If no path connects vertices i and j, D ij is set to missing. Signifies what is sometimes referred to as an infinite path. D ii is set to zero. For undirected networks, matrix D is symmetric. 12 / 41

Matrix representations Path matrix Path matrix Path matrix P is defined as a V V matrix with P ij entries being equal to the number of shortest paths between vertices i and j. If no paths exist between vertices i and j, P ij is set to zero. P ii is set to one. P matrix is symmetric for undirected networks. 13 / 41

Matrix representations Example Example matrices 4 1 2 Undirected unweighted network. Drawn using NETPLOT (Corten, 2011). 3 Adjacency matrix Distance matrix Path matrix 0 1 0 0 1 0 1 1 0 1 0 0 0 1 0 0 0 1 2 2 1 0 1 1 2 1 0 2 2 1 2 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 14 / 41

Centrality measures Centrality measures 15 / 41

Centrality measures Degree centrality Degree centrality (1) Undirected network Degree centrality measures the importance of a vertex by the number of connection the vertex has if the network is unweighted, and by the aggregate of the weights of edges connected to the vertex if the network is weighted (Freeman, 1978). For an undirected network, degree centrality for vertex i is defined as 1 V 1 A ij (1) j( i) where the leading divisor is adjusted for the exclusion of the j = i term. 16 / 41

Centrality measures Degree centrality Degree centrality (2) Directed network Directed networks may entail vertices having different number of incoming and outgoing edges, and thus we have out-degree and in-degree centrality. Out-degree centrality for vertex i is defined similarly to equation (1). For in-degree, we simply transpose the adjacency matrix: 1 V 1 A ij. (2) j( i) 17 / 41

Centrality measures Degree centrality Example Undirected unweighted network Centrality comparison G (0.33) A (0.33) Centrality Vertex E (0.50) D (0.33) C (0.50) A B F C G E D F (0.33) B (0.33) Degree 0.33 0.50 0.33 Closeness Betweenness Eigenvector Katz-Bonacich Degree centrality in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 18 / 41

Centrality measures Closeness centrality Closeness centrality Closeness centrality provides higher centrality scores to vertices that are situated closer to members of their component, or the set of reachable vertices, by taking the inverse of the average shortest paths as a measure of proximity (Freeman, 1978). That is, closeness centrality for vertex i is defined as ( V 1) j( i) D, (3) ij which reflects how vertices with smaller average shortest path lengths receive higher centrality scores than those that are situated farther away from members of their component. 19 / 41

Centrality measures Closeness centrality Example Undirected unweighted network Centrality comparison G (0.40) A (0.40) Centrality Vertex E (0.55) D (0.60) C (0.55) A B F C G E D F (0.40) B (0.40) Degree 0.33 0.50 0.33 Closeness 0.40 0.55 0.60 Betweenness Eigenvector Katz-Bonacich Closeness centrality in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 20 / 41

Centrality measures Betweenness centrality Betweenness centrality Betweenness centrality bestows larger centrality scores on vertices that lie on a higher proportion of shortest paths linking vertices other than itself. Let P ij denote the number of shortest paths from vertex i to j. Let P ij (k) denote the number of shortest paths from vertex i to j that vertex k lies on. Then following Anthonisse (1971) and Freeman (1977), betweenness centrality measure for vertex k is defined as ij:i j,k ij P ij (k) P ij. (4) To normalize (4), divide by ( V 1)( V 2), the maximum number of paths a given vertex could lie on between pairs of other vertices. 21 / 41

Centrality measures Betweenness centrality Example Undirected unweighted network Centrality comparison G (0.00) A (0.00) Centrality Vertex E (0.53) D (0.60) C (0.53) A B F C G E D F (0.00) B (0.00) Degree 0.33 0.50 0.33 Closeness 0.40 0.55 0.60 Betweenness 0.00 0.53 0.60 Eigenvector Katz-Bonacich Normalized betweenness centrality in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 22 / 41

Centrality measures Eigenvector centrality Eigenvector centrality (1) Eigenvector centrality can provide an indication on how important a vertex is by having the property of being large if a vertex has many neighbors, important neighbors, or both (Bonacich, 1972). For an undirected network with adjacency matrix A, centrality of vertex i, x i, can be expressed as x i = λ 1 j A ij x j (5) which can be rewritten as λx = Ax. (6) The convention is to use the eigenvector corresponding to the dominant eigenvalue of A. 23 / 41

Centrality measures Eigenvector centrality Eigenvector centrality (2) For directed networks, the general concern is in obtaining a centrality measure based on how often a vertex is being pointed to and the importance of neighbors associated with the incoming edges. Thus with a slight modification to equation (6), eigenvector centrality is redefined as λx = A x (7) where A is the transposed adjacency matrix. There are several shortcomings to the eigenvector centrality: A vertex with no incoming edges will always have centrality of zero. Vertices with neighbors that all have zero incoming edges will also have zero centrality since the sum in equation (5), j A ijx j, will not have any terms. The Katz-Bonacich centrality, a variation of the eigenvector centrality, seeks to address these issues. 24 / 41

Centrality measures Eigenvector centrality Example Undirected unweighted network Centrality comparison G (0.33) A (0.33) Centrality Vertex E (0.45) D (0.38) C (0.45) A B F C G E D F (0.33) B (0.33) Degree 0.33 0.50 0.33 Closeness 0.40 0.55 0.60 Betweenness 0.00 0.53 0.60 Eigenvector 0.33 0.45 0.38 Katz-Bonacich Eigenvector centrality in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 25 / 41

Centrality measures Katz-Bonacich centrality Katz-Bonacich centrality The additional inclusion of a free parameter (also referred to as a decay factor) and a vector of exogenous factors into equation (7): Avoids the exclusion of vertices with zero incoming edges. Allows connection values to decay over distance. Attributed to Katz (1953), Bonacich (1987), and Bonacich and Lloyd (2001). Centrality measure is defined as a solution to the equation x = αa x + β (8) where α is the free parameter and β is the vector of exogenous factors which can vary or be constant across vertices. For the centrality measure to converge properly, absolute value of α must be less than the absolute value of the inverse of the dominant eigenvalue of A. A positive α allows vertices with important neighbors to have higher status while a negative α value reduces the status. 26 / 41

Centrality measures Katz-Bonacich centrality Example Undirected unweighted network Centrality comparison G (3.99) A (3.99) Centrality Vertex E (5.06) D (4.34) C (5.06) A B F C G E D Degree 0.33 0.50 0.33 Closeness 0.40 0.55 0.60 Betweenness 0.00 0.53 0.60 Eigenvector 0.33 0.45 0.38 Katz-Bonacich* 3.99 5.06 4.34 F (3.99) B (3.99) * Maximum α = 0.43 (0.33 used). Exogenous factors set to one for all vertices. Katz-Bonacich centrality in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 27 / 41

Clustering coefficient Clustering coefficient 28 / 41

Clustering coefficient Introduction Clustering coefficient Clustering coefficient is one way of gauging how tightly connected a network is. The general idea is to consider transitive relations: If vertex j is connected to vertex i, and i is connected to k, then j is also connected to k. Global clustering coefficients provide indication on the degree of concentration of the entire network and consists of overall and average clustering coefficients. Overall clustering coefficient is equal to all observed transitive relations divided by all possible transitive relations in the network. Average clustering coefficient involves applying the definition of overall clustering coefficient at the vertex level, then averaging across all the vertices. 29 / 41

Clustering coefficient Overall Overall clustering coefficient For an undirected unweighted adjacency matrix A, overall clustering coefficient is defined as A ji A ik A jk c o (A) = i;j i;k j;k i i;j i;k j;k i A ji A ik (9) where the numerator represents the sum over i of all closed triplets in which transitivity holds, and the denominator represents the sum over i of all possible triplets. 30 / 41

Clustering coefficient Local and average Local and average clustering coefficient With a slight modification in notation, local clustering coefficient for vertex i is defined as A ji A ik A jk c i (A) = j i;k j;k i j i;k j;k i which leads to the average clustering coefficient: c a (A) = 1 V A ji A ik (10) c i (A). (11) By convention, c i (A) = 0 if vertex i has zero or only one link. i 31 / 41

Clustering coefficient Generalized methods Generalized clustering coefficient Building upon the works of Barrat et al. (2004), Opsahl and Panzarasa (2009) propose generalized methods. Clustering coefficients for vertex i based on weighted adjacency matrix W and corresponding unweighted adjacency matrix A are calculated as c i (W) = j i;k j;k i j i;k j;k i ωa jk ω (12) where ω equals (W ji + W ik )/2 for arithmetic mean, W ji W ik for geometric mean, max(w ji, W ik ) for maximum, and min(w ji, W ik ) for minimum. For unweighted networks, W = A and the four types of clustering coefficients are all equal. 32 / 41

Clustering coefficient Generalized methods Example Undirected unweighted network G (1.00) A (1.00) E (0.33) D (0.00) C (0.33) Average clustering coefficient: 0.67 Overall clustering coefficient: 0.55 F (1.00) B (1.00) Local clustering coefficients in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 33 / 41

Stata implementation Stata implementation 34 / 41

Stata implementation Using network and netsummarize network and netsummarize commands We demonstrate the use of network and netsummarize commands on a dataset of 15th-century Florentine marriages from Padgett and Ansell (1993) to compute betweenness and eigenvector centrality measures. network generates vectors of betweenness and eigenvector centralities. netsummarize merges vectors to Stata dataset.. use florentine_marriages, clear (15th Florentine marriages 15th-century > (Padgett and Ansell 1993)) marriages Edge. list list stored as Stata dataset v1 v2 1. Peruzzi Castellan 2. Peruzzi Strozzi 3. Peruzzi Bischeri 4. Castellan Strozzi 5. Castellan Barbadori 6. Strozzi Ridolfi 7. Strozzi Bischeri 8. Bischeri Guadagni 9. Barbadori Medici 10. Ridolfi Medici 11. Ridolfi Tornabuon 12. Medici Tornabuon 13. Medici Albizzi 14. Medici Salviati 15. Medici Acciaiuol 16. Tornabuon Guadagni 17. Guadagni Lambertes 18. Guadagni Albizzi 19. Albizzi Ginori 20. Salviati Pazzi 35 / 41

Stata implementation Using network and netsummarize Computing network centrality measures. // Generate betweenness centrality.. network v1 v2, measure(betweenness) name(b,replace) Breadth-first search algorithm (15 vertices) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5... Breadth-first search algorithm completed Betweenness centrality calculation (15 vertices) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5... Betweenness centrality calculation completed matrix b saved in Mata. netsummarize b/((rows(b)-1)*(rows(b)-2)), > generate(betweenness) statistic(rowsum). // Generate eigenvector centrality.. network v1 v2, measure(eigenvector) name(e,replace) matrix e saved in Mata. netsummarize e, generate(eigenvector) statistic(rowsum) 36 / 41

Stata implementation Using network and netsummarize Data description (1). describe, full Contains data from florentine_marriages.dta obs: 20 15th century Florentine marriages > (Padgett and Ansell 1993) vars: 10 10 Dec 2010 09:45 size: 1,160 (99.9% of memory free) storage display value variable name type format label variable label v1 str9 %9s v2 str9 %9s betweenness_source float %9.0g rowsum of Mata matrix b/((rows(b)-1)*(rows(b)-2)) betweenness_target float %9.0g rowsum of Mata matrix b/((rows(b)-1)*(rows(b)-2)) eigenvector_source float %9.0g rowsum of Mata matrix e eigenvector_target float %9.0g rowsum of Mata matrix e Sorted by: Note: dataset has changed since last saved 37 / 41

Stata implementation Using network and netsummarize Data description (2). list v1 v2 betweenness_source betweenness_target eigenvector_source eigenve > ctor_target v1 v2 betwee ~ e betwee ~ t eigenv ~ e eigenv ~ t 1. Peruzzi Castellan.021978.0549451.2757304.2590262 2. Peruzzi Strozzi.021978.1025641.2757304.3559805 3. Peruzzi Bischeri.021978.1043956.2757304.2828001 4. Castellan Strozzi.0549451.1025641.2590262.3559805 5. Castellan Barbadori.0549451.0934066.2590262.2117053 6. Strozzi Ridolfi.1025641.1135531.3559805.3415526 7. Strozzi Bischeri.1025641.1043956.3559805.2828001 8. Bischeri Guadagni.1043956.2545788.2828001.2891156 9. Barbadori Medici.0934066.521978.2117053.4303081 10. Ridolfi Medici.1135531.521978.3415526.4303081 11. Ridolfi Tornabuon.1135531.0915751.3415526.3258423 12. Medici Tornabuon.521978.0915751.4303081.3258423 13. Medici Albizzi.521978.2124542.4303081.2439561 14. Medici Salviati.521978.1428571.4303081.1459172 15. Medici Acciaiuol.521978 0.4303081.1321543 16. Tornabuon Guadagni.0915751.2545788.3258423.2891156 17. Guadagni Lambertes.2545788 0.2891156.0887919 18. Guadagni Albizzi.2545788.2124542.2891156.2439561 19. Albizzi Ginori.2124542 0.2439561.0749227 20. Salviati Pazzi.1428571 0.1459172.0448134 38 / 41

Stata implementation Using network and netsummarize Network visualization Betweenness centrality Eigenvector centrality Ginori (0.00) Ginori (0.07) Lambertes (0.00) Albizzi (0.21) Guadagni (0.25) Acciaiuol (0.00) Tornabuon (0.09) Medici (0.52) Bischeri (0.10) Salviati (0.14) Ridolfi (0.11) Pazzi (0.00) Lambertes (0.09) Albizzi (0.24) Guadagni (0.29) Acciaiuol (0.13) Tornabuon (0.33) Medici (0.43) Bischeri (0.28) Salviati (0.15) Ridolfi (0.34) Pazzi (0.04) Strozzi (0.10) Peruzzi (0.02) Barbadori (0.09) Strozzi (0.36) Peruzzi (0.28) Barbadori (0.21) Castellan (0.05) Castellan (0.26) Centrality scores in parentheses. Drawn using NETPLOT (Corten, 2011). 39 / 41

Conclusion Conclusion Different types of matrices, centrality measures, and clustering coefficients can be generated from information retrieved from relational data. The command network provides access to SGL functions that generate network measures based on edge list stored in Stata. The postcomputation command netsummarize allows the user to generate standard and customized network measures which are merged into Stata dataset. Future developments include: More efficient algorithms. Designing functions for additional network measures. Optimizing SGL in Mata. 40 / 41

References References I Anthonisse, J. M. (1971): The rush in a directed graph, Discussion paper, Stichting Mathematisch Centrum, Amsterdam, Technical Report BN 9/71. Barrat, A., M. Barthélemy, R. Pastor-Satorras, and A. Vespignani (2004): The architecture of complex weighted networks, Proceedings of the National Academy of Sciences, 101(11), 3747 3752. Bonacich, P. (1972): Factoring and weighting approaches to status scores and clique identification, Journal of Mathematical Sociology 2, pp. 113 120. (1987): Power and Centrality: A Family of Measures, The American Journal of Sociology, 92(5), 1170 1182. Bonacich, P., and P. Lloyd (2001): Eigenvector-like measures of centrality for asymmetric relations, Social Networks, 23(3), 191 201. Corten, R. (2011): Visualization of social networks in Stata using multidimensional scaling, Stata Journal, 11(1), 52 63. Freeman, L. C. (1977): A set of measures of centrality based on betweenness, Sociometry, 40(1), 35 41. (1978): Centrality in social networks: Conceptual clarification, Social Networks, (1), 215 239. Jackson, M. O. (2008): Social and Economic Networks. Princeton University Press. Katz, L. (1953): A New Status Index Derived from Sociometric Analysis, Psychometrika, 18, 39 43. Opsahl, T., and P. Panzarasa (2009): Clustering in weighted networks, Social Networks, 31(2), 155 163. Padgett, J. F., and C. K. Ansell (1993): Robust action and the rise of the Medici, 1400-1434, The American Journal of Sociology, 98(6), 1259 1319. 41 / 41