# SGL: Stata graph library for network analysis

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 SGL: Stata graph library for network analysis Hirotaka Miura Federal Reserve Bank of San Francisco Stata Conference Chicago 2011 The views presented here are my own and do not necessarily represent the views of the Federal Reserve System or any person affiliated with it.

2 Introduction What is network analysis? Network analysis is an application of network theory, which is a subfield of graph theory, and is concerned with analyzing relational data. Some questions network analysis tries to address is how important, or how central are the actors in the network and how concentrated is the network. Example usages of network analysis include: Determining the importance of a web page using Google s PageRank. Examining communication networks in intelligence and computer security. Solving transportation problems that involve flow of traffic or commodities. Addressing the too-connected-to-fail problem in financial networks. Analyzing social relationships between individuals in social network analysis. 2 / 41

3 Introduction Outline Outline Modeling relational data Matrix representations Centrality measures Clustering coefficient Stata implementation Conclusion 3 / 41

4 Modeling relational data Definitions Graph model A graph model representing a network G = (V, E) consists of a set of vertices V and a set of edges E. V equals the number of vertices. E equals the number of edges. An edge is defined as a link between two vertices i and j, not necessarily distinct, that has vertex i on one end and vertex j on the other. An edge may be directed or undirected and may also be weighted with differing edge values or have all equal edge values of one in which case the network is said to be unweighted. 4 / 41

5 Modeling relational data Definitions Special cases Special types of vertices and edges exist for which standard graph algorithms are not designed to handle or there simply does not exist routines to accommodate such types. Thus the following types of vertices and edges are currently excluded from analysis: Isolated vertex - a vertex that is not attached to any edges. Parallel edges - two or more edges that connect the same pair of vertices. Self-loop - an edge connecting vertex i to itself. Zero or negative weighted edge. 5 / 41

6 Modeling relational data Storing data Storing relational data A variety of storage types are available for capturing relational data: Adjacency matrix Adjacency list Core SGL algorithms use this structure. Edge list Most suited for storing relational data in Stata, as it allows the use of options such as if exp and in range. Plus others such as the Compressed Sparse Row format for efficient storage and access. 6 / 41

7 Modeling relational data Storing data Example storage types Undirected unweighted network. Drawn using NETPLOT (Corten, 2011). 3 Adjacency matrix Adjacency list Edge list Vertex Neighbor(s) / 41

8 Modeling relational data Using joinby Creating an edge list using joinby (1) We illustrate a method of creating an edge list using the joinby command and example datasets used in [D] Data-Management Reference Manual, child.dta and parent.dta. Directed edges can represent parent vertices providing care to child vertices.. use child, clear (Data on Children). list family ~ d child_id x1 x use parent, clear (Data on Parents). list family ~ d parent ~ d x1 x / 41

9 Modeling relational data Using joinby Creating an edge list using joinby (2). sort family_id. joinby family_id using child. list parent_id child_id parent ~ d child_id Drawn using NETPLOT (Corten, 2011). 9 / 41

10 Matrix representations Matrix representations 10 / 41

11 Matrix representations Adjacency matrix Adjacency matrix Adjacency matrix A for unweighted networks is defined as a V V matrix with A ij entries being equal to one if an edge connects vertices i and j and zero otherwise. A ii entries are set to zero. Matrix A is symmetric if the network is undirected. For directed networks, rows of matrix A represent outgoing edges and columns represent incoming edges. The convention of denoting X ij entries as an edge from i to j is adopted for all matrices. For weighted networks, A ij entries are equal to the weight of the edge connecting vertices i and j. 11 / 41

12 Matrix representations Distance matrix Distance matrix Distance matrix D is defined as a V V matrix with D ij entries being equal to the length of the shortest path between vertices i and j. A path is defined as a way of reaching vertex j starting from vertex i using a combination of edges that do not go through a particular vertex more than once. If no path connects vertices i and j, D ij is set to missing. Signifies what is sometimes referred to as an infinite path. D ii is set to zero. For undirected networks, matrix D is symmetric. 12 / 41

13 Matrix representations Path matrix Path matrix Path matrix P is defined as a V V matrix with P ij entries being equal to the number of shortest paths between vertices i and j. If no paths exist between vertices i and j, P ij is set to zero. P ii is set to one. P matrix is symmetric for undirected networks. 13 / 41

14 Matrix representations Example Example matrices Undirected unweighted network. Drawn using NETPLOT (Corten, 2011). 3 Adjacency matrix Distance matrix Path matrix / 41

15 Centrality measures Centrality measures 15 / 41

16 Centrality measures Degree centrality Degree centrality (1) Undirected network Degree centrality measures the importance of a vertex by the number of connection the vertex has if the network is unweighted, and by the aggregate of the weights of edges connected to the vertex if the network is weighted (Freeman, 1978). For an undirected network, degree centrality for vertex i is defined as 1 V 1 A ij (1) j( i) where the leading divisor is adjusted for the exclusion of the j = i term. 16 / 41

17 Centrality measures Degree centrality Degree centrality (2) Directed network Directed networks may entail vertices having different number of incoming and outgoing edges, and thus we have out-degree and in-degree centrality. Out-degree centrality for vertex i is defined similarly to equation (1). For in-degree, we simply transpose the adjacency matrix: 1 V 1 A ij. (2) j( i) 17 / 41

18 Centrality measures Degree centrality Example Undirected unweighted network Centrality comparison G (0.33) A (0.33) Centrality Vertex E (0.50) D (0.33) C (0.50) A B F C G E D F (0.33) B (0.33) Degree Closeness Betweenness Eigenvector Katz-Bonacich Degree centrality in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 18 / 41

19 Centrality measures Closeness centrality Closeness centrality Closeness centrality provides higher centrality scores to vertices that are situated closer to members of their component, or the set of reachable vertices, by taking the inverse of the average shortest paths as a measure of proximity (Freeman, 1978). That is, closeness centrality for vertex i is defined as ( V 1) j( i) D, (3) ij which reflects how vertices with smaller average shortest path lengths receive higher centrality scores than those that are situated farther away from members of their component. 19 / 41

20 Centrality measures Closeness centrality Example Undirected unweighted network Centrality comparison G (0.40) A (0.40) Centrality Vertex E (0.55) D (0.60) C (0.55) A B F C G E D F (0.40) B (0.40) Degree Closeness Betweenness Eigenvector Katz-Bonacich Closeness centrality in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 20 / 41

21 Centrality measures Betweenness centrality Betweenness centrality Betweenness centrality bestows larger centrality scores on vertices that lie on a higher proportion of shortest paths linking vertices other than itself. Let P ij denote the number of shortest paths from vertex i to j. Let P ij (k) denote the number of shortest paths from vertex i to j that vertex k lies on. Then following Anthonisse (1971) and Freeman (1977), betweenness centrality measure for vertex k is defined as ij:i j,k ij P ij (k) P ij. (4) To normalize (4), divide by ( V 1)( V 2), the maximum number of paths a given vertex could lie on between pairs of other vertices. 21 / 41

22 Centrality measures Betweenness centrality Example Undirected unweighted network Centrality comparison G (0.00) A (0.00) Centrality Vertex E (0.53) D (0.60) C (0.53) A B F C G E D F (0.00) B (0.00) Degree Closeness Betweenness Eigenvector Katz-Bonacich Normalized betweenness centrality in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 22 / 41

23 Centrality measures Eigenvector centrality Eigenvector centrality (1) Eigenvector centrality can provide an indication on how important a vertex is by having the property of being large if a vertex has many neighbors, important neighbors, or both (Bonacich, 1972). For an undirected network with adjacency matrix A, centrality of vertex i, x i, can be expressed as x i = λ 1 j A ij x j (5) which can be rewritten as λx = Ax. (6) The convention is to use the eigenvector corresponding to the dominant eigenvalue of A. 23 / 41

24 Centrality measures Eigenvector centrality Eigenvector centrality (2) For directed networks, the general concern is in obtaining a centrality measure based on how often a vertex is being pointed to and the importance of neighbors associated with the incoming edges. Thus with a slight modification to equation (6), eigenvector centrality is redefined as λx = A x (7) where A is the transposed adjacency matrix. There are several shortcomings to the eigenvector centrality: A vertex with no incoming edges will always have centrality of zero. Vertices with neighbors that all have zero incoming edges will also have zero centrality since the sum in equation (5), j A ijx j, will not have any terms. The Katz-Bonacich centrality, a variation of the eigenvector centrality, seeks to address these issues. 24 / 41

25 Centrality measures Eigenvector centrality Example Undirected unweighted network Centrality comparison G (0.33) A (0.33) Centrality Vertex E (0.45) D (0.38) C (0.45) A B F C G E D F (0.33) B (0.33) Degree Closeness Betweenness Eigenvector Katz-Bonacich Eigenvector centrality in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 25 / 41

26 Centrality measures Katz-Bonacich centrality Katz-Bonacich centrality The additional inclusion of a free parameter (also referred to as a decay factor) and a vector of exogenous factors into equation (7): Avoids the exclusion of vertices with zero incoming edges. Allows connection values to decay over distance. Attributed to Katz (1953), Bonacich (1987), and Bonacich and Lloyd (2001). Centrality measure is defined as a solution to the equation x = αa x + β (8) where α is the free parameter and β is the vector of exogenous factors which can vary or be constant across vertices. For the centrality measure to converge properly, absolute value of α must be less than the absolute value of the inverse of the dominant eigenvalue of A. A positive α allows vertices with important neighbors to have higher status while a negative α value reduces the status. 26 / 41

27 Centrality measures Katz-Bonacich centrality Example Undirected unweighted network Centrality comparison G (3.99) A (3.99) Centrality Vertex E (5.06) D (4.34) C (5.06) A B F C G E D Degree Closeness Betweenness Eigenvector Katz-Bonacich* F (3.99) B (3.99) * Maximum α = 0.43 (0.33 used). Exogenous factors set to one for all vertices. Katz-Bonacich centrality in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 27 / 41

28 Clustering coefficient Clustering coefficient 28 / 41

29 Clustering coefficient Introduction Clustering coefficient Clustering coefficient is one way of gauging how tightly connected a network is. The general idea is to consider transitive relations: If vertex j is connected to vertex i, and i is connected to k, then j is also connected to k. Global clustering coefficients provide indication on the degree of concentration of the entire network and consists of overall and average clustering coefficients. Overall clustering coefficient is equal to all observed transitive relations divided by all possible transitive relations in the network. Average clustering coefficient involves applying the definition of overall clustering coefficient at the vertex level, then averaging across all the vertices. 29 / 41

30 Clustering coefficient Overall Overall clustering coefficient For an undirected unweighted adjacency matrix A, overall clustering coefficient is defined as A ji A ik A jk c o (A) = i;j i;k j;k i i;j i;k j;k i A ji A ik (9) where the numerator represents the sum over i of all closed triplets in which transitivity holds, and the denominator represents the sum over i of all possible triplets. 30 / 41

31 Clustering coefficient Local and average Local and average clustering coefficient With a slight modification in notation, local clustering coefficient for vertex i is defined as A ji A ik A jk c i (A) = j i;k j;k i j i;k j;k i which leads to the average clustering coefficient: c a (A) = 1 V A ji A ik (10) c i (A). (11) By convention, c i (A) = 0 if vertex i has zero or only one link. i 31 / 41

32 Clustering coefficient Generalized methods Generalized clustering coefficient Building upon the works of Barrat et al. (2004), Opsahl and Panzarasa (2009) propose generalized methods. Clustering coefficients for vertex i based on weighted adjacency matrix W and corresponding unweighted adjacency matrix A are calculated as c i (W) = j i;k j;k i j i;k j;k i ωa jk ω (12) where ω equals (W ji + W ik )/2 for arithmetic mean, W ji W ik for geometric mean, max(w ji, W ik ) for maximum, and min(w ji, W ik ) for minimum. For unweighted networks, W = A and the four types of clustering coefficients are all equal. 32 / 41

33 Clustering coefficient Generalized methods Example Undirected unweighted network G (1.00) A (1.00) E (0.33) D (0.00) C (0.33) Average clustering coefficient: 0.67 Overall clustering coefficient: 0.55 F (1.00) B (1.00) Local clustering coefficients in parentheses. Figure from Jackson (2008). Drawn using NETPLOT (Corten, 2011). 33 / 41

34 Stata implementation Stata implementation 34 / 41

35 Stata implementation Using network and netsummarize network and netsummarize commands We demonstrate the use of network and netsummarize commands on a dataset of 15th-century Florentine marriages from Padgett and Ansell (1993) to compute betweenness and eigenvector centrality measures. network generates vectors of betweenness and eigenvector centralities. netsummarize merges vectors to Stata dataset.. use florentine_marriages, clear (15th Florentine marriages 15th-century > (Padgett and Ansell 1993)) marriages Edge. list list stored as Stata dataset v1 v2 1. Peruzzi Castellan 2. Peruzzi Strozzi 3. Peruzzi Bischeri 4. Castellan Strozzi 5. Castellan Barbadori 6. Strozzi Ridolfi 7. Strozzi Bischeri 8. Bischeri Guadagni 9. Barbadori Medici 10. Ridolfi Medici 11. Ridolfi Tornabuon 12. Medici Tornabuon 13. Medici Albizzi 14. Medici Salviati 15. Medici Acciaiuol 16. Tornabuon Guadagni 17. Guadagni Lambertes 18. Guadagni Albizzi 19. Albizzi Ginori 20. Salviati Pazzi 35 / 41

36 Stata implementation Using network and netsummarize Computing network centrality measures. // Generate betweenness centrality.. network v1 v2, measure(betweenness) name(b,replace) Breadth-first search algorithm (15 vertices) Breadth-first search algorithm completed Betweenness centrality calculation (15 vertices) Betweenness centrality calculation completed matrix b saved in Mata. netsummarize b/((rows(b)-1)*(rows(b)-2)), > generate(betweenness) statistic(rowsum). // Generate eigenvector centrality.. network v1 v2, measure(eigenvector) name(e,replace) matrix e saved in Mata. netsummarize e, generate(eigenvector) statistic(rowsum) 36 / 41

37 Stata implementation Using network and netsummarize Data description (1). describe, full Contains data from florentine_marriages.dta obs: 20 15th century Florentine marriages > (Padgett and Ansell 1993) vars: Dec :45 size: 1,160 (99.9% of memory free) storage display value variable name type format label variable label v1 str9 %9s v2 str9 %9s betweenness_source float %9.0g rowsum of Mata matrix b/((rows(b)-1)*(rows(b)-2)) betweenness_target float %9.0g rowsum of Mata matrix b/((rows(b)-1)*(rows(b)-2)) eigenvector_source float %9.0g rowsum of Mata matrix e eigenvector_target float %9.0g rowsum of Mata matrix e Sorted by: Note: dataset has changed since last saved 37 / 41

38 Stata implementation Using network and netsummarize Data description (2). list v1 v2 betweenness_source betweenness_target eigenvector_source eigenve > ctor_target v1 v2 betwee ~ e betwee ~ t eigenv ~ e eigenv ~ t 1. Peruzzi Castellan Peruzzi Strozzi Peruzzi Bischeri Castellan Strozzi Castellan Barbadori Strozzi Ridolfi Strozzi Bischeri Bischeri Guadagni Barbadori Medici Ridolfi Medici Ridolfi Tornabuon Medici Tornabuon Medici Albizzi Medici Salviati Medici Acciaiuol Tornabuon Guadagni Guadagni Lambertes Guadagni Albizzi Albizzi Ginori Salviati Pazzi / 41

39 Stata implementation Using network and netsummarize Network visualization Betweenness centrality Eigenvector centrality Ginori (0.00) Ginori (0.07) Lambertes (0.00) Albizzi (0.21) Guadagni (0.25) Acciaiuol (0.00) Tornabuon (0.09) Medici (0.52) Bischeri (0.10) Salviati (0.14) Ridolfi (0.11) Pazzi (0.00) Lambertes (0.09) Albizzi (0.24) Guadagni (0.29) Acciaiuol (0.13) Tornabuon (0.33) Medici (0.43) Bischeri (0.28) Salviati (0.15) Ridolfi (0.34) Pazzi (0.04) Strozzi (0.10) Peruzzi (0.02) Barbadori (0.09) Strozzi (0.36) Peruzzi (0.28) Barbadori (0.21) Castellan (0.05) Castellan (0.26) Centrality scores in parentheses. Drawn using NETPLOT (Corten, 2011). 39 / 41

40 Conclusion Conclusion Different types of matrices, centrality measures, and clustering coefficients can be generated from information retrieved from relational data. The command network provides access to SGL functions that generate network measures based on edge list stored in Stata. The postcomputation command netsummarize allows the user to generate standard and customized network measures which are merged into Stata dataset. Future developments include: More efficient algorithms. Designing functions for additional network measures. Optimizing SGL in Mata. 40 / 41

41 References References I Anthonisse, J. M. (1971): The rush in a directed graph, Discussion paper, Stichting Mathematisch Centrum, Amsterdam, Technical Report BN 9/71. Barrat, A., M. Barthélemy, R. Pastor-Satorras, and A. Vespignani (2004): The architecture of complex weighted networks, Proceedings of the National Academy of Sciences, 101(11), Bonacich, P. (1972): Factoring and weighting approaches to status scores and clique identification, Journal of Mathematical Sociology 2, pp (1987): Power and Centrality: A Family of Measures, The American Journal of Sociology, 92(5), Bonacich, P., and P. Lloyd (2001): Eigenvector-like measures of centrality for asymmetric relations, Social Networks, 23(3), Corten, R. (2011): Visualization of social networks in Stata using multidimensional scaling, Stata Journal, 11(1), Freeman, L. C. (1977): A set of measures of centrality based on betweenness, Sociometry, 40(1), (1978): Centrality in social networks: Conceptual clarification, Social Networks, (1), Jackson, M. O. (2008): Social and Economic Networks. Princeton University Press. Katz, L. (1953): A New Status Index Derived from Sociometric Analysis, Psychometrika, 18, Opsahl, T., and P. Panzarasa (2009): Clustering in weighted networks, Social Networks, 31(2), Padgett, J. F., and C. K. Ansell (1993): Robust action and the rise of the Medici, , The American Journal of Sociology, 98(6), / 41

### Visualization of Social Networks in Stata by Multi-dimensional Scaling

Visualization of Social Networks in Stata by Multi-dimensional Scaling Rense Corten Department of Sociology/ICS Utrecht University The Netherlands r.corten@uu.nl April 12, 2010 1 Introduction Social network

### DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

### Eigenvector-like measures of centrality for asymmetric relations

Social Networks 23 (2001) 191 201 Eigenvector-like measures of centrality for asymmetric relations Phillip Bonacich, Paulette Lloyd Department of Sociology, University of California at Los Angeles, 2201

### Social Media Mining. Network Measures

Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the like-minded users

### Social Network Analysis: Lecture 4-Centrality Measures

Social Network Analysis: Lecture 4-Centrality Measures Donglei Du (ddu@unb.edu) Faculty of Business Administration, University of New Brunswick, NB Canada Fredericton E3B 9Y2 Donglei Du (UNB) Social Network

### Social networks, structuralism, cohesion, brokerage, stratification, network analysis, methods, graph theory, statistical models

6.99A SOCIAL NETWORK ANALYSIS {author} {affiliation} Keywords Social networks, structuralism, cohesion, brokerage, stratification, network analysis, methods, graph theory, statistical models Contents 1

### Practical Graph Mining with R. 5. Link Analysis

Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities

### Social network and smoking: a pilotstudyamonghigh-schoolstudents

2013 Italian Stata Users Group meeting Social network and smoking: a pilotstudyamonghigh-schoolstudents Bruno Federico Firenze, 14 novembre 2013 The social pattern of smoking Smoking is an important risk

### Social Media Mining. Graph Essentials

Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures

### Algorithms for representing network centrality, groups and density and clustered graph representation

COSIN IST 2001 33555 COevolution and Self-organization In dynamical Networks Algorithms for representing network centrality, groups and density and clustered graph representation Deliverable Number: D06

### USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu

### Sociology and CS. Small World. Sociology Problems. Degree of Separation. Milgram s Experiment. How close are people connected? (Problem Understanding)

Sociology Problems Sociology and CS Problem 1 How close are people connected? Small World Philip Chan Problem 2 Connector How close are people connected? (Problem Understanding) Small World Are people

### Introduction to Flocking {Stochastic Matrices}

Supelec EECI Graduate School in Control Introduction to Flocking {Stochastic Matrices} A. S. Morse Yale University Gif sur - Yvette May 21, 2012 CRAIG REYNOLDS - 1987 BOIDS The Lion King CRAIG REYNOLDS

### Lesson 3. Algebraic graph theory. Sergio Barbarossa. Rome - February 2010

Lesson 3 Algebraic graph theory Sergio Barbarossa Basic notions Definition: A directed graph (or digraph) composed by a set of vertices and a set of edges We adopt the convention that the information flows

### Mining Social-Network Graphs

342 Chapter 10 Mining Social-Network Graphs There is much information to be gained by analyzing the large-scale data that is derived from social networks. The best-known example of a social network is

### Part 2: Community Detection

Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

### Applying Social Network Analysis to the Information in CVS Repositories

Applying Social Network Analysis to the Information in CVS Repositories Luis Lopez-Fernandez, Gregorio Robles, Jesus M. Gonzalez-Barahona GSyC, Universidad Rey Juan Carlos {llopez,grex,jgb}@gsyc.escet.urjc.es

### HISTORICAL DEVELOPMENTS AND THEORETICAL APPROACHES IN SOCIOLOGY Vol. I - Social Network Analysis - Wouter de Nooy

SOCIAL NETWORK ANALYSIS University of Amsterdam, Netherlands Keywords: Social networks, structuralism, cohesion, brokerage, stratification, network analysis, methods, graph theory, statistical models Contents

### Walk-Based Centrality and Communicability Measures for Network Analysis

Walk-Based Centrality and Communicability Measures for Network Analysis Michele Benzi Department of Mathematics and Computer Science Emory University Atlanta, Georgia, USA Workshop on Innovative Clustering

### Graph. Consider a graph, G in Fig Then the vertex V and edge E can be represented as:

Graph A graph G consist of 1. Set of vertices V (called nodes), (V = {v1, v2, v3, v4...}) and 2. Set of edges E (i.e., E {e1, e2, e3...cm} A graph can be represents as G = (V, E), where V is a finite and

### Network Analysis: Lecture 1. Sacha Epskamp 02-09-2014

: Lecture 1 University of Amsterdam Department of Psychological Methods 02-09-2014 Who are you? What is your specialization? Why are you here? Are you familiar with the network perspective? How familiar

### College Algebra. Barnett, Raymond A., Michael R. Ziegler, and Karl E. Byleen. College Algebra, 8th edition, McGraw-Hill, 2008, ISBN: 978-0-07-286738-1

College Algebra Course Text Barnett, Raymond A., Michael R. Ziegler, and Karl E. Byleen. College Algebra, 8th edition, McGraw-Hill, 2008, ISBN: 978-0-07-286738-1 Course Description This course provides

### USE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS

USE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu ABSTRACT This

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### Graph Processing and Social Networks

Graph Processing and Social Networks Presented by Shu Jiayu, Yang Ji Department of Computer Science and Engineering The Hong Kong University of Science and Technology 2015/4/20 1 Outline Background Graph

### Why? A central concept in Computer Science. Algorithms are ubiquitous.

Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

### Analyzing Big Data: Social Choice & Measurement

Analyzing Big Data: Social Choice & Measurement John W. Patty Elizabeth Maggie Penn August 30, 2014 Social scientists commonly construct indices to reduce and summarize complex, multidimensional data.

### An Introduction to APGL

An Introduction to APGL Charanpal Dhanjal February 2012 Abstract Another Python Graph Library (APGL) is a graph library written using pure Python, NumPy and SciPy. Users new to the library can gain an

### STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. by John C. Davis Clarificationof zonationprocedure described onpp. 38-39 Because the notation used in this section (Eqs. 4.8 through 4.84) is inconsistent

### 4. MATRICES Matrices

4. MATRICES 170 4. Matrices 4.1. Definitions. Definition 4.1.1. A matrix is a rectangular array of numbers. A matrix with m rows and n columns is said to have dimension m n and may be represented as follows:

### Lecture 9. 1 Introduction. 2 Random Walks in Graphs. 1.1 How To Explore a Graph? CS-621 Theory Gems October 17, 2012

CS-62 Theory Gems October 7, 202 Lecture 9 Lecturer: Aleksander Mądry Scribes: Dorina Thanou, Xiaowen Dong Introduction Over the next couple of lectures, our focus will be on graphs. Graphs are one of

### arxiv:1201.3120v3 [math.na] 1 Oct 2012

RANKING HUBS AND AUTHORITIES USING MATRIX FUNCTIONS MICHELE BENZI, ERNESTO ESTRADA, AND CHRISTINE KLYMKO arxiv:1201.3120v3 [math.na] 1 Oct 2012 Abstract. The notions of subgraph centrality and communicability,

### Graphs and Network Flows IE411 Lecture 1

Graphs and Network Flows IE411 Lecture 1 Dr. Ted Ralphs IE411 Lecture 1 1 References for Today s Lecture Required reading Sections 17.1, 19.1 References AMO Chapter 1 and Section 2.1 and 2.2 IE411 Lecture

### IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* Rakesh Nagi Department of Industrial Engineering University at Buffalo (SUNY) *Lecture notes from Network Flows by Ahuja, Magnanti

### General Network Analysis: Graph-theoretic. COMP572 Fall 2009

General Network Analysis: Graph-theoretic Techniques COMP572 Fall 2009 Networks (aka Graphs) A network is a set of vertices, or nodes, and edges that connect pairs of vertices Example: a network with 5

### Long questions answer Advanced Mathematics for Computer Application If P= , find BT. 19. If B = 1 0, find 2B and -3B.

Unit-1: Matrix Algebra Short questions answer 1. What is Matrix? 2. Define the following terms : a) Elements matrix b) Row matrix c) Column matrix d) Diagonal matrix e) Scalar matrix f) Unit matrix OR

### Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)

### Operation Count; Numerical Linear Algebra

10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point

### In the following we will only consider undirected networks.

Roles in Networks Roles in Networks Motivation for work: Let topology define network roles. Work by Kleinberg on directed graphs, used topology to define two types of roles: authorities and hubs. (Each

### MATH 551 - APPLIED MATRIX THEORY

MATH 55 - APPLIED MATRIX THEORY FINAL TEST: SAMPLE with SOLUTIONS (25 points NAME: PROBLEM (3 points A web of 5 pages is described by a directed graph whose matrix is given by A Do the following ( points

### The basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23

(copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, high-dimensional

### Notes on Matrix Multiplication and the Transitive Closure

ICS 6D Due: Wednesday, February 25, 2015 Instructor: Sandy Irani Notes on Matrix Multiplication and the Transitive Closure An n m matrix over a set S is an array of elements from S with n rows and m columns.

### SECTIONS 1.5-1.6 NOTES ON GRAPH THEORY NOTATION AND ITS USE IN THE STUDY OF SPARSE SYMMETRIC MATRICES

SECIONS.5-.6 NOES ON GRPH HEORY NOION ND IS USE IN HE SUDY OF SPRSE SYMMERIC MRICES graph G ( X, E) consists of a finite set of nodes or vertices X and edges E. EXMPLE : road map of part of British Columbia

### Social and Technological Network Analysis. Lecture 3: Centrality Measures. Dr. Cecilia Mascolo (some material from Lada Adamic s lectures)

Social and Technological Network Analysis Lecture 3: Centrality Measures Dr. Cecilia Mascolo (some material from Lada Adamic s lectures) In This Lecture We will introduce the concept of centrality and

### Graph Algorithms using Map-Reduce

Graph Algorithms using Map-Reduce Graphs are ubiquitous in modern society. Some examples: The hyperlink structure of the web 1/7 Graph Algorithms using Map-Reduce Graphs are ubiquitous in modern society.

### Linear Algebra and TI 89

Linear Algebra and TI 89 Abdul Hassen and Jay Schiffman This short manual is a quick guide to the use of TI89 for Linear Algebra. We do this in two sections. In the first section, we will go over the editing

### Ranking on Data Manifolds

Ranking on Data Manifolds Dengyong Zhou, Jason Weston, Arthur Gretton, Olivier Bousquet, and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics, 72076 Tuebingen, Germany {firstname.secondname

### Analyzing Big Data: Social Choice & Measurement

Analyzing Big Data: Social Choice & Measurement John W. Patty Elizabeth Maggie Penn September 27, 2014 Social scientists commonly construct indices to reduce and summarize complex, multidimensional data.

### Direct Methods for Solving Linear Systems. Linear Systems of Equations

Direct Methods for Solving Linear Systems Linear Systems of Equations Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University

### ! E6893 Big Data Analytics Lecture 10:! Linked Big Data Graph Computing (II)

E6893 Big Data Analytics Lecture 10: Linked Big Data Graph Computing (II) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and

### APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large

### MATH 240 Fall, Chapter 1: Linear Equations and Matrices

MATH 240 Fall, 2007 Chapter Summaries for Kolman / Hill, Elementary Linear Algebra, 9th Ed. written by Prof. J. Beachy Sections 1.1 1.5, 2.1 2.3, 4.2 4.9, 3.1 3.5, 5.3 5.5, 6.1 6.3, 6.5, 7.1 7.3 DEFINITIONS

### 1 Introduction to Matrices

1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

### Bindel, Fall 2012 Matrix Computations (CS 6210) Week 8: Friday, Oct 12

Why eigenvalues? Week 8: Friday, Oct 12 I spend a lot of time thinking about eigenvalue problems. In part, this is because I look for problems that can be solved via eigenvalues. But I might have fewer

### Arithmetic and Algebra of Matrices

Arithmetic and Algebra of Matrices Math 572: Algebra for Middle School Teachers The University of Montana 1 The Real Numbers 2 Classroom Connection: Systems of Linear Equations 3 Rational Numbers 4 Irrational

### 7 Communication Classes

this version: 26 February 2009 7 Communication Classes Perhaps surprisingly, we can learn much about the long-run behavior of a Markov chain merely from the zero pattern of its transition matrix. In the

### Statistical Analysis of Complete Social Networks

Statistical Analysis of Complete Social Networks Introduction to networks Christian Steglich c.e.g.steglich@rug.nl median geodesic distance between groups 1.8 1.2 0.6 transitivity 0.0 0.0 0.5 1.0 1.5 2.0

### Graph Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Graph Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 3. Topic Overview Definitions and Representation Minimum

V. Adamchik 1 Graph Theory Victor Adamchik Fall of 2005 Plan 1. Basic Vocabulary 2. Regular graph 3. Connectivity 4. Representing Graphs Introduction A.Aho and J.Ulman acknowledge that Fundamentally, computer

### The PageRank Citation Ranking: Bring Order to the Web

The PageRank Citation Ranking: Bring Order to the Web presented by: Xiaoxi Pang 25.Nov 2010 1 / 20 Outline Introduction A ranking for every page on the Web Implementation Convergence Properties Personalized

### Theorem A graph T is a tree if, and only if, every two distinct vertices of T are joined by a unique path.

Chapter 3 Trees Section 3. Fundamental Properties of Trees Suppose your city is planning to construct a rapid rail system. They want to construct the most economical system possible that will meet the

### CONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation

Chapter 2 CONTROLLABILITY 2 Reachable Set and Controllability Suppose we have a linear system described by the state equation ẋ Ax + Bu (2) x() x Consider the following problem For a given vector x in

### 6.042/18.062J Mathematics for Computer Science October 3, 2006 Tom Leighton and Ronitt Rubinfeld. Graph Theory III

6.04/8.06J Mathematics for Computer Science October 3, 006 Tom Leighton and Ronitt Rubinfeld Lecture Notes Graph Theory III Draft: please check back in a couple of days for a modified version of these

### Collecting Network Data in Surveys

Collecting Network Data in Surveys Arun Advani and Bansi Malde September 203 We gratefully acknowledge funding from the ESRC-NCRM Node Programme Evaluation for Policy Analysis Grant reference RES-576-25-0042.

### A Network Approach to Define Modularity of Components in Complex Products

Manuel E. Sosa INSEAD Fontainebleau, France manuel.sosa@insead.edu Steven D. Eppinger MIT Cambridge, MA, USA eppinger@mit.edu Craig M. Rowles Pratt and Whitney East Hartford, CT, USA rowles@alum.mit.edu

### Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

### Network Analysis Basics and applications to online data

Network Analysis Basics and applications to online data Katherine Ognyanova University of Southern California Prepared for the Annenberg Program for Online Communities, 2010. Relational data Node (actor,

### = [a ij ] 2 3. Square matrix A square matrix is one that has equal number of rows and columns, that is n = m. Some examples of square matrices are

This document deals with the fundamentals of matrix algebra and is adapted from B.C. Kuo, Linear Networks and Systems, McGraw Hill, 1967. It is presented here for educational purposes. 1 Introduction In

### 10. Graph Matrices Incidence Matrix

10 Graph Matrices Since a graph is completely determined by specifying either its adjacency structure or its incidence structure, these specifications provide far more efficient ways of representing a

### Hadoop SNS. renren.com. Saturday, December 3, 11

Hadoop SNS renren.com Saturday, December 3, 11 2.2 190 40 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December

### A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS Kyoungjin Park Alper Yilmaz Photogrammetric and Computer Vision Lab Ohio State University park.764@osu.edu yilmaz.15@osu.edu ABSTRACT Depending

### Relations Graphical View

Relations Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry Introduction Recall that a relation between elements of two sets is a subset of their Cartesian product (of ordered pairs). A binary

### Solving Simultaneous Equations and Matrices

Solving Simultaneous Equations and Matrices The following represents a systematic investigation for the steps used to solve two simultaneous linear equations in two unknowns. The motivation for considering

### Manifold Learning Examples PCA, LLE and ISOMAP

Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

### Data Structures in Java. Session 16 Instructor: Bert Huang

Data Structures in Java Session 16 Instructor: Bert Huang http://www.cs.columbia.edu/~bert/courses/3134 Announcements Homework 4 due next class Remaining grades: hw4, hw5, hw6 25% Final exam 30% Midterm

### Similar matrices and Jordan form

Similar matrices and Jordan form We ve nearly covered the entire heart of linear algebra once we ve finished singular value decompositions we ll have seen all the most central topics. A T A is positive

### Graph Theory for Articulated Bodies

Graph Theory for Articulated Bodies Alba Perez-Gracia Department of Mechanical Engineering, Idaho State University Articulated Bodies A set of rigid bodies (links) joined by joints that allow relative

### Practical statistical network analysis (with R and igraph)

Practical statistical network analysis (with R and igraph) Gábor Csárdi csardi@rmki.kfki.hu Department of Biophysics, KFKI Research Institute for Nuclear and Particle Physics of the Hungarian Academy of

### MATH 304 Linear Algebra Lecture 8: Inverse matrix (continued). Elementary matrices. Transpose of a matrix.

MATH 304 Linear Algebra Lecture 8: Inverse matrix (continued). Elementary matrices. Transpose of a matrix. Inverse matrix Definition. Let A be an n n matrix. The inverse of A is an n n matrix, denoted

### December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation

### Mathematics of Cryptography Modular Arithmetic, Congruence, and Matrices. A Biswas, IT, BESU SHIBPUR

Mathematics of Cryptography Modular Arithmetic, Congruence, and Matrices A Biswas, IT, BESU SHIBPUR McGraw-Hill The McGraw-Hill Companies, Inc., 2000 Set of Integers The set of integers, denoted by Z,

### Graphs. Graph G=(V,E): representation

Graphs G = (V,E) V the vertices of the graph {v 1, v 2,..., v n } E the edges; E a subset of V x V A cost function cij is the cost/ weight of the edge (v i, v j ) Graph G=(V,E): representation 1. Adjacency

### 3. INNER PRODUCT SPACES

. INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.

### CS2 Algorithms and Data Structures Note 11. Breadth-First Search and Shortest Paths

CS2 Algorithms and Data Structures Note 11 Breadth-First Search and Shortest Paths In this last lecture of the CS2 Algorithms and Data Structures thread we will consider the problem of computing distances

### Nonlinear Iterative Partial Least Squares Method

Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for

### The University of Chicago Press is collaborating with JSTOR to digitize, preserve and extend access to American Journal of Sociology.

Power and Centrality: A Family of Measures Author(s): Phillip Bonacich Reviewed work(s): Source: American Journal of Sociology, Vol. 92, No. 5 (Mar., 1987), pp. 1170-1182 Published by: The University of

### SOLVING LINEAR SYSTEMS

SOLVING LINEAR SYSTEMS Linear systems Ax = b occur widely in applied mathematics They occur as direct formulations of real world problems; but more often, they occur as a part of the numerical analysis

### Linear Programming I

Linear Programming I November 30, 2003 1 Introduction In the VCR/guns/nuclear bombs/napkins/star wars/professors/butter/mice problem, the benevolent dictator, Bigus Piguinus, of south Antarctica penguins

### Mathematics of Cryptography

CHAPTER 2 Mathematics of Cryptography Part I: Modular Arithmetic, Congruence, and Matrices Objectives This chapter is intended to prepare the reader for the next few chapters in cryptography. The chapter

### Diagonal, Symmetric and Triangular Matrices

Contents 1 Diagonal, Symmetric Triangular Matrices 2 Diagonal Matrices 2.1 Products, Powers Inverses of Diagonal Matrices 2.1.1 Theorem (Powers of Matrices) 2.2 Multiplying Matrices on the Left Right by

### Identification of Influencers - Measuring Influence in Customer Networks

Submitted to Decision Support Systems manuscript DSS Identification of Influencers - Measuring Influence in Customer Networks Christine Kiss, Martin Bichler Internet-based Information Systems, Dept. of

### Factor Analysis. Chapter 420. Introduction

Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

### Localized Bridging Centrality for Distributed Network Analysis

Localized Bridging Centrality for Distributed Network Analysis Soumendra Nanda and David Kotz Department of Computer Science and Institute for Security Technology Studies, Dartmouth College, Hanover, NH

### (67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7

(67902) Topics in Theory and Complexity Nov 2, 2006 Lecturer: Irit Dinur Lecture 7 Scribe: Rani Lekach 1 Lecture overview This Lecture consists of two parts In the first part we will refresh the definition

### Vectors. Philippe B. Laval. Spring 2012 KSU. Philippe B. Laval (KSU) Vectors Spring /

Vectors Philippe B Laval KSU Spring 2012 Philippe B Laval (KSU) Vectors Spring 2012 1 / 18 Introduction - Definition Many quantities we use in the sciences such as mass, volume, distance, can be expressed

### MyMathLab ecourse for Developmental Mathematics

MyMathLab ecourse for Developmental Mathematics, North Shore Community College, University of New Orleans, Orange Coast College, Normandale Community College Table of Contents Module 1: Whole Numbers and