Complex Network Analysis of Brain Connectivity: An Introduction LABREPORT 5 Fernando Ferreira-Santos 2012
Title: Complex Network Analysis of Brain Connectivity: An Introduction Technical Report Authors: Fernando Ferreira-Santos University of Porto, Portugal and University College London, UK Keywords: brain connectivity; network theory; graph theory. Citation (APA 6 th ): Ferreira-Santos, F. (2012). Complex network analysis of brain connectivity: An introduction (LabReport No. 5). Porto: Laboratory of Neuropsychophysiology (University of Porto). Retrieved from: http://www.fpce.up.pt/labpsi/data_files/09labreports/labreport_5.pdf LABREPORTS Series, Number 5 Scientific coordination: João Marques-Teixeira, Fernando Barbosa, Pedro R. Almeida, Fernando Ferreira-Santos This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this licensed, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California 94105, USA. Laboratory of Neuropsychophysiology Faculdade de Psicologia e de Ciências da Educação da Universidade do Porto Rua do Dr. Manuel Pereira da Silva, 4200-392 Porto PORTUGAL http://www.fpce.up.pt/labpsi/ http://www.fpce.up.pt/
Complex Network Analysis of Brain Connectivity: An Introduction Introduction 3 Network definition and properties 4 Network measures and types 6 Modelling brain connectivity and network inferences 9 Conclusions 11 References 11 Introduction In the last decades there has been an exponential increase in human neuroscientific research. This is in part due to the widespread availability of non-invasive techniques for measuring brain structure and activity, such as neuroimaging (e.g., MRI, fmri, DTI) and neurophysiological recordings (e.g., EEG, MEG), which produce large datasets of spatiotemporal data. In fact, the amount of data obtained from a single fmri or high-density EEG experiment in the present day would be computationally intractable a few decades ago and it is still fairly common for researchers to analyze only a small subset of the whole data collected (e.g., analysing only a few selected EEG electrodes from a high-density recording). Presently, one of the areas of technical advancement that is raising significant interest is that of brain connectivity (Bullmore & Sporns, 2009; Reijneveld, Ponten, Berendse, & Stam, 2007; Sporns, 2011; Stam & Reijneveld, 2007), and one of the promises of connectivity analyses is to make the most of the rich datasets that neuroscientists collect, by analysing the spatiotemporal dynamics present in the data. The purpose of the present report is to introduce the basic concepts of Network Theory 1 (often enumerating different terms that are used to represent the same concept) and their 1 Network Theory is the part of Graph Theory (a branch of Mathematics) that is specifically concerned with modelling complex real-world systems (moving away from the classical objects of Graph Theoretical analysis, namely random and regular graphs, which are poor candidates for modelling many real-world phenomena; Strogatz, 2001). Earlier references to these methods in Neuroscience tend refer solely to Graph Theory, which is still technically correct. LABREPORT 5 -- Fernando Ferreira-Santos 3
application to the study of brain connectivity. Brain connectivity is a broad concept, but can be generally divided into three subtypes: structural, functional, and effective connectivity (Friston, 1994; see Horwitz [2003] for a discussion of some conceptual caveats). Structural connectivity refers to the anatomical connections between brain regions, typically corresponding to white matter tracts. Functional connectivity corresponds to the temporal correlation in the activity of two brain regions, regardless of whether they have direct anatomic links. Finally, effective connectivity consists of directed causal influences one brain region produces in another (Rubinov & Sporns, 2010). A common denominator to the different kinds of brain connectivity is the idea that, in all instances, the system can be described as a (structural or functional) network. The idea that the brain is a network-like system is far from being novel in neuroscience, but attempts to examine these properties under formal mathematical network theories are a recent endeavour. Network definition and properties In Network Theoretical terms, a network (or graph) is a mathematical model that represents of a collection of nodes (or vertices) and links (or edges, or connections) between pairs of nodes (Newman, 2003; Rubinov & Sporns, 2010) (Figure 1). Note that a complex network is an abstract model and can be used to represent the brain systems at different levels (from small ensembles of neurons and synaptic connections to macroanatomical regions connected by white matter bundles) 2. node link Figure 1. Graphical representation of a network (or graph). The characteristics of the links define the network properties. In the simplest case of unweighted undirected networks, each link has the same strength or length (and, as such, existing links are represented by 1 whereas the absence of a link is represented by 0) and links are bidirectional (meaning the in a link connecting nodes A and B information can travel from A to B and from B to A). If a network is weighted, this means that the links may differ from each other, with some being stronger than others. The strength or length of each link is represented by its 2 In fact, networks have been used to model complex systems across disciplines from physics to the social sciences (Newman, 2003). LABREPORT 5 -- Fernando Ferreira-Santos 4
specific weight. Finally, a network may be directed if its links are unidirectional (in a directed link connecting nodes A and B information can travel from A to B but not from B to A) (Figure 2). Unweighted (binary) Weighted Undirected (symmetrical) Directed (asymmetrical) Figure 2. Graphical representation of four networks illustrating the possible combinations of different properties. As seen in Figures 1 and 2 above, networks can be graphically represented by plotting the nodes and links according to the network properties, but the most computationally useful format to represent networks in matrix form. The adjacency matrix is a n-by-n square matrix (n being the number of nodes in the network). Usually the adjacency matrix is indicated by A and an individual link is indicated as A i,j. For visualization, the adjacency matrix can be coded in greyscale values (0=black up to 1=white) or with other colour maps. Unweighted (binary) Undirected (symmetrical) Weighted Directed (asymmetrical) B E B E B E A D A D A D C A B C D E A 0 1 1 1 0 B 1 0 1 1 0 C 1 1 0 1 0 D 1 1 1 0 1 E 0 0 0 1 0 C A B C D E A 0 0.8 0.5 1 0 B 0.8 0 0.2 0.5 0 C 0.5 0.2 0 0.8 0 D 1 0.5 0.8 0 0.8 E 0 0 0 0.8 0 C A B C D E A 0 1 1 1 0 B 0 0 1 1 0 C 0 0 0 1 0 D 1 0 0 0 1 E 0 0 0 1 0 Figure 3. Graphical representation of three networks (top row) and the corresponding adjacency matrices, coded with a greyscale colour map (bottom row). The nodes are indicated by letters both in the graphical representation and in the adjacency matrix. For weighted networks the width of the links indicates connection strength. For directed networks the arrows indicate the direction of the information flow. Note that the adjacency matrices of undirected networks are necessarily symmetrical whereas for directed networks this is not the case. LABREPORT 5 -- Fernando Ferreira-Santos 5
Network measures and types The main advantage of formal Network Theoretical analysis is the precise quantification of network parameters (or measures, or metrics) that allow examining the network topology and efficiency. There are many measures which can be calculated for a network (for a detailed review see Appendix 1 of Rubinov & Sporns, 2010), but only the core measures of networks will be detailed below (Stam & Reijneveld, 2007) 3 Figure 4 provides examples of some of the core measures of networks. Set of nodes in a network (N) and size (n): the size of a network is the number of nodes (n) in it. It corresponds to the number of rows of the adjacency matrix (or columns, given that it is a square matrix). For a network of size n, he maximum number of possible links is (n 1)n/2 for undirected networks and (n 1)n for directed networks (excluding the possibility of self-connections). The set of all nodes in the network is usually represented by N. Degree (k) and Degree Distribution: the degree (sometimes referred to as strength) of a node consists of the number of links which connect to it, which is also the number of neighbour nodes (i.e. nodes directly connected to it). The degree distribution consists of the degrees of all the nodes in the network, and can be defined analytically as the probability of k as a function of k. The mean degree of the network is a measure of the density or wiring cost of the network (the larger the degree then the larger the number of connections). Clustering coefficient (C): this is the main measure of local structure of a network which can be calculated for individual nodes or for the entire network. The clustering coefficient c i of node i with degree k i can be defined as the ratio of the actual number of links between neighbours of i (e i ), and the maximum possible number of links between those neighbours (neighbours of i are nodes directly connected to node i). c i = 2e i k i (k i 1) (1) The clustering coefficient C of the network is the average of all individual clustering coefficients: 3 In Network Theory notation there is a tendency to use capital letters to indicate network measures and lower case letters to denote node measures, although there are sometimes exceptions to this. The present report follows this convention. LABREPORT 5 -- Fernando Ferreira-Santos 6
N C = 1 N c i i=1 (2) Clustering coefficients vary between 0 and 1. High clustering coefficients means that neighbouring nodes are well interconnected. This suggests redundancy in connections, which protects the network against random error, i.e. the loss of an individual node will have little impact on the structure of the network. Characteristic path length (L): this is a network measure indicates how integrated a network is and how easy can information flow within the network. The path length or distance (or geodesic path) d i,j between two nodes i and j is the smallest number of links that connect i to j. The characteristic path length L of a network is the average of distances between all pairs of nodes: 1 L = N(N 1) i,j N,i j d i,j (3) Network Adjacency matrix Degree Degree distribution Clust. Coef. Distance matrix A B C D E A B C D E k A 0 1 1 1 0 3 B 1 0 1 1 0 3 C 1 1 0 1 0 3 D 1 1 1 0 1 4 E 0 0 0 1 0 1 P(k) A B C D E c Mean k = 2.8 A 0 1 1 1 0 1 1 0.8 0.6 0.4 0.2 0 B 1 0 1 1 0 1 C 1 1 0 1 0 1 D 1 1 1 0 1 0.5 0 1 2 3 4 E 0 0 0 1 0 0 Degree (k) C = 0.70 A B C D E A 0 1 1 1 2 B 1 0 1 1 2 C 1 1 0 1 2 D 1 1 1 0 1 E 2 2 2 1 0 L = 1.3 A B C D E A B C D E k A 0 1 1 1 0 3 B 1 0 0 0 0 1 C 1 0 0 1 0 2 D 1 0 1 0 1 3 E 0 0 0 1 0 1 P(k) 1 0.8 0.6 0.4 0.2 0 A B C D E c Mean k = 2.0 A 0 1 1 1 0 0.3 B 1 0 0 0 0 0 C 1 0 0 1 0 1 D 1 0 1 0 1 0.3 0 1 2 3 4 E 0 0 0 1 0 0 Degree (k) C = 0.33 A B C D E A 0 1 1 1 2 B 1 0 2 2 3 C 1 2 0 1 2 D 1 2 1 0 1 E 2 3 2 1 0 L = 1.6 Figure 4. Core network measures for two networks. The k and c values indicate the degree and clustering coefficient for each individual node (A to E), respectively. Note that differences in the density of connections, which in this example can be appreciated visually, are reflected by the value of the mean degree and lead to different degree distributions. The distance matrices indicate the distance d for each pair of nodes. The top network shows more interconnections between neighbouring nodes than the bottom network, leading to a higher C and a lower L. LABREPORT 5 -- Fernando Ferreira-Santos 7
Based on the core measures presented so far, different types of networks can be distinguished 4 (Stam & Reijneveld, 2007 Figure 5): Ordered (regular or latice-like) networks: in these networks every node is connected to its k neighbours, leading to an ordered connectivity structure. The degree of all nodes in such a network is the same. This leads to a high clustering coefficient C (high resilience to random error) and a high characteristic path length L (poor transmission of information). Random networks: these are the opposite of ordered networks, as the links between nodes are completely random. This leads to low C and low L (i.e., information travels easily, but the network is vulnerable to the loss of single nodes). Small-world networks: the small-world topology resembles an ordered network with a few randomly rewired links. This means that small-world networks have high C, making them resilient, but also low L, making them effective. A popularized idea related to the concept of small-world networks is that of the six degrees of separation according to this idea, a person is only six acquaintances (or, in some versions, handshakes) away from any other person in the world (http://en.wikipedia.org/wiki/six_degrees_of_separation). This illustration captures the interesting properties of small-world networks: most people have shaken hands with other people from their community (high C). But a few people from that community, e.g. public figures or politicians, will have shaken hands with several other people from other communities. These long-range links mean that the distance (in handshakes) to other people is dramatically reduced because of this specific person who functions as a hub in the network. Small-worldness can be calculated via the following formula (Humphries & Gurney, 2008): S = C/C rand L/L rand (4) Small-world networks often have S > 1. Note that C and C rand are the clustering coefficients, and L and L rand are the characteristic path lengths of the tested network and a random null network, respectively (see section Modelling brain connectivity and network inferences below). 4 Scale-free networks are an additional type of networks that will not be addressed in the present report. The degree distribution of scale-free networks follows a power law, i.e., the network grows by preferential attachment. For details see Stam & Reijneveld (2007). LABREPORT 5 -- Fernando Ferreira-Santos 8
Figure 5. Reprinted with permission from Stam & Reijneveld (2007, p. 6): Three basic network types in the model of Watts and Strogatz. The leftmost graph is a ring of 16 vertices (N=16), where each vertex is connected to four neighbours (k=4). This is an ordered graph which has a high clustering coefficient C and a long path length L. By choosing an edge at random, and reconnecting it to a randomly chosen vertex, graphs with increasingly random structure can be generated for increasing rewiring probability p. In the case of p=1, the graph becomes completely random, and has a low clustering coefficient and a short path length. For small values of p so-called small-world networks arise, which combine the high clustering coefficient of ordered networks with the short path length of random networks. Modelling brain connectivity and network inferences When considering brain networks, the network nodes should ideally represent meaningful brain regions. The use of EEG/MEG sensors as nodes is a common practice, but such results should be carefully interpreted as the electromagnetic signals picked up are likely to show spatial overlap. Regarding the network links, these may represent structural or functional associations. In structural networks, links should represent the anatomical connections between brain regions, and different weights may represent the size, amount or coherence of fibre tracts. For functional and effective networks, links represent some measure correlation or causal influence (respectively) between the activity of the nodes they connect (Rubinov & Sporns, 2010). The first step of a network analysis is to extract a network model from the raw brain connectivity data. From the raw data, one must produce a connectivity matrix that captures the strengths of the connections between the nodes under analysis (which are usually represented using continuous numeric scales). In a structural connectivity analysis this matrix could represent white matter integrity (e.g., by considering anisotropy measures of brain structures); in a functional connectivity analysis the connectivity matrix will show some measure of association between channels or regions (e.g., temporal correlation, coherence); and in an effective connectivity analysis the connectivity matrix will be populated by measures of causal influence (e.g. Granger-causality, directed transfer function). The second step is to produce the adjacency LABREPORT 5 -- Fernando Ferreira-Santos 9
matrix. In weighted networks models, the adjacency matrix would simply be this connectivity matrix (or a normalized version of it). However, it is more common to convert the connectivity matrix to a binary (unweighted) adjacency matrix by retaining only the links that are above a certain threshold. This leads to a binary network model, where the links above the threshold are represented by 1 (presence of link) and those below it by 0 (absence of link). Finally, having produced the adjacency matrix, one can calculate all relevant network measures, as described in sections above. Once the model has been specified and its parameters calculated, one is typically interested in drawing inferences. In network analysis, this is usually done by comparing the measures of the tested network with those of a similar but random network, which constitutes the null network against which inferences are drawn. Randomization can take place at different steps of the modelling process (Zalesky, Fornito, & Bullmore, 2012), but it is usually accomplished by randomly shuffling the cells of the connectivity matrix prior to thresholding (Stam & Reijneveld, 2007). The random null network will be similar to the network under study in several dimensions (e.g., it will have the same mean degree), thus ensuring that inferences about specific measures are correctly drawn. As illustrated in Figure 4 above, differences in the mean degree of the network will affect the clustering coefficient and path length, suggesting that only when degree is controlled for, can meaningful interpretations of the other measures emerge. Figure 6 summarizes the process described. LABREPORT 5 -- Fernando Ferreira-Santos 10
Figure 6. Reprinted with permission from Stam & Reijneveld (2007, p. 12): Schematic illustration of graph analysis applied to multi channel recordings of brain activity (fmri, EEG or MEG). The first step (panel A) consists of computing a measure of correlation between all possible pairs of channels of recorded brain activity. The correlations can be represented in a correlation diagram (panel B, strength of correlation indicated with black white scale). Next a threshold is applied, and all correlations above the threshold are considered to be edges connecting vertices (channels). Thus, the correlation matrix is converted to a unweighted graph (panel C). From this graph various measures such as the clustering coefficient C and the path length L can be computed. For comparisons, random networks can be generated by shuffling the cells of the original correlation matrix of panel B. This shuffling preserves the symmetry of the matrix, and the mean strength of the correlations (panel D). From the random matrices graphs are constructed, and graph measures are computed as before. The mean values of the graph measures for the ensemble of random networks are determined. Finally, The ratio of the graph measures of the original network and the mean values of the graph measures of the random networks can be determined (panel F). Conclusions The present report has provided a basic introduction to the main concepts of complex network analysis of brain connectivity data. A future report will focus on technical issues and review software tools to conduct such studies. Readers interested in learning more about this topic are referred to the works cited for additional information. References Bullmore, E., & Sporns, O. (2009). Complex brain networks: Graph theoretical analysis of structural and functional systems. Nature Reviews Neuroscience, 10(3), 186-198. doi:10.1038/nrn2575 LABREPORT 5 -- Fernando Ferreira-Santos 11
Friston, K. J. (1994). Functional and effective connectivity in neuroimaging: A synthesis. Human Brain Mapping, 2(1-2), 56-78. doi:10.1002/hbm.460020107 Horwitz, B. (2003). The elusive concept of brain connectivity. NeuroImage, 19(2), 466-470. doi:10.1016/s1053-8119(03)00112-5 Humphries MD, Gurney K (2008) Network Small-world-ness : A quantitative method for determining canonical network equivalence. PLoS ONE, 3(4), e0002051. doi:10.1371/journal.pone.0002051 Newman, M. E. J. (2003). The Structure and function of complex networks. SIAM Review, 45(2), 167-256. doi:10.1137/s003614450342480 Reijneveld, J. C., Ponten, S. C., Berendse, H. W., & Stam, C. J. (2007). The application of graph theoretical analysis to complex networks in the brain. Clinical Neurophysiology, 118(11), 2317-2331. doi:10.1016/j.clinph.2007.08.010 Rubinov, M., & Sporns, O. (2010). Complex network measures of brain connectivity: uses and interpretations. NeuroImage, 52(3), 1059-1069. doi:10.1016/j.neuroimage.2009.10.003 Sporns, O. (2011). Networks in the brain. Cambridge, MA: MIT Press. Stam, C. J., & Reijneveld, J. C. (2007). Graph theoretical analysis of complex networks in the brain. Nonlinear Biomedical Physics, 1(3), 1-19. doi:10.1186/1753-4631-1-3 Zalesky, A., Fornito, A., & Bullmore, E. (2012). On the use of correlation as a measure of network connectivity. NeuroImage, 60(4), 2096-2106. Elsevier Inc. doi:10.1016/j.neuroimage.2012.02.001 Abbreviations used: DTI Diffusion tensor imaging EEG Electroencephalography fmri Functional magnetic resonance imaging MEG Magnetoencephalography MRI Magnetic resonance imaging LABREPORT 5 -- Fernando Ferreira-Santos 12