DATA ANALYSIS II Matrix Algorithms
Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where A(i, j ) = a ij denotes the similarity or affinity between points x i and x j. We require the similarity to be symmetric and non-negative, that is, a ij = a ji and a ij 0, respectively.
Weighted Adjacency Matrix The matrix A may be considered to be a weighted adjacency matrix of the weighted (undirected) graph G = (V,E), where each vertex is a point and each edge joins a pair of points, that is,
Degree Matrix For a vertex x i, let di denote the degree of the vertex, defined as We define the degree matrix D = of graph G as the n n diagonal matrix:
Normalized Adjacency Matrix The normalized adjacency matrix is obtained by dividing each row of the adjacency matrix by the degree of the corresponding node. Given the weighted adjacency matrix A for a graph G, its normalized adjacency matrix is defined as
Eigenvalues Because A is assumed to have non-negative elements, this implies that each element of M, namely m ij is also non-negative, as m ij = a ij, d i 0. Consider the sum of the i-th row in M; we have Thus, each row in M sums to 1. This implies that 1 is an eigenvalue of M. In fact, λ 1 = 1 is the largest eigenvalue of M, and the other eigenvalues satisfy the property that λ i 1. If G is connected then the eigenvector corresponding to λ 1 is u 1 = 1/ n * (1,1,...,1) T = 1/ n * 1.
Example (graph)
Adjacency and Degree Matrices
Graph Laplacian Matrix The Laplacian matrix of a graph is defined as L is a symmetric, positive semidefinite matrix.
Properties L has n real, non-negative eigenvalues, which can be arranged in decreasing order as follows: λ 1 λ 2 λ n 0. We can see that the first column (and the first row) is a linear combination of the remaining columns (rows). That is, if L i denotes the i-th column of L, then we can observe that L 1 +L 2 +L 3 + +L n = 0. This implies that the rank of L is at most n 1, and the smallest eigenvalue is λ n = 0, with the corresponding eigenvector given as u n = 1 n * (1,1,...,1) T = 1 / n * 1, provided the graph is connected. If the graph is disconnected, then the number of eigenvalues equal to zero specifies the number of connected components in the graph.
Eigenvector Centrality A natural extension of the simple degree centrality. We can think of degree centrality as awarding one centrality point for every network neighbor a vertex has. But not all neighbors are equivalent. Vertex s importance in a network is increased by having connections to other vertices that are themselves important.
Important Neighbors Let us make some initial guess about the centrality x i of each vertex i (e.g. x i = 1 for all i). We define the sum of the centralities of i s neighbors: where A ij is an element of the adjacency matrix.
Matrix Representation We can also write this expression in matrix notation as x = Ax, where x is the vector with elements x i. Repeating this process to make better estimates, we have after t steps a vector of centralities x(t) given by
Eigenvectors Now let us write x(0) as a linear combination of the eigenvectors v i of the adjacency matrix for some appropriate choice of constants c i :
Then where the κ i are the eigenvalues of A, and κ 1 is the largest of them. κ i /κ 1 < 1 for all i 1, t
In other words, the limiting vector of centralities is simply proportional to the leading eigenvector of the adjacency matrix. Equivalently we could say that the centrality x satisfies The centrality x i of vertex i is proportional to the sum of the centralities of i s neighbors:
Remarks The eigenvector centralities of all vertices are non-negative. To see this, consider what happens if the initial vector x(0) happens to have only non-negative elements. Since all elements of the adjacency matrix are also nonnegative, multiplication by A can never introduce any negative elements to the vector and x(t) must have all elements non-negative.
Normalization We care only about which vertices have high or low centrality and not about absolute values. We can normalize the centralities by, for instance, requiring that they sum to n (which insures that average centrality stays constant as the network gets larger).
Largest Eigenvalue? Eigenvector centrality is an example of a quantity that can be calculated by a computer in a number of different ways, but not all of them are equally efficient. One way to calculate it would be to use a standard linear algebra method to calculate the complete set of eigenvectors of the adjacency matrix, and then discard all of them except the one corresponding to the largest eigenvalue.???
Power Method If we start with essentially any initial vector x(0) and multiply it repeatedly by the adjacency matrix A, we get x(t) will converge to the required leading eigenvector of A as t. There is no faster method known for calculating the leading eigenvector of any matrix.
Problems We have to choose all elements of our initial vector to be positive, we are guaranteed that the vector cannot be orthogonal to the leading eigenvector. We must periodically renormalize the vector by dividing all the elements by the same value, which we are allowed to do since an eigenvector divided throughout by a constant is still an eigenvector. How long do we need to go on multiplying by the adjacency matrix before the result converges to the leading eigenvalue? One simple way to gauge convergence is to perform the calculation in parallel for two different initial vectors and watch to see when they reach the same value, within some prescribed tolerance.
Sources Zaki, M. J., Meira Jr, W. (2014). Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press. [397-401] Newman, M. (2010). Networks: an introduction. Oxford University Press. [169-172, 345-353]