# Lecture 9. 1 Introduction. 2 Random Walks in Graphs. 1.1 How To Explore a Graph? CS-621 Theory Gems October 17, 2012

Save this PDF as:

Size: px
Start display at page:

Download "Lecture 9. 1 Introduction. 2 Random Walks in Graphs. 1.1 How To Explore a Graph? CS-621 Theory Gems October 17, 2012"

## Transcription

1 CS-62 Theory Gems October 7, 202 Lecture 9 Lecturer: Aleksander Mądry Scribes: Dorina Thanou, Xiaowen Dong Introduction Over the next couple of lectures, our focus will be on graphs. Graphs are one of the most basic objects in computer science. They have plethora of applications essentially, almost any CS problem can be phrased in graph language (even though, of course, it is not always the best thing to do). Formally, we view a graph G = (V, E) as consisting of a set V of vertices with a set E of edges that connect pairs of vertices, i.e., we have E V V. For an edge e = (v, u), we call v and u the endpoints of the edge e. Sometimes, we consider graphs that are directed, i.e., we assume that each edge e = (v, u) is directed from its tail v to its head v. (Note that, as a result, in directed graphs the edge (u, v) is different than the edge (v, u).) To emphasize this difference, we usually call the edges of a directed graph arcs. In Figure, we see an example of an undirected and directed graph. We will usually denote the size V of a vertex set of a graph by n and the size E of its edge set by m. Finally, from time to time, we will be considering weighted graphs G = (V, E, w), in which we additionally have a weight vector w that assigns an non-negative weight w e to each edge e. (a) (b) Figure : (a) Undirected graph (b) Directed graph. How To Explore a Graph? Given a graph G = (V, E), one of the most basic graph primitives is exploration of that graph from a particular vertex s. There are many different ways to do this most popular ones are the depth-first and breath-first search and depending on applications some of them might be more preferable than the others. Today, we want to introduce yet another way of exploring the graph via so-called random walks that is fundamentally different than the most obvious ones, but turns out to provide us with a considerable insight into the structure of the underlying graphs. 2 Random Walks in Graphs Given an undirected graph G = (V, E) and some starting vertex s, a random walk in G of length t is defined as a randomized process in which, starting from the vertex s, we repeat t times a step that consists of choosing at random one of the neighbors v of the vertex v we are at and moving to it. (If

2 the graph is weighted then we transition to a neighbor v with probability that is proportional to the weight w e of the connecting edge e = (v, v ).) Note that this definition is also valid for a directed graph (we just always pick an arc that is leaving v, i.e., has v as its tail), but for reasons that will become clear later we focus our treatment on undirected case. 2. Random Walk as a Distribution Clearly, random walks as defined above are naturally viewed as trajectories determined by the outcome of randomized choices we make at each step. However, today we want to focus on a more global view on them and thus we will keep track not of these trajectories directly, but instead of the probability distribution on vertices that random walks induce. More precisely, for a given vertex v, we define p t v to be the probability that we are at vertex v after t steps of the random walk. For notational convenience, we will represent all these probabilities as a single n-dimensional vector p t R V with the v th coordinate corresponding to p t v. Clearly, if our random walks starts from some vertex s, the initial distribution p 0 of the random walk can be defined as { p 0, if v = s v = 0, otherwise. Now, we can express the evolution of the distribution p t as p t+ v = e E,e=(u,v) d(u) pt u for each v V, () where d(u) is the degree, i.e., number of adjacent edges, of vertex u. (If the graph is weighted, we want d(u) denote the weighted degree of vertex u, that is, the sum of weights of adjacent edges, and then p t+ v = w e e E,e=(u,v) d(u) pt u.) Intuitively, the evolution of the probability distribution p t as a function of t can be seen as describing a diffusion processes in the underlying graph. 2.2 Lazy Random Walk It will be sometime convenient to consider a slight variation of random walk, in which in each step, with probability /2, we stay at the current vertex and only with probability /2 we make the usual step of random walk. This variation is called lazy random walk and it can be viewed as a vanilla version of random walk in a graph in which we added d(u) self-loops to every vertex u. Formally, the evolution of the probability distribution ˆp t of such a lazy random walk is given by ˆp t+ v = 2 ˆpt Matrix Definitions e E,e=(u,v) d(u) ˆpt u for each v V. (2) As we will see soon, linear algebra is the right language for analyzing random walks. So, to enable us to use this language, we want to express the evolution of random walks in purely linear-algebraic terms. To this end, given a graph G = (V, E), let us define the adjacency matrix A of G as an n-by-n matrix with rows and columns indexed by vertices and A u,v = { if (u, v) E 0 otherwise. (3) 2

3 Now, we define the walk matrix W of G as an n-by-n matrix given by W := AD, where D is the degree matrix of G, i.e., a diagonal n-by-n matrix whose each diagonal entry D v,v is equal to the degree d(v) of vertex v. It is easy to verify that the walk matrix W can be explicitly written as W u,v = { d(v) if (u, v) E 0 otherwise. (4) As a result, we can compactly rewrite the evolution of the probability distribution p t (cf. Equation ()) as p t+ = W p t = W t p 0. (5) Similarly, in case of lazy random walk, if we define lazy walk matrix Ŵ of G to be a matrix Ŵ := 2 I + W, (6) 2 where I is the identity matrix, we can conveniently express the evolution of the distribution ˆp t (cf. Equation (2)) as ˆp t+ = Ŵ ˆpt = Ŵ t ˆp 0. (7) Finally, if the graph G = (V, E, w) is weighted then one just needs to adjust the definitions of the adjacency matrix A and degree matrix D to make A u,v := w e if e = (u, v) is an edge in G, and D v,v to be equal to the weighted degree of vertex v. 3 Stationary Distribution The main goal of today s lecture is understanding the behavior of the (lazy) random walk distributions p t and ˆp t for large t. At first, given the randomized character of the random walk and its crucial dependence on graph topology, one might expect this behavior to be very complex and hard to describe. However, somewhat strikingly, it turns out that the shape of the distributions p t and ˆp t for t is very easy to describe and depends only mildly on the topology of the graph. (As we will see later, it is actually the behavior of these distributions for moderate values of t that really carries more substantial information about the graph structure.) The key object that will be relevant here is the so-called stationary distribution π of an (undirected) graph G given by d(v) π v := (8) d(u). u V Intuitively, in this distribution, the probability of being at vertex v is proportional to its degree. The crucial property of this distribution is that W π = Ŵ π = π, (9) i.e., once the distribution p t or ˆp t is equal to π, for some t, it will remain equal for all t t. We can thus view the stationary distribution as a steady state of a random walk in G. Now, the surprising fact about random walks is that in every connected non-bipartite graph the random walk distribution p t on this graph is actually guaranteed to converge to π when t. No A stationary distribution can also be defined for directed graphs, but its form and conditions for existence are slightly more involved. 3

4 matter what the starting distribution p 0 is. (A graph G = (V, E) is bipartite if its set of vertices V can be partitioned into two disjoint sets P and Q such that all the edges in E have one endpoint in P and one in Q, i.e., E P Q.) The reason why there is no convergence when the graph is not connected is obvious. On the other hand, to see why there might be no convergence to π when the graph is bipartite, consider p 0 to be a distribution that is supported only on one side of the bipartition, i.e., only on P or only on Q. Then, in each step of random walk, the support of p t will move entirely to the other side and thus, in particular, p t will be always different for ts with different parity. It is not hard to see that the above problem with bipartite graphs is no longer present when we consider the lazy variant of random walks. Indeed, for the lazy random walk distribution ˆp t such convergence happens always even in bipartite graphs as long as the underlying graph is connected. (This is one of the main reasons why it is more convenient to work with lazy random walks instead of the vanilla ones.) 3. Spectral Matrix Theory Our goal now is to prove the convergence properties stated above. Looking at Equations (5) and (7), one can see that understanding the distributions p t and ˆp t for large t is equivalent to understanding the product of a power of the walk matrices W t and Ŵ t with the vector p 0, for large values of t. It turns out that there is a very powerful linear-algebraic machinery called spectral matrix theory that can be used for this purpose. To introduce this tool, let us define a λ R to be an eigenvalue of a square matrix M iff there exists some vector v such that Mv = λv. The vector v is called the eigenvector of M corresponding to the eigenvalue λ. Although every real n-by-n matrix M has n eigenvalues (but not necessarily real ones!), when M is symmetric, i.e., M i,j = M j,i, for all i, j, then these eigenvalues are especially nice. Namely, we have then that. M has exactly n eigenvalues λ,..., λ n (some of these λ i s can be equal) that are real. 2. One can find a collection v,..., v n of eigenvectors of M, where v i is the eigenvector corresponding to eigenvalue λ i, such that all v i s are orthogonal, i.e., (v i ) T v j = 0 if i j. Now, the fundamental importance of existence of the full set of n eigenvalues (and corresponding eigenvectors) of the matrix M lies in that we can express any power (in fact, any real function) of M in a very convenient way via its eigenvalues and eigenvectors. Namely, we have that M t = i λ t iv i (v i ) T. (More generally, for any real function f of M, we have f(m) = i f(λ i)v i (v i ) T.) So, a product of M t and any vector v is just equal to M t v = i λ t iv i (v i ) T v, which is an expression whose only parts that depend on t are the powers of eigenvalues and, as a result, understanding the evolution of M t v as a function of t boils down to understanding the eigenvalues of M. 4

5 3.2 Proof of Convergence In the light of the above discussion, we would like now to analyze the distribution p t (resp. ˆp t ) by treating it as a product of t-th power of the matrix W (resp. Ŵ ) and the vector p0, and applying to it the spectral approach we introduced. Unfortunately, the problem is that the matrices W and Ŵ might not necessarily be symmetric and thus they might, in particular, not even have real eigenvalues. It turns out, however, that there is a way to remedy this problem, provided that the graph is undirected. (When the graph is directed, it might actually really be the case that the walk matrices do not have eigenvalues and thus analyzing them becomes more challenging this is the main reason why we focus on undirected graph case.) To this end, we consider the following matrix S S := D 2 AD 2 = D 2 W D 2. It is easy to see that S is symmetric, as for undirected graphs the adjacency matrix A is symmetric. So, S has to have a set of n eigenvalues ω... ω n (for notational convenience, we numbered them in non-increasing order) and corresponding set of orthogonal eigenvectors v,..., v n. Now, the crucial observation is that, for any i, W (D 2 v i ) = D 2 SD 2 D 2 v i = D 2 Sv i = D 2 ωi v i = ω i (D 2 v i ), i.e., ω i is an eigenvalue of the walk matrix W corresponding to an eigenvector D 2 v i. This proves that W has a full set of n real eigenvalues (even though it is, in general, not symmetric!) and thus we can write p t as n p t = W t p 0 = (D 2 SD 2 ) t p 0 = D 2 (S t D 2 p 0 ) = D 2 ( ωiv t i (v i ) T D 2 p 0 ), (0) in which, again, the only terms depending on t are the powers of the eigenvalues ω i. So, we need now to bound the values of ω i to be able to understand how the limit of p t for t looks like. To this end, we prove the following lemma. Lemma For any connected graph G = (V, E), we have that ω i, for each i, ω is equal to, and π is an eigenvector of W corresponding to this eigenvalue. Furthermore, ω i < for i >. Proof The fact that ω = and π is the corresponding eigenvector follows directly from the property of the stationary distribution cf. (9). Now, to see that ω i, for all i, let y i = D 2 v i be an eigenvector of W corresponding to the eigenvalue ω i. Let u be the vertex that maximizes among all the vertices of G. By definition of ω i, we have that and thus ω i y i u ω i y i u = (w,u) E y i u d(u), (w,u) E y i w d(w) i= y i w d(w), () (w,u) E where in the second inequality we used the maximality of u. So, indeed ω i. y i u d(u) = yi u, 5

6 To show that ω i < when i >, let u be the vertex that maximizes y i u d(u), among all the vertices of G. Note that if ω i = then the only way the Equality () can hold is if yu i d(u) = yi w d(w), for all the neighbors of u in G. But, by repeating this argument iteratively and using the fact that G is connected, we can conclude that the above equality holds for any two vertices of the graph. This, in turn, means that y i = Cπ, for some C, i.e., y i has to be a scalar multiple of the stationary distribution vector π. In this way, we have shown that any eigenvector of W corresponding to the eigenvalue of has to be co-linear with π, which implies that there can be only one eigenvalue equal to. In one of the problem sets, you will be asked to prove that W has an eigenvalue equal to only if the graph G is bipartite. So, in the case of G being connected and non-bipartite, we have that ω i < for i > 2 and thus in the limit of t the powers of these eigenvalues will vanish. That is, (0) becomes n p t = W t p 0 = D 2 ( ωiv t i (v i ) T D 2 p 0 ) t = D 2 v (v ) T D 2 p 0 = Cπ = π, i= where the last two equalities follow as any eigenvector of W corresponding to ω = has to be of the form Cπ, for some C, (cf. Lemma ) and W t p 0 has to be a valid probability distribution, which means that C =. This concludes the proof of convergence for the vanilla version of the random walks. To see how the convergence for the lazy variant follows, note that by definition of the lazy walk matrix Ŵ (6), we have that ˆω i = ω i, for any i, where ˆω... ˆω n are the eigenvalues of the matrix Ŵ. Also, it must be the case that the eigenvectors of Ŵ and W coincide. So, by Lemma, we have that 0 ˆω i and, if G is connected, ˆω = is the only eigenvalue of value (with π being it corresponding eigenvector). Thus, by repeating the reasoning we applied to W above, we can conclude that again ˆp t = Ŵ t p 0 = D 2 ( n i= ˆω t iv i (v i ) T D 2 p 0 ) t = D 2 v (v ) T D 2 p 0 = π, as desired. (Note that we do not need to assume here that G is non-bipartite.) 4 The Convergence Rate Once we established the convergence result, it is natural to ask what is the rate at which this convergence happens. By analyzing Equation (0), one can establish the following lemma. (For convenience, we focus here on lazy random walk case.) 6

7 Lemma 2 For any graph G = (V, E) and any starting distribution ˆp 0, we have ˆp t d(v) π max v,u V d(u) ˆωt 2, for any t 0. For a connected graph G, the bound on convergence rate of a (lazy) random walk given by the above lemma is mainly dependent on how much smaller than is the value of ˆω 2. In particular, if we define the spectral gap ˆλ(G) of the graph G to be ˆλ(G) := ω 2 = 2( ˆω 2 ) then we will have that, for any ε > 0, the number of steps t ε needed for the lazy random walk distribution ˆp t d(g) to be within ε of the stationary distribution π is O( log ˆλ(G) ε ), where d(g) d(v) := max v,u V d(u) is the degree ratio of G. (The time t 2 is sometimes called the mixing time of the graph G.) As we will see in coming lectures, the spectral gap of the graph is a very important characteristic of the graph and captures a lot of its structural properties. So, the connection between ˆλ(G) and the mixing time of random walks implies that observing the behavior of the random walks for moderate values of t, i.e., how fast it converges to stationary distribution, provides us with a lot of information about the graph. Indeed, a lot of powerful graph algorithms is based on this random walk paradigm. 5 Examples of Mixing Times To get a feel for how different can the convergence rate (i.e., the mixing time) be in different graphs, we provide a heuristic analysis of random walk convergence on a few important examples.. Consider a path graph P n on n vertices cf. Figure 2(a), and assume that our starting vertex is in the middle. To understand the behavior of random walk in this graph, think about the random variable X t that describes the offset (from the middle) of the vertex random walk is at at step t. Clearly, X t [ n/2, n/2] and, ignoring the boundary cases, we can view X t as a sum of t independent random variables that are equally probable to be and. Now, it is easy to see that the standard deviation of the variable X t is t. So, heuristically, to have a reasonable chance of reaching one of the path endpoints (which is a a necessary condition for good mixing), we need this standard deviation to be Ω(n). This implies that the mixing time should be Θ(n 2 ) and thus the spectral gap ˆλ(P n ) has to be O(/n 2 ) (note that even a tight bound on mixing time provides only an upperbound on the spectral gap). As we will see later, even though our argument wasn t fully formal, the bounds we get are actually tight. In particular, ˆλ(P n ) is Θ(/n 2 ). 2. Consider now a full binary tree graph T n on n vertices, as depicted at Figure 2(b). Let us consider a random walk in this graph that starts from one of the leaves. Clearly, in order to mix, we need to be able to reach the root of the tree with good probability. (In fact, once we reach the root, it will take only little time to mix.) The root is at distance of only log 2 (n) away from the leaf, however, every time we climb towards the root we are twice as likely to go down rather than go further up. One can see that such a process requires Θ(2 d ) steps, in expectation, to reach a point that is d levels from the leaf. Therefore, we expect the mixing time to be Θ(2 log 2 n ) = Θ(n) and thus spectral gap to be O(/n). It turns out that the spectral gap ˆλ(T n ) is indeed Θ(/n). 3. Let us analyze now the dumbbell graph D n shown in Figure 2(c). It consists of two complete graphs K n/2 on n/2 vertices that are connected by only one bridge edge. (Complete graph K n is a graph in which every vertex is connected to every other vertex.) Consider a random walk starting at one 7

8 of the vertices that are not endpoints of the bridge edge. For this walk to mix, we need to be able to, in particular, pass through the bridge edge to reach the other complete graph. However, in each step, we have only Θ(/n) chance of moving to the endpoint of the bridge edge, and then we have only Θ(/n) chance to move to its other endpoint and thus cross the bridge. So, crossing to other side in any two consecutive steps happens with probability of Θ(/n 2 ) and thus the mixing time should be roughly Θ(n 2 ). This leads to upper bound on the spectral gap of this graph being roughly O(/n 2 ) and, again, this bound turns out to be asymptotically tight. 4. Finally, let us consider the bolas graph B n depicted in Figure 2(d). The only difference from the dumbbell graph is that in this graph we have a path graph P n/3 that connect the two complete graphs K n/3 instead of just one bridge edge. Again, our heuristic for estimation of the mixing time is to consider the expected time to cross from one complete graph to the other. By similar reasoning, we get that the probability for us to enter the path is Θ(/n 2 ), but now one can show that we have only probability of Θ(/n) of reaching the other end of the path before exiting it through the endpoint we entered. This hints at the mixing time of Θ(n 3 ) and, once more, the resulting upperbound on the spectral gap of this graph turns out to be tight, i.e., ˆλ(B n ) = Θ(/n 3 ). (As we will see later, O(n 3 ) is the worse-case bound on the number of steps (in expectation) required to explore any graph of n vertices. So, bolas graph is an example of an graph that has worst bottlenecks from the point of view of random walks.) 8

9 (a) (b) (c) (d) Figure 2: Some example graphs. (a) Path graph P n (b) Binary tree graph T n (c) Dumbbell graph D n (d) Bolas graph B n 9

### Markov Chains, part I

Markov Chains, part I December 8, 2010 1 Introduction A Markov Chain is a sequence of random variables X 0, X 1,, where each X i S, such that P(X i+1 = s i+1 X i = s i, X i 1 = s i 1,, X 0 = s 0 ) = P(X

### Spectral graph theory

Spectral graph theory Uri Feige January 2010 1 Background With every graph (or digraph) one can associate several different matrices. We have already seen the vertex-edge incidence matrix, the Laplacian

### DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

### Bindel, Fall 2012 Matrix Computations (CS 6210) Week 8: Friday, Oct 12

Why eigenvalues? Week 8: Friday, Oct 12 I spend a lot of time thinking about eigenvalue problems. In part, this is because I look for problems that can be solved via eigenvalues. But I might have fewer

### Markov chains and Markov Random Fields (MRFs)

Markov chains and Markov Random Fields (MRFs) 1 Why Markov Models We discuss Markov models now. This is the simplest statistical model in which we don t assume that all variables are independent; we assume

### Math 4707: Introduction to Combinatorics and Graph Theory

Math 4707: Introduction to Combinatorics and Graph Theory Lecture Addendum, November 3rd and 8th, 200 Counting Closed Walks and Spanning Trees in Graphs via Linear Algebra and Matrices Adjacency Matrices

### MATH 56A SPRING 2008 STOCHASTIC PROCESSES 31

MATH 56A SPRING 2008 STOCHASTIC PROCESSES 3.3. Invariant probability distribution. Definition.4. A probability distribution is a function π : S [0, ] from the set of states S to the closed unit interval

### Introduction to Flocking {Stochastic Matrices}

Supelec EECI Graduate School in Control Introduction to Flocking {Stochastic Matrices} A. S. Morse Yale University Gif sur - Yvette May 21, 2012 CRAIG REYNOLDS - 1987 BOIDS The Lion King CRAIG REYNOLDS

### Markov Chain Fundamentals (Contd.)

Markov Chain Fundamentals (Contd.) Recap from last lecture: - Any starting probability distribution π 0 can be represented as a linear combination of the eigenvectors including v 0 (stationary distibution)

### Lecture 3: Linear Programming Relaxations and Rounding

Lecture 3: Linear Programming Relaxations and Rounding 1 Approximation Algorithms and Linear Relaxations For the time being, suppose we have a minimization problem. Many times, the problem at hand can

### 6.896 Probability and Computation February 14, Lecture 4

6.896 Probability and Computation February 14, 2011 Lecture 4 Lecturer: Constantinos Daskalakis Scribe: Georgios Papadopoulos NOTE: The content of these notes has not been formally reviewed by the lecturer.

### princeton university F 02 cos 597D: a theorist s toolkit Lecture 7: Markov Chains and Random Walks

princeton university F 02 cos 597D: a theorist s toolkit Lecture 7: Markov Chains and Random Walks Lecturer: Sanjeev Arora Scribe:Elena Nabieva 1 Basics A Markov chain is a discrete-time stochastic process

### Why graph clustering is useful?

Graph Clustering Why graph clustering is useful? Distance matrices are graphs as useful as any other clustering Identification of communities in social networks Webpage clustering for better data management

### Homework 15 Solutions

PROBLEM ONE (Trees) Homework 15 Solutions 1. Recall the definition of a tree: a tree is a connected, undirected graph which has no cycles. Which of the following definitions are equivalent to this definition

### AN UPPER BOUND ON THE LAPLACIAN SPECTRAL RADIUS OF THE SIGNED GRAPHS

Discussiones Mathematicae Graph Theory 28 (2008 ) 345 359 AN UPPER BOUND ON THE LAPLACIAN SPECTRAL RADIUS OF THE SIGNED GRAPHS Hong-Hai Li College of Mathematic and Information Science Jiangxi Normal University

### Solutions to Exercises 8

Discrete Mathematics Lent 2009 MA210 Solutions to Exercises 8 (1) Suppose that G is a graph in which every vertex has degree at least k, where k 1, and in which every cycle contains at least 4 vertices.

### Social Media Mining. Graph Essentials

Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures

### Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

### Summary of week 8 (Lectures 22, 23 and 24)

WEEK 8 Summary of week 8 (Lectures 22, 23 and 24) This week we completed our discussion of Chapter 5 of [VST] Recall that if V and W are inner product spaces then a linear map T : V W is called an isometry

### GRADUATE STUDENT NUMBER THEORY SEMINAR

GRADUATE STUDENT NUMBER THEORY SEMINAR MICHELLE DELCOURT Abstract. This talk is based primarily on the paper Interlacing Families I: Bipartite Ramanujan Graphs of All Degrees by Marcus, Spielman, and Srivastava

### SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH

31 Kragujevac J. Math. 25 (2003) 31 49. SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH Kinkar Ch. Das Department of Mathematics, Indian Institute of Technology, Kharagpur 721302, W.B.,

### (67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7

(67902) Topics in Theory and Complexity Nov 2, 2006 Lecturer: Irit Dinur Lecture 7 Scribe: Rani Lekach 1 Lecture overview This Lecture consists of two parts In the first part we will refresh the definition

### The Power Method for Eigenvalues and Eigenvectors

Numerical Analysis Massoud Malek The Power Method for Eigenvalues and Eigenvectors The spectrum of a square matrix A, denoted by σ(a) is the set of all eigenvalues of A. The spectral radius of A, denoted

### LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12,

LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12, 2000 45 4 Iterative methods 4.1 What a two year old child can do Suppose we want to find a number x such that cos x = x (in radians). This is a nonlinear

### 1 Digraphs. Definition 1

1 Digraphs Definition 1 Adigraphordirected graphgisatriplecomprisedofavertex set V(G), edge set E(G), and a function assigning each edge an ordered pair of vertices (tail, head); these vertices together

### max cx s.t. Ax c where the matrix A, cost vector c and right hand side b are given and x is a vector of variables. For this example we have x

Linear Programming Linear programming refers to problems stated as maximization or minimization of a linear function subject to constraints that are linear equalities and inequalities. Although the study

### CSL851: Algorithmic Graph Theory Semester I Lecture 1: July 24

CSL851: Algorithmic Graph Theory Semester I 2013-2014 Lecture 1: July 24 Lecturer: Naveen Garg Scribes: Suyash Roongta Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have

### Markov chain mixing and coupling

Markov chain mixing and coupling Recall from Lecture 2: Definition 0.. Let S be a countable set and (X n ) a sequence of random variables taking values in S. We say that (X n ) is a Markov chain if it

### CHAPTER III - MARKOV CHAINS

CHAPTER III - MARKOV CHAINS JOSEPH G. CONLON 1. General Theory of Markov Chains We have already discussed the standard random walk on the integers Z. A Markov Chain can be viewed as a generalization of

### ON THE FIBONACCI NUMBERS

ON THE FIBONACCI NUMBERS Prepared by Kei Nakamura The Fibonacci numbers are terms of the sequence defined in a quite simple recursive fashion. However, despite its simplicity, they have some curious properties

### Linear Programming I

Linear Programming I November 30, 2003 1 Introduction In the VCR/guns/nuclear bombs/napkins/star wars/professors/butter/mice problem, the benevolent dictator, Bigus Piguinus, of south Antarctica penguins

### CS 598CSC: Combinatorial Optimization Lecture date: 2/4/2010

CS 598CSC: Combinatorial Optimization Lecture date: /4/010 Instructor: Chandra Chekuri Scribe: David Morrison Gomory-Hu Trees (The work in this section closely follows [3]) Let G = (V, E) be an undirected

### Good luck, veel succes!

Final exam Advanced Linear Programming, May 7, 13.00-16.00 Switch off your mobile phone, PDA and any other mobile device and put it far away. No books or other reading materials are allowed. This exam

### CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992

CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992 Original Lecture #6: 28 January 1991 Topics: Triangulating Simple Polygons

### Eigenvalues and Markov Chains

Eigenvalues and Markov Chains Will Perkins April 15, 2013 The Metropolis Algorithm Say we want to sample from a different distribution, not necessarily uniform. Can we change the transition rates in such

### 10.1 Integer Programming and LP relaxation

CS787: Advanced Algorithms Lecture 10: LP Relaxation and Rounding In this lecture we will design approximation algorithms using linear programming. The key insight behind this approach is that the closely

### 7.1 Introduction. CSci 335 Software Design and Analysis III Chapter 7 Sorting. Prof. Stewart Weiss

Chapter 7 Sorting 7.1 Introduction Insertion sort is the sorting algorithm that splits an array into a sorted and an unsorted region, and repeatedly picks the lowest index element of the unsorted region

### Lecture 7: Approximation via Randomized Rounding

Lecture 7: Approximation via Randomized Rounding Often LPs return a fractional solution where the solution x, which is supposed to be in {0, } n, is in [0, ] n instead. There is a generic way of obtaining

### Some Results on 2-Lifts of Graphs

Some Results on -Lifts of Graphs Carsten Peterson Advised by: Anup Rao August 8, 014 1 Ramanujan Graphs Let G be a d-regular graph. Every d-regular graph has d as an eigenvalue (with multiplicity equal

### 2.3 Scheduling jobs on identical parallel machines

2.3 Scheduling jobs on identical parallel machines There are jobs to be processed, and there are identical machines (running in parallel) to which each job may be assigned Each job = 1,,, must be processed

### Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

### 1 Polyhedra and Linear Programming

CS 598CSC: Combinatorial Optimization Lecture date: January 21, 2009 Instructor: Chandra Chekuri Scribe: Sungjin Im 1 Polyhedra and Linear Programming In this lecture, we will cover some basic material

### Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

### Graphs and Network Flows IE411 Lecture 1

Graphs and Network Flows IE411 Lecture 1 Dr. Ted Ralphs IE411 Lecture 1 1 References for Today s Lecture Required reading Sections 17.1, 19.1 References AMO Chapter 1 and Section 2.1 and 2.2 IE411 Lecture

### CS2 Algorithms and Data Structures Note 11. Breadth-First Search and Shortest Paths

CS2 Algorithms and Data Structures Note 11 Breadth-First Search and Shortest Paths In this last lecture of the CS2 Algorithms and Data Structures thread we will consider the problem of computing distances

### Part 2: Community Detection

Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

### Conductance, the Normalized Laplacian, and Cheeger s Inequality

Spectral Graph Theory Lecture 6 Conductance, the Normalized Laplacian, and Cheeger s Inequality Daniel A. Spielman September 21, 2015 Disclaimer These notes are not necessarily an accurate representation

### Markov Chains, Stochastic Processes, and Advanced Matrix Decomposition

Markov Chains, Stochastic Processes, and Advanced Matrix Decomposition Jack Gilbert Copyright (c) 2014 Jack Gilbert. Permission is granted to copy, distribute and/or modify this document under the terms

### Lecture 6: Approximation via LP Rounding

Lecture 6: Approximation via LP Rounding Let G = (V, E) be an (undirected) graph. A subset C V is called a vertex cover for G if for every edge (v i, v j ) E we have v i C or v j C (or both). In other

### Lecture notes from Foundations of Markov chain Monte Carlo methods University of Chicago, Spring 2002 Lecture 1, March 29, 2002

Lecture notes from Foundations of Markov chain Monte Carlo methods University of Chicago, Spring 2002 Lecture 1, March 29, 2002 Eric Vigoda Scribe: Varsha Dani & Tom Hayes 1.1 Introduction The aim of this

### S = {1, 2,..., n}. P (1, 1) P (1, 2)... P (1, n) P (2, 1) P (2, 2)... P (2, n) P = . P (n, 1) P (n, 2)... P (n, n)

last revised: 26 January 2009 1 Markov Chains A Markov chain process is a simple type of stochastic process with many social science applications. We ll start with an abstract description before moving

### A Sublinear Bipartiteness Tester for Bounded Degree Graphs

A Sublinear Bipartiteness Tester for Bounded Degree Graphs Oded Goldreich Dana Ron February 5, 1998 Abstract We present a sublinear-time algorithm for testing whether a bounded degree graph is bipartite

### USE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS

USE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu ABSTRACT This

### 1 Limiting distribution for a Markov chain

Copyright c 2009 by Karl Sigman Limiting distribution for a Markov chain In these Lecture Notes, we shall study the limiting behavior of Markov chains as time n In particular, under suitable easy-to-check

### 1 Singular Value Decomposition (SVD)

Contents 1 Singular Value Decomposition (SVD) 2 1.1 Singular Vectors................................. 3 1.2 Singular Value Decomposition (SVD)..................... 7 1.3 Best Rank k Approximations.........................

### 5.1 Bipartite Matching

CS787: Advanced Algorithms Lecture 5: Applications of Network Flow In the last lecture, we looked at the problem of finding the maximum flow in a graph, and how it can be efficiently solved using the Ford-Fulkerson

### Lesson 3. Algebraic graph theory. Sergio Barbarossa. Rome - February 2010

Lesson 3 Algebraic graph theory Sergio Barbarossa Basic notions Definition: A directed graph (or digraph) composed by a set of vertices and a set of edges We adopt the convention that the information flows

### A CHARACTERIZATION OF TREE TYPE

A CHARACTERIZATION OF TREE TYPE LON H MITCHELL Abstract Let L(G) be the Laplacian matrix of a simple graph G The characteristic valuation associated with the algebraic connectivity a(g) is used in classifying

### (a) (b) (c) Figure 1 : Graphs, multigraphs and digraphs. If the vertices of the leftmost figure are labelled {1, 2, 3, 4} in clockwise order from

4 Graph Theory Throughout these notes, a graph G is a pair (V, E) where V is a set and E is a set of unordered pairs of elements of V. The elements of V are called vertices and the elements of E are called

### Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques. My T. Thai

Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques My T. Thai 1 Overview An overview of LP relaxation and rounding method is as follows: 1. Formulate an optimization

### Quantum Computation CMU BB, Fall 2015

Quantum Computation CMU 5-859BB, Fall 205 Homework 3 Due: Tuesday, Oct. 6, :59pm, email the pdf to pgarriso@andrew.cmu.edu Solve any 5 out of 6. [Swap test.] The CSWAP (controlled-swap) linear operator

### USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu

### CS420/520 Algorithm Analysis Spring 2010 Lecture 04

CS420/520 Algorithm Analysis Spring 2010 Lecture 04 I. Chapter 4 Recurrences A. Introduction 1. Three steps applied at each level of a recursion a) Divide into a number of subproblems that are smaller

### Lecture 3: Solving Equations Using Fixed Point Iterations

cs41: introduction to numerical analysis 09/14/10 Lecture 3: Solving Equations Using Fixed Point Iterations Instructor: Professor Amos Ron Scribes: Yunpeng Li, Mark Cowlishaw, Nathanael Fillmore Our problem,

### Recurrences (CLRS )

Recurrences (CLRS 4.1-4.2) Last time we discussed divide-and-conquer algorithms Divide and Conquer To Solve P: 1. Divide P into smaller problems P 1,P 2,P 3...P k. 2. Conquer by solving the (smaller) subproblems

### Lecture 14: Fourier Transform

Machine Learning: Foundations Fall Semester, 2010/2011 Lecture 14: Fourier Transform Lecturer: Yishay Mansour Scribes: Avi Goren & Oron Navon 1 14.1 Introduction In this lecture, we examine the Fourier

### Mining Social-Network Graphs

342 Chapter 10 Mining Social-Network Graphs There is much information to be gained by analyzing the large-scale data that is derived from social networks. The best-known example of a social network is

### 6.042/18.062J Mathematics for Computer Science October 3, 2006 Tom Leighton and Ronitt Rubinfeld. Graph Theory III

6.04/8.06J Mathematics for Computer Science October 3, 006 Tom Leighton and Ronitt Rubinfeld Lecture Notes Graph Theory III Draft: please check back in a couple of days for a modified version of these

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### Diagonalisation. Chapter 3. Introduction. Eigenvalues and eigenvectors. Reading. Definitions

Chapter 3 Diagonalisation Eigenvalues and eigenvectors, diagonalisation of a matrix, orthogonal diagonalisation fo symmetric matrices Reading As in the previous chapter, there is no specific essential

### Min-cost flow problems and network simplex algorithm

Min-cost flow problems and network simplex algorithm The particular structure of some LP problems can be sometimes used for the design of solution techniques more efficient than the simplex algorithm.

### Zachary Monaco Georgia College Olympic Coloring: Go For The Gold

Zachary Monaco Georgia College Olympic Coloring: Go For The Gold Coloring the vertices or edges of a graph leads to a variety of interesting applications in graph theory These applications include various

### A short proof of Perron s theorem. Hannah Cairns, April 25, 2014.

A short proof of Perron s theorem. Hannah Cairns, April 25, 2014. A matrix A or a vector ψ is said to be positive if every component is a positive real number. A B means that every component of A is greater

### Seminar assignment 1. 1 Part

Seminar assignment 1 Each seminar assignment consists of two parts, one part consisting of 3 5 elementary problems for a maximum of 10 points from each assignment. For the second part consisting of problems

### Solution based on matrix technique Rewrite. ) = 8x 2 1 4x 1x 2 + 5x x1 2x 2 2x 1 + 5x 2

8.2 Quadratic Forms Example 1 Consider the function q(x 1, x 2 ) = 8x 2 1 4x 1x 2 + 5x 2 2 Determine whether q(0, 0) is the global minimum. Solution based on matrix technique Rewrite q( x1 x 2 = x1 ) =

### The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

### princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora Scribe: One of the running themes in this course is the notion of

### Analysis of Algorithms, I

Analysis of Algorithms, I CSOR W4231.002 Eleni Drinea Computer Science Department Columbia University Thursday, February 26, 2015 Outline 1 Recap 2 Representing graphs 3 Breadth-first search (BFS) 4 Applications

### Graphs without proper subgraphs of minimum degree 3 and short cycles

Graphs without proper subgraphs of minimum degree 3 and short cycles Lothar Narins, Alexey Pokrovskiy, Tibor Szabó Department of Mathematics, Freie Universität, Berlin, Germany. August 22, 2014 Abstract

### GRAPH THEORY and APPLICATIONS. Trees

GRAPH THEORY and APPLICATIONS Trees Properties Tree: a connected graph with no cycle (acyclic) Forest: a graph with no cycle Paths are trees. Star: A tree consisting of one vertex adjacent to all the others.

### Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)

### Homework One Solutions. Keith Fratus

Homework One Solutions Keith Fratus June 8, 011 1 Problem One 1.1 Part a In this problem, we ll assume the fact that the sum of two complex numbers is another complex number, and also that the product

### Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

### Iterative Techniques in Matrix Algebra. Jacobi & Gauss-Seidel Iterative Techniques II

Iterative Techniques in Matrix Algebra Jacobi & Gauss-Seidel Iterative Techniques II Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin

### Similarity and Diagonalization. Similar Matrices

MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### 2.5 Complex Eigenvalues

1 25 Complex Eigenvalues Real Canonical Form A semisimple matrix with complex conjugate eigenvalues can be diagonalized using the procedure previously described However, the eigenvectors corresponding

### Graph Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Graph Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 3. Topic Overview Definitions and Representation Minimum

### (January 14, 2009) End k (V ) End k (V/W )

(January 14, 29) [16.1] Let p be the smallest prime dividing the order of a finite group G. Show that a subgroup H of G of index p is necessarily normal. Let G act on cosets gh of H by left multiplication.

### / Approximation Algorithms Lecturer: Michael Dinitz Topic: Steiner Tree and TSP Date: 01/29/15 Scribe: Katie Henry

600.469 / 600.669 Approximation Algorithms Lecturer: Michael Dinitz Topic: Steiner Tree and TSP Date: 01/29/15 Scribe: Katie Henry 2.1 Steiner Tree Definition 2.1.1 In the Steiner Tree problem the input

### Week 5 Integral Polyhedra

Week 5 Integral Polyhedra We have seen some examples 1 of linear programming formulation that are integral, meaning that every basic feasible solution is an integral vector. This week we develop a theory

### Lecture Steiner Tree Approximation- June 15

Approximation Algorithms Workshop June 13-17, 2011, Princeton Lecture Steiner Tree Approximation- June 15 Fabrizio Grandoni Scribe: Fabrizio Grandoni 1 Overview In this lecture we will summarize our recent

### 1 Basic Definitions and Concepts in Graph Theory

CME 305: Discrete Mathematics and Algorithms 1 Basic Definitions and Concepts in Graph Theory A graph G(V, E) is a set V of vertices and a set E of edges. In an undirected graph, an edge is an unordered

### Notes on Linear Recurrence Sequences

Notes on Linear Recurrence Sequences April 8, 005 As far as preparing for the final exam, I only hold you responsible for knowing sections,,, 6 and 7 Definitions and Basic Examples An example of a linear

### Inner Product Spaces

Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

### The Classical Linear Regression Model

The Classical Linear Regression Model 1 September 2004 A. A brief review of some basic concepts associated with vector random variables Let y denote an n x l vector of random variables, i.e., y = (y 1,

### Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Peter J. Olver 7. Iterative Methods for Linear Systems Linear iteration coincides with multiplication by successive powers of a matrix; convergence of the iterates depends

### Chapter 4: Binary Operations and Relations

c Dr Oksana Shatalov, Fall 2014 1 Chapter 4: Binary Operations and Relations 4.1: Binary Operations DEFINITION 1. A binary operation on a nonempty set A is a function from A A to A. Addition, subtraction,