Sketch As a Tool for Numerical Linear Algebra

Size: px

Start display at page:

Download "Sketch As a Tool for Numerical Linear Algebra"

Allan Carr
5 years ago
Views:

1 Sketching as a Tool for Numerical Linear Algebra (Graph Sparsification) David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania April, 2015 Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 1 / 18

2 Goal New survey by David Woodruff: Sketching as a Tool for Numerical Linear Algebra Topics: Subspace Embeddings Least Squares Regression Least Absolute Deviation Regression Low Rank Approximation Graph Sparsification Sketching Lower Bounds Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 2 / 18

Deviation Regression Low Rank Approximation Graph Sparsification Sketching Lower

3 Goal New survey by David Woodruff: Sketching as a Tool for Numerical Linear Algebra Topics: Subspace Embeddings Least Squares Regression Least Absolute Deviation Regression Low Rank Approximation Graph Sparsification Sketching Lower Bounds Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 3 / 18

4 Matrix Compression Previously: Compress a matrix A R n d using linear sketches Example: subspace embedding Definition (l 2 -subspace embedding) A (1 ± ε) l 2 -subspace embedding for a matrix A R n d is a matrix S for which for all x R n SAx 2 2 = (1 ± ε) Ax 2 2 Typically SA is an Õ(d 2 )-size matrix Techniques: Using random matrices S (Guassian, sign matrices, etc. ) Using leverage score sampling Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 4 / 18

= (1 ± ε) Ax 2 2 Typically SA is an Õ(d 2 )-size matrix Techniques: Using random matrices S (Guassian, sign matrices, etc.

5 Graph Compression Today: Compress a graph G(V, E) using linear sketches Example: sparsification Definition (cut sparsifier) A (1 ± ε) cut sparsifier of a graph G(V, E) is a weighted subgraph H of G such that for any S V : W H (S, S) = (1 ± ε) W G (S, S) *W G (S, S) is the weight of the cut between S and S in G Typically H is an Õ(n)-size graph Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 5 / 18

S V : W H (S, S) = (1 ± ε) W G (S, S) *W G (S, S) is the weight of the cut between S and S in G Typically H

6 Graph Compression (cont.) Laplacian matrix of a graph G(V, E): L R n n L = D A, degree matrix D R n n and adjacency matrix A L = e E L e for edge-laplacian matrix L e R n n L = B T B for edge-vertex incidence matrix B R (n 2) n A set of vertices S V and its characteristic vector x {0, 1} n : x T Lx = (x u x v ) 2 = δg (S, S) e:(u,v) E Any cut sparsifier H of G has a Laplacian L such that: x {0, 1} n x T Lx = (1 ± ε) x T Lx Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 6 / 18

edge-laplacian matrix L e R n n L = B T B for edge-vertex incidence matrix B R (n 2) n A set of vertices S V and its

7 Spectral Sparsifier Definition (spectral sparsifier) A (1 ± ε) spectral sparsifier of a graph G(V, E) is a weighted subgraph H of G such that for any x R n : x T Lx = (1 ± ε) x T Lx *L (resp. L) is the Laplacian of G (resp. H ) Originally proposed by Spielman and Teng [ST11]: Õ(m) construction time and Õ(n) size. Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 7 / 18

8 Spectral vs Cut Sparsifiers Difference between spectral and cut sparsifiers: (Figure from [ST11]) Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 8 / 18

9 Graph vs Matrix Compression Matrix compression A R n d A is a tall matrix, i.e., n d Compression guarantee of the form Õ(d 2 ) Graph compression L R n n L is a square matrix Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 9 / 18

10 Graph vs Matrix Compression Matrix compression A R n d A is a tall matrix, i.e., n d Compression guarantee of the form Õ(d 2 ) Graph compression L R n n L is a square matrix But... L = B T B and B is tall x T Lx = x T B T Bx = Bx 2 Spectral sparsification is a subspace embedding for B! Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 10 / 18

11 Spectral Sparsification and Subspace Embedding A sampling based subspace embedding: Leverage score sampling Leverage Score of i-th row of A = UΣV: l i = 2 U(i) Leverage score sampling for A R m d Ss m = D s m Ω m m Ds m : rescaling matrix (according to the sampled probability) Ωm m : sampling matrix (based on leverage scores) Theorem (LS-sampling theorem) For s = Θ( d log d ), with probability 0.99, S βε 2 s m is a subspace embedding matrix for A m d. Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 11 / 18

probability) Ωm m : sampling matrix (based on leverage scores) Theorem (LS-sampling theorem) For s = Θ( d log d ), with probability 0.

12 Spectral Sparsification and Subspace Embedding (cont.) Theorem Sampling and weighting Õ(ε 2 n) edges from G(V, E) according to leverage scores of B R (n 2) n results in a (1 ± ε) spectral sparsifier of G. Proof. For any x R n, x T Lx = Bx LS-sampling for subspace embedding of B Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 12 / 18

B R (n 2) n results in a (1 ± ε) spectral sparsifier of G. Proof.

13 Linear Sketching for Spectral Sparsification Theorem ( [KLM + 14]) There exists a distribution on ε 2 polylog (n) ( ) n 2 dimensional matrices S, such that with high probability, from S B, a (1 ± ε) spectral sparsifier of G can be recovered. Key feature: linear sketch First single pass spectral sparsifier for dynamic graph streams [KLM + 14] Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 13 / 18

14 Introduction and Removal of Artificial Bases Theorem ( [LMP13]) Let K be any PSD matrix with maximum eigen value λ u and minimum (non-zero) eigen value λ l and d = log (λ u /λ l ). For l [d], define: γ(l) = λ u 2 l Consider the sequence of PSD matrices K(0),..., K(d), where: Then: 1 K R K(d) R 2K K(l) = K + γ(l) I 2 K(l) K(l 1) 2K(l) for l 1 3 K(0) 2γ(0)I 2K(0) Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 14 / 18

For l [d], define: γ(l) = λ u 2 l Consider the sequence of PSD matrices K(0),.

15 Constructing a Spectral Sparsifier Use previous theorem! d = O(log n) for Laplacian matrices Leverage scores of K(l) leverage scores of K(l + 1) Proof. On the board. Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 15 / 18

leverage scores of K(l + 1) Proof. On the board.

16 Sparse Recovery Algorithm Theorem ([GLPS12]) There exists an algorithm D and a distribution on matrices Φ of dimension ε 2 polylog (n) n, such that for any x R n, with high probability, D(Φx, i) can detect whether x i = Ω( x ) or x i = o( x ). Heavy hitter detection! Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 16 / 18

high probability, D(Φx, i) can detect whether x i = Ω( x ) or x i = o( x ).

17 Constructing a Spectral Sparsifiers via Linear Sketches 1 For i = 1,..., O(log n): (a) Maintain Φ D i B, (Φ is the sparse recovery matrix, D i R (n 2) ( n 2) is diagonal) 2 Repeat O(log n) times We are done! Proof Sketch. Enough information to traverse the hierarchy of K(0) to K(d) At each level l, compute Φ D i B K(l) b e for every edge e Run D(Φ D i B K(l) b e, e) to sample an edge e Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 17 / 18

18 Questions? Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 18 / 18

19 Anna C. Gilbert, Yi Li, Ely Porat, and Martin J. Strauss. Approximate sparse recovery: Optimizing time and measurements. SIAM J. Comput., 41(2): , Michael Kapralov, Yin Tat Lee, Cameron Musco, Christopher Musco, and Aaron Sidford. Single pass spectral sparsification in dynamic streams. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014,, pages , Mu Li, Gary L. Miller, and Richard Peng. Iterative row sampling. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013,, pages , Daniel A. Spielman and Shang-Hua Teng. Spectral sparsification of graphs. SIAM J. Comput., 40(4): , Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 18 / 18

In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014,, pages 561 570, 2014. Mu Li, Gary L. Miller, and Richard Peng. Iterative row sampling.

CIS 700: algorithms for Big Data

CIS 700: algorithms for Big Data Lecture 6: Graph Sketching Slides at http://grigory.us/big-data-class.html Grigory Yaroslavtsev http://grigory.us Sketching Graphs? We know how to sketch vectors: v Mv