Sublinear Algorithms for Big Data. Part 4: Random Topics



Similar documents
A Negative Result Concerning Explicit Matrices With The Restricted Isometry Property

Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity

Algorithmic Techniques for Big Data Analysis. Barna Saha AT&T Lab-Research

NMR Measurement of T1-T2 Spectra with Partial Measurements using Compressive Sensing

Nimble Algorithms for Cloud Computing. Ravi Kannan, Santosh Vempala and David Woodruff

CIS 700: algorithms for Big Data

Noisy and Missing Data Regression: Distribution-Oblivious Support Recovery

Sparse recovery and compressed sensing in inverse problems

When is missing data recoverable?

Randomized Robust Linear Regression for big data applications

Sparsity-promoting recovery from simultaneous data: a compressive sensing approach

COSAMP: ITERATIVE SIGNAL RECOVERY FROM INCOMPLETE AND INACCURATE SAMPLES

Associative Memory via a Sparse Recovery Model

An Improved Reconstruction methods of Compressive Sensing Data Recovery in Wireless Sensor Networks

Part II Redundant Dictionaries and Pursuit Algorithms

Linear Programming I

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

Learning with Dynamic Group Sparsity

Scalable Machine Learning - or what to do with all that Big Data infrastructure

In-Network Coding for Resilient Sensor Data Storage and Efficient Data Mule Collection

The p-norm generalization of the LMS algorithm for adaptive filtering

Bilinear Prediction Using Low-Rank Models

CS3220 Lecture Notes: QR factorization and orthogonal transformations

Notes 11: List Decoding Folded Reed-Solomon Codes

B669 Sublinear Algorithms for Big Data

Stable Signal Recovery from Incomplete and Inaccurate Measurements

Lecture 14: Section 3.3

Lecture 6 Online and streaming algorithms for clustering

Discuss the size of the instance for the minimum spanning tree problem.

Sublinear Algorithms for Big Data. Part 4: Random Topics

Orthogonal Projections

APPROXIMATION OF FRAME BASED MISSING DATA RECOVERY

6. Cholesky factorization

Inner products on R n, and more

Lecture 4 Online and streaming algorithms for clustering

We shall turn our attention to solving linear systems of equations. Ax = b

How To Understand The Problem Of Decoding By Linear Programming

Streaming Algorithms

Orthogonal Projections and Orthonormal Bases

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

(67902) Topics in Theory and Complexity Nov 2, Lecture 7

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

Big Data: The curse of dimensionality and variable selection in identification for a high dimensional nonlinear non-parametric system

MODELING RANDOMNESS IN NETWORK TRAFFIC

Fast Solution of l 1 -norm Minimization Problems When the Solution May be Sparse

Applied Algorithm Design Lecture 5

Weakly Secure Network Coding

New Constructions and Practical Applications for Private Stream Searching (Extended Abstract)

Sketch As a Tool for Numerical Linear Algebra

Analyzing Graph Structure via Linear Measurements

Statistical Machine Learning

8. Linear least-squares

Text Analytics (Text Mining)

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

Inner product. Definition of inner product

Applications to Data Smoothing and Image Processing I

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.

Mathematical finance and linear programming (optimization)

Group Testing a tool of protecting Network Security

General Framework for an Iterative Solution of Ax b. Jacobi s Method

Dot product and vector projections (Sect. 12.3) There are two main ways to introduce the dot product

Linear Algebra Notes

Statistical machine learning, high dimension and big data

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = i.

Math 550 Notes. Chapter 7. Jesse Crawford. Department of Mathematics Tarleton State University. Fall 2010

0.1 Phase Estimation Technique

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

2.3 Convex Constrained Optimization Problems

Notes on Symmetric Matrices

The Communication Complexity of Distributed Set-Joins with Applications to Matrix Multiplication

ISOMETRIES OF R n KEITH CONRAD

Similarity and Diagonalization. Similar Matrices

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

Direct Methods for Solving Linear Systems. Matrix Factorization

DATA ANALYSIS II. Matrix Algorithms

Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS)

Transcription:

Sublinear Algorithms for Big Data Part 4: Random Topics Qin Zhang 1-1

2-1 Topic 1: Compressive sensing

Compressive sensing The model (Candes-Romberg-Tao 04; Donoho 04) Applicaitons Medical imaging reconstruction Single-pixel camera Compressive sensor network etc. 3-1

Formalization Lp/Lq guarantee: The goal to acquire a signal x = [x 1,..., x n ] (e.g., a digital image). The acquisition proceeds by computing a measurement vector Ax of dimension m n. Then, from Ax, we want to recover a k-sparse approximation x of x so that x x q C min x 0 k x x p ( ) 4-1

Formalization Lp/Lq guarantee: The goal to acquire a signal x = [x 1,..., x n ] (e.g., a digital image). The acquisition proceeds by computing a measurement vector Ax of dimension m n. Then, from Ax, we want to recover a k-sparse approximation x of x so that x x q C min x 0 k x x p ( ) Err k p (x) 4-2

Formalization Lp/Lq guarantee: The goal to acquire a signal x = [x 1,..., x n ] (e.g., a digital image). The acquisition proceeds by computing a measurement vector Ax of dimension m n. Then, from Ax, we want to recover a k-sparse approximation x of x so that x x q C min x 0 k x x p Often study: L 1 /L 1, L 1 /L 2 and L 2 /L 2 ( ) Err k p (x) 4-3

Formalization Lp/Lq guarantee: The goal to acquire a signal x = [x 1,..., x n ] (e.g., a digital image). The acquisition proceeds by computing a measurement vector Ax of dimension m n. Then, from Ax, we want to recover a k-sparse approximation x of x so that x x q C min x 0 k x x p Often study: L 1 /L 1, L 1 /L 2 and L 2 /L 2 ( ) Err k p (x) For each: Given a (random) matrix A, for each signal x, ( ) holds w.h.p. For all: One matrix A for all signals x. Stronger. 4-4

Results 5-1 Up to year 2009... copied from Indyk s talk

Pre-history: Orthogonal Matching Pursuit Given a signal n-dimentional vector x {0, 1} n with k n non-zero entries. Let y = Ax, where A rand { 1, 0, 1} m n. The following algorithm can recover x exactly using A, y with m = O(k log(n/k)), (i.e., O(k log(n/k)) measurements) 6-1

Pre-history: Orthogonal Matching Pursuit Given a signal n-dimentional vector x {0, 1} n with k n non-zero entries. Let y = Ax, where A rand { 1, 0, 1} m n. The following algorithm can recover x exactly using A, y with m = O(k log(n/k)), (i.e., O(k log(n/k)) measurements) Algorithm Orthogonal Matching Pursuit Set r = y. Denote A = (A 1,..., A n ). For i = 1 to t do 1. Set A j = arg max Al {A 1,...,A n } r, A l 2. Set γ j = arg max γ r A j γ 2 3. Set r = r A j γ j Return x where x j = γ j. 6-2

Pre-history: Orthogonal Matching Pursuit Given a signal n-dimentional vector x {0, 1} n with k n non-zero entries. Let y = Ax, where A rand { 1, 0, 1} m n. The following algorithm can recover x exactly using A, y with m = O(k log(n/k)), (i.e., O(k log(n/k)) measurements) Algorithm Orthogonal Matching Pursuit Set r = y. Denote A = (A 1,..., A n ). For i = 1 to t do 1. Set A j = arg max Al {A 1,...,A n } r, A l 2. Set γ j = arg max γ r A j γ 2 3. Set r = r A j γ j Return x where x j = γ j. Will converge since in each step r 2 deceases. 6-3

Pre-history: Orthogonal Matching Pursuit Given a signal n-dimentional vector x {0, 1} n with k n non-zero entries. Let y = Ax, where A rand { 1, 0, 1} m n. The following algorithm can recover x exactly using A, y with m = O(k log(n/k)), (i.e., O(k log(n/k)) measurements) Algorithm Orthogonal Matching Pursuit Set r = y. Denote A = (A 1,..., A n ). For i = 1 to t do May stop when r 2 or γ is very small 1. Set A j = arg max Al {A 1,...,A n } r, A l 2. Set γ j = arg max γ r A j γ 2 3. Set r = r A j γ j Return x where x j = γ j. Will converge since in each step r 2 deceases. 6-4

L 1 point query (recall) Algorithm Count-Min [Cormode and Muthu 05] Pick d (d = log(1/δ)) independent hash functions h 1,..., h d where h i : {1,..., n} {1,..., w} (w = 4/ɛ) from a 2-universal family. Maintain d vectors Z 1,..., Z d where Z t = {Z1, t..., Zw t } such that = i:h t (i)=j x i Z t j Estimator: x i = min t Z t h t (i) Theorem We can solve L 1 point query, with approximation ɛ, and failure probability δ by storing O(1/ɛ log(1/δ)) words. 7-1

For each (L 1 /L 1 ) The algorithm for L 1 point query gives a L 1 /L 1 sparse approximation. 8-1

For each (L 1 /L 1 ) The algorithm for L 1 point query gives a L 1 /L 1 sparse approximation. Recall L 1 Point Query Problem: Given ɛ, after reading the whole stream, given i, report x i = x i ± ɛ x 1 8-2

For each (L 1 /L 1 ) The algorithm for L 1 point query gives a L 1 /L 1 sparse approximation. Recall L 1 Point Query Problem: Given ɛ, after reading the whole stream, given i, report x i = x i ± ɛ x 1 Set α = kɛ (0, 1) and δ = 1/n 2 in L 1 point query. And then return a vector x consisting of k largest (in magnitude) elements of x. It gives w.p. 1 δ, x x 1 (1 + 3α) Err k 1 Total measurements: m = O(k/α log n) (All analysis on board) 8-3

For all (L 1 /L 2 ) A matrix A satisfies (k, δ)-rip (Restricted Isometry Property) if k-sparse vector x we have (1 δ) x 2 Ax 2 (1 + δ) x 2. 9-1

For all (L 1 /L 2 ) A matrix A satisfies (k, δ)-rip (Restricted Isometry Property) if k-sparse vector x we have (1 δ) x 2 Ax 2 (1 + δ) x 2. Theorem Johnson-Linderstrauss Lemma x with x 2 = 1, we have 7/8 Ax 2 8/7 w.p. 1 e O(m). 9-2

For all (L 1 /L 2 ) A matrix A satisfies (k, δ)-rip (Restricted Isometry Property) if k-sparse vector x we have (1 δ) x 2 Ax 2 (1 + δ) x 2. Theorem Johnson-Linderstrauss Lemma x with x 2 = 1, we have 7/8 Ax 2 8/7 w.p. 1 e O(m). Theorem We If each canentry solveof L 1 Apoint is i.i.d. query, as N with (0, 1), approximation and m = O(klog(n/k)), ɛ, and failure then A satisfies probability (k, δ1/3)-rip by storing w.h.p. O(1/ɛ log(1/δ)) numbers. 9-3

For all (L 1 /L 2 ) A matrix A satisfies (k, δ)-rip (Restricted Isometry Property) if k-sparse vector x we have (1 δ) x 2 Ax 2 (1 + δ) x 2. Theorem Johnson-Linderstrauss Lemma x with x 2 = 1, we have 7/8 Ax 2 8/7 w.p. 1 e O(m). Theorem We If each canentry solveof L 1 Apoint is i.i.d. query, as N with (0, 1), approximation and m = O(klog(n/k)), ɛ, and failure then A satisfies probability (k, δ1/3)-rip by storing w.h.p. O(1/ɛ log(1/δ)) numbers. 9-4 Theorem Main Theorem We If Acan has solve (6k, 1/3)-RIP. L 1 point query, Let x with be the approximation solution to the ɛ, and LP: failure minimize x probability 1 subject δ bytostoring Ax = O(1/ɛ Ax (xlog(1/δ)) is k-sparse). numbers. Then x x 2 C/ k Err1 k for any x (All analysis on board)