Sublinear Algorithms for Big Data. Part 4: Random Topics

Size: px
Start display at page:

Download "Sublinear Algorithms for Big Data. Part 4: Random Topics"

Transcription

1 Sublinear Algorithms for Big Data Part 4: Random Topics Qin Zhang 1-1

2 2-1 Topic 1: Compressive sensing

3 Compressive sensing The model (Candes-Romberg-Tao 04; Donoho 04) Applicaitons Medical imaging reconstruction Single-pixel camera Compressive sensor network etc. 3-1

4 Formalization Lp/Lq guarantee: The goal to acquire a signal x = [x 1,..., x n ] (e.g., a digital image). The acquisition proceeds by computing a measurement vector Ax of dimension m n. Then, from Ax, we want to recover a k-sparse approximation x of x so that x x q C min x 0 k x x p ( ) 4-1

5 Formalization Lp/Lq guarantee: The goal to acquire a signal x = [x 1,..., x n ] (e.g., a digital image). The acquisition proceeds by computing a measurement vector Ax of dimension m n. Then, from Ax, we want to recover a k-sparse approximation x of x so that x x q C min x 0 k x x p ( ) Err k p (x) 4-2

6 Formalization Lp/Lq guarantee: The goal to acquire a signal x = [x 1,..., x n ] (e.g., a digital image). The acquisition proceeds by computing a measurement vector Ax of dimension m n. Then, from Ax, we want to recover a k-sparse approximation x of x so that x x q C min x 0 k x x p Often study: L 1 /L 1, L 1 /L 2 and L 2 /L 2 ( ) Err k p (x) 4-3

7 Formalization Lp/Lq guarantee: The goal to acquire a signal x = [x 1,..., x n ] (e.g., a digital image). The acquisition proceeds by computing a measurement vector Ax of dimension m n. Then, from Ax, we want to recover a k-sparse approximation x of x so that x x q C min x 0 k x x p Often study: L 1 /L 1, L 1 /L 2 and L 2 /L 2 ( ) Err k p (x) For each: Given a (random) matrix A, for each signal x, ( ) holds w.h.p. For all: One matrix A for all signals x. Stronger. 4-4

8 Results 5-1 Up to year copied from Indyk s talk

9 Pre-history: Orthogonal Matching Pursuit Given a signal n-dimentional vector x {0, 1} n with k n non-zero entries. Let y = Ax, where A rand { 1, 0, 1} m n. The following algorithm can recover x exactly using A, y with m = O(k log(n/k)), (i.e., O(k log(n/k)) measurements) 6-1

10 Pre-history: Orthogonal Matching Pursuit Given a signal n-dimentional vector x {0, 1} n with k n non-zero entries. Let y = Ax, where A rand { 1, 0, 1} m n. The following algorithm can recover x exactly using A, y with m = O(k log(n/k)), (i.e., O(k log(n/k)) measurements) Algorithm Orthogonal Matching Pursuit Set r = y. Denote A = (A 1,..., A n ). For i = 1 to t do 1. Set A j = arg max Al {A 1,...,A n } r, A l 2. Set γ j = arg max γ r A j γ 2 3. Set r = r A j γ j Return x where x j = γ j. 6-2

11 Pre-history: Orthogonal Matching Pursuit Given a signal n-dimentional vector x {0, 1} n with k n non-zero entries. Let y = Ax, where A rand { 1, 0, 1} m n. The following algorithm can recover x exactly using A, y with m = O(k log(n/k)), (i.e., O(k log(n/k)) measurements) Algorithm Orthogonal Matching Pursuit Set r = y. Denote A = (A 1,..., A n ). For i = 1 to t do 1. Set A j = arg max Al {A 1,...,A n } r, A l 2. Set γ j = arg max γ r A j γ 2 3. Set r = r A j γ j Return x where x j = γ j. Will converge since in each step r 2 deceases. 6-3

12 Pre-history: Orthogonal Matching Pursuit Given a signal n-dimentional vector x {0, 1} n with k n non-zero entries. Let y = Ax, where A rand { 1, 0, 1} m n. The following algorithm can recover x exactly using A, y with m = O(k log(n/k)), (i.e., O(k log(n/k)) measurements) Algorithm Orthogonal Matching Pursuit Set r = y. Denote A = (A 1,..., A n ). For i = 1 to t do May stop when r 2 or γ is very small 1. Set A j = arg max Al {A 1,...,A n } r, A l 2. Set γ j = arg max γ r A j γ 2 3. Set r = r A j γ j Return x where x j = γ j. Will converge since in each step r 2 deceases. 6-4

13 L 1 point query (recall) Algorithm Count-Min [Cormode and Muthu 05] Pick d (d = log(1/δ)) independent hash functions h 1,..., h d where h i : {1,..., n} {1,..., w} (w = 4/ɛ) from a 2-universal family. Maintain d vectors Z 1,..., Z d where Z t = {Z1, t..., Zw t } such that = i:h t (i)=j x i Z t j Estimator: x i = min t Z t h t (i) Theorem We can solve L 1 point query, with approximation ɛ, and failure probability δ by storing O(1/ɛ log(1/δ)) words. 7-1

14 For each (L 1 /L 1 ) The algorithm for L 1 point query gives a L 1 /L 1 sparse approximation. 8-1

15 For each (L 1 /L 1 ) The algorithm for L 1 point query gives a L 1 /L 1 sparse approximation. Recall L 1 Point Query Problem: Given ɛ, after reading the whole stream, given i, report x i = x i ± ɛ x 1 8-2

16 For each (L 1 /L 1 ) The algorithm for L 1 point query gives a L 1 /L 1 sparse approximation. Recall L 1 Point Query Problem: Given ɛ, after reading the whole stream, given i, report x i = x i ± ɛ x 1 Set α = kɛ (0, 1) and δ = 1/n 2 in L 1 point query. And then return a vector x consisting of k largest (in magnitude) elements of x. It gives w.p. 1 δ, x x 1 (1 + 3α) Err k 1 Total measurements: m = O(k/α log n) (All analysis on board) 8-3

17 For all (L 1 /L 2 ) A matrix A satisfies (k, δ)-rip (Restricted Isometry Property) if k-sparse vector x we have (1 δ) x 2 Ax 2 (1 + δ) x

18 For all (L 1 /L 2 ) A matrix A satisfies (k, δ)-rip (Restricted Isometry Property) if k-sparse vector x we have (1 δ) x 2 Ax 2 (1 + δ) x 2. Theorem Johnson-Linderstrauss Lemma x with x 2 = 1, we have 7/8 Ax 2 8/7 w.p. 1 e O(m). 9-2

19 For all (L 1 /L 2 ) A matrix A satisfies (k, δ)-rip (Restricted Isometry Property) if k-sparse vector x we have (1 δ) x 2 Ax 2 (1 + δ) x 2. Theorem Johnson-Linderstrauss Lemma x with x 2 = 1, we have 7/8 Ax 2 8/7 w.p. 1 e O(m). Theorem We If each canentry solveof L 1 Apoint is i.i.d. query, as N with (0, 1), approximation and m = O(klog(n/k)), ɛ, and failure then A satisfies probability (k, δ1/3)-rip by storing w.h.p. O(1/ɛ log(1/δ)) numbers. 9-3

20 For all (L 1 /L 2 ) A matrix A satisfies (k, δ)-rip (Restricted Isometry Property) if k-sparse vector x we have (1 δ) x 2 Ax 2 (1 + δ) x 2. Theorem Johnson-Linderstrauss Lemma x with x 2 = 1, we have 7/8 Ax 2 8/7 w.p. 1 e O(m). Theorem We If each canentry solveof L 1 Apoint is i.i.d. query, as N with (0, 1), approximation and m = O(klog(n/k)), ɛ, and failure then A satisfies probability (k, δ1/3)-rip by storing w.h.p. O(1/ɛ log(1/δ)) numbers. 9-4 Theorem Main Theorem We If Acan has solve (6k, 1/3)-RIP. L 1 point query, Let x with be the approximation solution to the ɛ, and LP: failure minimize x probability 1 subject δ bytostoring Ax = O(1/ɛ Ax (xlog(1/δ)) is k-sparse). numbers. Then x x 2 C/ k Err1 k for any x (All analysis on board)

A Negative Result Concerning Explicit Matrices With The Restricted Isometry Property

A Negative Result Concerning Explicit Matrices With The Restricted Isometry Property A Negative Result Concerning Explicit Matrices With The Restricted Isometry Property Venkat Chandar March 1, 2008 Abstract In this note, we prove that matrices whose entries are all 0 or 1 cannot achieve

More information

Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity

Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity Wei Dai and Olgica Milenkovic Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign

More information

Algorithmic Techniques for Big Data Analysis. Barna Saha AT&T Lab-Research

Algorithmic Techniques for Big Data Analysis. Barna Saha AT&T Lab-Research Algorithmic Techniques for Big Data Analysis Barna Saha AT&T Lab-Research Challenges of Big Data VOLUME Large amount of data VELOCITY Needs to be analyzed quickly VARIETY Different types of structured

More information

NMR Measurement of T1-T2 Spectra with Partial Measurements using Compressive Sensing

NMR Measurement of T1-T2 Spectra with Partial Measurements using Compressive Sensing NMR Measurement of T1-T2 Spectra with Partial Measurements using Compressive Sensing Alex Cloninger Norbert Wiener Center Department of Mathematics University of Maryland, College Park http://www.norbertwiener.umd.edu

More information

Lecture Topic: Low-Rank Approximations

Lecture Topic: Low-Rank Approximations Lecture Topic: Low-Rank Approximations Low-Rank Approximations We have seen principal component analysis. The extraction of the first principle eigenvalue could be seen as an approximation of the original

More information

Compact Summaries for Large Datasets

Compact Summaries for Large Datasets Compact Summaries for Large Datasets Big Data Graham Cormode University of Warwick G.Cormode@Warwick.ac.uk The case for Big Data in one slide Big data arises in many forms: Medical data: genetic sequences,

More information

Nimble Algorithms for Cloud Computing. Ravi Kannan, Santosh Vempala and David Woodruff

Nimble Algorithms for Cloud Computing. Ravi Kannan, Santosh Vempala and David Woodruff Nimble Algorithms for Cloud Computing Ravi Kannan, Santosh Vempala and David Woodruff Cloud computing Data is distributed arbitrarily on many servers Parallel algorithms: time Streaming algorithms: sublinear

More information

CIS 700: algorithms for Big Data

CIS 700: algorithms for Big Data CIS 700: algorithms for Big Data Lecture 6: Graph Sketching Slides at http://grigory.us/big-data-class.html Grigory Yaroslavtsev http://grigory.us Sketching Graphs? We know how to sketch vectors: v Mv

More information

Noisy and Missing Data Regression: Distribution-Oblivious Support Recovery

Noisy and Missing Data Regression: Distribution-Oblivious Support Recovery : Distribution-Oblivious Support Recovery Yudong Chen Department of Electrical and Computer Engineering The University of Texas at Austin Austin, TX 7872 Constantine Caramanis Department of Electrical

More information

Sparse recovery and compressed sensing in inverse problems

Sparse recovery and compressed sensing in inverse problems Gerd Teschke (7. Juni 2010) 1/68 Sparse recovery and compressed sensing in inverse problems Gerd Teschke (joint work with Evelyn Herrholz) Institute for Computational Mathematics in Science and Technology

More information

When is missing data recoverable?

When is missing data recoverable? When is missing data recoverable? Yin Zhang CAAM Technical Report TR06-15 Department of Computational and Applied Mathematics Rice University, Houston, TX 77005 October, 2006 Abstract Suppose a non-random

More information

Randomized Robust Linear Regression for big data applications

Randomized Robust Linear Regression for big data applications Randomized Robust Linear Regression for big data applications Yannis Kopsinis 1 Dept. of Informatics & Telecommunications, UoA Thursday, Apr 16, 2015 In collaboration with S. Chouvardas, Harris Georgiou,

More information

Sparsity-promoting recovery from simultaneous data: a compressive sensing approach

Sparsity-promoting recovery from simultaneous data: a compressive sensing approach SEG 2011 San Antonio Sparsity-promoting recovery from simultaneous data: a compressive sensing approach Haneet Wason*, Tim T. Y. Lin, and Felix J. Herrmann September 19, 2011 SLIM University of British

More information

COSAMP: ITERATIVE SIGNAL RECOVERY FROM INCOMPLETE AND INACCURATE SAMPLES

COSAMP: ITERATIVE SIGNAL RECOVERY FROM INCOMPLETE AND INACCURATE SAMPLES COSAMP: ITERATIVE SIGNAL RECOVERY FROM INCOMPLETE AND INACCURATE SAMPLES D NEEDELL AND J A TROPP Abstract Compressive sampling offers a new paradigm for acquiring signals that are compressible with respect

More information

Compressed Sensing with Non-Gaussian Noise and Partial Support Information

Compressed Sensing with Non-Gaussian Noise and Partial Support Information IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 10, OCTOBER 2015 1703 Compressed Sensing with Non-Gaussian Noise Partial Support Information Ahmad Abou Saleh, Fady Alajaji, Wai-Yip Chan Abstract We study

More information

Associative Memory via a Sparse Recovery Model

Associative Memory via a Sparse Recovery Model Associative Memory via a Sparse Recovery Model Arya Mazumdar Department of ECE University of Minnesota Twin Cities arya@umn.edu Ankit Singh Rawat Computer Science Department Carnegie Mellon University

More information

An Improved Reconstruction methods of Compressive Sensing Data Recovery in Wireless Sensor Networks

An Improved Reconstruction methods of Compressive Sensing Data Recovery in Wireless Sensor Networks Vol.8, No.1 (14), pp.1-8 http://dx.doi.org/1.1457/ijsia.14.8.1.1 An Improved Reconstruction methods of Compressive Sensing Data Recovery in Wireless Sensor Networks Sai Ji 1,, Liping Huang, Jin Wang, Jian

More information

Part II Redundant Dictionaries and Pursuit Algorithms

Part II Redundant Dictionaries and Pursuit Algorithms Aisenstadt Chair Course CRM September 2009 Part II Redundant Dictionaries and Pursuit Algorithms Stéphane Mallat Centre de Mathématiques Appliquées Ecole Polytechnique Sparsity in Redundant Dictionaries

More information

Linear Programming I

Linear Programming I Linear Programming I November 30, 2003 1 Introduction In the VCR/guns/nuclear bombs/napkins/star wars/professors/butter/mice problem, the benevolent dictator, Bigus Piguinus, of south Antarctica penguins

More information

10/27/14. Streaming & Sampling. Tackling the Challenges of Big Data Big Data Analytics. Piotr Indyk

10/27/14. Streaming & Sampling. Tackling the Challenges of Big Data Big Data Analytics. Piotr Indyk Introduction Streaming & Sampling Two approaches to dealing with massive data using limited resources Sampling: pick a random sample of the data, and perform the computation on the sample 8 2 1 9 1 9 2

More information

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #18: Dimensionality Reduc7on Dimensionality Reduc=on Assump=on: Data lies on or near a low d- dimensional subspace Axes of this subspace

More information

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

More information

Stable Signal Recovery from Incomplete and Inaccurate Measurements

Stable Signal Recovery from Incomplete and Inaccurate Measurements Stable Signal Recovery from Incomplete and Inaccurate Measurements Emmanuel Candes, Justin Romberg, and Terence Tao Applied and Computational Mathematics, Caltech, Pasadena, CA 91125 Department of Mathematics,

More information

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT

More information

Research Article A Method of Data Recovery Based on Compressive Sensing in Wireless Structural Health Monitoring

Research Article A Method of Data Recovery Based on Compressive Sensing in Wireless Structural Health Monitoring Mathematical Problems in Engineering Volume 214, Article ID 46478, 9 pages http://dx.doi.org/1.11/214/46478 Research Article A Method of Data Recovery Based on Compressive Sensing in Wireless Structural

More information

Learning with Dynamic Group Sparsity

Learning with Dynamic Group Sparsity Learning with Dynamic Group Sparsity Junzhou Huang Rutgers University Frelinghuysen Road Piscataway, NJ 8854, USA jzhuang@cs.rutgers.edu Xiaolei Huang Lehigh University 9 Memorial Drive West Bethlehem,

More information

Scalable Machine Learning - or what to do with all that Big Data infrastructure

Scalable Machine Learning - or what to do with all that Big Data infrastructure - or what to do with all that Big Data infrastructure TU Berlin blog.mikiobraun.de Strata+Hadoop World London, 2015 1 Complex Data Analysis at Scale Click-through prediction Personalized Spam Detection

More information

In-Network Coding for Resilient Sensor Data Storage and Efficient Data Mule Collection

In-Network Coding for Resilient Sensor Data Storage and Efficient Data Mule Collection In-Network Coding for Resilient Sensor Data Storage and Efficient Data Mule Collection Michele Albano Jie Gao Instituto de telecomunicacoes, Aveiro, Portugal Stony Brook University, Stony Brook, USA Data

More information

The p-norm generalization of the LMS algorithm for adaptive filtering

The p-norm generalization of the LMS algorithm for adaptive filtering The p-norm generalization of the LMS algorithm for adaptive filtering Jyrki Kivinen University of Helsinki Manfred Warmuth University of California, Santa Cruz Babak Hassibi California Institute of Technology

More information

Bilinear Prediction Using Low-Rank Models

Bilinear Prediction Using Low-Rank Models Bilinear Prediction Using Low-Rank Models Inderjit S. Dhillon Dept of Computer Science UT Austin 26th International Conference on Algorithmic Learning Theory Banff, Canada Oct 6, 2015 Joint work with C-J.

More information

CS3220 Lecture Notes: QR factorization and orthogonal transformations

CS3220 Lecture Notes: QR factorization and orthogonal transformations CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss

More information

Notes 11: List Decoding Folded Reed-Solomon Codes

Notes 11: List Decoding Folded Reed-Solomon Codes Introduction to Coding Theory CMU: Spring 2010 Notes 11: List Decoding Folded Reed-Solomon Codes April 2010 Lecturer: Venkatesan Guruswami Scribe: Venkatesan Guruswami At the end of the previous notes,

More information

Compressive Sensing for Light Fields Ingomar Wesp Institute of Computer Graphics Johannes Kepler University Linz

Compressive Sensing for Light Fields Ingomar Wesp Institute of Computer Graphics Johannes Kepler University Linz Compressive Sensing for Light Fields Ingomar Wesp Institute of Computer Graphics Johannes Kepler University Linz 16th March 2011 Contents 1 Introduction 2 2 Related Work 3 2.1 Light Transport..................................

More information

B669 Sublinear Algorithms for Big Data

B669 Sublinear Algorithms for Big Data B669 Sublinear Algorithms for Big Data Qin Zhang 1-1 Now about the Big Data Big data is everywhere : over 2.5 petabytes of sales transactions : an index of over 19 billion web pages : over 40 billion of

More information

Stable Signal Recovery from Incomplete and Inaccurate Measurements

Stable Signal Recovery from Incomplete and Inaccurate Measurements Stable Signal Recovery from Incomplete and Inaccurate Measurements EMMANUEL J. CANDÈS California Institute of Technology JUSTIN K. ROMBERG California Institute of Technology AND TERENCE TAO University

More information

Lecture 14: Section 3.3

Lecture 14: Section 3.3 Lecture 14: Section 3.3 Shuanglin Shao October 23, 2013 Definition. Two nonzero vectors u and v in R n are said to be orthogonal (or perpendicular) if u v = 0. We will also agree that the zero vector in

More information

Lecture 6 Online and streaming algorithms for clustering

Lecture 6 Online and streaming algorithms for clustering CSE 291: Unsupervised learning Spring 2008 Lecture 6 Online and streaming algorithms for clustering 6.1 On-line k-clustering To the extent that clustering takes place in the brain, it happens in an on-line

More information

Discuss the size of the instance for the minimum spanning tree problem.

Discuss the size of the instance for the minimum spanning tree problem. 3.1 Algorithm complexity The algorithms A, B are given. The former has complexity O(n 2 ), the latter O(2 n ), where n is the size of the instance. Let n A 0 be the size of the largest instance that can

More information

Sublinear Algorithms for Big Data. Part 4: Random Topics

Sublinear Algorithms for Big Data. Part 4: Random Topics Sublinear Algorithms for Big Data Part : Random Topics Qin Zhang 1-1 Topic 3: Random sampling in distributed data streams (based on a paper with ormode, Muthukrishnan and Yi, PODS 10, JAM 12) 2-1 Distributed

More information

APPROXIMATION OF FRAME BASED MISSING DATA RECOVERY

APPROXIMATION OF FRAME BASED MISSING DATA RECOVERY APPROXIMATION OF FRAME BASED MISSING DATA RECOVERY JIAN-FENG CAI, ZUOWEI SHEN, AND GUI-BO YE Abstract. Recovering missing data from its partial samples is a fundamental problem in mathematics and it has

More information

Orthogonal Projections

Orthogonal Projections Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors

More information

APPROXIMATION OF FRAME BASED MISSING DATA RECOVERY

APPROXIMATION OF FRAME BASED MISSING DATA RECOVERY APPROXIMATION OF FRAME BASED MISSING DATA RECOVERY JIAN-FENG CAI, ZUOWEI SHEN, AND GUI-BO YE Abstract. Recovering missing data from its partial samples is a fundamental problem in mathematics and it has

More information

6. Cholesky factorization

6. Cholesky factorization 6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix

More information

Inner products on R n, and more

Inner products on R n, and more Inner products on R n, and more Peyam Ryan Tabrizian Friday, April 12th, 2013 1 Introduction You might be wondering: Are there inner products on R n that are not the usual dot product x y = x 1 y 1 + +

More information

Lecture 4 Online and streaming algorithms for clustering

Lecture 4 Online and streaming algorithms for clustering CSE 291: Geometric algorithms Spring 2013 Lecture 4 Online and streaming algorithms for clustering 4.1 On-line k-clustering To the extent that clustering takes place in the brain, it happens in an on-line

More information

We shall turn our attention to solving linear systems of equations. Ax = b

We shall turn our attention to solving linear systems of equations. Ax = b 59 Linear Algebra We shall turn our attention to solving linear systems of equations Ax = b where A R m n, x R n, and b R m. We already saw examples of methods that required the solution of a linear system

More information

THE great leap forward in digital processing over the

THE great leap forward in digital processing over the SEPTEMBER 1 1 Trust, but Verify: Fast and Accurate Signal Recovery from 1-bit Compressive Measurements Jason N. Laska, Zaiwen Wen, Wotao Yin, and Richard G. Baraniuk Abstract The recently emerged compressive

More information

How To Understand The Problem Of Decoding By Linear Programming

How To Understand The Problem Of Decoding By Linear Programming Decoding by Linear Programming Emmanuel Candes and Terence Tao Applied and Computational Mathematics, Caltech, Pasadena, CA 91125 Department of Mathematics, University of California, Los Angeles, CA 90095

More information

Streaming Algorithms

Streaming Algorithms 3 Streaming Algorithms Great Ideas in Theoretical Computer Science Saarland University, Summer 2014 Some Admin: Deadline of Problem Set 1 is 23:59, May 14 (today)! Students are divided into two groups

More information

Orthogonal Projections and Orthonormal Bases

Orthogonal Projections and Orthonormal Bases CS 3, HANDOUT -A, 3 November 04 (adjusted on 7 November 04) Orthogonal Projections and Orthonormal Bases (continuation of Handout 07 of 6 September 04) Definition (Orthogonality, length, unit vectors).

More information

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Allan Borodin November 15, 2012; Lecture 10 1 / 27 Randomized online bipartite matching and the adwords problem. We briefly return to online algorithms

More information

(67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7

(67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7 (67902) Topics in Theory and Complexity Nov 2, 2006 Lecturer: Irit Dinur Lecture 7 Scribe: Rani Lekach 1 Lecture overview This Lecture consists of two parts In the first part we will refresh the definition

More information

Morphological Diversity and Sparsity for Multichannel Data Restoration

Morphological Diversity and Sparsity for Multichannel Data Restoration Morphological Diversity and Sparsity for Multichannel Data Restoration J.Bobin 1, Y.Moudden 1, J.Fadili and J-L.Starck 1 1 jerome.bobin@cea.fr, ymoudden@cea.fr, jstarck@cea.fr - CEA-DAPNIA/SEDI, Service

More information

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. Nullspace Let A = (a ij ) be an m n matrix. Definition. The nullspace of the matrix A, denoted N(A), is the set of all n-dimensional column

More information

Big Data: The curse of dimensionality and variable selection in identification for a high dimensional nonlinear non-parametric system

Big Data: The curse of dimensionality and variable selection in identification for a high dimensional nonlinear non-parametric system Big Data: The curse of dimensionality and variable selection in identification for a high dimensional nonlinear non-parametric system Er-wei Bai University of Iowa, Iowa, USA Queen s University, Belfast,

More information

MODELING RANDOMNESS IN NETWORK TRAFFIC

MODELING RANDOMNESS IN NETWORK TRAFFIC MODELING RANDOMNESS IN NETWORK TRAFFIC - LAVANYA JOSE, INDEPENDENT WORK FALL 11 ADVISED BY PROF. MOSES CHARIKAR ABSTRACT. Sketches are randomized data structures that allow one to record properties of

More information

Fast Solution of l 1 -norm Minimization Problems When the Solution May be Sparse

Fast Solution of l 1 -norm Minimization Problems When the Solution May be Sparse Fast Solution of l 1 -norm Minimization Problems When the Solution May be Sparse David L. Donoho and Yaakov Tsaig October 6 Abstract The minimum l 1 -norm solution to an underdetermined system of linear

More information

Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

More information

Disjoint sparsity for signal separation and applications to hybrid imaging inverse problems

Disjoint sparsity for signal separation and applications to hybrid imaging inverse problems Disjoint sparsity for signal separation and applications to hybrid imaging inverse problems Giovanni S Alberti (joint with H Ammari) DMA, École Normale Supérieure, Paris June 16, 2015 Giovanni S Alberti

More information

Weakly Secure Network Coding

Weakly Secure Network Coding Weakly Secure Network Coding Kapil Bhattad, Student Member, IEEE and Krishna R. Narayanan, Member, IEEE Department of Electrical Engineering, Texas A&M University, College Station, USA Abstract In this

More information

Lattice-Based Threshold-Changeability for Standard Shamir Secret-Sharing Schemes

Lattice-Based Threshold-Changeability for Standard Shamir Secret-Sharing Schemes Lattice-Based Threshold-Changeability for Standard Shamir Secret-Sharing Schemes Ron Steinfeld (Macquarie University, Australia) (email: rons@ics.mq.edu.au) Joint work with: Huaxiong Wang (Macquarie University)

More information

Learning Big (Image) Data via Coresets for Dictionaries

Learning Big (Image) Data via Coresets for Dictionaries Learning Big (Image) Data via Coresets for Dictionaries Dan Feldman, Micha Feigin, and Nir Sochen Abstract. Signal and image processing have seen in the last few years an explosion of interest in a new

More information

New Constructions and Practical Applications for Private Stream Searching (Extended Abstract)

New Constructions and Practical Applications for Private Stream Searching (Extended Abstract) New Constructions and Practical Applications for Private Stream Searching (Extended Abstract)???? John Bethencourt CMU Dawn Song CMU Brent Waters SRI 1 Searching for Information Too much on-line info to

More information

Sketch As a Tool for Numerical Linear Algebra

Sketch As a Tool for Numerical Linear Algebra Sketching as a Tool for Numerical Linear Algebra (Graph Sparsification) David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania April, 2015 Sepehr Assadi (Penn)

More information

Analyzing Graph Structure via Linear Measurements

Analyzing Graph Structure via Linear Measurements Analyzing Graph Structure via Linear Measurements Kook Jin Ahn Sudipto Guha Andrew McGregor Abstract We initiate the study of graph sketching, i.e., algorithms that use a limited number of linear measurements

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

8. Linear least-squares

8. Linear least-squares 8. Linear least-squares EE13 (Fall 211-12) definition examples and applications solution of a least-squares problem, normal equations 8-1 Definition overdetermined linear equations if b range(a), cannot

More information

Text Analytics (Text Mining)

Text Analytics (Text Mining) CSE 6242 / CX 4242 Apr 3, 2014 Text Analytics (Text Mining) LSI (uses SVD), Visualization Duen Horng (Polo) Chau Georgia Tech Some lectures are partly based on materials by Professors Guy Lebanon, Jeffrey

More information

Continuous Matrix Approximation on Distributed Data

Continuous Matrix Approximation on Distributed Data Continuous Matrix Approximation on Distributed Data Mina Ghashami School of Computing University of Utah ghashami@cs.uah.edu Jeff M. Phillips School of Computing University of Utah jeffp@cs.uah.edu Feifei

More information

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar

More information

Inner product. Definition of inner product

Inner product. Definition of inner product Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

More information

Applications to Data Smoothing and Image Processing I

Applications to Data Smoothing and Image Processing I Applications to Data Smoothing and Image Processing I MA 348 Kurt Bryan Signals and Images Let t denote time and consider a signal a(t) on some time interval, say t. We ll assume that the signal a(t) is

More information

Geographical load balancing

Geographical load balancing Geographical load balancing Jonathan Lukkien 31 mei 2013 1 / 27 Jonathan Lukkien Geographical load balancing Overview Introduction The Model Algorithms & Experiments Extensions Conclusion 2 / 27 Jonathan

More information

An Information-Theoretic Approach to Distributed Compressed Sensing

An Information-Theoretic Approach to Distributed Compressed Sensing An Information-Theoretic Approach to Distributed Compressed Sensing Dror Baron, Marco F. Duarte, Shriram Sarvotham, Michael B. Wakin and Richard G. Baraniuk Dept. of Electrical and Computer Engineering,

More information

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

More information

v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.

v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors. 3. Cross product Definition 3.1. Let v and w be two vectors in R 3. The cross product of v and w, denoted v w, is the vector defined as follows: the length of v w is the area of the parallelogram with

More information

Mathematical finance and linear programming (optimization)

Mathematical finance and linear programming (optimization) Mathematical finance and linear programming (optimization) Geir Dahl September 15, 2009 1 Introduction The purpose of this short note is to explain how linear programming (LP) (=linear optimization) may

More information

Group Testing a tool of protecting Network Security

Group Testing a tool of protecting Network Security Group Testing a tool of protecting Network Security Hung-Lin Fu 傅 恆 霖 Department of Applied Mathematics, National Chiao Tung University, Hsin Chu, Taiwan Group testing (General Model) Consider a set N

More information

General Framework for an Iterative Solution of Ax b. Jacobi s Method

General Framework for an Iterative Solution of Ax b. Jacobi s Method 2.6 Iterative Solutions of Linear Systems 143 2.6 Iterative Solutions of Linear Systems Consistent linear systems in real life are solved in one of two ways: by direct calculation (using a matrix factorization,

More information

Throat polyp detection based on compressed big data of voice with support vector machine algorithm

Throat polyp detection based on compressed big data of voice with support vector machine algorithm Wang et al. EURASIP Journal on Advances in Signal Processing 214, 214:1 RESEARCH Open Access Throat polyp detection based on compressed big data of voice with support vector machine algorithm Wei Wang

More information

Dot product and vector projections (Sect. 12.3) There are two main ways to introduce the dot product

Dot product and vector projections (Sect. 12.3) There are two main ways to introduce the dot product Dot product and vector projections (Sect. 12.3) Two definitions for the dot product. Geometric definition of dot product. Orthogonal vectors. Dot product and orthogonal projections. Properties of the dot

More information

Linear Algebra Notes

Linear Algebra Notes Linear Algebra Notes Chapter 19 KERNEL AND IMAGE OF A MATRIX Take an n m matrix a 11 a 12 a 1m a 21 a 22 a 2m a n1 a n2 a nm and think of it as a function A : R m R n The kernel of A is defined as Note

More information

Statistical machine learning, high dimension and big data

Statistical machine learning, high dimension and big data Statistical machine learning, high dimension and big data S. Gaïffas 1 14 mars 2014 1 CMAP - Ecole Polytechnique Agenda for today Divide and Conquer principle for collaborative filtering Graphical modelling,

More information

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = 36 + 41i.

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = 36 + 41i. Math 5A HW4 Solutions September 5, 202 University of California, Los Angeles Problem 4..3b Calculate the determinant, 5 2i 6 + 4i 3 + i 7i Solution: The textbook s instructions give us, (5 2i)7i (6 + 4i)(

More information

Math 550 Notes. Chapter 7. Jesse Crawford. Department of Mathematics Tarleton State University. Fall 2010

Math 550 Notes. Chapter 7. Jesse Crawford. Department of Mathematics Tarleton State University. Fall 2010 Math 550 Notes Chapter 7 Jesse Crawford Department of Mathematics Tarleton State University Fall 2010 (Tarleton State University) Math 550 Chapter 7 Fall 2010 1 / 34 Outline 1 Self-Adjoint and Normal Operators

More information

0.1 Phase Estimation Technique

0.1 Phase Estimation Technique Phase Estimation In this lecture we will describe Kitaev s phase estimation algorithm, and use it to obtain an alternate derivation of a quantum factoring algorithm We will also use this technique to design

More information

Trading regret rate for computational efficiency in online learning with limited feedback

Trading regret rate for computational efficiency in online learning with limited feedback Trading regret rate for computational efficiency in online learning with limited feedback Shai Shalev-Shwartz TTI-C Hebrew University On-line Learning with Limited Feedback Workshop, 2009 June 2009 Shai

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

Big Data Interpolation: An Effcient Sampling Alternative for Sensor Data Aggregation

Big Data Interpolation: An Effcient Sampling Alternative for Sensor Data Aggregation Big Data Interpolation: An Effcient Sampling Alternative for Sensor Data Aggregation Hadassa Daltrophe, Shlomi Dolev, Zvi Lotker Ben-Gurion University Outline Introduction Motivation Problem definition

More information

2.3 Convex Constrained Optimization Problems

2.3 Convex Constrained Optimization Problems 42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions

More information

Approximating Semidefinite Programs in Sublinear Time

Approximating Semidefinite Programs in Sublinear Time Approximating Semidefinite Programs in Sublinear Time Dan Garber Technion - Israel Institute of Technology Haifa 3000 Israel dangar@cs.technion.ac.il Elad Hazan Technion - Israel Institute of Technology

More information

The Dantzig selector: statistical estimation when p is much larger than n

The Dantzig selector: statistical estimation when p is much larger than n The Dantzig selector: statistical estimation when p is much larger than n Emmanuel Candes and Terence Tao Applied and Computational Mathematics, Caltech, Pasadena, CA 91125 Department of Mathematics, University

More information

Notes on Symmetric Matrices

Notes on Symmetric Matrices CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.

More information

The Communication Complexity of Distributed Set-Joins with Applications to Matrix Multiplication

The Communication Complexity of Distributed Set-Joins with Applications to Matrix Multiplication The Communication Complexity of Distributed Set-Joins with Applications to Matrix Multiplication Dirk Van Gucht 1 Ryan Williams 2 David P. Woodruff 3 Qin Zhang 1 1 Indiana University Bloomington 2 Stanford

More information

ISOMETRIES OF R n KEITH CONRAD

ISOMETRIES OF R n KEITH CONRAD ISOMETRIES OF R n KEITH CONRAD 1. Introduction An isometry of R n is a function h: R n R n that preserves the distance between vectors: h(v) h(w) = v w for all v and w in R n, where (x 1,..., x n ) = x

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where. Introduction Linear Programming Neil Laws TT 00 A general optimization problem is of the form: choose x to maximise f(x) subject to x S where x = (x,..., x n ) T, f : R n R is the objective function, S

More information

Direct Methods for Solving Linear Systems. Matrix Factorization

Direct Methods for Solving Linear Systems. Matrix Factorization Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS)

Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) Anshumali Shrivastava Department of Computer Science Computing and Information Science Cornell University Ithaca, NY 4853, USA

More information