The degrees of freedom of the Lasso in underdetermined linear regression models

Size: px
Start display at page:

Download "The degrees of freedom of the Lasso in underdetermined linear regression models"

Transcription

1 The degrees of freedom of the Lasso in underdetermined linear regression models C. Dossal (1), M. Kachour (2), J. Fadili (2), G. Peyré (3), C. Chesneau (4) (1) IMB, Université Bordeaux 1 (2) GREYC, ENSICAEN (3) CNRS and Ceremade Université Paris-Dauphine (4) LMNO, Université de Caen SPARS 11, June 2011 Edinburgh

2 The contents 1 Motivations 2 Some characteristics of the Lasso 3 The overdetermined case 4 Main results 5 References

3 Motivations The contents 1 Motivations 2 Some characteristics of the Lasso 3 The overdetermined case 4 Main results 5 References

4 Motivations Statistical framework : The underdetermined (or high dimensional) linear regression model estimation : y = Ax 0 + ε = θ 0 + ε, (1) where y R n, A = (a j ) j {1,,p}, with a j R n and n < p, x 0 R p, and ε N (0, σ 2 I ) Applications : Genomics Econometrics Signal processing Tomography,...

5 Motivations Strict sparsity assumption : Weak sparsity assumption : x 0 has few non-zero components x 0 can be approximated to good quality by a vector with few non-zero components Let ˆx = f (y) be an estimator of x 0 and ˆµ = Aˆx Since y N (Ax 0, σ 2 ), according to Efron [2], the degrees of freedom df of ˆµ is dened by df (ˆµ) = n i=1 cov(ˆµ i, y i ) σ 2 (2)

6 Motivations Stein's lemma [3] : Let ˆx be an estimator of x 0. Suppose that ˆµ = Aˆx is almost dierentiable. Then, we have [ ] df (ˆµ) = E df ˆ (ˆµ), where df ˆ (ˆµ) = div (ˆµ) = n i=1 ˆµ i y i (3)

7 Motivations Stein's lemma [3] : Let ˆx be an estimator of x 0. Suppose that ˆµ = Aˆx is almost dierentiable. Then, we have [ ] df (ˆµ) = E df ˆ (ˆµ), where df ˆ (ˆµ) = div (ˆµ) = n i=1 ˆµ i y i (3) We dene MSE(ˆµ) = E ˆµ θ = E A(ˆx x 0 ) 2 2 (4) SURE(ˆµ) = nσ 2 + ˆµ y σ 2 ˆ df (ˆµ) (5) So, according to Stein's lemma, we have MSE(ˆµ) = E [SURE(ˆµ)] (6)

8 Some characteristics of the Lasso The contents 1 Motivations 2 Some characteristics of the Lasso 3 The overdetermined case 4 Main results 5 References

9 Some characteristics of the Lasso The Lasso estimate [4] amounts to solving : P 1 (y, λ) : 1 min x R p 2 y Ax λ x 1 (7) Depending on λ, some coecients are set exactly equal to zero The estimator can be computed for very large p Let x = ( x j ) R p. The support of x and its cardinal are dened by I ( x) = supp( x) = {j : x j 0}, I ( x) = card(i ( x)) We note x I ( x) = ( x i ) i I ( x), A I ( x) = (a i ) i I ( x) and sign ( x I ( x) ) = ( sign ( xi) )) i I ( x) { 1, 1} I ( x)

10 Some characteristics of the Lasso x is a solution of P 1 (y, λ) if and only if 1 a k, y A x = λ, k I ( x) 2 a j, y A x λ, j I ( x) x is the unique minimizer of P 1 (y, λ) if 1 a k, y A x = λ, k I ( x) 2 a j, y A x < λ, j I ( x) 3 A I ( x) is full rank Here, we have ( ) 1 ) 1 x I ( x) = A t I ( x) A I ( x) A t I ( x) (A y λ t I ( x) A ) I ( x) sign ( xi ( x) (8) Furtheremore, there exists a nite sequence of λ's, called the transition points, so that between two transition point the support and the sign of the optimum (8) are constant with respect to λ.

11 The overdetermined case The contents 1 Motivations 2 Some characteristics of the Lasso 3 The overdetermined case 4 Main results 5 References

12 The overdetermined case Consider the overdetermined linear regression model : y = Ax 0 + ε, where y R n, x 0 R p, ε N (0, σ 2 I ), and (a j ) j {1,,p} are linearly independent, that is rank(a) = p Here, the Lasso problem P 1 (y, λ) has an unique solution x λ Let K λ R n, so that z K λ, λ is not a transition point

13 The overdetermined case Theorem (Zou and al. [5]) Let λ > 0 and y K λ. Let µ λ = A x λ (y) be the Lasso response vector. Then, we have [ ] df ( µ λ ) = E Ĩ, (9) where Ĩ is the support of x λ

14 Main results The contents 1 Motivations 2 Some characteristics of the Lasso 3 The overdetermined case 4 Main results 5 References

15 Main results Goal : Extend the previous result to the underdetermined case where the Lasso solution is not unique Remark : If x 1 and x 2 are solutions of P 1 (y, λ), then A x 1 = A x 2 Let ˆµ λ (y) be the unique image of the solutions of P 1 (y, λ)

16 Main results Notations Let I {1, 2,, p}, such that A I = (a i ) i I is full rank, A + I = (A t I A I ) 1 A t I the pseudo-inverse of A I, V I = span(a i ) i I, P VI = A I A + I, P V = I n n P VI, S { 1, 1} I I Let j {1, 2,, p} and x λ > 0. We dene H I,j,S = {u R n : P V (a j ), u = ±λ(1 a j, (A + I I ) t S )} (10)

17 Main results Notations Let I {1, 2,, p}, such that A I = (a i ) i I is full rank, A + I = (A t I A I ) 1 A t I the pseudo-inverse of A I, V I = span(a i ) i I, P VI = A I A + I, P V = I n n P VI, S { 1, 1} I I Let j {1, 2,, p} and x λ > 0. We dene H I,j,S = {u R n : P V (a j ), u = ±λ(1 a j, (A + I I ) t S )} (10) Ω = {(I, j, S) : a j V I } (11)

18 Main results Notations Let I {1, 2,, p}, such that A I = (a i ) i I is full rank, A + I = (A t I A I ) 1 A t I the pseudo-inverse of A I, V I = span(a i ) i I, P VI = A I A + I, P V = I n n P VI, S { 1, 1} I I Let j {1, 2,, p} and x λ > 0. We dene H I,j,S = {u R n : P V (a j ), u = ±λ(1 a j, (A + I I ) t S )} (10) and Ω = {(I, j, S) : a j V I } (11) G λ = R n \ H I,j,S (12) (I,j,S) Ω

19 Main results Notations Let I {1, 2,, p}, such that A I = (a i ) i I is full rank, A + I = (A t I A I ) 1 A t I the pseudo-inverse of A I, V I = span(a i ) i I, P VI = A I A + I, P V = I n n P VI, S { 1, 1} I I Let j {1, 2,, p} and x λ > 0. We dene H I,j,S = {u R n : P V (a j ), u = ±λ(1 a j, (A + I I ) t S )} (10) and Ω = {(I, j, S) : a j V I } (11) G λ = R n \ H I,j,S (12) (I,j,S) Ω Remark : K λ G λ

20 Main results Theorem : Let λ > 0, y G λ, and M y,λ the set of all the solutions of P 1 (y, λ). Let xλ be a solution of P 1(y, λ) such that A I the associated matrix to its support I is full rank. Then, I = min x λ M y,λ supp(x λ ). (13) Furthermore, there exists ε > 0 such that for all z B(y, ε), we have ˆµ λ (z) = ˆµ λ (y) + P VI (z y). (14)

21 Main results Corollary 1 : Let λ > 0 and y G λ. Then, we have div(ˆµ λ (y)) = I (15) Moreover, df = E( I ). (16) Corollary 2 : Let λ > 0 and y G λ. Suppose that P 1 (y, λ) has an unique solution ˆx λ, and let Î be its support. Then, we have df = E( Î ). (17)

22 Main results Condition (UC), Dossal [1] A matrix A satises condition (UC) if, for all subsets I {1,, p} with I n, such that (a i ) i I are linearly independent, for all indices j I and all vectors S { 1, 1} I, a j, (A + I ) t S 1. (18) Consequences Suppose that A satises condition (UC). Then, we have y G λ P 1 (y, λ) has an unique solution ˆx λ Therefore, df = E( Î ), where Î is the support of ˆx λ

23 References The contents 1 Motivations 2 Some characteristics of the Lasso 3 The overdetermined case 4 Main results 5 References

24 References Dossal, C (2007). A necessary and sucient condition for exact recovery by l1 minimization. Technical report, HAL :1. Efron, B. (1981). How biased is the apparent error rate of a prediction rule. J. Amer. Statist. Assoc. vol. 81 pp Stein, C. (1981). Estimation of the mean of a multivariate normal distribution. Ann. Statist Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58(1) Zou, H., Hastie, T. and Tibshirani, R. (2007). On the "degrees of freedom" of the Lasso. Ann. Statist. Vol. 35, No

Degrees of Freedom and Model Search

Degrees of Freedom and Model Search Degrees of Freedom and Model Search Ryan J. Tibshirani Abstract Degrees of freedom is a fundamental concept in statistical modeling, as it provides a quantitative description of the amount of fitting performed

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. Nullspace Let A = (a ij ) be an m n matrix. Definition. The nullspace of the matrix A, denoted N(A), is the set of all n-dimensional column

More information

160 CHAPTER 4. VECTOR SPACES

160 CHAPTER 4. VECTOR SPACES 160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results

More information

On the Degrees of Freedom of the Lasso

On the Degrees of Freedom of the Lasso On the Degrees of Freedom of the Lasso Hui Zou Trevor Hastie Robert Tibshirani Abstract We study the degrees of freedom of the Lasso in the framewor of Stein s unbiased ris estimation (SURE). We show that

More information

Solving Linear Systems, Continued and The Inverse of a Matrix

Solving Linear Systems, Continued and The Inverse of a Matrix , Continued and The of a Matrix Calculus III Summer 2013, Session II Monday, July 15, 2013 Agenda 1. The rank of a matrix 2. The inverse of a square matrix Gaussian Gaussian solves a linear system by reducing

More information

When is missing data recoverable?

When is missing data recoverable? When is missing data recoverable? Yin Zhang CAAM Technical Report TR06-15 Department of Computational and Applied Mathematics Rice University, Houston, TX 77005 October, 2006 Abstract Suppose a non-random

More information

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = 36 + 41i.

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = 36 + 41i. Math 5A HW4 Solutions September 5, 202 University of California, Los Angeles Problem 4..3b Calculate the determinant, 5 2i 6 + 4i 3 + i 7i Solution: The textbook s instructions give us, (5 2i)7i (6 + 4i)(

More information

These axioms must hold for all vectors ū, v, and w in V and all scalars c and d.

These axioms must hold for all vectors ū, v, and w in V and all scalars c and d. DEFINITION: A vector space is a nonempty set V of objects, called vectors, on which are defined two operations, called addition and multiplication by scalars (real numbers), subject to the following axioms

More information

Weakly Secure Network Coding

Weakly Secure Network Coding Weakly Secure Network Coding Kapil Bhattad, Student Member, IEEE and Krishna R. Narayanan, Member, IEEE Department of Electrical Engineering, Texas A&M University, College Station, USA Abstract In this

More information

Orthogonal Projections and Orthonormal Bases

Orthogonal Projections and Orthonormal Bases CS 3, HANDOUT -A, 3 November 04 (adjusted on 7 November 04) Orthogonal Projections and Orthonormal Bases (continuation of Handout 07 of 6 September 04) Definition (Orthogonality, length, unit vectors).

More information

Factorization Theorems

Factorization Theorems Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

More information

8. Linear least-squares

8. Linear least-squares 8. Linear least-squares EE13 (Fall 211-12) definition examples and applications solution of a least-squares problem, normal equations 8-1 Definition overdetermined linear equations if b range(a), cannot

More information

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA

More information

About the inverse football pool problem for 9 games 1

About the inverse football pool problem for 9 games 1 Seventh International Workshop on Optimal Codes and Related Topics September 6-1, 013, Albena, Bulgaria pp. 15-133 About the inverse football pool problem for 9 games 1 Emil Kolev Tsonka Baicheva Institute

More information

Inner products on R n, and more

Inner products on R n, and more Inner products on R n, and more Peyam Ryan Tabrizian Friday, April 12th, 2013 1 Introduction You might be wondering: Are there inner products on R n that are not the usual dot product x y = x 1 y 1 + +

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

WHEN DOES A CROSS PRODUCT ON R n EXIST?

WHEN DOES A CROSS PRODUCT ON R n EXIST? WHEN DOES A CROSS PRODUCT ON R n EXIST? PETER F. MCLOUGHLIN It is probably safe to say that just about everyone reading this article is familiar with the cross product and the dot product. However, what

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, [email protected] Spring 2007 Lecture 3: QR, least squares, linear regression Linear Algebra Methods for Data Mining, Spring 2007, University

More information

Notes on Symmetric Matrices

Notes on Symmetric Matrices CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.

More information

Solving Systems of Linear Equations

Solving Systems of Linear Equations LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how

More information

Duality of linear conic problems

Duality of linear conic problems Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least

More information

Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1

Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 J. Zhang Institute of Applied Mathematics, Chongqing University of Posts and Telecommunications, Chongqing

More information

Regularized Logistic Regression for Mind Reading with Parallel Validation

Regularized Logistic Regression for Mind Reading with Parallel Validation Regularized Logistic Regression for Mind Reading with Parallel Validation Heikki Huttunen, Jukka-Pekka Kauppi, Jussi Tohka Tampere University of Technology Department of Signal Processing Tampere, Finland

More information

Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity

Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity Wei Dai and Olgica Milenkovic Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign

More information

Methods for Finding Bases

Methods for Finding Bases Methods for Finding Bases Bases for the subspaces of a matrix Row-reduction methods can be used to find bases. Let us now look at an example illustrating how to obtain bases for the row space, null space,

More information

Discussion on the paper Hypotheses testing by convex optimization by A. Goldenschluger, A. Juditsky and A. Nemirovski.

Discussion on the paper Hypotheses testing by convex optimization by A. Goldenschluger, A. Juditsky and A. Nemirovski. Discussion on the paper Hypotheses testing by convex optimization by A. Goldenschluger, A. Juditsky and A. Nemirovski. Fabienne Comte, Celine Duval, Valentine Genon-Catalot To cite this version: Fabienne

More information

SMALL SKEW FIELDS CÉDRIC MILLIET

SMALL SKEW FIELDS CÉDRIC MILLIET SMALL SKEW FIELDS CÉDRIC MILLIET Abstract A division ring of positive characteristic with countably many pure types is a field Wedderburn showed in 1905 that finite fields are commutative As for infinite

More information

IRREDUCIBLE OPERATOR SEMIGROUPS SUCH THAT AB AND BA ARE PROPORTIONAL. 1. Introduction

IRREDUCIBLE OPERATOR SEMIGROUPS SUCH THAT AB AND BA ARE PROPORTIONAL. 1. Introduction IRREDUCIBLE OPERATOR SEMIGROUPS SUCH THAT AB AND BA ARE PROPORTIONAL R. DRNOVŠEK, T. KOŠIR Dedicated to Prof. Heydar Radjavi on the occasion of his seventieth birthday. Abstract. Let S be an irreducible

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

α = u v. In other words, Orthogonal Projection

α = u v. In other words, Orthogonal Projection Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

Adaptive Online Gradient Descent

Adaptive Online Gradient Descent Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650

More information

F. ABTAHI and M. ZARRIN. (Communicated by J. Goldstein)

F. ABTAHI and M. ZARRIN. (Communicated by J. Goldstein) Journal of Algerian Mathematical Society Vol. 1, pp. 1 6 1 CONCERNING THE l p -CONJECTURE FOR DISCRETE SEMIGROUPS F. ABTAHI and M. ZARRIN (Communicated by J. Goldstein) Abstract. For 2 < p

More information

6. Cholesky factorization

6. Cholesky factorization 6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix

More information

Randomized Robust Linear Regression for big data applications

Randomized Robust Linear Regression for big data applications Randomized Robust Linear Regression for big data applications Yannis Kopsinis 1 Dept. of Informatics & Telecommunications, UoA Thursday, Apr 16, 2015 In collaboration with S. Chouvardas, Harris Georgiou,

More information

SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH

SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH 31 Kragujevac J. Math. 25 (2003) 31 49. SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH Kinkar Ch. Das Department of Mathematics, Indian Institute of Technology, Kharagpur 721302, W.B.,

More information

RINGS WITH A POLYNOMIAL IDENTITY

RINGS WITH A POLYNOMIAL IDENTITY RINGS WITH A POLYNOMIAL IDENTITY IRVING KAPLANSKY 1. Introduction. In connection with his investigation of projective planes, M. Hall [2, Theorem 6.2]* proved the following theorem: a division ring D in

More information

2.3 Convex Constrained Optimization Problems

2.3 Convex Constrained Optimization Problems 42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions

More information

T ( a i x i ) = a i T (x i ).

T ( a i x i ) = a i T (x i ). Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)

More information

COMMUTATIVE RINGS. Definition: A domain is a commutative ring R that satisfies the cancellation law for multiplication:

COMMUTATIVE RINGS. Definition: A domain is a commutative ring R that satisfies the cancellation law for multiplication: COMMUTATIVE RINGS Definition: A commutative ring R is a set with two operations, addition and multiplication, such that: (i) R is an abelian group under addition; (ii) ab = ba for all a, b R (commutative

More information

Model selection in R featuring the lasso. Chris Franck LISA Short Course March 26, 2013

Model selection in R featuring the lasso. Chris Franck LISA Short Course March 26, 2013 Model selection in R featuring the lasso Chris Franck LISA Short Course March 26, 2013 Goals Overview of LISA Classic data example: prostate data (Stamey et. al) Brief review of regression and model selection.

More information

Matrix Representations of Linear Transformations and Changes of Coordinates

Matrix Representations of Linear Transformations and Changes of Coordinates Matrix Representations of Linear Transformations and Changes of Coordinates 01 Subspaces and Bases 011 Definitions A subspace V of R n is a subset of R n that contains the zero element and is closed under

More information

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

More information

Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix multiplication).

Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix multiplication). MAT 2 (Badger, Spring 202) LU Factorization Selected Notes September 2, 202 Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix

More information

3. Linear Programming and Polyhedral Combinatorics

3. Linear Programming and Polyhedral Combinatorics Massachusetts Institute of Technology Handout 6 18.433: Combinatorial Optimization February 20th, 2009 Michel X. Goemans 3. Linear Programming and Polyhedral Combinatorics Summary of what was seen in the

More information

ON THE DEGREES OF FREEDOM OF THE LASSO

ON THE DEGREES OF FREEDOM OF THE LASSO The Annals of Statistics 2007, Vol. 35, No. 5, 2173 2192 DOI: 10.1214/009053607000000127 Institute of Mathematical Statistics, 2007 ON THE DEGREES OF FREEDOM OF THE LASSO BY HUI ZOU, TREVOR HASTIE AND

More information

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES I GROUPS: BASIC DEFINITIONS AND EXAMPLES Definition 1: An operation on a set G is a function : G G G Definition 2: A group is a set G which is equipped with an operation and a special element e G, called

More information

Lasso on Categorical Data

Lasso on Categorical Data Lasso on Categorical Data Yunjin Choi, Rina Park, Michael Seo December 14, 2012 1 Introduction In social science studies, the variables of interest are often categorical, such as race, gender, and nationality.

More information

Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

More information

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications

More information

Lecture 5 Least-squares

Lecture 5 Least-squares EE263 Autumn 2007-08 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property

More information

MATH1231 Algebra, 2015 Chapter 7: Linear maps

MATH1231 Algebra, 2015 Chapter 7: Linear maps MATH1231 Algebra, 2015 Chapter 7: Linear maps A/Prof. Daniel Chan School of Mathematics and Statistics University of New South Wales [email protected] Daniel Chan (UNSW) MATH1231 Algebra 1 / 43 Chapter

More information

1 Norms and Vector Spaces

1 Norms and Vector Spaces 008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)

More information

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Notes from February 11

Notes from February 11 Notes from February 11 Math 130 Course web site: www.courses.fas.harvard.edu/5811 Two lemmas Before proving the theorem which was stated at the end of class on February 8, we begin with two lemmas. The

More information

1 Introduction to Matrices

1 Introduction to Matrices 1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

9 Multiplication of Vectors: The Scalar or Dot Product

9 Multiplication of Vectors: The Scalar or Dot Product Arkansas Tech University MATH 934: Calculus III Dr. Marcel B Finan 9 Multiplication of Vectors: The Scalar or Dot Product Up to this point we have defined what vectors are and discussed basic notation

More information

Orthogonal Projections

Orthogonal Projections Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors

More information

By choosing to view this document, you agree to all provisions of the copyright laws protecting it.

By choosing to view this document, you agree to all provisions of the copyright laws protecting it. This material is posted here with permission of the IEEE Such permission of the IEEE does not in any way imply IEEE endorsement of any of Helsinki University of Technology's products or services Internal

More information

GI01/M055 Supervised Learning Proximal Methods

GI01/M055 Supervised Learning Proximal Methods GI01/M055 Supervised Learning Proximal Methods Massimiliano Pontil (based on notes by Luca Baldassarre) (UCL) Proximal Methods 1 / 20 Today s Plan Problem setting Convex analysis concepts Proximal operators

More information

1 Sets and Set Notation.

1 Sets and Set Notation. LINEAR ALGEBRA MATH 27.6 SPRING 23 (COHEN) LECTURE NOTES Sets and Set Notation. Definition (Naive Definition of a Set). A set is any collection of objects, called the elements of that set. We will most

More information

On Mardia s Tests of Multinormality

On Mardia s Tests of Multinormality On Mardia s Tests of Multinormality Kankainen, A., Taskinen, S., Oja, H. Abstract. Classical multivariate analysis is based on the assumption that the data come from a multivariate normal distribution.

More information

Inner product. Definition of inner product

Inner product. Definition of inner product Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

More information

Notes V General Equilibrium: Positive Theory. 1 Walrasian Equilibrium and Excess Demand

Notes V General Equilibrium: Positive Theory. 1 Walrasian Equilibrium and Excess Demand Notes V General Equilibrium: Positive Theory In this lecture we go on considering a general equilibrium model of a private ownership economy. In contrast to the Notes IV, we focus on positive issues such

More information

1 0 5 3 3 A = 0 0 0 1 3 0 0 0 0 0 0 0 0 0 0

1 0 5 3 3 A = 0 0 0 1 3 0 0 0 0 0 0 0 0 0 0 Solutions: Assignment 4.. Find the redundant column vectors of the given matrix A by inspection. Then find a basis of the image of A and a basis of the kernel of A. 5 A The second and third columns are

More information

THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS

THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS KEITH CONRAD 1. Introduction The Fundamental Theorem of Algebra says every nonconstant polynomial with complex coefficients can be factored into linear

More information

Package bigdata. R topics documented: February 19, 2015

Package bigdata. R topics documented: February 19, 2015 Type Package Title Big Data Analytics Version 0.1 Date 2011-02-12 Author Han Liu, Tuo Zhao Maintainer Han Liu Depends glmnet, Matrix, lattice, Package bigdata February 19, 2015 The

More information

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where. Introduction Linear Programming Neil Laws TT 00 A general optimization problem is of the form: choose x to maximise f(x) subject to x S where x = (x,..., x n ) T, f : R n R is the objective function, S

More information

BLOCKWISE SPARSE REGRESSION

BLOCKWISE SPARSE REGRESSION Statistica Sinica 16(2006), 375-390 BLOCKWISE SPARSE REGRESSION Yuwon Kim, Jinseog Kim and Yongdai Kim Seoul National University Abstract: Yuan an Lin (2004) proposed the grouped LASSO, which achieves

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 7: Conditionally Positive Definite Functions Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 [email protected] MATH 590 Chapter

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

Sales forecasting # 2

Sales forecasting # 2 Sales forecasting # 2 Arthur Charpentier [email protected] 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting

More information

THE DIMENSION OF A VECTOR SPACE

THE DIMENSION OF A VECTOR SPACE THE DIMENSION OF A VECTOR SPACE KEITH CONRAD This handout is a supplementary discussion leading up to the definition of dimension and some of its basic properties. Let V be a vector space over a field

More information

OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION

OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION Sérgio Pequito, Stephen Kruzick, Soummya Kar, José M. F. Moura, A. Pedro Aguiar Department of Electrical and Computer Engineering

More information

Ideal Class Group and Units

Ideal Class Group and Units Chapter 4 Ideal Class Group and Units We are now interested in understanding two aspects of ring of integers of number fields: how principal they are (that is, what is the proportion of principal ideals

More information

FINITE DIMENSIONAL ORDERED VECTOR SPACES WITH RIESZ INTERPOLATION AND EFFROS-SHEN S UNIMODULARITY CONJECTURE AARON TIKUISIS

FINITE DIMENSIONAL ORDERED VECTOR SPACES WITH RIESZ INTERPOLATION AND EFFROS-SHEN S UNIMODULARITY CONJECTURE AARON TIKUISIS FINITE DIMENSIONAL ORDERED VECTOR SPACES WITH RIESZ INTERPOLATION AND EFFROS-SHEN S UNIMODULARITY CONJECTURE AARON TIKUISIS Abstract. It is shown that, for any field F R, any ordered vector space structure

More information

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

More information

What is Linear Programming?

What is Linear Programming? Chapter 1 What is Linear Programming? An optimization problem usually has three essential ingredients: a variable vector x consisting of a set of unknowns to be determined, an objective function of x to

More information

SECRET sharing schemes were introduced by Blakley [5]

SECRET sharing schemes were introduced by Blakley [5] 206 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006 Secret Sharing Schemes From Three Classes of Linear Codes Jin Yuan Cunsheng Ding, Senior Member, IEEE Abstract Secret sharing has

More information

Linear Algebra. A vector space (over R) is an ordered quadruple. such that V is a set; 0 V ; and the following eight axioms hold:

Linear Algebra. A vector space (over R) is an ordered quadruple. such that V is a set; 0 V ; and the following eight axioms hold: Linear Algebra A vector space (over R) is an ordered quadruple (V, 0, α, µ) such that V is a set; 0 V ; and the following eight axioms hold: α : V V V and µ : R V V ; (i) α(α(u, v), w) = α(u, α(v, w)),

More information

Big Data Techniques Applied to Very Short-term Wind Power Forecasting

Big Data Techniques Applied to Very Short-term Wind Power Forecasting Big Data Techniques Applied to Very Short-term Wind Power Forecasting Ricardo Bessa Senior Researcher ([email protected]) Center for Power and Energy Systems, INESC TEC, Portugal Joint work with

More information

4: SINGLE-PERIOD MARKET MODELS

4: SINGLE-PERIOD MARKET MODELS 4: SINGLE-PERIOD MARKET MODELS Ben Goldys and Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2015 B. Goldys and M. Rutkowski (USydney) Slides 4: Single-Period Market

More information

TWO L 1 BASED NONCONVEX METHODS FOR CONSTRUCTING SPARSE MEAN REVERTING PORTFOLIOS

TWO L 1 BASED NONCONVEX METHODS FOR CONSTRUCTING SPARSE MEAN REVERTING PORTFOLIOS TWO L 1 BASED NONCONVEX METHODS FOR CONSTRUCTING SPARSE MEAN REVERTING PORTFOLIOS XIAOLONG LONG, KNUT SOLNA, AND JACK XIN Abstract. We study the problem of constructing sparse and fast mean reverting portfolios.

More information

Background: State Estimation

Background: State Estimation State Estimation Cyber Security of the Smart Grid Dr. Deepa Kundur Background: State Estimation University of Toronto Dr. Deepa Kundur (University of Toronto) Cyber Security of the Smart Grid 1 / 81 Dr.

More information

Chapter 7. Continuity

Chapter 7. Continuity Chapter 7 Continuity There are many processes and eects that depends on certain set of variables in such a way that a small change in these variables acts as small change in the process. Changes of this

More information

A Negative Result Concerning Explicit Matrices With The Restricted Isometry Property

A Negative Result Concerning Explicit Matrices With The Restricted Isometry Property A Negative Result Concerning Explicit Matrices With The Restricted Isometry Property Venkat Chandar March 1, 2008 Abstract In this note, we prove that matrices whose entries are all 0 or 1 cannot achieve

More information

(Quasi-)Newton methods

(Quasi-)Newton methods (Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting

More information

Row Ideals and Fibers of Morphisms

Row Ideals and Fibers of Morphisms Michigan Math. J. 57 (2008) Row Ideals and Fibers of Morphisms David Eisenbud & Bernd Ulrich Affectionately dedicated to Mel Hochster, who has been an inspiration to us for many years, on the occasion

More information

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision

More information