Nonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization /36-725

Size: px
Start display at page:

Download "Nonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization 10-725/36-725"

Transcription

1 Nonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization /

2 Beyond the tip? 2

3 Some takeaway points If possible, formulate task in terms of convex optimization typically easier to solve, easier to analyze Nonconvex does not necessarily mean nonscientific! However, statistically, it does typically mean high(er) variance In more cases than you might expect, nonconvex problems can be solved exactly (to global optimality) 3

4 What does it mean for a problem to be nonconvex? Consider a generic optimization problem: min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j = 1,... r This is a convex problem if f, h i, i = 1,... m are convex, and l j, j = 1,... r are affine A nonconvex problem is one of this form, where not all conditions are met on the functions But trivial modifications of convex problems can lead to nonconvex formulations... so we really just consider nonconvex problems that are not trivially equivalent to convex ones 4

5 What does it mean to solve a nonconvex problem? Nonconvex problems can have local minima, i.e., there can exist a feasible x such that f(y) f(x) for all feasible y such that x y 2 R but x is still not globally optimal. (Note: we proved that this could not happen for convex problems) Hence by solving a nonconvex problem, we mean finding the global minimizer We also implicitly mean doing it efficiently, i.e., in polynomial time 5

6 Addendum This is really about putting together a list of cool problems, that are suprisingly tractable... hence there will be exceptions about nonconvexity and/or requiring exact global optima (Also, I m sure that there are many more examples out there that I m missing, so I invite you to give me ideas / contribute!) 6

7 Outline Rough categories for today s problems: Classic nonconvex problems Eigen problems Graph problems Nonconvex proximal operators Discrete problems Infinite-dimensional problems Statistical problems 7

8 Classic nonconvex problems 8

9 Linear-fractional programs A linear-fractional program is of the form min x R n c T x + d e T x + f subject to Gx h, e T x + f > 0 Ax = b This is nonconvex (but quasiconvex). Provided that this problem is feasible, it is in fact equivalent to the linear program min y R n, z R c T y + dz subject to Gy hz 0, z 0 Ay bz = 0, e T y + fz = 1 9

10 The link between the two problems is the transformation y = x e T x + f, z = 1 e T x + f The proof of their equivlance is simple; e.g., see B & V Chapter 4 Linear-fractional problems show up in the study of solutions paths for many common statistical estimation problems The knots in the lasso path (values of λ at which a coefficient is made nonzero) can be seen as the optimal values of linear-fractional programs E.g., see Taylor et al. (2013), Inference in adaptive regression via the Kac-Rice formula 10

11 Geometric programs A monomial is a function f : R n ++ R of the form f(x) = γx a 1 1 xa 2 2 xan n for γ > 0, a 1,... a n R. A posynomial is a sum of monomials, f(x) = p k=1 A geometric program of the form γ k x a k1 1 x a k2 2 x a kn n min subject to f(x) g i (x) 1, i = 1,... m h j (x) = 1, j = 1,... r where f, g i, i = 1,... m are posynomials and h j, j = 1,... r are monomials. This is nonconvex 11

12 12 This is equivalent to a convex problem, via a simple transformation. Given f(x) = γx a 1 1 xa 2 2 xan n, let y i = log x i and rewrite this as γ(e y 1 ) a 1 (e y 2 ) a2 (e yn ) an = e at y+b for b = log γ. Also, a posynomial can be written as p k=1 eat k y+b k. With this variable substitution, and after taking logs, a geometric program is equivalent to ( p0 ) min log e at 0k y+b 0k subject to log k=1 ( pi e at ik y+b ik k=1 ) c T j y + d j = 0, j = 1,... r 0, i = 1,... m This is convex, recalling the convexity of soft max functions

13 Many interesting problems are geometric programs; see Boyd et al. (2007), A tutorial on geometric programming, and also Chapter 8.8 of Floor B & planning V book 43 w i (x i,y i ) C i h i H W Figure 8.18 Floor planning problem. Non-overlapping rectangular cells are placed in a rectangle with width W,heightH, andlowerleftcornerat(0, 0). The ith cell is specified by its width w i,heighth i,andthecoordinatesofits lower left corner, (x i,y i). Extension to matrix world: Sra and Hosseini (2013), Geometric optimization on positive definite matrices with application to elliptically contoured distributions 13

14 14 Handling convex equality constraints Given convex f, h i, i = 1,... m, the problem min x subject to f(x) h i (x) 0, i = 1,... m l(x) = 0 is nonconvex when l is convex but not affine. A convex relaxation of this problem is min x subject to f(x) h i (x) 0, i = 1,... m l(x) 0 If we can ensure that l(x ) = 0 at any solution x of the above problem, then the two are equivalent

15 From B & V Exercises 4.6 and 4.58, e.g., consider the maximum utility problem T max α t u(x t ) x 0,...x T R b 1,...b T +1 R t=0 subject to b t+1 = b t + f(b t ) x t, t = 0,... T 0 x t b t, t = 0,... T where b 0 0 is fixed. Interpretation: x t is the amount spent of your total available money b t at time t; concave function u gives utility, concave function f measures investment return This is not a convex problem, because of the equality constraint; but can relax to b t+1 b t + f(b t ) x t, t = 0,... T without changing solution (think about throwing out money) 15

16 Problems with two quadratic functions Consider the problem involving two quadratics min x R n x T A 0 x + 2b T 0 x + c 0 subject to x T A 1 x + 2b T 1 x + c 1 0 Here A 0, A 1 need not be positive definite, so this is nonconvex. The dual problem can be cast as max u R, v R subject to u [ ] A0 + va 1 b 0 + vb 1 (b 0 + vb 1 ) T 0 c 0 + vc 1 u v 0 and (as always) is convex. Furthermore, strong duality holds. See Appendix B of B & V, see also Beck and Eldar (2006), Strong duality in nonconvex quadratic optimization with two quadratic constraints 16

17 Eigen problems 17

18 18 Principal component analysis Given a matrix X R n p, consider the nonconvex problem min X R R R 2 n p F subject to rank(r) = k for some fixed k. The solution here is given by the singular value decomposition of X: if X = UDV T, then ˆR = U k D k V T k, where U k, V k are the first k columns of U, V, and D k is the first k diagonal elements of D. I.e., ˆR is the reconstruction of X from its first k principal components This is often called the Eckart-Young Theorem, established in 1936, but was probably known even earlier see Stewart (1992), On the early history of the singular value decomposition

19 19 Fantope Another characterization of the SVD is via the following nonconvex problem, given X R n p : min X Z S XZ 2 p F subject to rank(z) = k, Z is a projection max Z S p XT X, Z subject to rank(z) = k, Z is a projection The solution here is Ẑ = V kv T k, where the columns of V k R p k give the first k eigenvectors of X T X This is equivalent to a convex problem. Express constraint set C as { } C = Z S p : rank(z) = k, Z is a projection { } = Z S p : λ i (Z) {0, 1} for i = 1,... p, tr(z) = k

20 20 Now consider the convex hull F k = conv(c): { } F k = Z S p : λ i (Z) [0, 1], i = 1,... p, tr(z) = k { } = Z S p : 0 Z I, tr(z) = k This is called the Fantope of order k. Further, the convex problem max Z S p XT X, Z subject to Z F k admits the same solution as the original one, i.e., Ẑ = V kv T k See Fan (1949), On a theorem of Weyl conerning eigenvalues of linear transformations, and Overton and Womersley (1992), On the sum of the largest eigenvalues of a symmetric matrix Sparse PCA extension: Vu et al. (2013), Fantope projection and selection: near-optimal convex relaxation of sparse PCA

21 Classical multidimensional scaling Let x 1,... x n R p, and define similarities S ij = (x i x) T (x j x). For fixed k, classical multidimensional scaling or MDS solves the nonconvex problem min z 1,...z n R k i,j ( ) 2 S ij (z i z) T (z j z) From Hastie et al. (2009), The elements of statistical learning 21

22 22 Let S be the similarity matrix (entries S ij = (x i x) T (x j x)) The classical MDS problem has an exact solution in terms of the eigendecomposition S = UD 2 U T : ẑ 1,... ẑ n are the rows of U k D k where U k is the first k columns of U, and D k the first k diagonal entries of D Note: other very similar forms of MDS are not convex, and not directly solveable, e.g., least squares scaling, with d ij = x i x j 2 : min z 1,...z n R k ( ) 2 dij z i z j 2 i,j See Hastie et al. (2009), Chapter 14

23 23 Generalized eigenvalue problems Given B, W S p, B, W 0, consider the nonconvex problem max v R n v T Bv v T W v This is a generalized eigenvalue problem, with exact solution given by the top eigenvector of W 1 B This is important, e.g., in Fisher s discriminant analysis, where B is the between-class covariance matrix, and W the within-class covariance matrix See Hastie et al. (2009), Chapter 4

24 Graph problems 24

25 25 Min cut Given a graph G = (V, E) with V = {1,... n}, two nodes s, t V, and costs c ij 0 on edges (i, j) E. Min cut problem: min b R E,x R V subject to (i,j) E b ij c ij b ij x i x j b ij, x i, x j {0, 1} for all i, j, x s = 0, x t = 1 Think of b ij as the indicator that the edge (i, j) traverses the cut from s to t; think of x i as an indicator that node i is grouped with t. This nonconvex problem can be solved exactly using max flow (max flow/min cut theorem)

26 26 A relaxation of min cut min b R E,x R V subject to (i,j) E b ij c ij b ij x i x j for all i, j b 0 x s = 0, x t = 1 This is an LP; recall that it is the dual of the max flow LP: max f R E (s,j) E f sj subject to f ij 0, f ij c ij for all (i, j) E f ik = f kj for all k V \ {s, t} (i,k) E (k,j) E Max flow min cut theorem tells us that the relaxed min cut is tight

27 e P Shortest paths Given a graph G = (V, E), with edge costs c e, e E, consider the shortest path problem, between two nodes s, t V min c e min c e paths P P =(e 1,...e r) Dijkstra s algorithm solves this problem (and more), from Dijkstra (1959), A note on two problems in connexion with graphs e P subject to e 1,1 = s, e r,2 = t e i,2 = e i+1,1, i = 1,... r 1 Clever implementations run in O( E log V ) time; e.g., see Kleinberg and Tardos (2005), Algorithm design, Chapter 5 27

28 Nonconvex proximal operators 28

29 Hard-thresholding One of the simplest nonconvex problems, given y R n : min β R n n n (y i β i ) 2 + λ i 1{β i 0} i=1 i=1 Solution is given by hard-thresholding y, { y i if yi 2 β i = > λ i 0 otherwise, i = 1,... n and can be seen by inspection. Special case λ i = λ, i = 1,... n, min β R n y β λ β 0 Compare to soft-thresholding, prox operator for l 1 penalty. Note: changing the loss to y Xβ 2 2 gives best subset selection, which is NP hard for general X 29

30 30 l 0 segmentation Consider the nonconvex l 0 segmentation problem min β R n n n 1 (y i β i ) 2 + λ 1{β i β i+1 } i=1 i=1 Can be solved exactly using dynamic programming, in two ways: Bellman (1961), On the approximation of curves by line segments using dynamic programming, and Johnson (2013) A dynamic programming algorithm for the fused lasso and L 0 -segmentation Johnson: more efficient, Bellman: more general Worst-case O(n 2 ), but with practical performance more like O(n)

31 31 Tree-leaves projection Given target u R n, tree g on R n, and label y {0, 1}, consider min u z R z 2 n 2 + λ 1{g(z) y} Interpretation: find z close to u, whose label under g is not unlike y. Argue directly that solution is either ẑ = u or ẑ = P S (u), where S = g 1 (1) = {z : g(z) = y} the set of leaves of g assigned label y. We simply compute both options for ẑ and compare costs. Therefore problem reduces to computing P S (y), the projection onto a set of tree leaves, a highly nonconvex set This appears as a subroutine of a broader algorithm for nonconvex optimization; see Carreira-Perpinan and Wang (2012), Distributed optimization of deeply nested systems

32 The set S is a union of axis-aligned boxes; projection onto any one box is fast, O(n) operations 32

33 33 To project onto S, could just scan through all boxes, and take the closest Faster: decorate each node of tree with labels of its leaves, and bounding box. Perform depthfirst search, pruning nodes that do not contain a leaf labeled y, or whose bounding box is farther away than the current closest box

34 Discrete problems 34

35 35 Binary graph segmentation Given y R n, and a graph G = (V, E), V = {1,... n}, consider binary graph segmentation: min β {0,1} n n (y i β i ) 2 + λ ij 1{β i β j } i=1 (i,j) E Simple manipulation brings this problem to the form max a i + b j A {1,...n} i A j A c (i,j) E, A {i,j} =1 which is a segmentation problem that can be solved exactly using min cut/max flow. E.g., Kleinberg and Tardos (2005), Algorithm design, Chapter 7 λ ij

36 36 E.g., apply recursively to get a verison of graph hierarchical clustering (divisive) E.g., take the graph as a 2d grid for image segmentation (From ac.kr)

37 37 Discrete l 0 segmentation Now consider discrete l 0 segmentation: min β {b 1,...b k } n n n 1 (y i β i ) 2 + λ 1{β i β i+1 } i=1 where {b 1,... b k } is some fixed discrete set. This can be efficiently solved using classic (discrete) dynamic programming Key insight is that the 1-dimensional structure allows us to exactly solve and store ˆβ 1 (β 2 ) = ˆβ 2 (β 3 ) =... argmin β 1 {b 1,...b k } argmin β 2 {b 1,...b k } i=1 (y 1 β 1 ) 2 + λ 1{β 1 β 2 } } {{ } f 1 (β 1,β 2 ) f 1 ( ˆβ1 (β 2 ), β 2 ) + (y2 β 2 ) 2 + λ 1{β 2 β 3 }

38 38 Algorithm: Make a forward pass over β 1,... β n 1, keeping a look-up table; also keep a look-up table for the optimal partial criterion values f 1,... f n 1 Solve exactly for β n Make a backward pass β n 1,... β 1, reading off the look-up table β 1 β 2... β n 1 f 1 f 2... f n 1 b 1 b 2... b k b 1 b 2... b k Requires O(nk) operations

39 Infinite-dimensional problems 39

40 Smoothing splines Given pairs (x i, y i ) R R, i = 1,... n, smoothing splines solve min f n i=1 ( yi f(x i ) ) 2 + λ (f ( k+1 2 ) (t) ) 2 dt for a fixed odd k. The domain of minimization here is all functions f for which (f ( k+1 2 ) (t)) 2 dt <. Infinite-dimensional problem, but convex (in function space) Can show that the solution ˆf to the above problem is unique, and given by a natural spline of order k, with knots at x 1,... x n. This means we can restrict our attention to functions f = n θ j η j j=1 where η 1,... η n are natural spline basis functions 40

41 41 Plugging in f = n j=1 θ jη j, transform smoothing spline problem into finite-dimensional form: min θ R n y Nθ λθ T Ωθ where N ij = η j (x i ), and Ω ij = η ( k+1 2 ) i (t) η ( k+1 j solution is explicitly given by ˆθ = (N T N + λω) 1 N T y 2 ) (t) dt. The and fitted function is ˆf = n j=1 ˆθ j η j. With proper choice of basis function (B-splines), calculation of ˆθ is O(n) See, e.g., Wahba (1990), Splines models for observational data ; Green and Silverman (1994), Nonparametric regression and generalized linear models ; Hastie et al. (2009), Chapter 5

42 42 Locally adaptive regression splines Given same setup, locally adaptive regression splines solve min f n ( yi f(x i ) ) 2 + λ TV(f (k) ) i=1 for fixed k, even or odd. The domain is all f with TV(f (k) ) <, and again this is infinite-dimensional but convex Again, can show that a solution ˆf to above problem is given by a spline of order k, but two key differences: Can have any number of knots n k 1 (tuned by λ) Knots do not necessarily coincide with input points x 1,... x n See Mammen and van de Geer (1997), Locally adaptive regression splines ; in short, these are statistically more adaptive but computationally more challenging than smoothing splines

43 43 Mammen and van de Geer (1997) consider restricting attention to splines with knots contained in {x 1,... x n }; this turns the problem into finite-dimensional form, min θ R n y Gθ λ n j=k+2 θ j where G ij = g j (x i ), and g 1,... g n is a basis for splines with knots at x 1,... x n. The fitted function is ˆf = n j=1 ˆθ j g j These authors prove that the solution of this (tractable) problem ˆf and of the original problem f differ by max x [x 1,x n] ˆf(x) f (x) d k TV ( (f ) (k)) k with the maximum gap between inputs. Therefore, statistically it is reasonable to solve the finite-dimensional problem

44 E.g., a comparison, tuned to the same overall model complexity: Smoothing spline Finite-dimensional locally adaptive regression spline The left fit is easier to compute, but the right is more adaptive (Note: trend filtering estimates are asymptotically equivalent to locally adaptive regression splines, but much cheapter to compute) 44

45 Statistical problems 45

46 46 Sparse underdetermined linear systems Suppose that X R n p has unit normed columns, X i 2 = 1, for i = 1,... n. Given y, consider the problem of finding the sparsest sparse linear solution min β β R p 0 subject to Xβ = y This is nonconvex and known to be NP hard, for a generic X. A natural convex relaxation is the l 1 basis pursuit problem: min β β R p 1 subject to Xβ = y It turns out that there is a deep connection between the two; we cite results from Donoho (2006), For most large underdetermined systems of linear equations, the minimal l 1 norm solution is also the sparsest solution

47 47 As n, p grow large, p > n, there exists a threshold ρ (depending on the ratio p/n), such that for most matrices X, if we solve the l 1 problem and find a solution with: fewer than ρn nonzero components, then this is the unique solution of the l 0 problem greater than ρn nonzero components, then there is no solution of the linear system with less than ρn nonzero components (Here most is quantified precisely in terms of a probability over matrices X, constructed by drawing columns of X uniformly at random over the unit sphere in R n ) There is a large and fast-moving body of related literature. See Donoho et al. (2009), Message-passing algorithms for compressed sensing for a nice review

48 48 Nearly optimal K-means Given data points x 1,... x n R p, the K-means problem solves 1 min c 1,...c K R p n n i=1 min k=1,...k x i c k 2 2 } {{ } f(c 1,...c K ) This is NP hard, and is usually approximately solved using Lloyd s algorithm, run many times, with random starts Careful choice of starting positions makes a big impact: running Lloyd s algorithm once, from c 1 = s 1,... c K = s K, for cleverly chosen random s 1,... s K, yields estimates ĉ 1,... ĉ K satisfying E [ f(ĉ 1,... ĉ K ) ] 8(log k + 2) min f(c 1,... c K ) c 1,...c K R p

49 49 See Arthur and Vassilvitskii (2007), k-means++: The advantages of careful seeding. In fact, their construction of s 1,... s K is very simple: Begin by choosing s 1 uniformly at random among x 1,... x n Compute squared distances d 2 i = x i s for all points i not chosen, and choose s 2 by drawing from the remaining points, with probability weights d 2 i / j d2 j Recompute the squared distances as d 2 i = min { x i s 1 2 2, x i s 2 2 } 2 and choose s 3 according to the same recipe And so on, until s 1,... s K are chosen

2.3 Convex Constrained Optimization Problems

2.3 Convex Constrained Optimization Problems 42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions

More information

Duality in General Programs. Ryan Tibshirani Convex Optimization 10-725/36-725

Duality in General Programs. Ryan Tibshirani Convex Optimization 10-725/36-725 Duality in General Programs Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: duality in linear programs Given c R n, A R m n, b R m, G R r n, h R r : min x R n c T x max u R m, v R r b T

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Introduction to Support Vector Machines. Colin Campbell, Bristol University Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.

More information

Lecture Topic: Low-Rank Approximations

Lecture Topic: Low-Rank Approximations Lecture Topic: Low-Rank Approximations Low-Rank Approximations We have seen principal component analysis. The extraction of the first principle eigenvalue could be seen as an approximation of the original

More information

Statistical machine learning, high dimension and big data

Statistical machine learning, high dimension and big data Statistical machine learning, high dimension and big data S. Gaïffas 1 14 mars 2014 1 CMAP - Ecole Polytechnique Agenda for today Divide and Conquer principle for collaborative filtering Graphical modelling,

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

1 Introduction to Matrices

1 Introduction to Matrices 1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

More information

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8 Spaces and bases Week 3: Wednesday, Feb 8 I have two favorite vector spaces 1 : R n and the space P d of polynomials of degree at most d. For R n, we have a canonical basis: R n = span{e 1, e 2,..., e

More information

Discrete Optimization

Discrete Optimization Discrete Optimization [Chen, Batson, Dang: Applied integer Programming] Chapter 3 and 4.1-4.3 by Johan Högdahl and Victoria Svedberg Seminar 2, 2015-03-31 Todays presentation Chapter 3 Transforms using

More information

Proximal mapping via network optimization

Proximal mapping via network optimization L. Vandenberghe EE236C (Spring 23-4) Proximal mapping via network optimization minimum cut and maximum flow problems parametric minimum cut problem application to proximal mapping Introduction this lecture:

More information

Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

More information

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method Lecture 3 3B1B Optimization Michaelmas 2015 A. Zisserman Linear Programming Extreme solutions Simplex method Interior point method Integer programming and relaxation The Optimization Tree Linear Programming

More information

ECE 842 Report Implementation of Elliptic Curve Cryptography

ECE 842 Report Implementation of Elliptic Curve Cryptography ECE 842 Report Implementation of Elliptic Curve Cryptography Wei-Yang Lin December 15, 2004 Abstract The aim of this report is to illustrate the issues in implementing a practical elliptic curve cryptographic

More information

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm. Approximation Algorithms 11 Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of three

More information

Fitting Subject-specific Curves to Grouped Longitudinal Data

Fitting Subject-specific Curves to Grouped Longitudinal Data Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE Alexer Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html This is a joint work with J.A. Hartigan

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

24. The Branch and Bound Method

24. The Branch and Bound Method 24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no

More information

Adaptive Online Gradient Descent

Adaptive Online Gradient Descent Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where. Introduction Linear Programming Neil Laws TT 00 A general optimization problem is of the form: choose x to maximise f(x) subject to x S where x = (x,..., x n ) T, f : R n R is the objective function, S

More information

Recovery of primal solutions from dual subgradient methods for mixed binary linear programming; a branch-and-bound approach

Recovery of primal solutions from dual subgradient methods for mixed binary linear programming; a branch-and-bound approach MASTER S THESIS Recovery of primal solutions from dual subgradient methods for mixed binary linear programming; a branch-and-bound approach PAULINE ALDENVIK MIRJAM SCHIERSCHER Department of Mathematical

More information

Penalized Logistic Regression and Classification of Microarray Data

Penalized Logistic Regression and Classification of Microarray Data Penalized Logistic Regression and Classification of Microarray Data Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France Penalized Logistic Regression andclassification

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices

Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices MATH 30 Differential Equations Spring 006 Linear algebra and the geometry of quadratic equations Similarity transformations and orthogonal matrices First, some things to recall from linear algebra Two

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

More information

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA

More information

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

More information

Manifold Learning Examples PCA, LLE and ISOMAP

Manifold Learning Examples PCA, LLE and ISOMAP Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1 5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1 General Integer Linear Program: (ILP) min c T x Ax b x 0 integer Assumption: A, b integer The integrality condition

More information

Chapter 6. Orthogonality

Chapter 6. Orthogonality 6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be

More information

Vector and Matrix Norms

Vector and Matrix Norms Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

More information

Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure

Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure Belyaev Mikhail 1,2,3, Burnaev Evgeny 1,2,3, Kapushev Yermek 1,2 1 Institute for Information Transmission

More information

Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

More information

Seminar. Path planning using Voronoi diagrams and B-Splines. Stefano Martina stefano.martina@stud.unifi.it

Seminar. Path planning using Voronoi diagrams and B-Splines. Stefano Martina stefano.martina@stud.unifi.it Seminar Path planning using Voronoi diagrams and B-Splines Stefano Martina stefano.martina@stud.unifi.it 23 may 2016 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International

More information

Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

More information

Lecture 11: 0-1 Quadratic Program and Lower Bounds

Lecture 11: 0-1 Quadratic Program and Lower Bounds Lecture : - Quadratic Program and Lower Bounds (3 units) Outline Problem formulations Reformulation: Linearization & continuous relaxation Branch & Bound Method framework Simple bounds, LP bound and semidefinite

More information

1 Formulating The Low Degree Testing Problem

1 Formulating The Low Degree Testing Problem 6.895 PCP and Hardness of Approximation MIT, Fall 2010 Lecture 5: Linearity Testing Lecturer: Dana Moshkovitz Scribe: Gregory Minton and Dana Moshkovitz In the last lecture, we proved a weak PCP Theorem,

More information

1 Norms and Vector Spaces

1 Norms and Vector Spaces 008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)

More information

Notes on Symmetric Matrices

Notes on Symmetric Matrices CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.

More information

A Branch and Bound Algorithm for Solving the Binary Bi-level Linear Programming Problem

A Branch and Bound Algorithm for Solving the Binary Bi-level Linear Programming Problem A Branch and Bound Algorithm for Solving the Binary Bi-level Linear Programming Problem John Karlof and Peter Hocking Mathematics and Statistics Department University of North Carolina Wilmington Wilmington,

More information

Principles of Scientific Computing Nonlinear Equations and Optimization

Principles of Scientific Computing Nonlinear Equations and Optimization Principles of Scientific Computing Nonlinear Equations and Optimization David Bindel and Jonathan Goodman last revised March 6, 2006, printed March 6, 2009 1 1 Introduction This chapter discusses two related

More information

Lecture 5 Principal Minors and the Hessian

Lecture 5 Principal Minors and the Hessian Lecture 5 Principal Minors and the Hessian Eivind Eriksen BI Norwegian School of Management Department of Economics October 01, 2010 Eivind Eriksen (BI Dept of Economics) Lecture 5 Principal Minors and

More information

Offline sorting buffers on Line

Offline sorting buffers on Line Offline sorting buffers on Line Rohit Khandekar 1 and Vinayaka Pandit 2 1 University of Waterloo, ON, Canada. email: rkhandekar@gmail.com 2 IBM India Research Lab, New Delhi. email: pvinayak@in.ibm.com

More information

Two-Stage Stochastic Linear Programs

Two-Stage Stochastic Linear Programs Two-Stage Stochastic Linear Programs Operations Research Anthony Papavasiliou 1 / 27 Two-Stage Stochastic Linear Programs 1 Short Reviews Probability Spaces and Random Variables Convex Analysis 2 Deterministic

More information

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision

More information

Lecture 13 Linear quadratic Lyapunov theory

Lecture 13 Linear quadratic Lyapunov theory EE363 Winter 28-9 Lecture 13 Linear quadratic Lyapunov theory the Lyapunov equation Lyapunov stability conditions the Lyapunov operator and integral evaluating quadratic integrals analysis of ARE discrete-time

More information

5. Orthogonal matrices

5. Orthogonal matrices L Vandenberghe EE133A (Spring 2016) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal

More information

Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

More information

Computer Graphics. Geometric Modeling. Page 1. Copyright Gotsman, Elber, Barequet, Karni, Sheffer Computer Science - Technion. An Example.

Computer Graphics. Geometric Modeling. Page 1. Copyright Gotsman, Elber, Barequet, Karni, Sheffer Computer Science - Technion. An Example. An Example 2 3 4 Outline Objective: Develop methods and algorithms to mathematically model shape of real world objects Categories: Wire-Frame Representation Object is represented as as a set of points

More information

Linear Codes. Chapter 3. 3.1 Basics

Linear Codes. Chapter 3. 3.1 Basics Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length

More information

Binary Image Reconstruction

Binary Image Reconstruction A network flow algorithm for reconstructing binary images from discrete X-rays Kees Joost Batenburg Leiden University and CWI, The Netherlands kbatenbu@math.leidenuniv.nl Abstract We present a new algorithm

More information

Several Views of Support Vector Machines

Several Views of Support Vector Machines Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1 (d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

More information

Discuss the size of the instance for the minimum spanning tree problem.

Discuss the size of the instance for the minimum spanning tree problem. 3.1 Algorithm complexity The algorithms A, B are given. The former has complexity O(n 2 ), the latter O(2 n ), where n is the size of the instance. Let n A 0 be the size of the largest instance that can

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

Høgskolen i Narvik Sivilingeniørutdanningen STE6237 ELEMENTMETODER. Oppgaver

Høgskolen i Narvik Sivilingeniørutdanningen STE6237 ELEMENTMETODER. Oppgaver Høgskolen i Narvik Sivilingeniørutdanningen STE637 ELEMENTMETODER Oppgaver Klasse: 4.ID, 4.IT Ekstern Professor: Gregory A. Chechkin e-mail: chechkin@mech.math.msu.su Narvik 6 PART I Task. Consider two-point

More information

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

University of Lille I PC first year list of exercises n 7. Review

University of Lille I PC first year list of exercises n 7. Review University of Lille I PC first year list of exercises n 7 Review Exercise Solve the following systems in 4 different ways (by substitution, by the Gauss method, by inverting the matrix of coefficients

More information

Nonlinear Optimization: Algorithms 3: Interior-point methods

Nonlinear Optimization: Algorithms 3: Interior-point methods Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris Jean-Philippe.Vert@mines.org Nonlinear optimization c 2006 Jean-Philippe Vert,

More information

1 VECTOR SPACES AND SUBSPACES

1 VECTOR SPACES AND SUBSPACES 1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

More information

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm. Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of

More information

An Introduction to Machine Learning

An Introduction to Machine Learning An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,

More information

VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS

VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS MICHAEL DRMOTA, OMER GIMENEZ, AND MARC NOY Abstract. We show that the number of vertices of a given degree k in several kinds of series-parallel labelled

More information

Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points

Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points Introduction to Algebraic Geometry Bézout s Theorem and Inflection Points 1. The resultant. Let K be a field. Then the polynomial ring K[x] is a unique factorisation domain (UFD). Another example of a

More information

Solution of Linear Systems

Solution of Linear Systems Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start

More information

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Solving polynomial least squares problems via semidefinite programming relaxations

Solving polynomial least squares problems via semidefinite programming relaxations Solving polynomial least squares problems via semidefinite programming relaxations Sunyoung Kim and Masakazu Kojima August 2007, revised in November, 2007 Abstract. A polynomial optimization problem whose

More information

Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar

Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar Complexity Theory IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar Outline Goals Computation of Problems Concepts and Definitions Complexity Classes and Problems Polynomial Time Reductions Examples

More information

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression The SVD is the most generally applicable of the orthogonal-diagonal-orthogonal type matrix decompositions Every

More information

BANACH AND HILBERT SPACE REVIEW

BANACH AND HILBERT SPACE REVIEW BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but

More information

CSC2420 Spring 2015: Lecture 3

CSC2420 Spring 2015: Lecture 3 CSC2420 Spring 2015: Lecture 3 Allan Borodin January 22, 2015 1 / 1 Announcements and todays agenda Assignment 1 due next Thursday. I may add one or two additional questions today or tomorrow. Todays agenda

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Charlie Frogner 1 MIT 2011 1 Slides mostly stolen from Ryan Rifkin (Google). Plan Regularization derivation of SVMs. Analyzing the SVM problem: optimization, duality. Geometric

More information

Smoothing and Non-Parametric Regression

Smoothing and Non-Parametric Regression Smoothing and Non-Parametric Regression Germán Rodríguez grodri@princeton.edu Spring, 2001 Objective: to estimate the effects of covariates X on a response y nonparametrically, letting the data suggest

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Degrees of Freedom and Model Search

Degrees of Freedom and Model Search Degrees of Freedom and Model Search Ryan J. Tibshirani Abstract Degrees of freedom is a fundamental concept in statistical modeling, as it provides a quantitative description of the amount of fitting performed

More information

TWO L 1 BASED NONCONVEX METHODS FOR CONSTRUCTING SPARSE MEAN REVERTING PORTFOLIOS

TWO L 1 BASED NONCONVEX METHODS FOR CONSTRUCTING SPARSE MEAN REVERTING PORTFOLIOS TWO L 1 BASED NONCONVEX METHODS FOR CONSTRUCTING SPARSE MEAN REVERTING PORTFOLIOS XIAOLONG LONG, KNUT SOLNA, AND JACK XIN Abstract. We study the problem of constructing sparse and fast mean reverting portfolios.

More information

Stochastic Inventory Control

Stochastic Inventory Control Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

More information

(67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7

(67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7 (67902) Topics in Theory and Complexity Nov 2, 2006 Lecturer: Irit Dinur Lecture 7 Scribe: Rani Lekach 1 Lecture overview This Lecture consists of two parts In the first part we will refresh the definition

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Inner products on R n, and more

Inner products on R n, and more Inner products on R n, and more Peyam Ryan Tabrizian Friday, April 12th, 2013 1 Introduction You might be wondering: Are there inner products on R n that are not the usual dot product x y = x 1 y 1 + +

More information

Regularized Logistic Regression for Mind Reading with Parallel Validation

Regularized Logistic Regression for Mind Reading with Parallel Validation Regularized Logistic Regression for Mind Reading with Parallel Validation Heikki Huttunen, Jukka-Pekka Kauppi, Jussi Tohka Tampere University of Technology Department of Signal Processing Tampere, Finland

More information

10. Proximal point method

10. Proximal point method L. Vandenberghe EE236C Spring 2013-14) 10. Proximal point method proximal point method augmented Lagrangian method Moreau-Yosida smoothing 10-1 Proximal point method a conceptual algorithm for minimizing

More information

Message-passing sequential detection of multiple change points in networks

Message-passing sequential detection of multiple change points in networks Message-passing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal

More information

3. Linear Programming and Polyhedral Combinatorics

3. Linear Programming and Polyhedral Combinatorics Massachusetts Institute of Technology Handout 6 18.433: Combinatorial Optimization February 20th, 2009 Michel X. Goemans 3. Linear Programming and Polyhedral Combinatorics Summary of what was seen in the

More information

Chapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

Chapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one

More information

7 Gaussian Elimination and LU Factorization

7 Gaussian Elimination and LU Factorization 7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method

More information