Robust Semidefinite Programming Approaches for Sensor Network Localization with Anchors

Transcription

1 Robust Semidefinite Programming Approaches for Sensor Network Localization with Anchors Nathan Krislock Veronica Piccialli Henry Wolkowicz May 1, 006 University of Waterloo Department of Combinatorics and Optimization Waterloo, Ontario NL 3G1, Canada Research Report CORR Key Words: Sensor Network Localization, Anchors, Graph Realization, Semidefinite Programming, Euclidean Distance Matrix Completions, Gauss-Newton Method AMS Subject Classification: Contents 1 Introduction Outline Problem Formulation Related Approaches Background and Preliminary heory 5.1 Linear ransformations and Adjoints Related to EDM Properties of ransformations Invariant Cones and Perron Roots SDP Relaxation EDMC Problem Reformulation using Matrices Relaxation of the Hard Quadratic Constraint Facial Reduction Further Model Development with only X, Y Duality for EDMC-R 17 Research supported by Natural Sciences Engineering Research Council Canada. ngbkrislock@uwaterloo.ca Department of Computer and Systems Science Antonio Ruberti University of Rome La Sapienza Via Buonarroti 1, Rome, Italy. Veronica.Piccialli@dis.uniroma1.it Research supported by Natural Sciences Engineering Research Council Canada. hwolkowicz@uwaterloo.ca 1

2 5 A Robust Primal-Dual Interior-Point Method Linearization Heuristic for an Initial Strictly Feasible Point Diagonal Preconditioning Instability in the Model Case of an Accurate Relaxation Linear Independence of Constraints and Redundancy in Objective Function Instability from Linearization Alternative Models for Stability Minimum Norm Solutions for Underdetermined Systems Strengthened Relaxation using Lagrangian Dual Numerical ests 30 8 Conclusion 34 A Linear ransformations and Adjoints 35 B Composition of ransformations and Properties 37 B.1 KY : R rn+tn R tm+n n tm B. Y K B.3 K W K W C Elements of λ 40 D Diagonal Preconditioning Details 41 List of ables 1 Increasing Noise: densityw =.75, densityl =.8, n = 15, m = 5, r = Convergence Rate: densityw =.80, densityl =.8, n = 0, m = 6, r = Decreasing density W and L: n = 15, m = 5, r = List of Figures 1 Sample random problem with 0 sensors, 1 anchors After one iteration for randomly generated problem After two iterations for randomly generated problem After eight iterations for randomly generated problem After twelve iterations for randomly generated problem After iterations for randomly generated problem After 9 iterations for randomly generated problem Abstract

3 We derive a robust primal-dual interior-point algorithm for a semidefinite programming, SDP, relaxation for sensor localization with anchors and with noisy distance information. he relaxation is based on finding a Euclidean Distance Matrix, EDM, that is nearest in the Frobenius norm for the known noisy distances and that satisfies given upper and lower bounds on the unknown distances. We show that the SDP relaxation for this nearest EDM problem is usually underdetermined and is an ill-posed problem. Our interior-point algorithm exploits the structure and maintains exact feasibility at each iteration. High accuracy solutions can be obtained despite the illconditioning of the optimality conditions. Included are discussions on the strength and stability of the SDP relaxations, as well as results on invariant cones related to the operators that map between the cones of semidefinite and Euclidean distance matrices. 1 Introduction Many applications use ad hoc wireless sensor networks for monitoring information, e.g. for earthquake detection, ocean current flows, weather, etc... ypical networks include a large number of sensor nodes which gather data and communicate among themselves. he location of a subset of the sensors is known; these sensor nodes are called anchors. From intercommunication among sensor nodes within a given radio range, we are able to establish approximate distances between a subset of the sensors and anchors. he sensor localization problem is to determine/estimate the location of all the sensors from this partial information on the distances. For more details on the various applications, see e.g. [7, 4, 14]. We model the problem by treating it as a nearest Euclidean Distance Matrix, EDM, problem with lower bounds formed using the radio range between pairs of nodes for which no distance exists. We also allow for existing upper bounds. When solving for the sensor locations, the problem is nonconvex and hard to solve exactly. We study the semidefinite programming, SDP, relaxation of the sensor localization problem. We project the feasible set onto the minimal face of the cone of semidefinite matrices, SDP, and show that strict feasibility holds for the primal problem. However, we illustrate that this SDP relaxation has inherent instability, is usually underdetermined and ill-posed. his is due to both the nearest matrix model as well as the SDP relaxation. In addition, the relaxation can be strengthened using Lagrangian relaxation. Nevertheless, we derive a robust interior-point algorithm for the projected SDP relaxation. he algorithm exploits the special structure of the linear transformations for both efficiency and robustness. We eliminate, in advance, the primal and dual linear feasibility equations. We work with an overdetermined bilinear system, rather than using a square symmetrized approach, that is common for SDP. his uses the Gauss-Newton approach in [15] and involves a crossover step after which we use the affine scaling direction with steplength one and without any backtracking to preserve positive definiteness. he search direction is found using a preconditioned conjugate gradient method, LSQR [18]. he algorithm maintains exact primal and dual feasibility throughout the iterations and attains high accurate solutions. Our numerical tests show that, despite the illconditioning, we are able to obtain high accuracy in the solution of the SDP problem. In addition, we obtain surprisingly highly accurate approximations of the original sensor localization problem. 3

4 1.1 Outline he formulation of the sensor localization problem is outlined in Section 1.. We continue in Section with background and notation, including information on the linear transformation and adjoints used in the model. In Section.3 we present explicit representations of the Perron root and corresponding eigenvector for the important linear transformation that maps between the cones of EDM and SDP. he SDP relaxation is presented in Section 3. We use the Frobenius norm rather than the l 1 norm that is used in the literature. We then project the problem onto the minimal face in order to obtain Slater s constraint qualification strict feasibility and guarantee strong duality. he dual is presented in Section 4. he primal-dual interior-point p-d i-p algorithm is derived in Section 5. We include a heuristic for obtaining a strictly feasible starting point and the details for the diagonal preconditioning using the structure of the linear transformations. he algorithm uses a crossover technique, i.e. we use the affine scaling step without backtracking once we get sufficient decrease in the duality gap. he instability in the SDP relaxation, and different approaches on dealing with it, is discussed in Section 6. We then continue with numerical tests and concluding remarks in Sections 7 and 8. We include an appendix with the details of the composition and adjoints of the linear transformations, as well as the details of the diagonal preconditioning. 1. Problem Formulation Let the n unknown sensor points be p 1, p,..., p n R r, r the embedding dimension; and let the m known anchor points be a 1, a,..., a m R r. Let A = [a 1, a,..., a m ], X = [p 1, p,..., p n ], and define P := p 1, p,..., p n, a 1, a,..., a m = X A. 1.1 Note that we can always shift all the sensors and anchors so that the anchors are centered at the origin, A e = 0. We can then shift them all back at the end. o avoid some special trivial cases, we assume the following. Assumption 1.1 he number of sensors and anchors, and the embedding dimension satisfy n > m > r, A e = 0, and A is full column rank. Now define N e, N u, N l, respectively, to be the index sets of specified distance values, upper bounds, lower bounds, respectively, of the distances d ij between pairs of nodes from {p i } n 1 sensors; and let M e, M u, M l, denote the same for distances between a node from {p i } n 1 sensor and a node from {a k } m 1 anchor. Define the partial Euclidean Distance Matrix E with elements { d E ij = ij if ij N e M e 0 otherwise. he underlying graph is G = V, E, 1. with node set {1,..., m + n} and edge set N e M e. Similarly, we define the matrix of squared upper distance bounds U and the matrix of squared lower distance bounds L for ij N u M u and N l M l, respectively. 4

5 Our first formulation for finding the sensor locations p j is the feasibility question for the constraints: p i p j = E ij i, j N e n e = Ne p i a k = E ik i, k M e Me m e = p p U ij i, j N u n u = Nu i j p i a k U ik i, k M u m u = Mu 1.3 p i p j L ij i, j N l n l = N l p i a k L ik i, k M l m l = M l Note that the first two and the last two sets of constraints are quadratic, nonconvex, constraints. Let W p, W a be weight matrices. For example, they simply could be 0, 1 matrices that indicate when an exact distance is unknown or known. Or a weight could be used to indicate the confidence in the value of the distance. If there is noise in the data, the exact model 1.3 can be infeasible. herefore, we can minimize the weighted least squares error. min f 1 P := 1 W p ij p i p j E ij i,j N e + 1 W a ik p i a k E ik i,k M e EDMC subject to p i p j U ij i, j N u n u = Nu 1.4 p i a k U ik i, k M u m u = Mu p i p j L ij i, j N l n l = N l p i a k L ik i, k M l m l = M l. his is a hard problem to solve due to the nonconvex objective and constraints. One could apply global optimization techniques. In this paper, we use a convex relaxation approach, i.e. we use semidefinite relaxation. his results in a surprisingly robust, strong relaxation Related Approaches Several recent papers have developed algorithms for the semidefinite relaxation of the sensor localization problem. Recent work on this area includes e.g. [14, 6, 4, 19, 5]. hese relaxations use the l 1 norm rather than the l norm that we use. We assume the noise in the radio signal is from a multivariate normal distribution with mean 0 and variance-covariance matrix σ I, i.e. from a spherical normal distribution so that the least squares estimates are the maximum likelihood estimates. Our approach follows that in [] for EDM completion without anchors. Background and Preliminary heory.1 Linear ransformations and Adjoints Related to EDM We work in spaces of real matrices, M s t, equipped with the trace inner-product A, B = trace A B and induced Frobenius norm A = trace A A. he space of n n real symmetric matrices is denoted S n. It s dimension is tn = nn + 1/. For a given B S n, we denote 5

6 b = svec B R tn to be the vector obtained columnwise from the upper-triangular part of B with the strict upper-triangular part multiplied by. hus svec is an isometry mapping S n R tn. he inverse and adjoint mapping is B = smat b. We now define several linear operators on S n. A collection of linear transformations, adjoints and properties are given in the appendices. D e B := diag B e + e diag B, KB := D e B B,.5 where e is the vector of ones. he adjoint linear operators are D ed = Diag De, K D = Diag De D..6 By abuse of notation we allow D e to act on R n : D e v = ve + ev, v R n. he linear operator K maps the cone of positive semidefinite matrices denoted SDP onto the cone of Euclidean distance matrices denoted EDM, i.e. KSDP = EDM. his allows us to change problem EDMC into a SDP problem. We define the linear transformation sblk i S = S i S t, on S S n, which pulls out the i-th diagonal block of the matrix S of dimension t. he values of t and n can change and will be clear from the context. he adjoint sblk i = sblk i, where S t, constructs a symmetric matrix of suitable dimensions with all elements zero expect for the i-th diagonal block given by. Similarly, we define the linear transformation sblk ij G = G ij, on G S n, which pulls out the ij block of the matrix G of dimension k l and multiplies it by. he values of k, l, and n can change and will be clear from the context. he adjoint sblk ijj = sblk ij J, where J M k l = R kl, constructs a symmetric matrix which has all elements zero expect for the block ij which is given by J multiplied by 1, and for the block ji which is given by J multiplied by 1. We consider J M k l to be a k l matrix and equivalently J R kl is a vector of length kl with the positions known. he multiplication by or 1 guarantees that the mapping is an isometry.. Properties of ransformations Lemma.1 [1] he nullspace N K equals the range RD e. he range RK equals the hollow subspace of S n, denoted S H := {D S n : diag D = 0}. he range RK equals the centered subspace of S n, denoted S c := {B S n : Be = 0}. Corollary.1 1. Let S D denote the cone of diagonal matrices in S n. hen S c = N De = RK N K = RD e S H = RK = N D e S D = N K = RDe. 6

7 . Let V 1 n e be an n n orthogonal matrix. hen Y 0 Y = V Ŷ V + D e v 0, for some Ŷ Sn 1, v R n. Proof. Item 1 follows immediately from the definitions of the range RK = S c and the nullspace N K = RD e. Item now follows from the definition of V, i.e. the subspace of centered matrices S c = V S n 1 V. Let B = P P. hen D ij = p i p j = diag Be + ediag B B ij = KB ij,.7 i.e. the EDM D = D ij and the points p i in P are related by D = KB, see.5. Lemma. Suppose that 0 B S n. hen D = KB is EDM. Proof. Suppose that rank B = r. Since B 0, there exists P M r n such that B = P P. hus d ij = KB ij = KP P ij = diag P P e ij + e diag P P ij P P ij = p i p i + p j p j p i p j = p i p j, where p i is the i-th column of P. hus D = KB is EDM. he following shows that we can assume B is centered. Lemma.3 Let 0 B S n and define v := 1 n Be + e Be n e, C := B + ve + ev. hen C 0, D = KB = KC and, moreover, Ce = 0. Proof. First, note that KB = KC Kve + ev = 0. But ve + ev N K, by Lemma.1. Hence KB = KC. Moreover Finally, let P = e Ce = Be 1 n Bee e + e Be n ee e 1 n ee Be + e Be n ee e = Be Be + e Be n e e Be n e + e Be n e = 0. V be orthogonal. hen P 0 0 CP = 0 V = CV herefore, by Sylvester s heorem on congruence, C V 0. BV 7

8 .3 Invariant Cones and Perron Roots We digress in order to illustrate some of the relationships between the SDP and EDM cones. hese relationships give rise to elegant results including generalizations of the Perron-Frobenius heorem. Lemma.4 Suppose that 0 H S n, i.e. H is symmetric and nonnegative elementwise. Define the linear operator on S n U := H K H K. hen where the face of S n + US n + F E,.8 F E := S n + S C. Proof. Let X S n + be given. We first show that UX S n +; equivalently, we show that trace QUX 0, Q 0. Let Q 0. We have trace QUX = Q, H K H KX = H KQ, H KX = H D1, H D 0, since D 1, D EDM and H 0. herefore UX = H K H KX S+ n. he result.8 now follows since Lemma.1 implies UX RK = S c. Corollary. Suppose that 0 H S n is defined as in Lemma.4. And suppose that diag H = 0 and H has no zero row. hen the face F E and its relative interior relint F E are invariant under the operator U = H K H K, i.e. UF E F E, Urelint F E relint F E. Proof. hat the face F E is invariant follows immediately from Lemma.4. Note that ni E F E and has rank n 1; and B F E trace BE = 0 implies that rank B is at most n 1. herefore, the relative interior of the face F E consists of the centered, positive semidefinite matrices of rank exactly n 1. Suppose that B relint F E, i.e. B 0, trace BE = 0 and rank B = n 1. hen we know that UB F E, by Lemma.4. It remains to show that UB is in the relative interior of the face, i.e. the rank of UB is n 1. We show this by contradiction. Suppose that UBv = 0, 0 v αe, α R. hen 0 = v UBv = trace vv UB = trace U vv B. Denote C = U vv = K H Kvv. Since B relint F E, Uvv 0, and U vv B = Uvv B = 0, we conclude C is in the conjugate face of F E, i.e. C = αe, for some α 0. By Lemma.1, we get C is centered. his implies that C = 0. Lemma.1 also implies that N K is the set of diagonal matrices. Since H has no zero row, Lemma.1 implies that vv N K = RD e. We now have vv = we + ew, for some w, i.e. v =.5e, the desired contradiction. 8

9 he above Lemma means that we can apply the generalized Perron-Frobenius heorem, e.g. [0] to the operator U, i.e. the spectral radius corresponds to a positive real eigenvalue with a corresponding eigenvector in the relative interior of the face F E. In particular, we can get the following expression for the Perron eigenpair. heorem.1 Suppose that 0 H S n is defined as in Lemma.4 and that H ij {0, 1}. Suppose further that diag H = 0 and H has no zero row. Denote the maximum eigenvalue α := max i {λ i H + Diag He}, and let x be the corresponding eigenvector. hen the Perron root and eigenvector for the linear operator U H = K H K = K H K are, respectively, λ = α, X = Diag x + offdiag 1 α H ijx i + x j, where offdiag S := S Diag diag S. Proof. By the the expression for K,.6, we see that the graph of the eigenvector X is the same as for matrices in the range of H K, i.e. the graph of X is a subgraph of the graph of H. Moreover, since X RK, we have Xe = 0. Now, the eigenvalue-eigenvector equation gives λx = K H KX = Diag {H KX}e H KX = Diag {H diag Xe }e + Diag {H ediag X }e 4Diag {H X}e H diag Xe H ediag X + 4H X. Since H ij {0, 1}, we have Hence, we get H X = X Diag diag X λx = Diag {H diag Xe }e + Diag {H ediag X }e 4Diag Xe 4Diag Diag diag Xe H diag Xe H ediag X + 4X + 4Diag diag X. Combining with Xe = 0, this can be rewritten as λ X = Diag {H diag Xe }e + Diag {H ediag X }e H diag Xe H ediag X. By the above equation we get for the elements on the diagonal n αx ii = H ik x ii + k=1 k H ik x kk, i = 1,..., n, i=i i.e. αx = H + Diag Hex. 9

10 As for the off-diagonal elements, we have αx ij = H ij x ii + x jj, i j. Corollary.3 he Perron root and eigenvector for the linear operator are where γ = β n, β > 0, and X relint F E. U E := K K λ = 4n > 0, X = βi + γe 0, Proof. By Corollary. and the invariant cone results in [0], we conclude that the spectral radius of U E is a positive eigenvalue and the corresponding eigenvector X relint F E. We set H = E I and we verify that λ and X satisfy heorem.1. By setting H = E I, we get α = max{ λ i H + Diag He } = max{ λ i E I + Diag n 1e } = max{ λ i E + n I } = n that implies λ = 4n. On the other hand we know that X has the same graph of H, and hence X = βi + γe. Now we have that diag X = x = γ + βe = β n 1β + βe = n e n and we can see by substitution that x is the eigenvector of the matrix H+Diag He corresponding to the eigenvalue α = n. Moreover we have that X ij = β n = 1 α x i + x j. Finally, X = αe + βi = β I 1n E relint F E.. he eigenvector matrix can be written as X = βi β n E, where the spectral radius ρ β ne β, and E is nonnegative, i.e. X is an M-matrix, e.g. [3]. From the above proof we note that the corresponding EDM for the eigenvector X is D = KX = βe I. Corollary.4 Suppose that 0 H S n is the banded matrix H = E I but with H1, n = Hn, 1 = 0. hen the Perron root and eigenvector for the linear operator U := K H K 10

11 are given by X 11 β β β 0 λ = 3n + β X α α β n + 4n 1 > 0, X = β α X 33 α... α β......,.9 β α α α... X n 1,n 1 β 0 β β β X 11 β where the diagonal of X is defined by Xe = 0, and the Perron root λ and vector v = are α found from the spectral radius and corresponding eigenvector of the matrix n + n 3 A =,.10 4 n 4 i.e. Av = λv, with β = n + 1 n 8 + 4n 1 α, α > 0. Proof. With X defined in.9, we find that H KX has the same structure as X but with zero diagonal and β replaced by n + β + n 3α and α replaced by 4β + n 4α. hen K H KX doubles and negates these values. herefore, the eigenpair λ, X can be found by β solving the eigenpair equation given by Av = λv, where v = and A is given by.10. α 3 SDP Relaxation 3.1 EDMC Problem Reformulation using Matrices XX Now let Ȳ := P P XA = I I I P AX AA and Z = = P P P Ȳ the dimensions:. herefore we have X n r; A m r; P m + n r; Ȳ = Ȳ m + n m + n; Z = Z m + n + r m + n + r. We can reformulate EDM C using matrix notation to get the equivalent problem EDMC min f Ȳ := 1 W KȲ E F subject to g u Ȳ := H u KȲ U 0 g l Ȳ := H l KȲ L 0 Ȳ P P = 0, 3.11 where W is the n+m n+m weight matrix having a positive ij-element if i, j N e M e, 0 otherwise. H u, H l are 0, 1-matrices where the ij-th element equals 1 if i, j N u M u i, j N l M l, 0 otherwise. By abuse of notation, we consider the functions g u, g l as implicitly acting on only the nonzero components in the upper triangular parts of the matrices that result from the Hadamard products with H u, H l, respectively. 11

12 Remark 3.1 he function f Ȳ = f P P, and it is clear that f P P = f 1 P in 1.4. Note that the functions f, g u, g l act only on Ȳ and the locations of the anchors and sensors is completely XX hiding in the hard, nonconvex quadratic constraint Ȳ = P P XA = AX AA. he problem EDMC is a linear least squares problem with nonlinear constraints. As we show in Section 6, the objective function can be either overdetermined or underdetermined, depending on the weights W. his can result in ill-conditioning problems, e.g. [1]. 3. Relaxation of the Hard Quadratic Constraint XX We now consider the hard quadratic constraint Ȳ = P P XA = AX AA, P is defined in 1.1. We study the standard semidefinite relaxation. We include details on problems and weaknesses with the relaxation. he constraint in 3.11 has the blocked property Ȳ = P P XX XA = AX AA Ȳ11 = XX and Ȳ1 = AX, Ȳ = AA. 3.1 With P defined in 1.1 and Ȳ = Ȳ11 semidefinite constraint Ȳ 1 Ȳ1 AA, it is common practice to relax 3.1 to the GP, Ȳ := P P Ȳ 0, F G := {P, Ȳ : GP, Ȳ 0} he function G is convex in the Löwner semidefinite partial order. herefore, the feasible set F G is a convex set. By a Schur complement argument, the constraint is equivalent to the semidefinite I I constraint Z = 0, where Z has the following block structure. P P I P Z := P Ȳ X Y Y, P =, Ȳ := 1 A Y 1 Y, 3.14 and we equate Y = Ȳ11, Y 1 = Ȳ1, Y = Ȳ. When we fix the, block of Ȳ to be AA which is equivalent to fixing the anchors, we get the following semidefinite constraint: I P X Y Y 0 Z =, P =, Ȳ = 1 P Ȳ A Y 1 AA, Y 1 = AX 3.15 Equivalently, I X A Z = X Y Y1 0, Y 1 = AX A Y 1 AA We denote the corresponding convex feasible set for the relaxation F Z := {X, Y, Y 1 : 3.16 holds}, P, Ȳ formed using

13 Remark 3. We emphasize two points in the above relaxation. First, we relax from the hard nonlinear, nonconvex quadratic equality constraint in 3.1 to the convex quadratic constraint in 3.13, i.e. the latter is a convex constraint in the Löwner semidefinite partial order. And, using the identifications for P, Ȳ in 3.15 above, and by abuse of notation, the feasible sets are equal, i.e. {Ȳ, P : Ȳ = P P } F G = F Z. However, the rationale for relaxing from = to is based on obtaining a simple convexification, rather than obtaining a strong convexification. We see below, in Section 6.4., that the Lagrangian relaxation of Ȳ = P P provides a stronger semidefinite relaxation. Second, the Schur complement is used to change the quadratic function to a linear one. he fact that the two feasible sets are equal, F Z = F G, does not mean that the two relaxations are numerically equivalent. In fact, the Jacobian of the convex function G acting on H P, H Y is G P, Ȳ H P, H Y = P H P + H P P H Y. his immediately implies that G is onto. In fact, this is a Lyapunov type equation with known stability properties, e.g. [13, Chap. 15]. In addition, if G P, Ȳ 0, H Y = 0, then H Y = 0. And, if H Y = 0, then one can take the singular value decomposition of the full rank P = UΣV, where Σ contains the r r nonsingular diagonal block Σ r. Upon taking the orthogonal congruence U U, we see immediately that H P = 0 as well. herefore, the Jacobian G is full rank and the columns corresponding to the variables P, Y are separately linearly independent. However, the constants in the matrix Z = ZX, Y, Y 1 imply that the Jacobian is never onto. herefore, though this constraint has been linearized/simplified, the problem has become numerically unstable. his is evident in our numerical tests below. 3.3 Facial Reduction As seen in [], the cone of EDM has a one-one correspondence to a proper face of the cone of SDP. herefore, to apply interior-point methods on an equivalent SDP, one needs to reduce the problem to a smaller dimensional cone of SDP where Slater s constraint qualification holds. his guarantees both strictly feasible points for interior-point methods as well as the existence of dual variables. We now find the minimal face containing Z defined above in 3.15, and then we reduce the problem in order to obtain strictly feasible points. Strict feasibility Slater s constraint qualification also guarantees strong duality. Define the compact singular value decomposition A = UΣV, Σ = Diag σ 0, U M m r, U U = I, V M r n, V V = I, 3.18 and define the following matrix: I X Z s := X Y the block from 3.16 We now see that checking semidefiniteness of the big 3 3 block matrix Z 0 in 3.15 is equivalent to just checking semidefiniteness of the smaller block matrix Z s as long as we set Y 1 = AX, i.e. we can ignore the last row and column of blocks in Z. 13

14 heorem 3.1 Suppose that Z is defined by 3.15 and the corresponding feasible set F Z is defined in hen {Z 0} { Z s 0, and Y 1 = AX }, and the feasible set F Z = F S := { X, Y, Y 1 : Z s X, Y 0, and Y 1 = AX }. Proof. By substituting the singular value decomposition of A into 3.15 for Z, we get Z 0 if and only if I X V ΣU Z 1 := Z 1 X, Y, Y 1 = X Y Y UΣV Y 1 UΣ U We have I = I r S r, Y S n, and UΣ U S m with rank r. herefore, max rank Z 1 = r + n + r = r + n, attained if Y S+ n. 3.0 X,Y,Y 1,Z 1 0 he corresponding feasible set is and it is clear that F Z1 = F Z. Now, choose Ū so that U 0 Z := Z = his shows that Z 0 is equivalent to F Z1 := {X, Y, Y 1 : Z 1 X, Y, Y 1 0}, 3.1 Ū is orthogonal; and, consider the nonsingular congruence V 0 0 V I 0 Z 0 I U Ū 0 0 U Ū = 0 Z 3 := Z 3 X, Y, Y 1 = he corresponding feasible set is I V X Σ 0 XV Y Y1 U Y Σ U Y 1 Σ 0 0 Ū Y I V X Σ XV Y Y1 U Σ U Y 1 Σ 1Ū. 3., and Y1Ū = F Z3 := {X, Y, Y 1 : Z 3 X, Y, Y 1 0 and Y 1Ū = 0}. 3.4 And, it is clear that F Z3 = F Z1 = F Z. Note that with Z 1 in 3.19, we have Z 1 Q = 0, where Q = , rank Q = m r, 0 0 ŪŪ 14

15 i.e. Q is in the relative interior of the conjugate face to the minimal face containing F Z, and the minimal face the generated cone is I 0 0 I 0 0 F Z = SDP Q = 0 I 0 S+ r+n 0 I U 0 0 U. 3.5 We now apply a Schur complement argument to Z 3, i.e. 3.3 holds if and only if Y1Ū = 0 and I V 0 X Σ XV Y Y1 U Σ Σ U Y 1 0 V = X Σ 1 U 3.6 Y 1 XV Y1 UΣ 1 Y Y1 UΣ U. Y 1 And this is equivalent to Y Y1 UΣ U Y 1 0, ΣV X U Y 1 = 0, Y1Ū = We define the corresponding feasible set F S := {X, Y, Y 1 : 3.7 holds}. 3.8 We conclude that Z 0 if and only if 3.7 holds and the feasible sets F S = F Z3 = F Z1 = F Z. Now, 3.7 is equivalent to Y XV V X = Y XX 0, and V X = Σ 1 U Y 1, Y1Ū = 0, 3.9 which is equivalent to I X X Y 0, and V X = Σ 1 U Y 1 Y1Ū = herefore Define the full column rank matrix hen 3.30 is equivalent to I V X Σ 0 XV Y XV Σ I X = K Σ ΣV X Σ X Y F S = {X, Y, Y 1 : 3.30 holds} V 0 K := 0 I. 3.3 ΣV 0 K, and ΣV X = U Y 1, 3.33 Since this implies that Y1Ū = 0. Or equivalently I V X Σ XV Y XV Σ I X = K K I X, 0, ΣV X = U Y Σ ΣV X Σ X Y X Y

16 I X he constraint in 3.34 is equivalent to 0, UΣV X Y X = Y 1. he result now follows since A = UΣV. Corollary 3.1 he minimal face containing the feasible set F Z has dimension r + n and is given in the cone in 3.5. In particular, each given Z in 3.16, I X A Z = X Y Y1 0, A Y 1 AA can be expressed as I 0 0 I X V Σ I 0 0 Z = 0 I 0 X Y XV Σ 0 I U ΣV ΣV X Σ 0 0 U. 3.4 Further Model Development with only X, Y he above reduction from Z to Y allows us to replace the constraint Z 0 with the smaller dimensional constraint I X Z s = 0, Y X Y 1 = AX. o develop the model, we introduce the following notation. 0 X x := vec sblk 1 = vec X, y := svec Y, X 0 where we add to the definition of x since X appears together with X in Z s and implicitly in Ȳ, with Y 1 = AX. We define the following matrices and linear transformations: Zs xx := sblk 1Mat x, Z s x, y := Zs xx + Zy s y, Y x x := sblk 1 AMat x, Yx, y := Y x x + Y y y, Zs y y := sblk smat y, Z s := sblk 1 I + Z s x, y, Y y y := sblk 1 smat y Ȳ := sblk AA + Yx, y. Ē := W [ E KsBlk AA ], Ū := H u [ KsBlk AA U ], L := H l [ L KsBlk AA ]. he unknown matrix Ȳ in 3.11 is equal to Yx, y with the additional constant in the, block, i.e. our unknowns are the vectors x, y which are used to build Ȳ and Z s. Using this notation we 16

17 can introduce the following relaxation of EDM C in EDMC R min f 3 x, y := 1 W KYx, y Ē F subject to g u x, y := H u KYx, y Ū 0 g l x, y := L H l KYx, y 0 sblk 1 I + Z s x, y As above, we consider the functions g u, g l as implicitly acting only on the nonzero parts of the upper triangular part of the matrix that results from the Hadamard products with H u, H l, respectively. Remark 3.3 he constraint Z s 0 in EDMC R is equivalent to that used in e.g. [14, 4] and the references therein. However, the objective function is different, i.e. we use an l nearest matrix problem and a quadratic objective SDP ; whereas the papers in the literature generally use the l 1 nearest matrix problem in order to obtain a linear SDP. 4 Duality for EDMC-R We begin with the Lagrangian of EDMC R, problem 3.35, Lx, y, Λ u, Λ l, Λ = 1 W KYx, y Ē F + Λ u, H u KYx, y Ū + Λ l, L H l KYx, y Λ, sblk 1 I + Z s x, y, where 0 Λ u, 0 Λ l S m+n, and 0 Λ S m+n. We partition the multiplier Λ as Λ = Λ1 Λ 1 Λ 1 Λ 4.36, 4.37 corresponding to the blocks of Z s. Recall that x = vec X, y := svec Y. In addition, we denote λ u := svec Λ u, λ l := svec Λ l, h u := svec H u, h l := svec H l, λ := svec Λ, λ 1 := svec Λ 1, λ := svec Λ, λ 1 := vec sblk 1 Λ. And, for numerical implementation, we define the linear transformations h nz u = svec u H u R nzu, h nz l = svec l H l R nz l, 4.38 where h nz u is obtained from h u by removing the zeros; thus, nz u is the number of nonzeros in the upper-triangular part of H u. hus the indices are fixed from the given matrix H u. Similarly, for h nz l with indices fixed from H l. We then get the vectors λ nz u = svec uλ u R nzu, λ nz l = svec l Λ l R nz l. he adjoints are smat u, smat l ; and, for any matrix M we get H u M = smat u svec u H u M. his holds similarly for H l M. herefore, we could rewrite the Lagrangian as Lx, y, Λ u, Λ l, Λ = Lx, y, λ nz u, λ nz l, Λ = 1 W KYx, y Ē F + Λ u, H u KYx, y Ū + Λ l, L H l KYx, y Λ, sblk 1 I + Z s x, y = 1 W KYx, y Ē F + svec u Λ u, svec u Hu KYx, y Ū + svec l Λ l, svec l L Hl KYx, y Λ, sblk 1 I + Z s x, y

18 o simplify the dual of EDMC R, i.e. the max-min of the Lagrangian, we now find the stationarity conditions of the inner minimization problem, i.e. we take the derivatives of L with respect to x and y. We get 0 = x Lx, y, Λ u, Λ l, Λ = [W KY x ] W KYx, y Ē + [H u KY x ] Λ u [H l KY x ] Λ l Z x s Λ Note that both the Hadamard product and the transpose,,, are self-adjoint linear transformations. Let H denote a weight matrix. hen [ ] [H KY x ] S = Y x K H S = vec {sblk 1 K H S} A Similarly, 0 = y Lx, y, Λ u, Λ l, Λ = [W KY y ] W KYx, y Ē + [H u KY y ] Λ u [H l KY y ] Λ l Z y s Λ. 4.4 And [H KY y ] S = Y y K H S = svec {sblk 1 K H S} hen Z x s Λ = vec sblk 1 Λ = λ 1. he first stationarity equation 4.40 yields Z x s Λ = λ 1 = [W KY x ] W KYx, y Ē + [H u KY x ] Λ u [H l KY x ] Λ l Similarly, the second stationarity equation 4.4 yields Z y s Λ = λ = [W KY y ] W KYx, y Ē + [H u KY y ] Λ u [H l KY y ] Λ l he Wolfe dual is obtained from applying the stationarity conditions to the inner minimization of the Lagrangian dual max-min of the Lagrangian, i.e. we get the dual EDMC problem: EDMC D We denote max Lx, y, λ u, λ l, λ 1, λ, λ 1 subject to 4.44, 4.45 smat λ u 0, smat λ l 0 sblk 1 smat λ 1 + sblk smat λ + sblk 1 Mat λ S u := Ū H u K Yx, y, s u = svec S u S l := H l K Yx, y L, s l = svec S l. We can now present the primal-dual characterization of optimality heorem 4.1 he primal-dual variables x, y, Λ, λ u, λ l are optimal for EDMC R if and only if: 18

19 1. Primal Feasibility: s u 0, s l 0, in 4.47, Z s = sblk 1 I + sblk smat y + sblk 1 Mat x Dual Feasibility: Stationarity equations 4.44,4.45 hold and Λ = sblk 1 smat λ 1 + sblk smat λ + sblk 1 Mat λ 1 0; λ u 0; λ l Complementary Slackness: λ u s u = 0 λ l s l = 0 ΛZ s = 0. We can use the structure of the optimality conditions to eliminate the linear dual equations and obtain a characterization of optimality based solely on a bilinear equation and nonnegativity/semidefiniteness. Corollary 4.1 he dual linear equality constraints 4.44,4.45 in heorem 4.1 can be eliminated after using them to substitute for λ, λ 1 in he complementarity conditions now yield a bilinear system of equations F x, y, λ u, λ l, λ 1 = 0, with nonnegativity and semidefinite conditions that characterize optimality of EDM C R. In addition, we can guarantee a strictly feasible primal starting point from the facial reduction in heorem 3.1. his, along with a strictly feasible dual starting point are found using the heuristic in Section A Robust Primal-Dual Interior-Point Method o solve EDM C R we use the Gauss-Newton method on the perturbed complementary slackness conditions written with the block vector notation: λ u s u µ u e F µ x, y, λ u, λ l, λ 1 := λ l s l µ l e = 0, 5.50 ΛZ s µ c I where s u = s u x, y, s l = s l x, y, Λ = Λλ 1, x, y, λ u, λ l, and Z s = Z s x, y. his is an overdetermined system with m u + n u + m l + n l + n + r equations; nr + tn + m u + n u + m l + n l + tr variables. 19

20 5.1 Linearization We denote the Gauss-Newton search direction for 5.50 by x y s := λ u λ l. λ 1 he linearized system for the search direction s is: F µ s = F µx, y, λ u, λ l, λ 1 s = F µ x, y, λ u, λ l, λ 1. o further simplify notation, we use the following composition of linear transformations. Let H be symmetric. hen KH x x := H KYx x, K y H y := H KYy y, K H x, y := H KYx, y. he linearization of the complementary slackness conditions results in three blocks of equations λ u svec K Hu x, y + s u λ u = µ u e λ u s u λ l svec K Hl x, y + s l λ l = µ l e λ l s l ΛZ s x, y + [sblk 1 smat λ 1 + sblk smat λ + sblk 1 smat λ 1 ] Z s = ΛZ s x, y + [sblk 1 smat λ 1 { } + sblk smat K y W K W x, y + K y H u smat λ u K y H l smat λ l +sblk 1 Mat { K x W K W x, y + K x H u smat λ u K x H l smat λ l }] Z s = µ c I ΛZ s and hence λ u svec K Hu x, y + s u λ u λ l svec K Hl x, y + s l λ l ΛZ s x, y + [sblk 1 smat λ 1 + F µ s = { } sblk smat K y W K W x, y + K y H u smat λ u K y H l smat λ l + { }] sblk 1 Mat KW x K W x, y + KH x u smat λ u KH x l smat λ l Z s where F µ : M r n R tn R tm+n R tm+n R tm+n R tm+n R tm+n R r+n, i.e. the linear system is overdetermined. 0

21 We need to calculate the adjoint F µ. We first find Zs, Kx H, K y H, and K H. By the expression of Z s we get Zs Z x S = s S Mat sblk Zs y = 1 S vec sblk S smat = 1 S sblk S svec sblk S By the expression of Y x, y, we get Y Y S = x S Mat sblk Y y = 1 S A S smat = sblk 1 S By the expression of K H x, y, we get KH K S = x H S K y H S We have: = vec sblk 1 S A. 5.5 svec sblk 1 S Y x K H S Y y K H S ΛZ s x, y, M = trace M ΛZ s x, y = 1 M Λ + ΛM, Z s x, y 1 = Z s M Λ + ΛM x,. y 5.54 he left half of the inner-product in the final row of 5.54 defines the required adjoint. Moreover, sblk 1 smat λ 1 Z s, M = trace Z s M sblk 1 smat λ 1 1 = Zs M + MZ s, sblk 1 smat λ 1 1 = svec sblk 1 Zs M + MZ s, λ1 Similarly, { sblk smat K y H K H x, y } Z s, M = trace Z s M { sblk smat K y H K H x, y } 1 = H K H x, y and = Z sm + MZ s, sblk smat K y 1 K H K y H svec sblk Zs M + MZ s, x y sblk 1 Mat KH x K H x, y Z s, M = trace Z s M sblk 1 Mat KH x K H x, y 1 = Zs M + MZ s, sblk 1 Mat K x H K H x, y 1 = K H KH x vec sblk 1 Z s M + MZ s x, y again yielding the desired adjoints on the left of the last inner-products. Now we can evaluate F µ w 1, w, W 3. his consists of three columns of blocks with five rows per column. We list this 1

22 by columns C 1, C, C 3. C 1 = C = KH x u smat λ u w 1 K y H u smat λ u w 1 w 1 s u 0 0 KH x l smat λ l w K y H l smat λ l w 0 w s l 0 C 3 = C 13 C 3 C 33 C 43 C 53 with elements of C 3 C 13 = 1 Zx s W3 Λ + Λ W Kx W KW x vec sblk 1 Z s W3 + W 3 Z s + 1 Kx W K y W svec sblk Zs W3 + W 3 Z s C 3 = 1 Zy s W3 Λ + Λ W Ky W KW x vec sblk 1 Z s W3 + W 3Z s + 1 Ky W K y W svec sblk Zs W3 + W 3 Z s C 33 = 1 svec K y H u svec sblk Z s W 3 + W 3 Z s + 1 svec K x H u vec sblk 1 Z s W 3 + W 3 Z s C 43 = 1 svec K y H l svec sblk Z s W 3 + W 3Z s 1 svec K x H l vec sblk 1 Z s W 3 + W 3Z s C 53 = 1 svec sblk 1 Zs W 3 + W 3 Z s, where w 1 R tm+n, w R tm+n, W 3 M r+n. hus the desired adjoint is given by F µ = C 1 + C + C Heuristic for an Initial Strictly Feasible Point o compute an initial strictly feasible point, we begin by considering x = 0 and y = svec Y, where Y = αi. hen for any α > 0 we have I 0 αe I αe Z s = 0 and KYx, y =, 0 αi αe 0

23 where E is a matrix of ones of appropriate size. Since we require for strict primal feasibility that s nz l := svec l S l = svec l H l KYx, y L > 0, it suffices to choose α > L ij, for all ij such that H l ij = 1. In practice, we choose { } α =.1 + ɛ max 1, max L ij, ij where 0 ɛ 1 is a random number. For the initial strictly feasible dual variables, we choose Λ l = S l so that λ nz l = svec l Λ l > 0. Defining λ 1 and λ according to equations 4.44 and 4.45, respectively, we find that Λ = smat λ 0. We would now like to choose Λ 1 = βi, for some β > 0, so that βi Λ Λ = 1 0. Λ 1 Λ he last inequality holds if and only if Λ 1 β Λ 1Λ 1 0. We can therefore guarantee Λ 0 by choosing β = trace Λ 1Λ 1. λ min Λ 5.3 Diagonal Preconditioning It has been shown in [11] that for a full rank matrix A M m n, m n, and using the condition trace K/n number of a positive definite matrix K, given by ωk :=, we conclude that the optimal detk 1/n diagonal scaling, i.e the solution of min ω AD AD, D positive diagonal matrix, is given by D = diag d ii = 1/ A :,i. herefore, we need to evaluate the columns of the linear transformation F µ. he details are presented in Appendix D, page Instability in the Model he use of the matrix Z s is equivalent to the matrix of unknown variables used in the literature, e.g. in [4]. We include the facial reduction in heorem 3.1 to emphasize that we are finding the smallest face containing the feasible set F Z and obtaining an equivalent problem. Such a reduction allows for Slater s constraint qualification to hold, i.e. for a strictly feasible solution. It also guarantees strong duality, i.e. the existence of a nonempty compact set of Lagrange multipliers. his should result in stabilizing the program. However, as mentioned above, we still have instability due in part to the constant I in Z s. We now see that we have a conundrum; a stronger SDP relaxation, as seen by Y XX = 0, results in increased ill-conditioning. 6.1 Case of an Accurate Relaxation We first suppose that at optimality the relaxation Z s 0 is a good approximation of the original constraint in 3.1, i.e. Ȳ P P F < ɛ, ɛ small. herefore, the norm of the first leading block 3

24 is small as well Y XX F < ɛ. Now I X I 0 Z s := = X Y X I 1 I 0 I X 1 0 Y XX. 0 I herefore, if 0 < ɛ << 1, then Z s has r eigenvalues order 1, O1, and the rest are approximately 0, i.e. Z s has numerical rank r. herefore, for strict complementarity to hold, the dual variable Λ to the constraint sblk 1 I + Z s x, y 0 in 3.35 must have numerical rank n r. his also follows from Cauchy s interlacing property extended to Schur complements, see [17, heorem.1, Page 49] But, our numerical tests show that this is not the case, i.e. strict complementarity fails. In fact, when the noise factor is low and the approximation is good, we lose strict complementarity and have an ill-conditioned problem. Paradoxically, a good SDP approximation theoretically results in a poor approximation numerically and/or slow convergence to the optimum. Note that if the approximation is good, i.e. Y XX F is small, then, by interlacing of eigenvalues, this means that Z s has many small eigenvalues and so has vectors that are numerically NX in the nullspace of Z s, i.e. numerically Z s N = 0, where N = R n+r t is a full column rank t matrix of numerical orthonormal eigenvectors in the nullspace. his implies that the linear transformation N Z X, Y : R nr+tn R nt he adjoint is NX N Z X, Y = X Y = XN N X + Y N Y = Y N Z M = M N X N Y. Since the nullity of NZ is 0, we conclude that N Z is onto and so we can eliminate nt variables using 6.55, i.e. nt is an estimate on the numerical nullity of the optimality conditions for the SDP relaxation. he above arguments emphasize the fact that as we get closer to the optimum and Z s gets closer to being singular, the SDP optimality conditions become more ill-conditioned. o correct this we use the strengthened relaxation in Section Alternatively, we can deflate. Suppose that we detect that the smallest eigenvalue 0 < λ min Z s << 1, Z s v = λ min Z s v, v = 1 is close to zero. Let P = v V be an orthogonal matrix. hen we can replace the complementary slackness ΛZ with Λ Z = 0, where Λ = V ΛV, Z = V ZV. his reduces the dimension of the symmetric matrix space by one and removes the singularity. We can recover X, Y at each iteration using P. his avoids the ill-conditioning as we can repeat this for each small eigenvalue of Z. 6. Linear Independence of Constraints and Redundancy in Objective Function Suppose that the distances in the data matrix E are exact. hen our problem is a feasibility question, i.e. an EDM completion problem. We could add the constraints W KYx, y Ē = 0 to our model problem. Since there are rn + tn variables, we would get a unique solution if we had the same number of linear independent constraints. his is pointed out in [4, Proposition 1]. But, as we now see, this can rarely happen. o determine which constraints are linearly independent we could use the following Lemma. Recall that A e = 0 and rank A = r < m < n. N Y 4

25 Lemma 6.1 Let V 1 n e be an n n orthogonal matrix. Under the assumption that A e = 0 and rank A = r < m < n, we have: 1. he nullspace N KY = 0; the range space R Y K = R rn+tn ; rank Y K = rn + tn; and dim R KY dim R Y K = nm r 1 0, i.e. the corresponding linear system is either square or overdetermined and always full column rank.. Suppose that W 0 is a given weight matrix with corresponding index sets N e, M e for the positive elements. We let y = y k = y ij = svecy R tn, where the indices k = ij refer to both the position in R tn as well as the position in the matrix Y. Similarly, x = x k = x ij = vec X, where the indices refer to both the position in R rn as well as the position in the matrix X. a he nullspace of vectors x,y is given by the span of the orthogonal unit vectors, herefore, and N W KY = span [{0, e k : e k = e ij, i < j, ij / N e } {0, e k : e k = e ii, it / M e, t} {e k, 0 : e k = e ij, 1 j r, it / M e, t}]. dim N W KY = tn 1 N e + r + 1 {i : it / M e, t} rank W KY = [tm + n 1 tm 1] tn 1 + N e r + 1 {i : it / M e, t} = mn + N e r + 1 {i : it / M e, t}. In the nontrivial case, {i : it / M e, t} = 0, we conclude that W KY is full column rank if and only if mn + N e rn + tn b he range space R W KY = span [{KY0, e k : e k = e ij, i < j, ij N e } {KY0, e k : e k = e ii, it M e, for some t} {KYe k, 0 : e k = e ij, 1 j r, it M e, for some t}]. W1 W 3. Consider the blocked matrix W = 1. he nullspace W 1 W { N Y K W1 W = W = 1 } : W W 1 W 1 = Diag w, W1 A e = 0. Proof. 1. he nullspace of KY is determined using the details of the composite linear transformation given in the Appendix B.74, i.e. with notation KYx, y = 0, Y = smat y, ȳ = diag Y, X = 1 vec x, we get D e ȳ Y = 0 Y = 0 AX = 0 X = 0. his implies that R Y K = R rn+tn. he difference in the dimensions is calculated in the Appendix B.1. 5

26 a. he unit vectors are found using the expression in B.74. he nullspace is given by the span of these unit vectors since XA + ȳe = 0 and A e = 0 implies that ȳ = 0. b. he range is found using the expression in B Suppose that Y K W = 0. From B.75, we get W 1 = 0 and svec Diag W 1 W 1 e W 1 = 0. his implies that W 1 = Diag w is a diagonal matrix. We get and so W 1 e = 0. svec Diag w + W 1e Diag w = 0, For given distances in E, we can now determine when a unique solution x, y exists. Corollary 6.1 Suppose that the assumption A e = 0, rank A = r holds. 1. If the number of specified distances n e + m e < nr + tn, then W KYx, y has a nontrivial nullspace, i.e. the least squares objective function f 3 x, y is underdetermined.. If the number of anchors is one more than the embedding dimension, m = r +1, then, there is a unique solution x, y that satisfies W KYx, y Ē = 0 if and only if all the exact distances in E are specified known. If some of the distances are not specified, then W KYx, y has a nontrivial nullspace. Remark 6.1 he above Corollary 6.1 states that if m = r + 1 and for an arbitrary number of sensors n > 0, we need to know all the distances in E exactly to guarantee that we have a unique solution x, y. Moreover, the least squares problem is underdetermined if some of the distances are not specified. his is also the case if the number of specified distances is less than nr + tn. his means that the least squares problem is ill-conditioned. We should replace the objective function with finding the least squares solution of minimum norm, i.e. minimize x, y and use the current objective as a linear constraint, after projecting Ē appropriately. See e.g. [16, 1, 10] for related work on underdetermined least squares problems. 6.3 Instability from Linearization he original relaxation replaced Ȳ = P P by Ȳ P P or equivalently XX Y = 0 is replaced by Gx, y = GX, Y = XX Y 0, where x = vec X, y = svec Y. In the formulation used in the literature, we replaced this by I X the equivalent linearized form Z s x, y = 0. hough this is equivalent in theory, it X Y 6

27 is not equivalent numerically, since the dimensions of the range of G and that of range of Z s are different, while the domains are the same. Consider the simple case where W is the matrix of all ones and there are no upper or lower bound constraints. he Lagrangian is now where 0 Λ = svec λ S n. Note that Lx, y, Λ = 1 KYx, y Ē F + Λ, Gx, y, 6.57 Λ, Gx, 0 = 1 Mat x, ΛMat x = 1 x vec ΛMat x ; Λ, G0, y = smat y, Λ = y svec Λ. We now find the stationarity conditions of the inner minimization problem of the Lagrangian dual, i.e. we take the derivatives of L with respect to x and y. We get Similarly, 0 = x Lx, y, Λ = [KY x ] KYx, y Ē + vec ΛMat x. 0 = y Lx, y, Λ = [KY y ] KYx, y Ē svec Λ he Wolfe dual is obtained from applying the stationarity conditions to the inner minimization of the Lagrangian dual max-min of the Lagrangian, i.e. we get the dual problem: Equivalently, EDMC nonlin EDMC nonlin 6.4 Alternative Models for Stability max Lx, y, λ subject to 6.58, 6.59 smat λ 0. max Lx, y, λ subject to [KY x ] KYx, y Ē + vec ΛMat x = 0 smat [KY y ] KYx, y Ē Following are two approaches that avoid the ill-conditioning problems discussed above. hese are studied in a forthcoming research report Minimum Norm Solutions for Underdetermined Systems Underdetermined least squares problems suffer from ill-conditioning as our numerics confirm for our problems. he usual approach, see e.g. [8], is to use the underdetermined system as a constraint and to find the solution of minimum norm. herefore, we can follow the approach in [1] and change the nearest EDM problem into a minimum norm interpolation problem. For given data Ē, and weight matrix W, we add the constraint W KYx, y = W Ē = 0. EDMC RC 1 min x, y subject to W KYx, y = W Ē sblk 1 I + Z s x, y 0. For simplicity we have ignored the upper and lower bound constraints

28 6.4. Strengthened Relaxation using Lagrangian Dual A stronger relaxation is obtained if we take the Lagrangian relaxation of 3.35 but with the constraint Ȳ = P P rather than the weaker Ȳ P P. We partition Y Y1 Ȳ = Y1 Y. EDMC R 1 min W KY Ē F subject to H u KY L Ū 0 H l KY 0 XX XA Y = AX AA. he Lagrangian is quadratic and convex in Y but not convex in X LX, Y, Λ u, Λ l, Λ = 1 W KY Ē F + Λ u, H u KY Ū + Λ l, L H l KY XX XA Λ, Y AX AA After homogenizing the Lagrangian, the optimality conditions for the Lagrangian dual include stationarity as well as a semidefiniteness condition, i.e. we get a semidefinite form for the Lagrangian dual. As an illustration of the relative strength of the different relaxations, suppose that we have the simplified problem µ := min fp subject to P P = Y P M m n he two different relaxations for 6.64 are:. µ I := min fp subject to P P Y P M m n ; 6.65 µ E := max S S n min P fp + trace SP P Y We now show that if the relaxation 6.65 does not solve the original problem 6.64 exactly, then the Lagrangian relaxation 6.66 is a stronger relaxation. heorem 6.1 Suppose that Y 0 is given and that f is a strictly convex coercive function of P M mn. hen µ I in 6.65 is finite and attained at a single matrix P. Moreover, if then µ µ E > µ I. Y P P 0,