Computing a Nearest Correlation Matrix with Factor Structure

Transcription

1 Computing a Nearest Correlation Matrix with Factor Structure Nick Higham School of Mathematics The University of Manchester higham@ma.man.ac.uk Joint work with Rüdiger Borsdorf, Marcos Raydan NAG Quant Event, London, October 2009

2 Outline Properties of structured correlation matrices. Nearness problem for factor structured correlation matrices. Selection of optimization method. Numerical analysis issues. MIMS Nick Higham Nearest Correlation Matrix 2 / 33

3 Correlation Matrix n n symmetric positive semidefinite matrix A with a ii 1. symmetric, 1s on the diagonal, eigenvalues nonnegative or all principal minors nonnegative. Properties: off-diagonal elements between 1 and 1, convex set. MIMS Nick Higham Nearest Correlation Matrix 3 / 33

4 Quiz Is this a correlation matrix? MIMS Nick Higham Nearest Correlation Matrix 4 / 33

5 Quiz Is this a correlation matrix? Spectrum: , , MIMS Nick Higham Nearest Correlation Matrix 4 / 33

6 Quiz Is this a correlation matrix? Spectrum: , , For what w is this a correlation matrix? 1 w w w 1 w. w w 1 MIMS Nick Higham Nearest Correlation Matrix 4 / 33

7 Quiz Is this a correlation matrix? Spectrum: , , For what w is this a correlation matrix? 1 w w 1 w 1 w. n 1 w 1. w w 1 MIMS Nick Higham Nearest Correlation Matrix 4 / 33

8 Structured Correlation Matrices Nonnegative: MIMS Nick Higham Nearest Correlation Matrix 5 / 33

9 Structured Correlation Matrices Nonnegative: Low rank: MIMS Nick Higham Nearest Correlation Matrix 5 / 33

10 Structured Correlation Matrices Nonnegative: Low rank: Factor structure: 1 x 1x 2 x 1 x 3 x 1 x 2 1 x 2 x 3. x 1 x 3 x 2 x 3 1 MIMS Nick Higham Nearest Correlation Matrix 5 / 33

11 Approximate Correlation Matrices Empirical correlation matrices often not true correlation matrices, due to asynchronous data missing data limited precision stress testing MIMS Nick Higham Nearest Correlation Matrix 6 / 33

12 Nearest Correlation Matrix Find X achieving min{ A X F : X is a correlation matrix }, where A 2 F = i,j a2 ij. Constraint set is a closed, convex set, so unique minimizer. Nonlinear optimization problem. MIMS Nick Higham Nearest Correlation Matrix 7 / 33

13 Optimization Problems Issues Existence and uniqueness of solution. Convexity? Explicit, closed-form solution? Choice of algorithm. Availability of derivatives. Starting matrix and convergence criterion. Practical behaviour. MIMS Nick Higham Nearest Correlation Matrix 8 / 33

14 Quick and Dirty Differentiation Analytic function f : R R, i = 1. Complex step approximation: E.g., h = f (x) Im f(x + ih). h MIMS Nick Higham Nearest Correlation Matrix 9 / 33

15 Quick and Dirty Differentiation Analytic function f : R R, i = 1. Complex step approximation: E.g., h = f (x) Im f(x + ih). h f f(x + ih) (x) = Im + O(h 2 ), h f(x) = Re f(x + ih) + O(h 2 ). See MIMS EPrint , April MIMS Nick Higham Nearest Correlation Matrix 9 / 33

16 Alternating Projections Method H (2002): repeatedly project onto the positive semidefinite matrices then the unit diagonal matrices. S 1 S 2 Easy to implement. Guaranteed convergence, at a linear rate. Can add further constraints/projections, e.g., fixed elements (Lucas, 2001). MIMS Nick Higham Nearest Correlation Matrix 10 / 33

17 Newton Method Qi & Sun (2006): Newton method based on theory of strongly semismooth matrix functions. Applies Newton to dual (unconstrained) of min 1 2 A X 2 F problem. Globally and quadratically convergent. H & Borsdorf (2007) improve efficiency and reliability by using appropriate iterative method and eigensolver, preconditioning the Newton direction solve. The basis of G02AAF (nearest correlation matrix) in NAG Library Mark 22. MIMS Nick Higham Nearest Correlation Matrix 11 / 33

18 Factor Model ξ = X }{{} n k Since E(ξ) = 0, η }{{} k 1 + diag(f i ) } {{ } n n }{{} ε n 1, η,ε N(0, 1). cov(ξ) = E(ξξ T ) = XX T + F 2. Assume var(ξ i ) 1. Then k j=1 x 2 ij + f 2 ii = 1, so k j=1 x 2 ij 1, i = 1: n. Collateralized debt obligations (CDOs), multivariate time series. MIMS Nick Higham Nearest Correlation Matrix 12 / 33

19 Structured Correlation Matrix Yields correlation matrix of form C(X) = D + XX T = D + k j=1 x j x T j, D = diag(i XX T ), X = [x 1,...,x k ]. C(X) has k factor correlation matrix structure. MIMS Nick Higham Nearest Correlation Matrix 13 / 33

20 Structured Correlation Matrix Yields correlation matrix of form C(X) = D + XX T = D + k j=1 x j x T j, D = diag(i XX T ), X = [x 1,...,x k ]. C(X) has k factor correlation matrix structure. 1 y1 Ty 2... y1 Ty n y C(X) = 1 T y yn 1 T y, y i R k. n y1 Ty n... yn 1 T y n 1 MIMS Nick Higham Nearest Correlation Matrix 13 / 33

21 Aims For k factor correlation matrices, investigate mathematical properties, nearness problem. MIMS Nick Higham Nearest Correlation Matrix 14 / 33

22 1-Parameter Correlation Matrix X(w) = 1 w w w 1 w w w 1, w R. Theorem min{ A X(w) F : X(w) a corr. matrix } has unique solution the projection of onto [ 1/(n 1), 1]. w = et Ae trace(a), n 2 n MIMS Nick Higham Nearest Correlation Matrix 15 / 33

23 Block Structured Correlation Matrix 1 γ 11 γ 12 γ 12 γ 11 1 γ 12 γ 12 γ 12 γ 12 1 γ 22 γ 12 γ 12 γ 22 1, C ij = { C(γii ) R n i n i, i = j, γ ij ee T R n i n j, i j. Objective function: f(γ) = A C(Γ) 2 F = m A ii C(γ ii ) 2 F + i j i=1 A ij γ ij ee T 2 F. Convex constraint set unique minimizer. Alternating projections converges. MIMS Nick Higham Nearest Correlation Matrix 16 / 33

24 1-Factor Correlation Matrix i.e., c ij = x i x j, i j. C(x) = diag(1 x 2 i ) + xx T, x R n Lemma det(c(x)) = n (1 xi 2 ) + i=1 n i=1 x 2 i n j=1 j i (1 x 2 j ). Corollary If x e with x i = 1 for at most one i then C(x) is nonsingular. C(x) is singular if x i = x j = 1 for some i j. MIMS Nick Higham Nearest Correlation Matrix 17 / 33

25 Rank Result Theorem C(x) = diag(1 x 2 i ) + xx T, Let x R n with x e. Then rank(c(x)) = min(p + 1, n), where p is the number of x i for which x i < x 4 x x 4 x 5 x = [1 1 1 x 4 x 5 ] C(x) = x 4 x 5 x 4 x 4 x 4 1 x 4 x 5 x 5 x 5 x 5 x 4 x 5 1. MIMS Nick Higham Nearest Correlation Matrix 18 / 33

26 One-Factor Problem min x R n f(x) := A C(x) 2 F subject to e x e. Objective function is nonconvex. The constraint implies C(x) is a correlation matrix. MIMS Nick Higham Nearest Correlation Matrix 19 / 33

27 One-Factor Problem: Derivatives Objective: f(x) = A I, A I F 2x T (A I)x + (x T x) 2 n i=1 x 4 i. Gradient: f(x) = 4((x T x)x (A I)x diag(x 2 i )x). Hessian: 2 f(x) = 4(2xx T + (x T x + 1)I A 3diag(x 2 i )). f(x), 2 f(x) cheap. f(x) has a saddle point at x = 0. MIMS Nick Higham Nearest Correlation Matrix 20 / 33

28 Case 1: f(x ) = 0 If f(x ) = 0 then 2 f(x ) has the form H n (x) = (x T x)i + xx T 2D 2, x R n. For example, for n = 4: x2 2 + x x 4 2 x 1 x 2 x 1 x 3 x 1 x 4 x 2 x 1 x1 2 + x x 4 2 x 2 x 3 x 2 x 4 x 3 x 1 x 3 x 2 x1 2 + x x 4 2 x 3 x 4. x 4 x 1 x 4 x 2 x 4 x 3 x1 2 + x x 3 2 MIMS Nick Higham Nearest Correlation Matrix 21 / 33

29 Case 1: f(x ) = 0 If f(x ) = 0 then 2 f(x ) has the form H n (x) = (x T x)i + xx T 2D 2, x R n. For example, for n = 4: x2 2 + x x 4 2 x 1 x 2 x 1 x 3 x 1 x 4 x 2 x 1 x1 2 + x x 4 2 x 2 x 3 x 2 x 4 x 3 x 1 x 3 x 2 x1 2 + x x 4 2 x 3 x 4. x 4 x 1 x 4 x 2 x 4 x 3 x1 2 + x x 3 2 Theorem H n is positive semidefinite. Moreover, H n is nonsingular iff at least three of x 1, x 2,...,x n are nonzero. MIMS Nick Higham Nearest Correlation Matrix 21 / 33

30 Case 2: f(x ) > 0 Can write f(x) = H n (x) + E n (x) where E n has the form 0 x 1 x 2 a 12 x 1 x 3 a 13 x 1 x 4 a 14 E 4 = x 2 x 1 a 21 0 x 2 x 3 a 23 x 2 x 4 a 24 x 3 x 1 a 31 x 3 x 2 a 32 0 x 3 x 4 a 34. x 4 x 1 a 41 x 4 x 2 a 42 x 4 x 3 a 43 0 Hence, if x i x j a ij is sufficiently small and H n positive definite a stationary point will be a local minimizer. MIMS Nick Higham Nearest Correlation Matrix 22 / 33

31 k Factor Problem C(X) := I diag(xx T ) + XX T with X R n k. Representation not unique! k j=1 x 2 ij 1 = C(X) is a correlation matrix. The k factor problem is min X R n k f(x) := A C(X) 2 F subject to k j=1 x 2 ij 1. MIMS Nick Higham Nearest Correlation Matrix 23 / 33

32 k Factor Problem: Derivatives Gradient f(x) = 4(X(X T X) AX + X diag(xx T )X) Hessian given implicitly, can be viewed as a matrix representation of the Fréchet derivative of f(x). MIMS Nick Higham Nearest Correlation Matrix 24 / 33

33 Choice of Optimization Method Derivatives available. Ignore the constraints? Starting matrix, convergence test? MIMS Nick Higham Nearest Correlation Matrix 25 / 33

34 Choice of Optimization Method Derivatives available. Ignore the constraints? Starting matrix, convergence test? Rich set of solvers in NAG Library, Mark 22: E04 - Minimizing or Maximizing a Function E05 - Global Optimization of a Function MATLAB Optimization toolbox. MIMS Nick Higham Nearest Correlation Matrix 25 / 33

35 Alternating Directions Hence f (x ij ) = 0 if f(x ij ) = const. + 2 q i x ij = Project x ij onto [ 1, 1]. Convergence guaranteed. ( a iq k ) 2 x is x qs. s=1 q i x qj (a iq ) s j x isx qs. q i x 2 qj Limit may not be feasible for k > 1. MIMS Nick Higham Nearest Correlation Matrix 26 / 33

36 Principal Factors Method Anderson, Sidenius & Basu (2003): with F(X) = I diag(xx T ), X i = argmin X R n k A F(X i 1 ) XX T F. Min obtained by eigendecomposition of A F(X i 1 ). Equivalent to alternating projections method for U := {W R n n : w ij = a ij for i j} convex, S := {W R n n : W = XX T for X R n k } nonconvex! Alt proj theory says no guarantee of convergence! Constraints ignored, so project final iterate onto them. MIMS Nick Higham Nearest Correlation Matrix 27 / 33

37 Spectral Projected Gradient Method Birgin, Martínez & Raydan (2000). To minimize f : R n R over convex set Ω: x k+1 = x k + α k d k. d k = P Ω (x k t k f(x k )) x k is descent direction, α k [ 1, 1] chosen through nonmonotone line search strategy. Promising since P Ω ( ) is cheap to compute. MIMS Nick Higham Nearest Correlation Matrix 28 / 33

38 Code in ACM TOMS (Alg 813).

39 Test Examples corr: gallery( randcorr,n) nrand: 1 2 (B + BT ) + diag(i B) with B [ 1, 1] n n such that λ min (B) < 0. Results averaged over 10 instances. AD: alternating directions. PFM: principal factors method. Nwt: e04lb of NAG Toolbox for MATLAB (modified Newton), bound constraints. SPGM: spectral projected gradient method. MIMS Nick Higham Nearest Correlation Matrix 30 / 33

40 Comparison: k = 1, n = 2000 tol= 10 3 tol= 10 6 t(sec.) #its f(x ) t(sec.) #its f(x ) corr, f(x 0 ) = 26.0 AD PFM Nwt SPGM nrand f(x 0 ) = AD PFM Nwt SPGM MIMS Nick Higham Nearest Correlation Matrix 31 / 33

41 Comparison: k = 6, n = 1000 tol= 10 3 tol= 10 6 t(sec.) #its f(x ) t(sec.) #its f(x ) corr, f(x 0 ) = 18.5 AD PFM Nwt SPGM nrand, f(x 0 ) = 415 AD e4 1.28e4 414 PFM Nwt SPGM MIMS Nick Higham Nearest Correlation Matrix 32 / 33

42 Conclusions Performance of methods depends on the problem type, the required tolerance, the problem size. Alternating directions good for k = 1, low accuracy. Principal factors method generally fast, but may not converge to feasible point. Spectral projected gradient method wins overall. Incorporating the constraints need not hurt performance. Exploit convexity when present. MIMS Nick Higham Nearest Correlation Matrix 33 / 33

43 References I L. Anderson, J. Sidenius, and S. Basu. All your hedges in one basket. Risk, pages 67 72, Nov J. Barzilai and J. M. Borwein. Two-point step size gradient methods. IMA J. Numer. Anal., 8: , E. G. Birgin, J. M. Martínez, and M. Raydan. Nonmonotone spectral projected gradient methods on convex sets. SIAM J. Optim., 10(4): , MIMS Nick Higham Nearest Correlation Matrix 29 / 33

44 References II E. G. Birgin, J. M. Martínez, and M. Raydan. Algorithm 813: SPG Software for convex-constrained optimization. ACM Trans. Math. Software, 27(3): , E. G. Birgin, J. M. Martínez, and M. Raydan. Spectral projected gradient methods. In C. A. Floudas and P. M. Pardalos, editors, Encyclopedia of Optimization, pages Springer-Verlag, Berlin, second edition, P. Glasserman and S. Suchintabandid. Correlation expansions for CDO pricing. Journal of Banking & Finance, 31: , MIMS Nick Higham Nearest Correlation Matrix 30 / 33

45 References III N. J. Higham. Computing the nearest correlation matrix A problem from finance. IMA J. Numer. Anal., 22(3): , C. Lucas. Computing nearest covariance and correlation matrices. M.Sc. Thesis, University of Manchester, Manchester, England, Oct MIMS Nick Higham Nearest Correlation Matrix 31 / 33

46 References IV H.-D. Qi and D. Sun. A quadratically convergent Newton method for computing the nearest correlation matrix. SIAM J. Matrix Anal. Appl., 28(2): , P. Sonneveld, J. J. I. M. van Kan, X. Huang, and C. W. Oosterlee. Nonnegative matrix factorization of a correlation matrix. Linear Algebra Appl., 431: , MIMS Nick Higham Nearest Correlation Matrix 32 / 33

47 References V A. Vandendorpe, N.-D. Ho, S. Vanduffel, and P. Van Dooren. On the parameterization of the CreditRisk + model for estimating credit portfolio risk. Insurance: Mathematics and Economics, 42(2): , MIMS Nick Higham Nearest Correlation Matrix 33 / 33