MATRICES WITH DISPLACEMENT STRUCTURE A SURVEY PLAMEN KOEV Abstract In the following survey we look at structured matrices with what is referred to as low displacement rank Matrices like Cauchy Vandermonde Polynomial Vandermonde Chebyshev Vandermonde Toeplitz Hankel and others only depend on O(n) parameters instead of n This suggests that linear systems of these types should be solvable with some degree of effort less than O(n 3 ) The same should also extend to the LU-factorization and inversion Also the inverses of (say) Vandermonde matrices do not have Vandermonde structure yet they should have similar properties when it comes to solving linear equations The property that describes the above structured matrices and their inverses (and Schur complements) is that they have a low displacement rank Exploiting the displacement structure of a matrix allows us to obtain O(n ) algorithms for solving Ax b obtaining the LU-factorization and for inversion of matrices with low displacement rank The present survey does not contain any new results and is entirely based on the excellent papers by Vadim Olshevsky and Thomas Kailath noted in the references Our task was to provide an outline of the main results for matrices with low displacement rank and provide the reader with an insight into the underlying logic of this theory Let the matrices F A C n n be given Let R C n n be a matrix satisfying a Sylvester type equation FA (R) F R R A G B for some rectangular matrices G C n α B C α n where the number α is small in comparison to n The pair of matrices G B above is referred to as the {F A}-generator of R and the smallest possible inner size α among all {F A}-generators is called a {F A}- displacement rank of R This is the so-called Toeplitz-like displacement operator The Hankel-like displacement operator is defined as: FA (R) R F R A G B Basic classes of Structured Matrices Toeplitz-like: F Z A Z Toeplitz-plus-Hankel-like: F Y 00 A Y Cauchy-like: F diag(c c n ) A diag(d d n ) Vandermonde-like: F diag( x ) A Z Chebyshev-Vandermonde-like: F diag(x ) A Y γδ Date: July 999
PLAMEN KOEV Here Z φ Y γδ 0 0 0 φ 0 0 0 0 0 0 γ 0 0 0 0 0 0 0 0 δ ie Z φ is the lower shift φ-circulant matrix and Y γδ Z 0 + Z T 0 + γe e T + δe e T Example Cauchy Matrix: c d c d c c d c d c d n c c d c d c d n cn c n d c n d c n d n c d c d n d d dn c d c d n c n d c n d c n d n Therefore the displacement rank of a Cauchy matrix is one The Displacement Structure is Inherited During Inversion If F R RA GB then AR R F (R G)(BR ) so R has a similar displacement structure and the same {A F } displacement rank as the {F A} displacement rank of R Similarity Transformations Preserve the Displacement Rank If F R RA GB and R T RT then F R R A G B where F T F T A T AT G T G and B BT This allows us to transform a structured matrix from one class to another The Displacement Structure is Inherited During Schur Complementation Lemma Let the matrix R [ d u l R ()
MATRICES WITH DISPLACEMENT STRUCTURE A SURVEY 3 satisfy Sylvester type displacement equation: [ [ f 0 a FA (R ) R F R G 0 A B where G C n α and B C α n If d 0 then the Schur complement R R () lu d satisfies the displacement equation F R R A G B where [ [ 0 G G g [ 0 B B b d l [ d u and g and b are the first row of G and the first column of B respectively Proof From the standard Schur complementation formula [ [ 0 d 0 R I 0 R d l we get [ [ [ f 0 d 0 d 0 F 0 R 0 R [ [ d u 0 I [ a 0 A 0 d l I Equating the () block entries one obtains the desired result [ B G d u 0 I Note: The requirement that F and A be lower and upper triangular is essential Otherwise the above (and the Fast Algorithm) doesn t work What this means is that if the () entry of F and A is then one step of Gaussian Elimination must leave F and A T unchanged Fast Gaussian Elimination for a Structured Matrix Recover from the generator the first column and the first row of [ d u R l R () Note: This must take O() flops per entry or the cost of the algorithm goes beyond O(n ) Now one has the first column [ d l of L and the first row [ d u of U in the LU factorization of R Compute a generator of the Schur complement of R using [ [ 0 [ G G g d l 0 B B b [ d u where g and b are the first row of G and the first column of B respectively
4 PLAMEN KOEV Example Consider the Sylvester type equation for a Cauchy-like matrix: 4 0 0 /3 / 3 /3 / 3 0 5 0 /4 3/3 4/ /4 3/3 4/ 0 0 0 0 0 0 6 /5 4/4 5/3 /5 4/4 5/3 0 0 3 0 [ 3 0 The first column of L and the first row of U in the LU decomposition are: R LU 0 0 3/4 0 /3 3 0 3/5 0 0 The generators of the Schur complement are: [ 0 0 3/4 [ 0 G 3/5 therefore Also therefore [ 0 B [ 3 0 G [ 0 B [ /4 /5 /4 0 0 /5 [ 3 9 [ 0 6 0 [ 6 So the Schur complement R satisfies the following displacement equation: [ [ [ [ 5 0 0 /4 6 R 0 6 R 0 3 /5 Therefore R [ /4 /4 /5 /5 (If diag(c i ) R R diag(d i ) G B then r ij gibj c i d j where g i and b j are the ith row of G and the jth column of B respectively) We continue the same way The second column of L and the second row of U in the LU decomposition are: R LU 3/4 0 0 0 3/5 8/5 The generators of the Schur complement are: [ [ [ 0 /4 G3 /5 8/5 so G 3 [ 0 /5 /3 3 0 /4 /4 0 0 [ /4 [ 0 0 0 /5
MATRICES WITH DISPLACEMENT STRUCTURE A SURVEY 5 Also so [ 0 B3 [ 6 [ B 3 [ 7 [ [ 0 7 0 The Schur complement R 3 satisfies the displacement equation [ 6 R3 R 3 [ 3 [ 0 /5 [ 7 [ 4/5 Therefore R 3 [ 4/5 and the LU decomposition is R LU 0 0 3/4 0 /3 3 0 /4 /4 3/5 8/5 0 0 4/5 Pivoting for Matrices with Displacement Structure Partial pivoting may be applied to matrices with displacement structure that satisfy the displacement equation F R R A G B with F -diagonal matrix After the row interchange the matrix ˆR P R satisfies the same displacement equation with the diagonal matrix F replaced by another diagonal matrix ˆF P F P T and with G replaced by Ĝ P G Now implies F R R A G B (P F P T )(P R ) (P R )A (P G )B Fast GEPP Algorithm for Structured Matrices Recover from the generator the first column of [ d u R l R () Note: This will depend on the form of the matrices F and A The procedure was specified for a Cauchy matrix Procedures exist for the recovery of the matrix R from its displacement equation for all basic classes of matrices with displacement structure Next determine the position (say) (k ) of the entry with maximal magnitude in the first column Let P be a permutation of the first and the k-th entries Interchange the first and the k-th diagonal entries of F and interchange the first and the k-th rows in the matrix G Then recover the first row of P R from the generator Now one has the first column [ d l of L and the first row [ d u of U in the LU factorization of P R
6 PLAMEN KOEV Compute a generator of the Schur complement of R of P R using [ [ 0 G G g d l [ 0 B B b [ d u where g and b are the first row of G and the first column of B respectively Proceeding recursively one finally obtains factorization R P LU where P P P n and P k is the permutation used at the k-th step of the recursion Fast Inversion for Matrices with Displacement Structure For the next paragraph we will assume that we know how to solve Rx b in O(n ) operations for the nonsingular matrix R that satisfies F R RA GB To do this we can either use Fast GE or first transform R into a Cauchy-like matrix (we will see later how) and use Fast GEPP From F R RA GB we obtain AR R F (R G)(BR ) thus R satisfies a very similar displacement equation If we know the {A F } generators { R G BR } of R then we can recover R from this displacement equation in O(n ) time Note that algorithms exist for the recovery of the matrix R from the displacement equation F R RA GB for most classes (actually all famous classes Toeplitz Chebyshev-Vandermonde etc) of matrices with low displacement rank in O() operations per entry ie in O(n ) operations for the entire matrix We can compute R G and BR in O(n ) time as follows First compute R G by solving α times the system Rx g i i α where g i are the columns of G and α is the displacement rank of G Since solving Rx b takes O(n ) time and α is small in comparison with n we can obtain R G in O(n ) time Then compute BR in the following way: BR (R T B T ) T R T B T is the solution of α equations R T x b i i α Each of those equations can be solved in O(n ) time because if R LU then R T U T L T If the matrix R was first transformed into another type of structured matrix (say Cauchylike from Toeplitz-like in order to apply GEPP) then we have T R T T U T L T We can still solve Rx b in O(n ) time because as we will see later the matrices T and T will be diagonal matrices Fast Trigonometric Transforms or products thereof Having obtained the generators of R in O(n ) time we can recover the matrix R from the generators and the displacement equation in O(n ) time The total time required for inversion of R is O(n ) Transformation of Toeplitz-like matrices into Cauchy-like matrices As described earlier we need to be able to convert the other classes of structured matrices into Cauchy-like matrices before we can apply partial pivoting
MATRICES WITH DISPLACEMENT STRUCTURE A SURVEY 7 If R is a Toeplitz matrix then Z R RZ GB where the rank of G is not greater than Matrices that satisfy Z R RZ GB where G is of low rank are referred to as Toeplitz-like matrices Here is how the Toeplitz-like matrices are transformed into Cauchy-like matrices Consider the (normalized) Discrete Fourier Transform matrix [ F n e πi n (k )(j ) and the matrices kj n D diag( e πi πi n e n (n ) ) D diag(e πi 3πi (n )πi n e n e n ) D 0 diag( e πi n e (n )πi n ) The following factorizations are well known: Z F D F Z D 0 F D F D 0 Substituting the above into Z R RZ GB one obtains D (F RD 0 F ) (F RD 0 F )D (F G)(BD 0F ) ie F RD0 F is a Cauchy-like matrix After applying the Fast GEPP we obtain F RD0 F P LU so we get the factorization R F P LUF D0 Solving Rx b will require O(n ) operations and will consist of the application of two (normalized) DFTs one diagonal scaling a permutation a forward and backward substitution for a total of O(n ) operations Transformation of Vandermonde-like into Cauchy-like matrices [ The Vandermonde matrix V satisfies the displacement equation x j i ij n D x V V ZT [ x x T [ 0 0 By analogy we shall refer to any matrix R with low {D ZT x }-displacement rank as a Vandermonde-like matrix If D V V ZT x GB then RF is a Cauchy-like matrix: D (RF ) (RF )D x G(BF ) Toeplitz-plus-Hankel-like matrices Let F Y 00 A Y T be a Toeplitz matrix and H be a Hankel matrix: t 0 t t n t 0 t t n t t 0 t n T H t t 0 t n t n+ t n+ t 0 t n t n t n We have rank (Y 00 (T + H) (T + H)Y ) 4 Matrices with low {Y 00 Y }-displacement rank are referred to as Toeplitz-plus- Hankel-like matrices
8 PLAMEN KOEV The matrix Y γδ with γ δ { } or γ δ 0 can be diagonalized by Fast Trigonometric Transform Matrices Y 00 SD S S Y CD C C T where C S [ n q (k )(j )π j cos n [ kjπ sin n + n + kj n kj n and are the (normalized) Discrete Cosine Tansform-II and Discrete Sine Transform-I matrices respectively (q q q n ) and D C diag ( cos π (n )π ) cos n n D S diag ( π cos n + cos nπ ) n + If R is a Toeplitz-plus-Hankel-like Matrix then SRC is a Cauchy-like matrix: The equation Y 00 R RY GB yields D S (SRC) (SRC)D C (SG)(BC) Chebyshev-Vandermonde Matrices Let T 0 T and U 0 U be the Chebyshev Polynomials of the first and the second kind respectively For nonzero x the matrices T 0 (x ) T (x ) T n (x ) T 0 (x ) T (x ) T n (x ) V T and T 0 ( ) T ( ) T n ( ) U 0 (x ) U (x ) U n (x ) U 0 (x ) U (x ) U n (x ) V U U 0 ( ) U ( ) U n ( ) are referred to as Chebyshev-Vandermonde matrices The Chebyshev polynomials satisfy the relations T 0 (x) T (x) x T n (x) xt n (x) T n (x) U 0 (x) U (x) x U n (x) xu n (x) U n (x) Consider F D x diag ( x x )
MATRICES WITH DISPLACEMENT STRUCTURE A SURVEY 9 and A W 0 0 0 0 0 0 0 0 0 [ n ( ) i (Z0 T ) i i where Z 0 as above is the lower circular shift Let D 0 diag( ) The Chebyshev-Vandermonde matrices satisfy D x (V T (x)d 0 ) (V T (x)d 0 ) W D x V U (x) V U W x x x x [ 0 0 0 [ 0 0 0 By analogy we will refer to matrices with small {D W }-displacement rank as x Chebyshev-Vandermonde-like Alternatively one can prove that the Chebyshev-Vandermonde-like matrices have low {D x Y } {D x Y 00 } or {D x Z + Z T }-displacement rank All displacement operators above describe in fact the same class of matrices A matrix that has a low rank with respect to one displacement operator will have a low displacement rank with respect to the other operators (but not necessarily the same) If a matrix R has a low {D x Y } {D x Y 00 } or {D x Z + Z T }-displacement rank then RC RS or RF respectively are Cauchy-like matrices where S C and F are appropriate Discrete Trigonometric Transforms described earlier in the text References [ I Gohberg T Kailath and V Olshevsky Fast Gaussian elimination with partial pivoting for matrices with displacement structure Math Comp 64 (995) pp 557 576 [ T Kailath and V Olshevsky Displacement structure approach to Chebyshev-Vandermonde and related matrices Integral Equations Operator Theory (995) pp 65 9 [3 T Kailath and V Olshevsky Displacement-structure approach to polynomial Vandermonde and related matrices Linear Algebra Appl 6 (997) pp 49 90