MATHEMATICS OF COMPUTATION Volume, Number, Pages S 5-578(XX)- AN EFFECTIVE MATRIX GEOMETRIC MEAN SATISFYING THE ANDO LI MATHIAS PROPERTIES DARIO A. BINI, BEATRICE MEINI AND FEDERICO POLONI Abstract. We propose a new matrx geometrc mean satsfyng the ten propertes gven by Ando, L and Mathas [Lnear Alg. Appl. 4]. Ths mean s the lmt of a sequence whch converges superlnearly wth convergence of order 3 whereas the mean ntroduced by Ando, L and Mathas s the lmt of a sequence havng order of convergence. Ths maes ths new mean very easly computable. We provde a geometrc nterpretaton and a generalzaton whch ncludes as specal cases our mean and the Ando-L-Mathas mean.. Introducton In several contexts, t s natural to generalze the geometrc mean of two postve real numbers a#b := ab to real symmetrc postve defnte n n matrces as (.) A#B := A(A B) / = A / (A / BA / ) / A /. Several papers, e.g. [3, 4, 9], and a chapter of the boo [], are devoted to studyng the geometry of the cone of postve defnte matrces P n endowed wth the Remannan metrc defned by ds = A / daa /, where B =,j b,j denotes the Frobenus norm. The dstance nduced by ths metrc s (.) d(a,b) = log(a / BA / ) It turns out that on ths manfold the geodesc jonng X and Y has equaton γ(t) = X / (X / Y X / ) t X / = X(X Y ) t =: X # t Y, t [,], and thus A#B s the mdpont of the geodesc jonng A and B. An analyss of numercal methods for computng the geometrc mean of two matrces s carred out n [7]. It s less clear how to defne the geometrc mean of more than two matrces. In the semnal wor [], Ando, L and Mathas lst ten propertes that a good matrx geometrc mean should satsfy, and show that several natural approaches based on generalzaton of formulas worng for the scalar case, or for the case of two matrces, do not wor well. They propose a new defnton of mean of Receved by the edtor??? and, n revsed form,???. Mathematcs Subject Classfcaton. Prmary 65F3; Secondary 5A48, 47A645. Key words and phrases. Matrx geometrc mean, geometrc mean, postve defnte matrx. c XXXX Amercan Mathematcal Socety
DARIO A. BINI, BEATRICE MEINI AND FEDERICO POLONI matrces satsfyng all the requested propertes. We refer to ths mean as to the Ando-L-Mathas mean, or shortly ALM-mean. The ALM-mean s the lmt of a recursve teraton process where at each step of the teraton geometrc means of matrces must be computed. One of the man drawbac of ths teraton s ts lnear convergence. In fact, the large number of teratons needed to approxmate each geometrc mean at all the recursve steps maes t qute expensve to actually compute the ALM-mean wth ths algorthm. Moreover, no other algorthms endowed wth a hgher effcency are nown. A class of geometrc means satsfyng the Ando, L, Mathas requrements has been ntroduced n [8]. These means are defned n terms of the soluton of certan matrx equatons. Ths approach provdes nterestng theoretcal propertes concernng the means but no effectve tools for ther computaton. In ths paper, we propose a new matrx geometrc mean satsfyng the ten propertes of Ando, L and Mathas. Lewse the ALM-mean, our mean s defned as the lmt of an teraton process wth the relevant dfference that convergence s superlnear wth order of convergence at least three. Ths property maes t much less expensve to compute ths geometrc mean snce the number of teratons requred to reach a hgh accuracy s dropped down to just a few ones. The teraton on whch our mean s based has a smple geometrcal nterpretaton. In the case = 3, gven the postve defnte matrces A,A,A 3, we generate three matrx sequences A (m),a (m),a (m) 3 startng from A () = A, =,,3. At the step m+, the matrx A (m+) s chosen along the geodesc whch connects A (m) wth the mdpont of the geodesc connectng A (m) to A (m) 3 at dstance /3 from A (m). The matrces A (m+) and A (m+) 3 are smlarly defned. In the case of the Eucldean geometry, just one step of the teraton provdes the value of the lmt,.e., the centrod of the trangle wth vertces A (m),a (m), A (m) 3. In fact, n a trangle the medans ntersect each other at /3 of ther length. In the dfferent geometry of the cone of postve defnte matrces the geodescs whch play the role of the medans, mght even not ntersect each other. In the case of matrces A,A,...,A, the matrx A (m+) s chosen along the geodesc whch connects A (m) wth the geometrc mean of the remanng matrces, at dstance /(+) from A (m). In the customary geometry, ths pont s the common ntersecton pont of all the medans of the -dmensonal smplex formed by all the matrces A (m), =,...,. We prove that the sequences {A (m) } m, =,...,, converge to a common lmt Ā wth order of convergence at least 3. The lmt Ā s our defnton of geometrc mean of A,...,A. It s nterestng to pont out that our mean and the ALM-mean of matrces can be vewed as two specfc nstances of a class of more general means dependng on parameters s [,], =,...,. All the means of ths class satsfy the requrements of Ando, L and Mathas, moreover, the ALM-mean s obtaned wth s = (/,,,...,), for s = (s ), whle our mean s obtaned wth s = (( )/,( )/( ),...,/). The new mean s the only one n ths class for whch the matrx sequences generated at each recursve step converge superlnearly. The artcle s structured as follows. After ths ntroducton, n Secton we present the ten Ando L Mathas propertes and brefly descrbe the ALM-mean; then, n Secton 3, we propose our new defnton of a matrx geometrc mean and
AN EFFECTIVE MATRIX GEOMETRIC MEAN 3 prove some of ts propertes by gvng also a geometrcal nterpretaton; n Secton 4 we provde a generalzaton whch ncludes the ALM-mean and our mean as two specal cases. Fnally, n Secton 5 we present some numercal experments of explct computatons nvolvng ths means concernng some problems from Physcs. It turns out that, n the case of sx matrces, the speed up reached by our approach wth respect to the ALM-mean s by a factor greater than. We also expermentally demonstrate that the ALM-mean s dfferent, even though very close, from our mean. Fnally, for = 3 we provde a pctoral descrpton of the parametrc famly of geometrc means.. Known results Throughout ths secton we use the postve semdefnte orderng defned by A B f A B s postve semdefnte. We denote by A the transpose conjugate of A... Ando L Mathas propertes for a matrx geometrc mean. Ando, L and Mathas [] proposed the followng lst of propertes that a good geometrc mean G( ) of three matrces should satsfy. P: Consstency wth scalars. If A, B, C commute then G(A,B,C) = (ABC) /3. P: Jont homogenety. G(αA,βB,γC) = (αβγ) /3 G(A,B,C). P3: Permutaton nvarance. For any permutaton π(a,b,c) of A, B, C t holds G(A,B,C) = G(π(A,B,C)). P4: Monotoncty. If A A, B B, C C, then G(A,B,C) G(A,B,C ). P5: Contnuty from above. If A n, B n, C n are monotonc decreasng sequences convergng to A, B, C, respectvely, then G(A n,b n,c n ) converges to G(A, B, C). P6: Congruence nvarance. For any nonsngular S, G(S AS,S BS,S CS) = S G(A,B,C)S. P7: Jont concavty. If A = λa + ( λ)a, B = λb + ( λ)b, C = λc +( λ)c, then G(A,B,C) λg(a,b,c )+( λ)g(a,b,c ). P8: Self-dualty. G(A,B,C) = G(A,B,C ). P9: Determnant dentty. detg(a,b,c) = (det Adet B det C) /3. P: Arthmetc geometrc harmonc mean nequalty: A + B + C 3 ( A + B + C ) G(A,B,C). 3 Moreover, t s proved n [] that P5 and P are consequences of the others. Notce that all these propertes can be easly generalzed to the mean of any number of matrces. We wll call a geometrc mean of three or more matrces any map G( ) satsfyng P P or ther analogous for a number 3 of entres... The Ando L Mathas mean. Here and hereafter, we use the followng notaton. We denote by G (A,B) the usual geometrc mean A#B and, gven the -tuple (A,...,A ), we defne Z (A,...,A ) = (A,...,A,A +,...,A ), =,...,, that s, the -tuple where the -th term has been dropped out.
4 DARIO A. BINI, BEATRICE MEINI AND FEDERICO POLONI In [], Ando, L and Mathas note that the prevously proposed defntons of means of more than two matrces do not satsfy all the propertes P P, and propose a new defnton that fulflls all of them. Ther mean s defned nductvely on the number of arguments. Gven A,...,A postve defnte, and gven the defnton of a geometrc mean G ( ) of matrces, they set A () = A, =,...,, and defne for r (.) A (r+) := G (Z (A (r),...,a(r) )), =,...,. For = 3, the teraton reads A (r+) G (B (r),c (r) ) B (r+) = G (A (r),c (r) ). C (r+) G (A (r),b (r) ) Ando, L and Mathas show that the sequences (A (r) ) r= converge to the same matrx Ã, and fnally defne G (A,...,A ) = Ã. In the followng, we shall denote by G( ) the Ando L Mathas mean, droppng the subscrpt when not essental. An addtonal property of the Ando L Mathas mean whch wll turn out to be mportant n the convergence proof s the followng. Recall that ρ(x) denotes the spectral radus of X, and let R(A,B) := max(ρ(a B),ρ(B A)). Ths functon s a multplcatve metrc, that s, we have R(A,B) wth equalty ff A = B, and R(A,C) R(A,B)R(B,C). The addtonal property s P: For each, and for each par of sequences (A,...,A ), (B,...,B ) t holds ( / R (G(A,...,A ),G(B,...,B )) R(A,B )). = 3. A new matrx geometrc mean 3.. Defnton. We are gong to defne for each a new mean Ḡ( ) of matrces satsfyng P P. Let Ḡ(A,B) = A#B, and suppose that the mean has already been defned for up to matrces. Let us denote shortly T (r) = Ḡ (Z (Ā(r),...,Ā(r) )) and defne Ā(r+) for =,..., as (3.) Ā (r+) := Ḡ(Ā(r),T (r),t (r),...,t (r) ), } {{ } tmes wth Ā() = A for all. Notce that apparently ths needs the mean Ḡ( ) to be already defned; n fact, n the specal case n whch of the arguments are concdent, the propertes P and P6 alone allow one to determne the common value of any geometrc mean: G(X,Y,Y,...,Y ) =X / G(I,X / Y X /,...,X / Y X / )X / =X / (X / Y X / ) X / = X # Y.
AN EFFECTIVE MATRIX GEOMETRIC MEAN 5 Thus we can use ths smpler expresson drectly n (3.) and set (3.) Ā (r+) = Ā(r) # T (r). In sectons 3.3 and 3.4, we are gong to prove that the sequences (Ā(r) ) r= converge to a common lmt Ā wth order of convergence at least three, and ths wll enable us to defne Ḡ(A,...,A ) := Ā. In the followng, we wll drop the ndex from Ḡ( ) when t can be easly nferred from the context. 3.. Geometrcal nterpretaton. In [3], an nterestng geometrcal nterpretaton of the Ando L Mathas mean s proposed for = 3. We propose an nterpretaton of the new mean Ḡ( ) n the same sprt. For = 3, the teraton defnng Ḡ( ) reads Ā (r+) Ā (r) # B (r+) ( B (r) # C (r) ) 3 = B (r) # (Ā(r) # C (r) ) C (r+) 3 C (r) # (Ā(r) # B. (r) ) 3 We can nterpret ths teraton as a geometrcal constructon n the followng way. To fnd e.g. Ā (r+), the algorthm s: () Draw the geodesc jonng B (r) and C (r), and tae ts mdpont M (r) ; () Draw the geodesc jonng Ā(r) and M (r), and tae the pont lyng at /3 of ts length: ths s Ā(r+). If we execute the same algorthm on the Eucldean plane, replacng the word geodesc wth straght lne segment, t turns out that Ā(), B(), and C () concde n the centrod of the trangle wth vertces A, B, C. Thus, unle the Eucldean counterpart of the Ando L Mathas mean, ths process converges n one step on the plane. Roughly speang, when A, B and C are very close to each other, we can approxmate (n some ntutve sense) the geometry on the Remannan manfold P n wth the geometry on the Eucldean plane: snce ths constructon to fnd the centrod of a plane trangle converges faster than the Ando L Mathas one, we can expect that also the convergence speed of the resultng algorthm s faster. Ths s ndeed what wll result after a more accurate convergence analyss. 3.3. Global convergence and propertes P P. In order to prove that the teraton (3.) s convergent (and thus that Ḡ( ) s well defned), we are gong to adapt a part of the proof of Theorem 3. of [] (namely, Argument ). Theorem 3.. Let A,...A be postve defnte. () All the sequences (Ā(r) ) r= converge for r to a common lmt Ā; () the functon Ḡ(A,...,A ) satsfes P P. Proof. We wor by nducton on. For =, our mean concdes wth the ALMmean, so all the requred wor has been done n []. Let us now suppose that the thess holds for all. We have Ā (r+) (Ā(r) ) + ( )T (r) = Ā (r), where the frst nequalty follows from P for the ALM-mean G ( ) (remember that n the specal case n whch of the arguments concde, G ( ) = Ḡ( )),
6 DARIO A. BINI, BEATRICE MEINI AND FEDERICO POLONI and the second from P for Ḡ ( ). Thus, (3.3) = Ā (r+) = Ā (r) A. Therefore, the sequence (Ā(r),...,Ā(r) ) r= s bounded, and there must be a convergng subsequence, say, convergng to (Ā,...,Ā). Moreover, for each p,q {,...,} we have R(Ā(r+) p,ā(r+) q ) R(Ā(r) p,ā(r) q R(Ā(r) p,ā(r) q ) / R(T (r) p / ) (R(Ā(r) q,t q (r) = ),Ā(r) p ) ) = R(Ā(r) p,ā(r) q ) /, where the frst nequalty follows from P n the specal case, and the second from P n the nductve hypothess. Passng at the lmt of the convergng subsequence, one can verfy that R(Āp,Āq) R(Āp,Āq) /, from whch we get R(Āp,Āq), that s, Ā p = Āq, because of the propertes of R,.e., the lmt of the subsequence s n the form (Ā,Ā,...,Ā). Suppose there s another subsequence convergng to ( B, B,..., B); then, by (3.3), we have both Ā B and B Ā, that s, Ā = B. Therefore, the sequence has only one lmt pont, thus t s convergent. Ths proves the frst pont of the theorem. We now turn to show that P holds for our mean Ḡ( ). Consder -tuples B (r) A,...,A and B,...,B, and let be defned as Ā(r) but startng the teraton from the -tuple (B ) nstead of (A ). We have for each R(Ā(r+), B(r+) ) R(Ā(r), R(Ā(r) (r), B (r) / B ) R(Ḡ(Z(Ā(r) ) /,...,Ā(r) )),Ḡ(Z ( R(Ā(r) (r) j, B j ) j B (r),..., B (r) ))) Ths yelds = j= R(Ā(r) (r) j, B j ) /. = R(Ā(r+), (r+) B ) = R(Ā(r) (r), B ); channg together these nequaltes for successve values of r and passng to the lmt, we get R(G(A,...,A ),G(B,...,B )) R(A,B ), whch s P. The other propertes P P4 and P6 P9 (remember that P5 and P are consequences of these) are not dffcult to prove. All the proofs are qute smlar, and can be establshed by nducton, usng also the fact that snce they hold for the ALM-mean, they can be appled to the mean Ḡ( ) appearng n (3.) (snce we just proved that all possble geometrc means tae the same value f appled wth =
AN EFFECTIVE MATRIX GEOMETRIC MEAN 7 equal arguments). For the sae of brevty, we provde only the proof for three of these propertes. P: We need to prove that f the A commute then Ḡ(A,...,A ) = (A A ) /. Usng the nductve hypothess, we have T () = j Ā. Usng the fact that P holds for the ALM-mean, we have = A / j Ā () A j = = A /, as needed. So, from the second teraton on, we have Ā(r) = Ā(r) Ā (r) P4: Let T (r) = = A/. and Ā (r) be defned as T (r) = = and Ā(r) but startng from A A. Usng monotoncty n the nductve case and n the ALM-mean, we have for each s and for each and thus T (r+) Ā (r+) T (r+) Ā (r+). Passng to the lmt for r, we obtan P4. P7: Suppose A = λa (r) + ( λ)a, and let T, (resp. T (r) ) and Ā (r), (resp. Ā (r) ) be defned as T (r) and Ā(r) but startng from A (resp. A ). Suppose that for some r t holds Ā(r) λā (r) +( λ)ā (r) for all. Then by jont concavty and monotoncty n the nductve case we have T (r+) =Ḡ(Z(Ā(r),...,Ā(r) )) Ḡ(Z(λĀ (r) + ( λ)ā (r),...,λā (r) + ( λ)ā (r) )) λt (r) + ( λ)t (r), and by jont concavty and monotoncty of the Ando L Mathas mean we have Ā (r+) =Ā(r) ( # λā (r) λā (r+) T (r) + ( λ)ā (r) + ( λ)ā (r+). ) ( # λt (r) Passng to the lmt for r, we obtan P7. ) + ( λ)t (r) 3.4. Cubc convergence. In ths secton, we wll use the bg-o notaton n the norm sense, that s, we wll wrte X = Y +O(ε h ) to denote that there are unversal postve constants ε < and θ such that for each < ε < ε t holds X Y θε h. The usual arthmetc rules nvolvng ths notaton hold. In the followng, these constants may depend on, but not on the specfc choce of the matrces nvolved n the formulas.
8 DARIO A. BINI, BEATRICE MEINI AND FEDERICO POLONI Theorem 3.. Let < ε <, M and Ā() = A, =,...,, be postve defnte n n, and E := M A I. Suppose that E ε for all =,...,. Then, for the matrces Ā() defned n (3.) the followng hold. C: We have (3.4) M Ā () I = T + O(ε 3 ) where T := j= E j 4 (E E j ).,j= C: There are postve constants θ, σ and ε < (all of whch may depend on ) such that for all ε ε t holds M Ā () I θε 3 for a sutable matrx M satsfyng M M I σε. C3: The teraton (3.) converges at least cubcally. C4: We have (3.5) M Ḡ(A,...,A ) I = O(ε 3 ). Proof. Let us frst fnd a local expanson of a generc pont on the geodesc A# t B: let M A = I + F and M B = I + F wth F δ, F δ, < δ <. Then we have M (A# t B) =M A(A B) t = (I + F ) ( (I + F ) (I + F ) ) t (3.6) =(I + F ) ( (I F + F + O(δ 3 ))(I + F ) ) t =(I + F ) ( I + F F F F + F + O(δ 3 ) ) t =(I + F ) ( I + t(f F F F + F ) ) t(t ) + (F F ) + O(δ 3 ) t(t ) =I + ( t)f + tf + (F F ) + O(δ 3 ), where we made use of the matrx seres expanson (I + X) t = I + tx + t(t ) X + O(X 3 ). Now, we are gong to prove the theorem by nducton on n the followng way. Let C denote the asserton C of the theorem (for =,...4) for a gven value of. We show that () C holds; () C = C ; (3) C = C3,C4 ; (4) C4 = C +. It s clear that these clams mply that the results C C4 hold for all by nducton; we wll now turn to prove them one by one. () Ths s smply equaton (3.6) for t =. () It s obvous that T = O(ε); thus, choosng M := M(I + T ) one has (3.7) Ā () = M(I + T + O(ε 3 )) = M (I + (I + T ) O(ε 3 )) = M (I + O(ε 3 )).
AN EFFECTIVE MATRIX GEOMETRIC MEAN 9 (3.8) Usng explct constants n the bg-o estmates, we get M Ā () I θε 3, M M I σε for sutable constants θ and σ. (3) Suppose ε s small enough to have θε 3 ε. We shall apply C wth ntal matrces Ā(), wth ε = θε 3 n leu of ε and M n leu of M, gettng M Ā () I θε 3, M M I σε. Repeatng agan for all the steps of our teratve process, we get for all s =,,... Ms Ā (s) I θε 3 s = ε s, M s M s+ I σεs (3.9) wth ε s+ := θε 3 s and M := M. For smplcty s sae, we ntroduce the notaton d(x,y ) := X Y I for any two n n symmetrc postve defnte matrces X and Y. It wll be useful to notce that X Y X X Y I X d(x,y ) and d(x,z) = (X Y I)(Y Z I) + X Y I + Y Z I d(x,y )d(y,z) + d(x,y ) + d(y,z). Wth ths notaton, we can restate (3.8) as d(m s,ā(s) ) ε s, d(m s,m s+ ) σε s. We wll now prove by nducton that, for ε smaller than a fxed constant, t holds (3.) d(m s,m s+t ) ( ) t σε s. Frst of all, t holds for all t ε s+t = θ 3t ε 3t, whch, for ε smaller than mn(/8,θ ), mples εs+t ε s ε 3 t s ε t s Now, usng (3.9), and supposng addtonally ε σ, we have d(m s,m s+t+ ) d(m s,m s+t )d(m s+t,m s+t+ ) + d(m s,m s+t ) + d(m s+t,m s+t+ ) ( ) ( t σε s + σε s σε s+t + ε ) s+t ε s ( ) ( t σε s + σε s ε ) s+t ε s Thus, we have for each t ( t ) σε s + σε s t+ = ( ) t+ σε s M t M M M M t I σ M ε, t+.
DARIO A. BINI, BEATRICE MEINI AND FEDERICO POLONI whch mples M t M for all t. By a smlar argument, (3.) M s+t M s M s d(m s+t,m s ) σ M ε s, (3.) Due to the bounds already mposed on ε, the sequence ε s tends monotoncally to zero wth cubc convergence rate; thus (M t ) s a Cauchy sequence and therefore converges. In the followng, let M be ts lmt. The convergence rate s cubc, snce passng to the lmt (3.) we get M M s σ M ε s. Now, usng the other relaton n (3.8), we get Ā(s) M Ā (s) M s + M M s M d(m s,ā(s) ) + σ M ε s (σ + ) M ε s, that s, Ā (s) converges wth cubc convergence rate to M. Thus C3 s proved. By (3.9), (3.), and (3.8), we have d(m,ā(t) ) d(m,m t )d(m t,ā(t) ) + d(m,m t ) + d(m t,ā(t) ) σε ε t + σε + ε t (4σ + )ε = O(ε 3 ), whch s C4. (4) Usng C4 and (3.6) wth F = E +, F = M Ḡ(A,...,A ) = T + O(ε 3 ), δ = ε, we have ( M Ā () + =M A + # Ḡ(A,...,A ) + =I + + E + + + T Observe that ( + ) ( E + ) ) E + O(ε 3 ). T = S + P ( )Q where S = = E, Q = = E, P =,j=, j E E j. Snce S = P +Q and S + = S +E +, Q + = Q +E+, P + = P +E + S + S E +, from (3.) one fnds that M Ā () + =I + + S + =I + T + + O(ε 3 ). = ( + ) Q + + ( + ) P + + O(ε 3 ) Snce the expresson we found s symmetrc wth respect to the E, t follows that Ā() j has the same expanson for any j. Observe that Theorems 3. and 3. mply that the teraton (3.) s globally convergent wth order of convergence at least 3. It s worth to pont out that, n the case where the matrces A, =,...,A, commute each other, the teraton (3.) converges n just one step,.e., Ā () = Ā
AN EFFECTIVE MATRIX GEOMETRIC MEAN for any. In the noncommutatve general case, one has det(ā(s) ) = det(ā) for any and for any s,.e., the determnant converges n one sngle step to the determnant of the matrx mean. Our mean s dfferent from the ALM-mean, as we wll show wth some numercal experments n Secton 5. In the next secton 4 we prove that our mean and the ALM-mean belong to a general class of matrx geometrc means, whch depends on a set of parameters. 4. A new class of matrx geometrc means In ths secton we ntroduce a new class of matrx means dependng on a set of parameters s,...,s and show that the ALM-mean and our mean are two specfc nstances of ths class. For the sae of smplcty, we descrbe ths generalzaton n the case of = 3 matrces A,B,C. The case > 3 s outlned. Here, the dstance between two matrces s defned n (.). For = 3, the algorthm presented n Secton 3 replaces the trple A,B,C wth A,B,C where A s chosen n the geodesc connectng A wth the mdpont of the geodesc connectng B and C, at dstance /3 from A, and smlarly s made for B and C. In our generalzaton we use two parameters s,t [,]. We consder the pont P t = B# t C n the geodesc connectng B to C at dstance t from B. Then we consder the geodesc connectng A to P t and defne A the matrx on ths geodesc at dstance s from A. That s, we set A = A# s (B# t C). Smlarly we do wth B and C. Ths transformaton s recursvely repeated so that the matrx sequences A (r), B (r), C (r) are generated by means of (4.) A (r+) = A (r) # s (B (r) # t C (r) ), B (r+) = B (r) # s (C (r) # t A (r) ), C (r+) = C (r) # s (A (r) # t B (r) ), r =,,... startng wth A () = A, B () = B, C () = C. By followng the same arguments of Secton 3 t can be shown that the three sequences have a common lmt G s,t for any s,t [,]. Moreover, for s =, t = / one obtans the ALM-mean,.e., G = G,, whle for s = /3, t = / the lmt concdes wth our mean,.e., Ḡ = G 3,. Moreover, t s possble to prove that for any s,t [,] the lmt satsfes the condtons P P so that t can be consdered a good geometrc mean. Concernng the convergence speed of the sequence generated by (4.) we may perform a more accurate analyss. Assume that A = M(I + E ), B = M(I + E ), C = M(I + E 3 ), where E ε <, =,,3. Then, applyng (3.6) n (4.) yelds A B C. = M(I + ( s)e + s( t)e + ste 3 + st(t ) H + s(s ) (H + th ) ). = M(I + ( s)e + s( t)e 3 + ste + st(t ) H3 + s(s ) (H + th 3 ) ). = M(I + ( s)e3 + s( t)e + ste + st(t ) H + s(s ) (H 3 + th ) ) where. = denotes equalty up to O(ε 3 ) terms, wth H = E E, H = E E 3, H 3 = E 3 E. Whence we have A = M(I + E ), B = M(I + E ), C =
DARIO A. BINI, BEATRICE MEINI AND FEDERICO POLONI M(I + E 3), wth E E =. C(s,t) E 3 where E E E 3 + C(s,t) = st(t ) H H 3 H + s(s ) ( s)i s( t)i sti sti ( s)i s( t)i s( t)i sti ( s)i. (H th ) (H th 3 ) (H 3 th ) Observe that the bloc crculant matrx C(s,t) has egenvalues λ =, λ = ( 3 s)+ 3 s(t ), and λ 3 = λ, wth multplcty n, where =. Moreover, the par (s,t) = (/3,/) s the only one whch yelds λ = λ 3 =. In fact (/3, /) s the only par whch provdes superlnear convergence. For the ALMmean, where t = / and s = t holds λ = λ 3 = / whch s the rate of convergence of the ALM teraton []. In the case of > 3 matrces, gven the ( )-tuple (s,s,...,s ) we may recursvely defne G s,...,s (A,...,A ) as the common lmt of the sequences generated by A (r+) = A (r) # s G s,...,s (Z (A (r),...,a(r) )), =,...,. Observe that wth (s,...,s ) = (/,,,...,) one obtans the ALM-mean, whle wth (s,...,s ) = (( )/,( )/( ),...,/) one obtans the new mean ntroduced n Secton 3. 5. Numercal experments We have mplemented the two teratons convergng to the ALM mean and to the newly defned geometrc mean n Matlab, and run some numercal experments on a quad-xeon.8ghz computer. To compute matrx square roots we used Matlab s bult-n sqrtm functon, whle for p-th roots wth p > we used the rootm functon n Ncholas Hgham s Matrx Computaton Toolbox [6]. To counter the loss of symmetry due to the accumulaton of computatonal errors, we chose to dscard the magnary part of the computed roots. The experments have been performed on the same data set as the paper [9]. It conssts of fve sets each composed of four to sx 6 6 postve defnte matrces, correspondng to physcal data from elastcty experments conducted by Hearmon [5]. The matrces are composed of smaller dagonal blocs of szes up to 4 4, dependng on the symmetres of the nvolved materals. Two to three sgnfcatve dgts are reported for each experments. We have computed both the ALM mean and the newly defned mean of these sets; as a stoppng crteron for each computed mean, we chose max A (r+) A (r) < ε, where X := max,j X j, wth ε =. The CPU tmes, n seconds, are reported n Table. For four matrces, the speed gan s a factor of, and t ncreases even more for more than four matrces. We then focused on Hearmon s second data set (ammonum dhydrogen phosphate), composed of four matrces. In Table, we reported the number of outer ( = 4) teratons needed and the average number of teratons needed to reach
AN EFFECTIVE MATRIX GEOMETRIC MEAN 3 Data set (number of matrces) ALM mean New mean NaClO 3 (5) 3..3 Ammonum dhydrogen phosphate (4) 9.9.39 Potassum dhydrogen phosphate (4) 9.7.38 Quartz (6) 67. 3. Rochelle salt (4)..53 Table. CPU tmes n seconds for the Hearmon elastcty data ALM mean New mean Outer teratons 3 3 Avg. nner teratons 8.3 Matrx square roots (sqrtm) 55 7 Matrx p-th roots (rootm) 84 Table. Number of nner and outer teratons needed, and number of matrx roots needed convergence n the nner ( = 3) teratons (remember that the computaton of a mean of four matrces requres the computaton of three means of three matrces at each of ts steps). Moreover, we measured the number of square and p-th roots needed by the two algorthms, snce they are the most expensve operaton n the algorthm. From the results, t s evdent that the speed gan n the new mean s due not only to the reducton of the number of outer teratons, but also of the number of nner teratons needed to get convergence at each step of the nner mean calculatons. When the number of nvolved matrces becomes larger, these speedups add up at each level. Hearmon s elastcty data are not sutable to measure the accuracy of the algorthm, snce the results to be obtaned are not nown. To measure the accuracy of the computed results, we computed nstead G(A 4,I,I,I) A, whch should yeld zero n exact arthmetc (due to P), and ts analogue wth the new mean. We chose A to be the frst matrx n Hearmon s second data set. Moreover, n order to obtan results closer to machne precson, n ths experment we changed the stoppng crteron choosng ε = 3 Operaton G(A 4,I,I,I) A Ḡ(A 4,I,I,I) A Result 3.6E-3.8E-4 The results are well wthn the errors permtted by the stoppng crteron, and show that both algorthms can reach a satsfyng precson. The followng examples provde an expermental proof that our mean s dfferent from the ALM-mean. Consder the followng matrces [ a b A = b a ], B = [ a b b a ] [, C = c Observe that the trple (A,B,C) s transformed nto (B,A,C) under the map X S XS, for S = dag(, ). In ths way, any matrx mean G(A,B,C) ].
4 DARIO A. BINI, BEATRICE MEINI AND FEDERICO POLONI satsfyng condton P3 s such that G = S GS, that s, the off-dagonal entres of G are zero. Whence, G must be dagonal. Wth a =,b =,c = 4, for the ALM-mean G and our mean Ḡ one fnds that [ ] [ ].48744366.485347837 Ḡ =, G =, 4.3376638 4.3945786 where we reported the frst dgts. Observe that the determnant of both the matrces s 6, that s, the geometrc mean of det A,det B,det C, moreover, ρ(ḡ) < ρ(g). For the matrces A = C = 3 5, B =, D = 3, 5, one has Ḡ =.348.36 3.845.36 6.68, G =.347.36 3.8796.36 6.6. Ther egenvalues are (6.58, 3.845,.39), and (6.85, 3.8796,.368), respectvely. Observe that, unle n the prevous example, t holds ρ(ḡ) > ρ(g). In order to llustrate the propertes of the set {G s,t : (s,t) (,] (,)}, where G s,t s the mean of three matrces defned n Secton 4, we consdered the ntervals [/5, ], [/5, 4/5] and dscretzed them nto two sets S, T of 5 equdstant ponts {/5 = s < s < < s 5 = }, {/5 = t < t < < t 5 = 4/5}, respectvely. For each par (s,t j ) S T,,j =,...,5, we computed G s,t j and the orthogonal projecton (x(,j),y(,j),z(,j)) of the matrx G s,t j G 3,, over a three dmensonal fxed randomly generated subspace. The set V = {(x(,j),y(,j),z(,j)) R 3,,j =,...,5} has been plotted wth the Matlab command mesh(x,y,z) whch connects each pont of coordnates (x(, j), y(, j), z(, j)) to ts four neghborhoods of coordnates (x( + δ,j + γ),y( + δ,j + γ),z( + δ,j + γ))) for δ,γ {, }. Fgure dsplays the set V from sx dfferent ponts of vew, where the matrces A,B and C of sze 3, have been randomly generated. The set appears to be a flat surface wth part of the edge tghtly folded on tself. The geometrc mean G 3, corresponds to the pont of coordnates (,,) whch s denoted by a small crcle and seems to be located n the central part of the fgure. These propertes, reported for only one trple (A,B,C), are mantaned wth very lght dfferences n all the plots that we have performed. The software concernng our experments can be delvered upon request.
AN EFFECTIVE MATRIX GEOMETRIC MEAN 5 4 x 4 3 x 4 4 x 3 4 4 6 x 4.5 x 3.5.5 5 5 x 4 x 4 4 3 x 4 4 3 5 x 4 5.5.5.5 x 3 x 3 4 4 6 x 4 x 4 4 x 4 4.5 x 3 x 3.5 4 4 x 4 6.5 4 4 x 4 6 Fgure. Plot of the set V, the small crcle corresponds to G /3,/. Acnowledgments The authors wsh to than Bruno Iannazzo for the many nterestng dscussons on ssues related to matrx means, and an anonymous referee for the useful comments and suggestons to mprove the presentaton.
6 DARIO A. BINI, BEATRICE MEINI AND FEDERICO POLONI References. T. Ando, Ch-Kwong L, and Roy Mathas, Geometrc means, Lnear Algebra Appl. 385 (4), 35 334. MR MR63358 (5f:4749). Rajendra Bhata, Postve defnte matrces, Prnceton Seres n Appled Mathematcs, Prnceton Unversty Press, Prnceton, NJ, 7. MR MR8476 (7:55) 3. Rajendra Bhata and John Holbroo, Noncommutatve geometrc means, Math. Intellgencer 8 (6), no., 3 39. MR MR893 (7g:473) 4., Remannan geometry and matrx geometrc means, Lnear Algebra Appl. 43 (6), no. -3, 594 68. MR MR9895 (7c:53) 5. R. F. S. Hearmon, The elastc constants of pesoelectrc crystals, J. Appl. Phys. 3 (95), 3. 6. Ncholas J. Hgham, The Matrx Computaton Toolbox, http://www.ma.man.ac.u/~hgham/mctoolbox. 7. Bruno Iannazzo and Beatrce Men, The matrx geometrc mean and other matrx functons: a unfyng framewor, Tech. report, Dpartmento d Matematca, Unverstà d Psa, 9. 8. Yongdo Lm, On Ando-L-Mathas geometrc mean equatons, Lnear Algebra Appl. 48 (8), no. 8-9, 767 777. MR MR3987 9. Maher Moaher, On the averagng of symmetrc postve-defnte tensors, J. Elastcty 8 (6), no. 3, 73 96. MR MR365 (7a:747) Dpartmento d Matematca, Unverstà d Psa, Largo B. Pontecorvo 5, 567 Psa, Italy E-mal address: bn, men@dm.unp.t Scuola Normale Superore, Pazza de Cavaler 6, 566 Psa, Italy E-mal address: polon@sns.t