MTH6140 Liear Algebra II Notes 4 1st November 2010 4 Determiats The determiat is a fuctio defied o square matrices; its value is a scalar. It has some very importat properties: perhaps most importat is the fact that a matrix is ivertible if ad oly if its determiat is ot equal to zero. We deote the determiat fuctio by det, so that det(a) is the determiat of A. For a matrix writte out as a array, the determiat is deoted by replacig the square brackets by vertical bars: 1 2 det = 3 4 1 2 3 4. 4.1 Defiitio of determiat You have met determiats i earlier courses, ad you kow the formula for the determiat of a 2 2 or 3 3 matrix: a b a b c c d = ad bc, d e f = aei + b f g + cdh a f h bdi ceg. g h i Our first job is to defie the determiat for square matrices of ay size. We do this i a axiomatic maer: Defiitio 4.1 A fuctio D defied o matrices is a determiat if it satisfies the followig three coditios: (D1) For each 1 i, the fuctio D is a liear fuctio of the ith colum: this meas that, if A ad A are two matrices which agree everywhere except the ith colum, ad if A is the matrix whose ith colum is c times the ith colum of A plus c times the ith colum of A, but agreeig with A ad A everywhere else, the D(A ) = cd(a) + c D(A ). 1
(D2) If A has two equal colums, the D(A) = 0. (D3) D(I ) = 1, where I is the idetity matrix. We show the followig result: Theorem 4.1 There is a uique determiat fuctio o matrices, for ay. Proof First, we show that applyig elemetary colum operatios to A has a welldefied effect o D(A). (a) If B is obtaied from A by addig c times the jth colum to the ith, the D(B) = D(A). (b) If B is obtaied from A by multiplyig the ith colum by a o-zero scalar c, the D(B) = cd(a). (c) If B is obtaied from A by iterchagig two colums, the D(B) = D(A). For (a), let A be the matrix which agrees with A i all colums except the ith, which is equal to the jth colum of A. By rule (D2), D(A ) = 0. By rule (D1), D(B) = D(A) + cd(a ) = D(A). Part (b) follows immediately from rule (D1). To prove part (c), we observe that we ca iterchage the ith ad jth colums by the followig sequece of operatios: add the ith colum to the jth; multiply the ith colum by 1; add the jth colum to the ith; subtract the ith colum from the jth. I symbols, (c i,c j ) (c i,c j + c i ) ( c i,c j + c i ) (c j,c j + c i ) (c j,c i ). The first, third ad fourth steps do t chage the value of D, while the secod multiplies it by 1. Now we take the matrix A ad apply elemetary colum operatios to it, keepig track of the factors by which D gets multiplied accordig to rules (a) (c). The overall effect is to multiply D(A) by a certai o-zero scalar c, depedig o the operatios. 2
If A is ivertible, the we ca reduce A to the idetity (see Corollary 2.8), so that cd(a) = D(I) = 1 (by (D3)), whece D(A) = c 1. If A is ot ivertible, the its colum rak is less tha. So the colums of A are liearly depedet, ad oe colum ca be writte as a liear combiatio of the others. Applyig axiom (D1), we see that D(A) is a liear combiatio of values D(A ), where A are matrices with two equal colums; so D(A ) = 0 for all such A, whece D(A) = 0. This proves that the determiat fuctio, if it exists, is uique. We show its existece i the ext sectio, by givig a couple of formulae for it. Give the uiqueess of the determiat fuctio, we ow deote it by det(a) istead of D(A). The proof of the theorem shows a importat corollary: Corollary 4.2 A square matrix is ivertible if ad oly if det(a) 0. Proof See the case divisio at the ed of the proof of the theorem. Oe of the most importat properties of the determiat is the followig. Theorem 4.3 If A ad B are matrices over K, the det(ab) = det(a)det(b). Proof Suppose first that B is ot ivertible. The det(b) = 0. Also, AB is ot ivertible. (For, suppose that (AB) 1 = X, so that XAB = I. The XA is the iverse of B.) So det(ab) = 0, ad the theorem is true. I the other case, B is ivertible, so we ca apply a sequece of elemetary colum operatios to B to get to the idetity (by Corollary 2.8). The effect of these operatios is to multiply the determiat by a o-zero factor c (depedig o the operatios), so that cdet(b) = det(i) = 1, or c = (det(b)) 1. Now these operatios are represeted by elemetary matrices; so we see that BQ = I, where Q is a product of elemetary matrices (see Lemma 2.2). If we apply the same sequece of elemetary operatios to AB, we ed up with the matrix (AB)Q = A(BQ) = AI = A. The determiat is multiplied by the same factor, so we fid that cdet(ab) = det(a). Sice c = (det(b)) 1, this implies that det(ab) = det(a) det(b), as required. Fially, we have defied determiats usig colums, but we could have used rows istead: Propositio 4.4 The determiat is the uique fuctio D of matrices which satisfies the coditios 3
(D1 ) for each 1 i, the fuctio D is a liear fuctio of the ith row; (D2 ) if two rows of A are equal, the D(A) = 0; (D3 ) D(I ) = 1. The proof of uiqueess is almost idetical to that for colums. To see that D(A) = det(a): if A is ot ivertible, the D(A) = det(a) = 0; but if A is ivertible, the it is a product of elemetary matrices (which ca represet either row or colum operatios), ad the determiat is the product of the factors associated with these operatios. Corollary 4.5 If A deotes the traspose of A, the det(a ) = det(a). For, if D deotes the determiat computed by row operatios, the det(a) = D(A) = det(a ), sice row operatios o A correspod to colum operatios o A. 4.2 Calculatig determiats We ow give a couple of formulae for the determiat. This fiishes the job we left ope i the proof of the last theorem, amely, showig that a determiat fuctio actually exists! The first formula ivolves some backgroud otatio (see also the additioal sheet Permutatios, available from the module website). Defiitio 4.2 A permutatio of {1,...,} is a bijectio from the set {1,...,} to itself. The symmetric group S cosists of all permutatios of the set {1,...,}. (There are! such permutatios.) For ay permutatio π S, there is a umber sig(π) = ±1, computed as follows: write π as a product of disjoit cycles; if there are k cycles (icludig cycles of legth 1), the sig(π) = ( 1) k. A traspositio is a permutatio which iterchages two symbols ad leaves all the others fixed. Thus, if τ is a traspositio, the sig(τ) = 1. The last fact holds because a traspositio has oe cycle of size 2 ad 2 cycles of size 1, so 1 altogether; so sig(τ) = ( 1) ( 1) = 1. We eed oe more fact about sigs: if π is ay permutatio ad τ is a traspositio, the sig(πτ) = sig(π), where πτ deotes the compositio of π ad τ (apply first τ, the π). Defiitio 4.3 Let A be a matrix over K. The determiat of A is defied by the formula det(a) = π S sig(π)a 1π(1) A 2π(2) A π(). 4
Proof I order to show that this is a good defiitio, we eed to verify that it satisfies our three rules (D1) (D3). (D1) Accordig to the defiitio, det(a) is a sum of! terms. Each term, apart from a sig, is the product of elemets, oe from each row ad colum. If we look at a particular colum, say the ith, it is clear that each product is a liear fuctio of that colum; so the same is true for the determiat. (D2) Suppose that the ith ad jth colums of A are equal. Let τ be the traspositio which iterchages i ad j ad leaves the other symbols fixed. The π(τ(i)) = π( j) ad π(τ( j)) = π(i), whereas π(τ(k)) = π(k) for k i, j. Because the elemets i the ith ad jth colums of A are the same, we see that the products A 1π(1) A 2π(2) A π() ad A 1πτ(1) A 2πτ(2) A πτ() are equal. But sig(πτ) = sig(π). So the correspodig terms i the formula for the determiat cacel oe aother. The elemets of S ca be divided up ito!/2 pairs of the form {π,πτ}. As we have see, each pair of terms i the formula cacel out. We coclude that det(a) = 0. Thus (D2) holds. (D3) If A = I, the the oly permutatio π which cotributes to the sum is the idetity permutatio ι: for ay other permutatio π satisfies π(i) i for some i, so that A iπ(i) = 0. The sig of ι is +1, ad all the terms A iι(i) = A ii are equal to 1; so det(a) = 1, as required. This gives us a ice mathematical formula for the determiat of a matrix. Ufortuately, it is a terrible formula i practice, sice it ivolves workig out! terms, each a product of matrix etries, ad addig them up with + ad sigs. For of moderate size, this will take a very log time! (For example, 10! = 3628800.) Here is a secod formula, which is also theoretically importat but very iefficiet i practice. Defiitio 4.4 Let A be a matrix. For 1 i, j, we defie the (i, j) mior of A to be the ( 1) ( 1) matrix obtaied by deletig the ith row ad jth colum of A. Now we defie the (i, j) cofactor of A to be ( 1) i+ j times the determiat of the (i, j) mior. (These sigs have a chessboard patter, startig with sig + i the top left corer.) We deote the (i, j) cofactor of A by K i j (A). Fially, the adjugate of A is the matrix Adj(A) whose (i, j) etry is the ( j,i) cofactor K ji (A) of A. (Note the traspositio!) Theorem 4.6 (a) For 1 j, we have det(a) = i=1 A i j K i j (A). 5
(b) For 1 i, we have det(a) = A i j K i j (A). j=1 This theorem says that, if we take ay colum or row of A, multiply each elemet by the correspodig cofactor, ad add the results, we get the determiat of A. Example 4.1 Usig a cofactor expasio alog the first colum, we see that 1 2 3 4 5 6 = 5 6 7 8 10 8 10 4 2 3 8 10 + 7 2 3 5 6 = (5 10 6 8) 4(2 10 3 8) + 7(2 6 3 5) = 2 + 16 21 = 3 usig the stadard formula for a 2 2 determiat. Proof We prove (a); the proof for (b) is a simple modificatio, usig rows istead of colums. Let D(A) be the fuctio defied by the right-had side of (a) i the theorem, usig the jth colum of A. We verify rules (D1) (D3). (D1) It is clear that D(A) is a liear fuctio of the jth colum. For k j, the cofactors are liear fuctios of the kth colum (sice they are determiats), ad so D(A) is liear. (D2) If the kth ad lth colums of A are equal (where k ad l are differet from j), the each cofactor is the determiat of a matrix with two equal colums, ad so is zero. The harder case is whe the jth colum is equal to aother, say the kth. Usig iductio, each cofactor ca be expressed as a sum of elemets of the kth colum times ( 2) ( 2) determiats. I the resultig sum, it is easy to see that each such determiat occurs twice with opposite sigs ad multiplied by the same factor. So the terms all cacel. (D3) Suppose that A = I. The oly o-zero cofactor i the jth colum is K j j (I), which is equal to ( 1) j+ j det(i 1 ) = 1. So D(I) = 1. By the mai theorem, the expressio D(A) is equal to det(a). At first sight, this looks like a simple formula for the determiat, sice it is just the sum of terms, rather tha! as i the first case. But each term is a ( 1) ( 1) determiat. Workig dow the chai we fid that this method is just as labouritesive as the other oe. But the cofactor expasio has further ice properties: 6
Theorem 4.7 For ay matrix A, we have A Adj(A) = Adj(A) A = det(a) I. Proof We calculate the matrix product. Recall that the (i, j) etry of Adj(A) is K ji (A). Now the (i,i) etry of the product A Adj(A) is A ik (Adj(A)) ki = A ik K ik (A) = det(a), k=1 k=1 by the cofactor expasio. O the other had, if i j, the the (i, j) etry of the product is A ik (Adj(A)) k j = A ik K jk (A). k=1 k=1 This last expressio is the cofactor expasio of the matrix A which is the same of A except for the jth row, which has bee replaced by the ith row of A. (Note that chagig the jth row of a matrix has o effect o the cofactors of elemets i this row.) So the sum is det(a ). But A has two equal rows, so its determiat is zero. Thus A Adj(A) has etries det(a) o the diagoal ad 0 everywhere else; so it is equal to det(a) I. The proof for the product the other way aroud is the same, usig colums istead of rows. Corollary 4.8 If the matrix A is ivertible, the its iverse is equal to (det(a)) 1 Adj(A). So how ca you work out a determiat efficietly? The best method i practice is to use elemetary operatios. Apply elemetary operatios to the matrix, keepig track of the factor by which the determiat is multiplied by each operatio. If you wat, you ca reduce all the way to the idetity, ad the use the fact that det(i) = 1. Ofte it is simpler to stop at a earlier stage whe you ca recogise what the determiat is. For example, if the matrix A has diagoal etries a 1,...,a, ad all off-diagoal etries are zero, the det(a) is just the product a 1 a. Example 4.2 Let A = 1 2 3 4 5 6. 7 8 10 7
Subtractig twice the first colum from the secod, ad three times the secod colum from the third (these operatios do t chage the determiat) gives 1 0 0 4 3 6. 7 6 11 Now the cofactor expasio alog the first row gives det(a) = 3 6 6 11 = 33 36 = 3. (At the last step, it is easiest to use the formula for the determiat of a 2 2 matrix rather tha do ay further reductio.) 4.3 The Cayley Hamilto Theorem Sice we ca add ad multiply matrices, we ca substitute them ito a polyomial. For example, if 0 1 A =, 2 3 the the result of substitutig A ito the polyomial x 2 3x + 2 is A 2 2 3 0 3 2 0 3A + 2I = + + = 6 7 6 9 0 2 0 0 0 0 We say that the matrix A satisfies the equatio x 2 3x + 2 = 0. (Notice that for the costat term 2 we substituted 2I.) It turs out that, for every matrix A, we ca calculate a polyomial equatio of degree satisfied by A. Defiitio 4.5 Let A be a matrix. The characteristic polyomial of A is the polyomial c A (x) = det(xi A). This is a polyomial i x of degree. the For example, if A = 0 1, 2 3 c A (x) = x 1 2 x 3 = x(x 3) + 2 = x2 3x + 2. Ideed, it turs out that this is the polyomial we wat i geeral: 8.
Theorem 4.9 (Cayley Hamilto Theorem) Let A be a matrix with characteristic polyomial c A (x). The c A (A) = O. Example 4.3 Let us just check the theorem for 2 2 matrices. If a b A =, c d the ad so c A (A) = c A (x) = x a c a 2 + bc ab + bd ac + cd bc + d 2 (a + d) after a small amout of calculatio. Proof We use the theorem b x d = x2 (a + d)x + (ad bc), a b c d A Adj(A) = det(a) I. I place of A, we put the matrix xi A ito this formula: + (ad bc) (xi A)Adj(xI A) = det(xi A)I = c A (x)i. 1 0 = O, 0 1 Now it is very temptig just to substitute x = A ito this formula: o the right we have c A (A)I = c A (A), while o the left there is a factor AI A = O. Ufortuately this is ot valid; it is importat to see why. The matrix Adj(xI A) is a matrix whose etries are determiats of ( 1) ( 1) matrices with etries ivolvig x. So the etries of Adj(xI A) are polyomials i x, ad if we try to substitute A for x the size of the matrix will be chaged! Istead, we argue as follows. As we have said, Adj(xI A) is a matrix whose etries are polyomials, so we ca write it as a sum of powers of x times matrices, that is, as a polyomial whose coefficiets are matrices. For example, x 2 + 1 2x 3x 4 x + 2 = x 2 1 0 0 0 + x 0 2 3 1 + 1 0 4 2 The etries i Adj(xI A) are ( 1) ( 1) determiats, so the highest power of x that ca arise is x 1. So we ca write Adj(xI A) = x 1 B 1 + x 2 B 2 + + xb 1 + B 0, 9.
for suitable matrices B 0,...,B 1. Hece c A (x)i = (xi A)Adj(xI A) = (xi A)(x 1 B 1 + x 2 B 2 + + xb 1 + B 0 ) = x B 1 + x 1 ( AB 1 + B 2 ) + + x( AB 1 + B 0 ) AB 0. So, if we let the we read off that c A (x) = x + c 1 x 1 + + c 1 x + c 0, B 1 = I, AB 1 + B 2 = c 1 I, AB 1 + B 0 = c 1 I, AB 0 = c 0 I. We take this system of equatios, ad multiply the first by A, the secod by A 1,..., ad the last by A 0 = I. What happes? O the left, all the terms cacel i pairs: we have A B 1 + A 1 ( AB 1 + B 2 ) + + A( AB 1 + B 0 ) + I( AB 0 ) = O. O the right, we have So c A (A) = O, as claimed. A + c 1 A 1 + + c 1 A + c 0 I = c A (A). 10