2 Polynomials over a field A polynomial over a field F is a sequence (a 0, a 1, a 2,, a n, ) where a i F i with a i = 0 from some point on a i is called the i th coefficient of f We define three special polynomials 0 = (0, 0, 0, ) 1 = (1, 0, 0, ) x = (0, 1, 0, ) The polynomial (a 0, ) is called a constant and is written simply as a 0 Let F [x] denote the set of all polynomials in x If f 0, then the degree of f, written deg f, is the greatest n such that a n 0 Note that the polynomial 0 has no degree a n is called the leading coefficient of f F [x] forms a vector space over F if we define λ(a 0, a 1, ) = (λa 0, λa 1, ), λ F DEFINITION 21 (Multiplication of polynomials) Let f = (a 0, a 1, ) and g = (b 0, b 1, ) Then fg = (c 0, c 1, ) where EXAMPLE 21 c n = a 0 b n + a 1 b n 1 + + a n b 0 n = a i b n i = i=0 0 i,0 j i+j=n a i b j x 2 = (0, 0, 1, 0, ), x 3 = (0, 0, 0, 1, 0, ) More generally, an induction shows that x n = (a 0, ), where a n = 1 and all other a i are zero If deg f = n, we have f = a 0 1 + a 1 x + + a n x n 20
THEOREM 21 (Associative Law) f(gh) = (fg)h PROOF Take f, g as above and h = (c 0, c 1, ) Then f(gh) = (d 0, d 1, ), where d n = (fg) i h j i+j=n = = i+j=n u+v+j=n Likewise (fg)h = (e 0, e 1, ), where e n = ( u+v+j=n u+v=i f u g v h j f u g v ) h j f u g v h j Some properties of polynomial arithmetic: fg = gf 0f = 0 1f = f f(g + h) = fg + fh f 0 and g 0 fg 0 The last statement is equivalent to The we deduce that and deg(fg) = deg f + deg g fg = 0 f = 0 or g = 0 fh = fg and f 0 h = g 21 Lagrange Interpolation Polynomials Let P n [F ] denote the set of polynomials a 0 + a 1 x + + a n x n, where a 0,, a n F Then a 0 + a 1 x + + a n x n = 0 implies that a 0 = 0,, a n = 0 P n [F ] is a subspace of F [x] and 1, x, x 2,, x n form the standard basis for P n [F ] 21
If f P n [F ] and c F, we write f(c) = a 0 + a 1 c + + a n c n This is the value of f at c This symbol has the following properties: (f + g)(c) = f(c) + g(c) (λf)(c) = λ(f(c)) (f g)(c) = f(c)g(c) DEFINITION 22 Let c 1,, c n+1 be distinct members of F Then the Lagrange interpolation polynomials p 1,, p n+1 are polynomials of degree n defined by EXAMPLE 22 p i = n+1 j=1 j i ( x cj c i c j ), 1 i n + 1 p 1 = p 2 = ( ) x c1 c 2 c 1 etc ( ) x c2 c 1 c 2 ( ) x c3 ( c 1 c 3 ) x c3 c 2 c 3 ( ) x cn+1 ( c 1 c n+1 ) x cn+1 c 2 c n+1 We now show that the Lagrange polynomials also form a basis for P n [F ] PROOF Noting that there are n + 1 elements in the standard basis, above, we see that dim P n [F ] = n + 1 and so it suffices to show that p 1,, p n+1 are LI We use the following property of the polynomials p i : { 1 if i = j p i (c j ) = δ ij = 0 if i j Assume that a 1 p 1 + + a n+1 p n+1 = 0 where a i F, 1 i n + 1 Evaluating both sides at c 1,, c n+1 gives a 1 p 1 (c 1 ) + + a n+1 p n+1 (c 1 ) = 0 a 1 p 1 (c n+1 ) + + a n+1 p n+1 (c n+1 ) = 0 22
Hence a i = 0 i as required COROLLARY 21 If f P n [F ] then Proof: We know that a 1 1 + a 2 0 + + a n+1 0 = 0 a 1 0 + a 2 1 + + a n+1 0 = 0 a 1 0 + a 2 0 + + a n+1 1 = 0 f = f(c 1 )p 1 + + f(c n+1 )p n+1 f = λ 1 p 1 + + λ n+1 p n+1 for some λ i F Evaluating both sides at c 1,, c n+1 then, gives as required f(c 1 ) = λ 1, f(c n+1 ) = λ n+1 COROLLARY 22 If f P n [F ] and f(c 1 ) = 0,, f(c n+1 ) = 0 where c 1,, c n+1 are distinct, then f = 0 (Ie a non-zero polynomial of degree n can have at most n roots) COROLLARY 23 If b 1,, b n+1 are any scalars in F, and c 1,, c n+1 are again distinct, then there exists a unique polynomial f P n [F ] such that f(c 1 ) = b 1,, f(c n+1 ) = b n+1 ; namely f = b 1 p 1 + + b n+1 p n+1 23
EXAMPLE 23 Find the quadratic polynomial f = a 0 + a 1 x + a 2 x 2 P 2 [R] such that f(1) = 8, f(2) = 5, f(3) = 4 Solution: f = 8p 1 + 5p 2 + 4p 3 where p 1 = p 2 = p 3 = (x 2)(x 3) (1 2)(1 3) (x 1)(x 3) (2 1)(2 3) (x 1)(x 2) (3 1)(3 2) 22 Division of polynomials DEFINITION 23 If f, g F [x], we say f divides g if h F [x] such that g = fh For this we write f g, and f g denotes the negation f does not divide g Some properties: f g and g 0 deg f deg g and thus of course f 1 deg f = 0 221 Euclid s Division Theorem Let f, g F [x] and g 0 Then q, r F [x] such that f = qg + r, (3) where r = 0 or deg r < deg g Moreover q and r are unique Outline of Proof: 24
If f = 0 or deg f < deg g, (3) is trivially true (taking q = 0 and r = f) So assume deg f deg g, where f = a m x m + a m 1 x m 1 + a 0, g = b n x n + + b 0 and we have a long division process, viz: a m b 1 n x m n + b n x n + + b 0 a m x m + a m 1 x m 1 + + a 0 a m x m etc (See S Perlis, Theory of Matrices, p111) 222 Euclid s Division Algorithm f = q 1 g + r 1 with deg r 1 < deg g g = q 2 r 1 + r 2 with deg r 2 < deg r 1 r 1 = q 3 r 2 + r 3 with deg r 3 < deg r 2 r n 2 = q n r n 1 + r n with deg r n < deg r n 1 r n 1 = q n+1 r n Then r n = gcd(f, g), the greatest common divisor of f and g ie r n is a polynomial d with the property that 1 d f and d g, and 2 e F [x], e f and e g e d (This defines gcd(f, g) uniquely up to a constant multiple) We select the monic (ie leading coefficient = 1) gcd as the gcd Also, u, v F [x] such that r n = gcd(f, g) = uf + vg find u and v by forward substitution in Euclid s algorithm; viz r 1 = f + ( q 1 )g r 2 = g + ( q 2 )r 1 25
= g + ( q 2 )(f + ( q 1 )g) = g + ( q 2 )f + (q 1 q 2 )g = ( q 2 )f + (1 + q 1 q 2 )g r n = ( ) f + ( ) g }{{} }{{} u v In general, r k = s k f + t k g for 1 k n, where r 1 = f, r 0 = g, s 1 = 1, s 0 = 0, t 1 = 0, t 0 = 1 and s k = q k s k 1 + s k 2, t k = q k t k 1 + t k 2 for 1 k n (Proof by induction) The special case gcd(f, g) = 1 (ie f and g are relatively prime) is of great importance: here u, v F [x] such that uf + vg = 1 EXERCISE 21 Find gcd(3x 2 + 2x + 4, 2x 4 + 5x + 1) in Q[x] and express it as uf + vg for two polynomials u and v 23 Irreducible Polynomials DEFINITION 24 Let f be a non-constant polynomial Then, if g f or g is a constant g = constant f we call f an irreducible polynomial Note: (Remainder theorem) f = (x a)q + f(a) where a F So f(a) = 0 iff (x a) f EXAMPLE 24 f(x) = x 2 + x + 1 Z 2 [x] is irreducible, for f(0) = f(1) = 1 0, and hence there are no polynomials of degree 1 which divide f 26
THEOREM 22 Let f be irreducible Then if f g, gcd(f, g) = 1 and u, v F [x] such that uf + vg = 1 PROOF Suppose f is irreducible and f g Let d = gcd(f, g) so d f and d g Then either d = cf for some constant c, or d = 1 But if d = cf then So d = 1 as required f d and d g f g a contradiction COROLLARY 24 If f is irreducible and f gh, then f g or f h Proof: Suppose f is irreducible and f gh, f g We show that f h By the above theorem, u, v such that uf + vg = 1 ufh + vgh = h f h THEOREM 23 Any non-constant polynomial is expressible as a product of irreducible polynomials where representation is unique up to the order of the irreducible factors Some examples: PROOF (x + 1) 2 = x 2 + 2x + 1 = x 2 + 1 inz 2 [x] (x 2 + x + 1) 2 = x 4 + x 2 + 1 in Z 2 [x] (2x 2 + x + 1)(2x + 1) = x 3 + x 2 + 1 inz 3 [x] = (x 2 + 2x + 2)(x + 2) inz 3 [x] 27
Existence of factorization: If f F [x] is not a constant polynomial, then f being irreducible implies the result Otherwise, f = f 1 F 1, with 0 < deg f 1, deg F 1 < deg f If f 1 and F 1 are irreducible, stop Otherwise, keep going Eventually we end with a decomposition of f into irreducible polynomials Uniqueness: Let cf 1 f 2 f m = dg 1 g 2 g n be two decompositions into products of constants (c and d) and monic irreducibles (f i, g j ) Now f 1 f 1 f 2 f m = f 1 g 1 g 2 g n and since f i, g i are irreducible we can cancel f 1 and some g j Repeating this for f 2,, f m, we eventually obtain m = n and c = d in other words, each expression is simply a rearrangement of the factors of the other, as required THEOREM 24 Let F q be a field with q elements Then if n N, there exists an irreducible polynomial of degree n in F [x] PROOF First we introduce the idea of the Riemann zeta function: 1 ζ(s) = n s = 1 n=1 p prime 1 p 1 s To see the equality of the latter expressions note that and so RHS = = 1 1 x = x i = 1 + x + x 2 + i=0 ( ) 1 p p prime is i=0 ( 1 + 1 2 s + 1 2 2s + ) ( = 1 + 1 2 s + 1 3 s + 1 4 s + 1 + 1 3 s + 1 ) 3 2s + 28
note for the last step that terms will be of form ( 1 p a 1 1 pa R R ) s up to some prime p R, with a i 0 i = 1,, R and as R, the prime factorizations p a 1 1 pa R R map onto the natural numbers, N We let N m denote the number of monic irreducibles of degree m in F q [x] For example, N 1 = q since x + a, a F q are the irreducible polynomials of degree 1 Now let f = q deg f, and 0 = 0 Then we have fg = f g since deg fg = deg f + deg g and, because of the uniqueness of factorization theorem, f monic Now the left hand side is 1 f s = f monic and irreducible 1 1 1 f s = = = and RHS = n=0 n=0 f monic and deg f = n q n q ns 1 f s (there are q n monic polynomials of degree n) 1 q n(s 1) n=0 1 1 1 n=1 q s 1 1 ( 1 q 1 ) Nn ns 29
Equating the two, we have 1 1 1 q s 1 = 1 ( 1 q 1 ) Nn (4) ns n=1 We now take logs of both sides, and then use the fact that ( ) 1 x n log = if x < 1; 1 x n so (4) becomes log so 1 1 q (s 1) = k=1 n=1 1 ( 1 q 1 ) Nn ns N n log n=1 1 kq (s 1)k = k=1 Putting x = q s, we have k=1 = q k kq sk = q k x k k = = k=1 n=1 N n n=1 m=1 N n n=1 m=1 k=1 mn=k ( 1 1 q ns 1 mq mns nn n kq ks mn=k n mnq mns x k nn n, and since both sides are power series, we may equate coefficients of x k to obtain q k = nn n (5) mn=k nn n = n k We can deduce from this that N n > 0 as n (see Berlekamp s Algebraic Coding Theory ) Now note that N 1 = q, so if k is a prime say k = p, (5) gives q p = N 1 + pn p = q + pn p N p = qp q > 0 as q > 1 and p 2 p 30 )
This proves the theorem for n = p, a prime But what if k is not prime? Equation (5) also tells us that Now let k 2 Then q k kn k q k = kn k + n k n k kn k + n k n k k/2 kn k + n=1 nn n q n (as nn n q n ) q n k/2 < kn k + q n (adding 1) n=0 = kn k + q k/2 +1 1 q 1 (sum of geometric series) so But q t+1 1 q 1 < q t+1 if q 2, q k < kn k + q k/2 +1 N k > qk q k/2 +1 k 0 if q k q k/2 +1 Since q > 1 (we cannot have a field with a single element, since the additive and multiplicative identities cannot be equal by one of the axioms), the latter condition is equivalent to k k/2 + 1 which is true and the theorem is proven 31
24 Minimum Polynomial of a (Square) Matrix Let A M n n (F ), and g = ch A Then g(a) = 0 by the Cayley Hamilton theorem DEFINITION 25 Any non zero polynomial g of minimum degree and satisfying g(a) = 0 is called a minimum polynomial of A Note: If f is a minimum polynomial of A, then f cannot be a constant polynomial For if f = c, a constant, then 0 = f(a) = ci n implies c = 0 THEOREM 25 If f is a minimum polynomial of A and g(a) = 0, then f g (In particular, f ch A ) PROOF Let g(a) = 0 and f be a minimum polynomial Then where r = 0 or deg r < deg f Hence g = qf + r, g(a) = q(a) 0 + r(a) 0 = r(a) So if r 0, the inequality deg r < deg f would give a contradict the definition of f Consequently r = 0 and f g Note: It follows that if f and g are minimum polynomials of A, then f g and g f and consequently f = cg, where c is a scalar Hence there is a unique monic minimum polynomial and we denote it by m A EXAMPLES (of minimum polynomials): 1 A = 0 m A = x 2 A = I n m A = x 1 3 A = ci n m A = x c 4 A 2 = A and A 0 and A I n m A = x 2 x EXAMPLE 25 F = Q and A = 5 6 6 1 4 2 3 6 4 32
Now A c 0 I 3, c 0 Q, so m A x c 0, A 2 = 3A 2I 3 m A = x 2 3x + 2 This is an special case of a general algorithm: (Minimum polynomial algorithm) Let A M n n (F ) Then we find the least positive integer r such that A r is expressible as a linear combination of the matrices I n, A,, A r 1, say A r = c 0 + c 1 A + + c r 1 A r 1 (Such an integer must exist as I n, A,, A n2 form a linearly dependent family in the vector space M n n (F ) and this latter space has dimension equal to n 2 ) Then m A = x r c r 1 x r 1 c 1 x c 0 THEOREM 26 If f = x n + a n 1 x n 1 + + a 1 x + a 0 F [x], then m C(f) = f, where C(f) = 0 0 0 a 0 1 0 0 a 1 0 1 0 a 2 0 0 1 a n 1 PROOF For brevity denote C(f) by A Then post-multiplying A by the respective unit column vectors E 1,, E n gives AE 1 = E 2 AE 2 = E 3 A 2 E 1 = E 3 AE n 1 = E n A n 1 E 1 = E n AE n = a 0 E 1 a 2 E 2 a n 1 E n = a 0 E 1 a 2 AE 1 a n 1 A n 1 E 1 = A n E 1, 33
so f(a)e 1 = 0 first column of f(a) zero Now although matrix multiplication is not commutative, multiplication of two matrices, each of which is a polynomial in a given square matrix A, is commutative Hence f(a)g(a) = g(a)f(a) if f, g F [x] Taking g = x gives f(a)a = Af(A) Thus f(a)e 2 = f(a)ae 1 = Af(A)E 1 = 0 and so the second column of A is zero Repeating this for E 3,, E n, we see that f(a) = 0 and thus m A f To show m A = f, we assume deg m A = t < n; say Now m A = x t + b t 1 x t 1 + + b 0 m A (A) = 0 A t + b t 1 A t 1 + + b 0 I n = 0 (A t + b t 1 A t 1 + + b 0 I n )E 1 = 0, and recalling that AE 1 = E 2 etc, and t < n, we have E t+1 + b t 1 E t + + b 1 E 2 + b 0 E 1 = 0 which is a contradiction since the E i are independent, the coefficient of E t+1 cannot be 1 Hence m A = f Note: It follows that ch A = f Because both ch A and m A have degree n and moreover m A divides ch A EXERCISE 22 If A = J n (a) for a F, an elementary Jordan matrix of size n, show 34
that m A = (x a) n where A = J n (a) = a 0 0 1 a 0 1 0 0 a 0 0 0 1 a (ie A is an n n matrix with a s on the diagonal and 1 s on the subdiagonal) Note: Again, the minimum polynomial happens to equal the characteristic polynomial here DEFINITION 26 (Direct Sum of Matrices) Let A 1,, A t be matrices over F Then the direct sum of these matrices is defined as follows: A 1 0 0 A 2 A 1 A 2 A t = 0 A t Properties: 1 (A 1 A t ) + (B 1 B t ) = (A 1 + B 1 ) (A t + B t ) 2 If λ F, λ(a 1 A t ) = (λa 1 ) (λa t ) 3 (A 1 A t )(B 1 B t ) = (A 1 B 1 ) (A t B t ) 4 If f F [x] and A 1,, A t are square, f(a 1 A t ) = f(a 1 ) f(a t ) DEFINITION 27 If f 1,, f t F [x], we call f F [x] a least common multiple ( lcm ) of f 1,, f t if 35
1 f 1 f, f t f, and 2 f 1 e, f t e f e This uniquely defines the lcm up to a constant multiple and so we set the lcm to be the monic lcm EXAMPLES 21 If fg 0, lcm (f, g) fg (Recursive property) THEOREM 27 Also lcm (f 1,, f t+1 ) = lcm ( lcm (f 1,, f t ), f t+1 ) m A1 A t = lcm (m A1,, m At ), ch A1 A t = t ch Ai i=1 PROOF Let f = LHS and g = RHS Then Conversely, Thus f = g f(a 1 A t ) = 0 f(a 1 ) f(a t ) = 0 0 f(a 1 ) = 0,, f(a t ) = 0 m A1 f,, m At f g f m A1 g,, m At g g(a 1 ) = 0,, g(a t ) = 0 g(a 1 ) g(a t ) = 0 0 g(a 1 A t ) = 0 f = m A1 A t g EXAMPLE 26 Let A = C(f) and B = C(g) Then m A B = lcm (f, g) 36
Note: If f = cp a 1 1 pat t g = dp b 1 1 p bt t where c, d 0 are in F and p 1,, p t are distinct monic irreducibles, then gcd(f, g) = p min(a 1,b 1 ) 1 p min(at,bt) t, lcm (f, g) = p max(a 1,b 1 ) 1 p max(at,bt) t Note so min(a i, b i ) + max(a i, b i ) = a i + b i gcd(f, g) lcm (f, g) = fg EXAMPLE 27 If A = diag (λ 1,, λ n ), then m A = (x c 1 ) (x c t ), where c 1,, c t are the distinct members of the sequence λ 1,, λ n PROOF For A is the direct sum of the 1 1 matrices λ 1,, λ n having minimum polynomials x λ 1,, λ n Hence m A = lcm (x λ 1,, x λ n ) = (x c 1 ) (x c t ) We know that m A ch A Hence if ch A = p a 1 1 pat t where a 1 > 0,, a t > 0, and p 1,, p t are distinct monic irreducibles, then m A = p b 1 1 p bt t where 0 b i a i, i = 1,, t We soon show that each b i > 0, ie if p ch A p m A and p is irreducible then 37
25 Construction of a field of p n elements (where p is prime and n N) Let f be a monic irreducible polynomial of degree n in Z p [x] that is, F q = Z p here For instance, n = 2, p = 2 x 2 + x + 1 = f n = 3, p = 2 x 3 + x + 1 = f or x 3 + x 2 + 1 = f Let A = C(f), the companion matrix of f Then we know f(a) = 0 We assert that the set of all matrices of the form g(a), where g Z p [x], forms a field consisting of precisely p n elements The typical element is b 0 I n + b 1 A + + b t A t where b 0,, b t Z p We need only show existence of a multiplicative inverse for each element except 0 (the additive identity), as the remaining axioms clearly hold So let g Z p [x] such that g(a) 0 We have to find h Z p [x] satisfying Note that g(a) 0 f g, since g(a)h(a) = I n f g g = ff 1 and hence g(a) = f(a)f 1 (A) = 0f 1 (A) = 0 Then since f is irreducible and f g, there exist u, v Z p [x] such that uf + vg = 1 Hence u(a)f(a) + v(a)g(a) = I n and v(a)g(a) = I n, as required We now show that our new field is a Z p vector space with basis consisting of the matrices I n, A,, A n 1 Firstly the spanning property: By Euclid s division theorem, g = fq + r 38
where q, r Z p [x] and deg r < deg g So let where r 0,, r n 1 Z p Then r = r 0 + r 1 x + + r n 1 x n 1 g(a) = f(a)q(a) + r(a) = 0q(A) + r(a) = r(a) = r 0 I n + r 1 A + + r n 1 A n 1 Secondly, linear independence over Z p : Suppose that r 0 I n + r 1 A + + r n 1 A n 1 = 0, where r 0, r 1,, r n 1 Z p Then r(a) = 0, where r = r 0 + r 1 x + + r n 1 x n 1 Hence m A = f divides r Consequently r = 0, as deg f = n whereas deg r < n if r 0 Consequently, there are p n such matrices g(a) in the field we have constructed Numerical Examples EXAMPLE 28 Let p = 2, n = 2, f = x 2 + x + 1 Z 2 [x], and A = C(f) Then [ ] [ ] 0 1 0 1 A = =, 1 1 1 1 and F 4 = { a 0 I 2 + a 1 A a 0, a 1 Z 2 } = { 0, I 2, A, I 2 + A } We construct addition and multiplication tables for this field, with B = I 2 + A (as an exercise, check these): 0 I 2 A B 0 0 I 2 A B I 2 I 2 0 B A A A B 0 I 2 B B A I 2 0 0 I 2 A B 0 0 0 0 0 I 2 0 I 2 A B A 0 A B I 2 B 0 B I 2 A 39
EXAMPLE 29 Let p = 2, n = 3, f = x 3 + x + 1 Z 2 [x] Then 0 0 1 0 0 1 A = C(f) = 1 0 1 = 1 0 1 0 1 0 0 1 0 and our eight-member field F 8 (usually denoted by GF (8) [ GF corresponds to Galois Field, in honour of Galois]) is F 8 = { a 0 I 3 + a 1 A + a 2 A 2 a 0, a 1, a 2 Z 2 } = { 0, I 3, A, A 2, I 3 + A, I 3 + A 2, A + A 2, I 3 + A + A 2 } Now find (A 2 + A) 1 Solution: use Euclid s algorithm Hence x 3 + x + 1 = (x + 1)(x 2 + x) + 1 x 3 + x + 1 + (x + 1)(x 2 + x) = 1 A 3 + A + I 3 + (A + I 3 )(A 2 + A) = I 3 (A + I 3 )(A 2 + A) = I 3 Hence (A 2 + A) 1 = A + I 3 THEOREM 28 Every finite field has precisely p n elements for some prime p the least positive integer with the property that 1 } + 1 + 1 {{ + + 1 } = 0 p p is then called the characteristic of the field Also, if x F, a field of q elements, then it can be shown that if x 0, then x q 1 = 1 In the special case F = Z p, this reduces to Fermat s Little Theorem:, x p 1 1 (mod p), if p is prime not dividing x 40
26 Characteristic and Minimum Polynomial of a Transformation DEFINITION 28 (Characteristic polynomial of T : V V ) Let β be a basis for V and A = [T ] β β Then we define ch T = ch A This polynomial is independent of the basis β: PROOF ( ch T is independent of the basis) If γ is another basis for V and B = [T ] γ γ, then we know A = P 1 BP where P is the change of basis matrix [I V ] γ β Then ch A = ch P 1 BP = det(xi n P 1 BP ) where n = dim V = det(p 1 (xi n )P P 1 BP ) = det(p 1 (xi n B)P ) = det P 1 ch B det P = ch B DEFINITION 29 If f = a 0 + + a t x t, where a 0,, a t F, we define Then the usual properties hold: f(t ) = a 0 I V + + a t T t f, g F [x] (f+g)(t ) = f(t )+g(t ) and (fg)(t ) = f(t )g(t ) = g(t )f(t ) LEMMA 21 f F [x] [f(t )] β β = f ([T ] β β ) Note: The Cayley-Hamilton theorem for matrices says that ch A (A) = 0 Then if A = [T ] β β, we have by the lemma so ch T (T ) = 0 V [ ch T (T )] β β = ch T (A) = ch A (A) = 0, 41
DEFINITION 210 Let T : V V be a linear transformation over F Then any polynomial of least positive degree such that f(t ) = 0 V is called a minimum polynomial of T We have corresponding results for polynomials in a transformation T to those for polynomials in a square matrix A: g = qf + r g(t ) = q(t )f(t ) + r(t ) Again, there is a unique monic minimum polynomial of T is denoted by m T and called the minimum polynomial of T Also note that because of the lemma, For (with A = [T ] β β ) m T = m [T ] β β (a) m A (A) = 0, so m A (T ) = 0 V Hence m T m A (b) m T (T ) = 0 V, so [m T (T )] β β = 0 Hence m T (A) = 0 and so m A m T EXAMPLES 22 T = 0 V m T = x T = I V m T = x 1 T = ci V m T = x c T 2 = T and T 0 V and T I V m T = x 2 x 261 M n n (F [x]) Ring of Polynomial Matrices Example: [ x 2 + 2 x 5 ] + 5x + 1 M x + 3 1 2 2 (Q[x]) ] [ ] [ + x 2 1 0 0 5 + x 0 0 1 0 = x 5 [ 0 1 0 0 we see that any element of M n n (F [x]) is expressible as x m A m + x m 1 A m 1 + + A 0 ] [ 2 1 + 3 1 where A i M n n (F ) We write the coefficient of x i after x i, to distinguish these entities from corresponding objects of the following ring 42 ]
262 M n n (F )[y] Ring of Matrix Polynomials This consists of all polynomials in y with coefficients in M n n (F ) Example: [ ] [ ] [ ] [ ] 0 1 y 5 1 0 + y 2 0 5 2 1 + y + M 0 0 0 0 1 0 3 1 2 2 (F )[y] THEOREM 29 The mapping given by Φ : M n n (F )[y] M n n (F [x]) Φ(A 0 + A 1 y + + A m y m ) = A 0 + xa 1 + + x m A m where A i M n n (F ), is a 1 1 correspondence and has the following properties: Φ(X + Y ) = Φ(X) + Φ(Y ) Φ(XY ) = Φ(X)Φ(Y ) Φ(tX) = tφ(x) t F Also Φ(I n y A) = xi n A A M n n (F ) THEOREM 210 ((Left) Remainder theorem for matrix polynomials) where Let B m y m + + B 0 M n n (F )[y] and A M n n (F ) Then B m y m + + B 0 = (I n y A)Q + R R = A m B m + + AB 1 + B 0 and Q = C m 1 y m 1 + + C 0 where C m 1,, C 0 are computed recursively: B m = C m 1 B m 1 = AC m 1 + C m 2 B 1 = AC 1 + C 0 43
PROOF First we verify that B 0 = AC 0 + R: R = A m B m = A m C m 1 +A m 1 B m 1 A m C m 1 + A m 1 C m 2 + + +AB 1 A 2 C 1 + AC 0 +B 0 B 0 = B 0 + AC 0 Then (I n y A)Q + R = (I n y)(c m 1 y m 1 + + C 0 ) A(C m 1 y m 1 + + C 0 ) + A m B m + + B 0 = C m 1 y m + (C m 2 AC m 1 )y m 1 + + (C 0 AC 1 )y + AC 0 + R = B m y m + B m 1 y m 1 + + B 1 y + B 0 Remark There is a similar right remainder theorem THEOREM 211 If p is an irreducible polynomial dividing ch A, then p m A PROOF (From Burton Jones, Linear Algebra ) Let m A = x t + a t 1 x t 1 + + a 0 and consider the matrix polynomial in y Φ 1 (m A I n ) = I n y t + (a t 1 I n )y t 1 + + (a 0 I n ) = (I n y A)Q + A t I n + A t 1 (a t 1 I n ) + + a 0 I n = (I n y A)Q + m T (A) = (I n y A)Q Now take Φ of both sides to give m A I n = (xi n A)Φ(Q) and taking determinants of both sides yields {m A } n = ch A det Φ(Q) 44
So letting p be an irreducible polynomial dividing ch A, we have p {m A } n and hence p m A Alternative simpler proof (MacDuffee): m A (x) m A (y) = (x y)k(x, y), where k(x, y) F [x, y] Hence m A (x)i n = m A (xi n ) m A (A) = (xi n A)k(xI n, A) Now take determinants to get m A (x) n = ch A (x) det k(xi n, A) Exercise: If (x) is the gcd of the elements of adj(xi n A), use the equation (xi n a)adj(xi n A) = ch A (x)i n and an above equation to deduce that m A (x) = ch A (x)/ (x) EXAMPLES 23 With A = 0 M n n (F ), we have ch A = x n and m A = x A = diag (1, 1, 2, 2, 2) M 5 5 (Q) Here ch A = (x 1) 2 (x 2) 3 and m A = (x 1)(x 2) DEFINITION 211 A matrix A M n n (F ) is called diagonable over F if there exists a non singular matrix P M n n (F ) such that where λ 1,, λ n belong to F P 1 AP = diag (λ 1,, λ n ), THEOREM 212 If A is diagonable, then m A is a product of distinct linear factors PROOF If P 1 AP = diag (λ 1,, λ n ) (with λ 1,, λ n F ) then m A = m P 1 AP = m diag (λ 1,, λ n ) = (x c 1 )(x c 2 ) (x c t ) where c 1,, c t are the distinct members of the sequence λ 1,, λ n The converse is also true, and will (fairly) soon be proved 45
EXAMPLE 210 A = J n (a) We saw earlier that m A = (x a) n so if n 2 we see that A is not diagonable DEFINITION 212 (Diagonable LTs) T : V V is called diagonable over F if there exists a basis β for V such that [T ] β β is diagonal THEOREM 213 A is diagonable T A is diagonable PROOF (Sketch) Suppose P 1 AP = diag (λ 1,, λ n ) letting P = [P 1 P n ] we see that Now pre-multiplying by P and T A (P 1 ) = AP 1 = λ 1 P 1 T A (P n ) = AP n = λ n P n and we let β be the basis P 1,, P n over V n (F ) Then λ 1 [T A ] β β = λ 2 λn Reverse the argument and use Theorem 117 THEOREM 214 Let A M n n (F ) Then if λ is an eigenvalue of A with multiplicity m, (that is (x λ) m is the exact power of x λ which divides ch A ), we have nullity (A λi n ) m 46
REMARKS (1) If m = 1, we deduce that nullity (A λi n ) = 1 For the inequality 1 nullity (A λi n ) always holds (2) The integer nullity (A λi n ) is called the geometric multiplicity of the eigenvalue λ, while m is referred to as the algebraic multiplicity of λ PROOF Let v 1,, v r be a basis for N(A λi n ), where λ is an eigenvalue of A having multiplicity m Extend this linearly independent family to a basis v 1,, v r, v r+1,, v n of V n (F ) Then the following equations hold: Av 1 = λv 1 Av r = λv r Av r+1 = b 11 v 1 + + b n1 v n Av n = b 1n r v 1 + + b nn r v n These equations can be combined into a single matrix equation: A[v 1 v r v r+1 v n ] = [Av 1 Av r Av r+1 Av n ] = [λv 1 λv r b 11 v 1 + + b n1 v n b 1n r v 1 + + b nn r v n ] [ ] λir B = [v 1 v n ] 1 0 B 2 Hence if P = [v 1 v n ], we have [ P 1 λir B AP = 1 0 B 2 ] Then ch A = ch P 1 AP = ch λir ch B2 = (x λ) r ch B2 and because (x λ) m is the exact power of x λ dividing ch A, it follows that nullity (A λi n ) = r m THEOREM 215 Suppose that ch T = (x c 1 ) a1 (x c t ) at Then T is diagonable if nullity (T c i I v ) = a i for 1 i t 47
PROOF We first prove that the subspaces Ker (T c i I V ) are independent (Subspaces V 1,, V t are called independent if v 1 + + v t = 0, v i V i, i = 1, t, v 1 = 0,, v t = 0 Then dim (V 1 + + V t ) = dim (V 1 ) + + dim V t )) Assume that v 1 + + v t = 0, where v i Ker (T c i I v ) for 1 i t Then Similarly we deduce that T (v 1 + + v t ) = T (0) c 1 v 1 + + c t v t = 0 c 2 1v 1 + + c 2 t v t = 0 c t 1 1 v 1 + + c t 1 t v t = 0 We can combine these t equations into a single matrix equation 1 1 c 1 c t v 1 o = c t 1 1 c t 1 v t 0 t However the coefficient matrix is the Vandermonde matrix, which is non singular as c i c j if i j, so we deduce that v 1 = 0,, v t = 0 Hence with V i = Ker (T c i I V ), we have Hence dim (V 1 + + V t ) = t dim V i = i=1 V = V 1 + + V t t a i = dim V Then if β i is a basis for V i for i i t and β = β 1 β t, it follows that β is a basis for V Moreover t [T ] β β = (c i I ai ) i=1 48 i=1
and T is diagonable EXAMPLE Let A = 5 2 2 2 5 2 2 2 5 (a) We find that ch A = (x 3) 2 (x 9) Next we find bases for each of the eigenspaces N(A 9I 3 ) and N(A 3I 3 ): First we solve (A 3I 3 )X = 0 We have A 3I 3 = 2 2 2 2 2 2 2 2 2 1 1 1 0 0 0 0 0 0 Hence the eigenspace consists of vectors X = [x, y, z] t satisfying x = y+z, with y and z arbitrary Hence y + z 1 1 X = y = y 1 + z 0, z 0 1 so X 11 = [ 1, 1, 0] t and X 12 = [1, 0, 1] t form a basis for the eigenspace corresponding to the eigenvalue 3 Next we solve (A 9I 3 )X = 0 We have A 9I 3 = 4 2 2 2 4 2 2 2 4 1 0 1 0 1 1 0 0 0 Hence the eigenspace consists of vectors X = [x, y, z] t satisfying x = z and y = z, with z arbitrary Hence z 1 X = z z = z 1 1 and we can take X 21 = [ 1, 1, 1] t as a basis for the eigenspace corresponding to the eigenvalue 9 Then P = [X 11 X 12 X 21 ] is non singular and P 1 AP = 3 0 0 0 3 0 0 0 9 49
THEOREM 216 If m T = (x c 1 ) (x c t ) for c 1,, c t distinct in F, then T is diagonable and conversely Moreover there exist unique linear transformations T 1,, T t satisfying I V = T 1 + + T t, T = c 1 T 1 + + c t T t, T i T j = 0 V if i j, T 2 i = T i, 1 i t Also rank T i = a i, where ch T = (x c 1 ) a1 (x c t ) at Remarks 1 T 1,, T t are called the principal idempotents of T 2 If g F [x], then g(t ) = g(c 1 )T 1 + + g(c t )T t For example T m = c m 1 T 1 + + c m t T t 3 If c 1,, c t are non zero (that is the eigenvalues of T are non zero), the T 1 is given by T 1 = c 1 1 T 1 + + c 1 t T t Formulae 2 and 3 are useful in the corresponding matrix formulation PROOF Suppose m T = (x c 1 ) (x c t ), where c 1,, c t are distinct Then ch T = (x c 1 ) a1 (x c t ) at To prove T is diagonable, we have to prove that nullity (T c i I V ) = a i, 1 i t Let p 1,, p t be the Lagrange interpolation polynomials based on c 1,, c t, ie t ( ) x cj p i =, 1 i t Then In particular, j=1 j i c i c j g F [x] g = g(c 1 )p 1 + + g(c t )p t g = 1 1 = p 1 + + p t 50
and g = x x = c 1 p 1 + + c t p t Hence with T i = p i (T ), I V = T 1 + + T t T = c 1 T 1 + + c t T t Next m T = (x c 1 ) (x c t ) p i p j (p i p j )(T ) = 0 V if i j if i j p i (T )p j (T ) = 0 V or T i T j = 0 V if i j Then T 2 i = T i (T 1 + + T t ) = T i I V = T i Next 0 V = m T (T ) = (T c 1 I V ) (T c t I V ) Hence dim V = nullity 0 V t nullity (T c i I V ) i=1 t a i = dim V i=1 Consequently nullity (T c i I V ) = a i, 1 i t and T is therefore diagonable Next we prove that rank T i = a i From the definition of p i, we have nullity p i (T ) t nullity (T c j I V ) = j=1 j i t a j = dim V a i j=1 j i Also p i (T )(T c i I V ) = 0, so Im (T c i I V ) Ker p i (T ) Hence dim V a i nullity p i (T ) and consequently nullity p i (T ) = dim (V ) a i, so rank p i (T ) = a i We next prove the uniqueness of T 1,, T t Suppose that S 1,, S t also satisfy the same conditions as T 1,, T t Then T i T = T T i = c i T i S j T = T S j = c j S j T i (T S j ) = T i (c j S j ) = c j T i S j = (T i T )S j = c i T i S j 51
so (c j c i )T i S j = 0 V and T i S j = 0 V if i j Hence T i = T i I V = T i ( S i = I V S i = ( t S j ) = T i S i j=1 t T j )S i = T i S i Hence T i = S i Conversely, suppose that T is diagonable and let β be a basis of V such that A = [T ] β β = diag (λ 1,, λ n ) Then m T = m A = (x c 1 ) (x c t ), where c 1,, c t are the distinct members of the sequence λ 1,, λ n COROLLARY 25 If j=1 ch T = (x c 1 ) (x c t ) with c i distinct members of F, then T is diagonable Proof: Here m T = ch T and we use theorem 33 EXAMPLE 211 Let A = [ 0 a b 0 ] a, b F, ab 0, 1 + 1 0 Then A is diagonable if and only if ab = y 2 for some y F For ch A = x 2 ab, so if ab = y 2, ch A = x 2 y 2 = (x + y)(x y) which is a product of distinct linear factors, as y y here Conversely suppose that A is diagonable Then as A is not a scalar matrix, it follows that m A is not linear and hence m A = (x c 1 )(x c 2 ), where c 1 c 2 Also ch A = m A, so ch A (c 1 ) = 0 Hence c 2 1 ab = 0, or ab = c 2 1 For example, take F = Z 7 and let a = 1 and b = 3 consequently A is not diagonable Then ab y 2 and 52