Systems of Linear Equations in Fields Fields A field is a structure F = (F ; +, ;, ι; 0, ) such that () F is a set with at least two members (2) +,,, ι, 0, are operations on F (a) + (addition) and (multiplication) are binary operations (b) (additive inversion) and ι (multiplicative inversion) are unary operations (c) 0 (zero) and (one) are nullary operations, sometimes called constants By a harmless abuse of notation, they are two of the elements of F The operations satisfy the following properties: () Addition is commutative: x + y = y + x (2) Addition is associative: (x + y) + z = x + (y + z) (3) Zero is an additive left identity: 0 + x = x (4) Additive left inverses are selected via the additive inversion operation: x + x = 0 (5) Multiplication is commutative: x y = y x (6) Multiplication is associative: (x y) z = x (y z) (7) One is a left multiplicative identity: x = x (8) Nonzero members of F have multiplicative left inverses that are selected via the multiplicative inversion operation: ι(x) x =, for x 0 (9) Multiplication is left distributive over addition: x (y + z) = x y + x z Examples Several familiar examples are immediately available: () The integers, with the usual integer arithmetic, do not form a field (2) Arithmetic modulo p, where p is a prime number, makes the set {0,,, p } into a field An example is any two-element field, whose only elements are zero and one (3) Boolean arithmetic on the two-element set {0, } is a field arithmetic (4) The usual arithmetic of the rational numbers makes the set of rationals into a field (5) The real numbers form a field (6) The complex numbers form a field (7) Even though arithmetic modulo four does not make the set {0,, 2, 3} into a field, there is a field with exactly four elements (8) The set of algebraic numbers forms a field (9) The set of constructible numbers forms a field
2 SYSTEMS OF LINEAR EQUATIONS IN FIELDS To each field F is associated a nonnegative integer, called its characteristic Specifically, if there is a positive integer n such that nx = 0 for each element of the field F, then F is said to have finite characteristic, and the characteristic of F is the least such positive integer If there is no such positive integer, then F is said to have characteristic zero The rational, real and complex fields are characteristic zero fields For any prime p, the fields with p elements are characteristic p fields For example, boolean arithmetic is a characteristic 2 arithmetic, because boolean arithmetic satisfies the law x + x = 0 An important fact about characteristic of fields is the following: Theorem Every field either has characteristic zero or has characteristic p, where p is a prime Fields of prime characteristic play a significant role in cryptography Another important result along these lines is the following: Theorem 2 Let F be a field If F has characteristic zero, then F has a subfield that is isomorphic to the rational number field If F has characteristic p, where p is a prime integer, then F has a subfield with p elements 2 Systems of Equations Example 2 Solve the following system of modulo five equations: x + 3x 2 = 0 x + x 2 = 0 Solution: We row-reduce the corresponding augmented matrix, using arithmetic modulo five: [ [ 3 0 R +2R 2 3 0 0 ===== 0 R 2 0 [ 2R ===== 0 0 2R 2+R 0 2 0 It follows that the only solution of the system is (0, 0) 3 Elementary and Admissible Row Operations Some row operations are clearly useful in solving systems of equations, and some are clearly useless (such as multiplying a row by zero) But in general, it is not always easy to distinguish useful row operations
SYSTEMS OF LINEAR EQUATIONS IN FIELDS 3 from useless ones The useful ones we shall call admissible, and we have a formal definition to help us determine which are which But those that are obviously useful we call elementary 3 Elementary Row Operations () A row replacement operation is a row operation that replaces a given row by the sum of itself and a multiple of another row (2) A row interchange operation is a row operation that interchanges two rows (3) A scaling operation replaces a row by a nonzero multiple of itself An elementary row operation is a row operation that is either a row replacement, a row interchange, or a scaling These are the basic row operations that can be used to solve systems of linear equations, but there are other row operations that will suffice We call a row operation admissible if it is representable as a sequence of finitely many elementary row operations Two matrices [A b and [B c are row equivalent if there is an admissible row operation that can be used to transform the matrix [A b into the matrix [B c Examples 3 () The following is an elementary row operation: [ [ 2 3 R 2 3 ==== 2R 6 2 8 2 Another elementary row operation is [ [ 2 3 R 2 3 ========= 6 2 8 R 2 3R (2) The following is an admissible row operation, but is not an elementary row operation: [ [ 2 3 R 2 3 ========== 2R 2 3R (3) The following matrices are row-equivalent matrices: [ [ 2 3 2 3, 6 2 8 and [ 2 3 Theorem 3 If two linear systems have row-equivalent augmented matrices, then the two systems are equivalent systems Conversely, if two (m n) systems of linear equations are equivalent systems, then their augmented matrices are row-equivalent matrices This theorem expresses the fact that if we begin with an augmented matrix for a given system of linear equations, and perform elementary or other admissible row operations to this matrix, and if we can then easily tell what are the solutions of the system whose augmented matrix results, then we may solve the
4 SYSTEMS OF LINEAR EQUATIONS IN FIELDS original system This is in fact what we do to solve systems of linear equations A special type of matrix is studied next, because such matrices represent easily solvable systems 32 Echelon Forms A matrix is in (row) echelon form if: () Any nonzero row is above any zero row (2) Any leading nonzero entry in a row is to the right of any leading nonzero entry in any row above it (3) All entries in a column below a leading nonzero entry are zeros A matrix is in near-reduced (row) echelon form if: () Any nonzero row is above any zero row (2) Any leading nonzero entry in a row is to the right of any leading nonzero entry in any row above it (3) All entries in a column below a leading nonzero entry are zeros (4) Each leading nonzero entry is the only nonzero entry in its column A matrix is in reduced (row) echelon form if: () Any nonzero row is above any zero row (2) Any leading nonzero entry in a row is to the right of any leading nonzero entry in any row above it (3) All entries in a column below a leading nonzero entry are zeros (4) Each leading nonzero entry is the only nonzero entry in its column (5) Each leading nonzero entry is To solve a system, we reduce its augmented matrix either to near-reduced echelon form or to reduced echelon form, using admissible row operations Some authors prefer reduced echelon form, and they have good reasons However, we will prefer near-reduced echelon form, as it is slightly easier to obtain, and is just as informative as reduced echelon form Example 3 Solve the system Solution: 2x + x 2 = 3 3x x 2 = 4 We use admissible row operations to reduce the matrix [ [A 2 3 b = to near-reduced echelon form, as follows: [ [ [A 2 3 R 2 3 b = ========== 2R 2 3R [ 5R + R 2 0 0 4 ========= R 0 5 2
SYSTEMS OF LINEAR EQUATIONS IN FIELDS 5 Thus our original system is equivalent to the system so the only solution of the system is (x, x 2 ) = 0x = 4 5x 2 =, ( 4 0, ) ( 7 = 5 5, ) 5 In performing row operations to solve a system of linear equations, we choose a nonzero row to use as a pivot row, and in that row, the leading nonzero entry is called a pivot entry The column containing this pivot entry is called a pivot column When we have computed a near-reduced (or reduced) echelon form of an augmented matrix, the leading nonzero entries of the nonzero rows of the near-reduced (or reduced) echelon form are the pivot entries of the matrix We use the pivot entries to convert other entries in the pivot columns into zeros, in our algorithm to find a near-reduced (or reduced) echelon form Example 32 In the matrices and [B c = [A b = 2 3 2 2 2 5 0 2 3 0 2 4 0 5 9 4 the (,) entry (a ) is a pivot entry, row is a pivot row and column is a pivot column We obtain [B c from [A b by performing one admissible row operation, using the pivot entry in the (,) position:, [A b R R 2 2R ========= [B c R 3 5R 4 Matrix Multiplication 4 Definition of matrix products We define matrix products as follows: () If x = [ y 2 x x 2 x n and y =, then y n y xy = [ n j= x jy j
6 SYSTEMS OF LINEAR EQUATIONS IN FIELDS (2) If A is an (m n) matrix and B is an (n p) matrix, then (AB) ij = A (i) B (j) Thus the (i, j)-entry of AB is the product of row i of A with column j of B (3) If the number of columns of A is not the same as the number of rows of B, then A and B cannot be multiplied 42 Some Facts about Matrix Products () Elementary matrices represent elementary row operations (2) Matrix multiplication is associative (3) Matrix multiplication is not, in general, commutative (4) A matrix A is invertible if there is a matrix B such that AB = BA = I (5) Any elementary matrix is invertible (6) Any admissible row operation is represented by some invertible matrix Conversely, any invertible matrix represents some admissible row operation (7) To perform an elementary row operation on a matrix, one need only multiply on the left by the elementary matrix that represents the given row operation (8) To perform an admissible row operation on a matrix, one need only multiply on the left by the invertible matrix that represents the given row operation (9) Even row operations that are not admissible are represented by matrices: To perform a row operation on a matrix, one may find first the matrix that represents the given row operation, then multiply on the left by that matrix (0) The process of LU-factorization provides a way to solve systems of equations quickly when one can find a lower triangular matrix L and an upper triangular matrix U such that LU is the coefficient matrix of the given systems () The inverse of the inverse is the original matrix (2) A product of invertible matrices is invertible, and its inverse is the product of the inverses of the factors, in the reverse order (3) Several nice properties are equivalent to invertibility (4) To compute the inverse of an invertible matrix, one need only augment the given matrix with the corresponding identity matrix and compute the reduced row-echelon form of the result The inverse is then in the augmented portion of the result 43 A Useful Family of Admissible Row Operations Assume that j {,, m}, and that scalars a,, a m and b,, b m are such that a j + b j 0, and for k j, a k 0 Then the following describes an admissible row operation: a R + b R j a j R j + b j R j ============= a m R m + b m R j
SYSTEMS OF LINEAR EQUATIONS IN FIELDS 7 The above family of admissible row operations is extremely useful, but note that many of them are not elementary Now we will see an example of an admissible row operation that is not in the family described above, and we shall contrast it with an inadmissible row operation Example 4 On (3 3) matrices, the following row operation is admissible: R + R 2 R 2 + R 3 ======== R 3 + R Example 42 On (4 4) matrices, the following row operation is not admissible: R + R 2 R 2 + R 3 ======== R 3 + R 4 R 4 + R