MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column of n numbers (or letters): [a 1,..., a n ] or a 1. a n. The set of all such vectors (either only rows, or only columns) with real entries is denoted by R n. Short notation for vectors varies: a, or a, or a. Definition. A linear equation in n variables x 1, x 2,..., x n is an equation a 1 x 1 + a 2 x 2 + + a n x n = b where the coefficients a 1, a 2,..., a n and the constant term b are constants. A solution of a linear equation a 1 x 1 + a 2 x 2 + + a n x n = b is a vector [s 1, s 2,..., s n ] whose components satisfy the equation when we substitute x 1 = s 1, x 2 = s 2,..., x n = s n, that is, a 1 s 1 + a 2 s 2 + + a n s n = b. A system of linear equations is a finite set of linear equations, each with the same variables. A solution of a system of linear equations is a vector that is simultaneously a solution of each equation in the system. The solution set of a system of linear equations is the set of all solutions of the system. 1

MATH10212 Linear Algebra Brief lecture notes 2 Definition A general solution of a linear system (or equation) is an expression of the unknowns in terms of certain parameters that can take independently any values producing all the solutions of the equation (and only solutions). Two linear systems are equivalent if they have the same solution sets. For example, x + y = 3 x y = 1 and x y = 1 y = 2 are equivalent, since both have the unique solution [1, 2]. We solve a system of linear equations by transforming it into an equivalent one of a triangular or staircase pattern: x y z = 4 y + 3z = 11 5z = 15 Using back substitution, we find successively that z = 3, y = 11 3 3 = 1, and x = 4 + 1 + 2 = 1. So the unique solution is [1, 2, 3].!!! However, in many cases the solution is not unique, or may not exist. If it does exist, we need to find all solutions. Another example: x y + z = 1 y + z = 1 Using back substitution: y = 1 z; x = y z 1 = (1 z) z 1 = 2z; thus, x = 2t, y = 1 t, z = t, where t is a parameter; so the solution set is {[ 2t, 1 t, t] t R}; infinitely many solutions. Matrices and Echelon Form The coefficient matrix of a linear system contains the coefficients of the variables, and the augmented matrix is the coefficient matrix augmented by an extra column containing the constant terms. (At the moment, matrix for us is simply a table of coefficients; no prior knowledge of matrices is assumed; properties of matrices will be studied later.) For the system the coefficient matrix is 2x + y z = 3 x + 5z = 1 x + 3y 2z = 0

MATH10212 Linear Algebra Brief lecture notes 3 and the augmented matrix is 1 0 5 2 1 1 1 3 2 1 0 5 2 1 1 1 3 2 If a variable is missing, its coefficient 0 is entered in the appropriate position in the matrix. If we denote the coefficient matrix of a linear system by A and the column vector of constant terms by b, then the form of the augmented matrix is [A b]. 3 1 0 Definition A matrix is in row echelon form if: 1. Any rows consisting entirely of zeros are at the bottom. 2. In each nonzero row, the first nonzero entry (called the leading entry) is in a column to the left of any leading entries below it. Definition If the augmented matrix of a linear system is in r.e.f., then the leading variables are those corresponding to the leading entries; the free variables are all the remaining variables (possibly, none). Remark If the augmented matrix of a linear system is in r.e.f., then it is easy to solve it (or see that there are no solutions): namely, there are no solutions if and only if there is a bad row at the bottom [0, 0,..., 0, b] with b 0. If there is no bad row, then one can solve the system using back substitution: express the leading var. in the equation corresponding to the lowest non-zero row, substitute into all the upper equations, then express the leading var. from the equation of the next-upward row, substitute everywhere above, and so on. Elementary Row Operations These are what is used to arrive at r.e.f. for solving linear systems (and there are many other applications). Definition a matrix: The following elementary row operations can be performed on 1. Interchange two rows.

MATH10212 Linear Algebra Brief lecture notes 4 2. Multiply a row by a nonzero constant. 3. Add a multiple of a row to another row. Remark Observe that dividing a row by a nonzero constant is implied in the above definition, since, for example, dividing a row by 2 is the same as multiplying it by 1 2. Similarly, subtracting a multiple of a row from another row is the same as adding a negative multiple of a row to another row. Notation for the three elementary row operations: 1. R i R j means interchange rows i and j. 2. kr i means multiplying row i by k (remember that k 0!). 3. R i + kr j means add k times row j to row i (and replace row i with the result, so only the ith row is changed). The process of applying elementary row operations to bring a matrix into row echelon form, called row reduction, is used to reduce a matrix to echelon form. Remarks E.r.o.s must be applied only one at a time, consecutively. The row echelon form of a matrix is not unique. Lemma on inverse e.r.o.s Elementary row operations are reversible by other e.r.o.s: operations 1 3 are undone by R i R j, 1 k R i (using k 0), R i kr j. Fundamental Theorem on E.R.O.s for Linear Systems. Elementary row operations applied to the augmented matrix do not alter the solution set of a linear system. (Thus, two linear systems with row equivalent matrices have the same solution set.) Proof. Suppose that one system (old) is transformed into a new one by an elementary row operation (of one of the types 1, 2, 3). (Clearly, we only need to consider one e.r.o.) Let S 1 be the solution set of the old system, and S 2 the solution set of the new one. We need to show that S 1 = S 2. First it is almost obvious that S 1 S 2, that is, every solution of the old system is a solution of the new one. Indeed, if it was type 1, then clearly

MATH10212 Linear Algebra Brief lecture notes 5 nothing changes, since the solution set does not depend on the order of equations. If it was e.r.o. of type 2, then only the ith equation changes: if a i1 u 1 + a i2 u 2 + + a in u n = b i (old), then ka i1 u 1 + ka i2 u 2 + + ka in u n = k(a i1 u 1 +a i2 u 2 + +a in u n ) = kb i (new), so a solution (u 1,..., u n ) of the old system remains a solution of the new one. Similarly, if it was type 3: only the ith equation changes: if [u 1,..., u n ] was a solution of the old system, then both a i1 u 1 + a i2 u 2 + + a in u n = b i and a j1 u 1 + a j2 u 2 + + a jn u n = b j, whence by adding the second times k to the second and collecting terms we get (a i1 + ka j1 )u 1 + (a i2 + ka j2 )u 2 + + (a in + ka jn )u n = b i + kb j, so [u 1,..., u n ] remains a solution of the new system. Thus, in each case, S 1 S 2. But by Lemma on inverses each e.r.o. has inverse, so the old system can also be obtained from the new one by an elementary row operation. Therefore, by the same argument, we also have S 2 S 1. Since now both S 2 S 1 and S 1 S 2, we have S 2 = S 1, as required. This theorem is the theoretical basis of methods of solution by e.r.o.s. Gaussian Elimination method for solving linear systems 1. Write the augmented matrix of the system of linear equations. 2. Use elementary row operations to reduce the augmented matrix to row echelon form. 3. If there is a bad row, then there are no solutions. If there is no bad row, then solve the equivalent system that corresponds to the row-reduced matrix expressing the leading variables via the constant terms and free variables using back substitution. Remark When performed by hand, step 2 of Gaussian elimination allows quite a bit of choice. Here are some useful guidelines: (a) Locate the leftmost column that is not all zeros. (b) Create a leading entry at the top of this column using type 1 e.r.o. R 1 R i. (It helps if you make this leading entry = 1, if necessary using type 2 e.r.o. (1/k)R 1.) (c) Use the leading entry to create zeros below it: kill off all the entries of this column below the leading, using type 3 e.r.o. R i ar 1. (d) Cover (ignore) the first row containing the leading entry, and repeat steps (a), (b), (c) on the remaining submatrix....and so on, every time in (d) ignoring several upper rows with the already created leading entries. Stop when the entire matrix is in row echelon form.

MATH10212 Linear Algebra Brief lecture notes 6 It is fairly obvious that this procedure always works. There are no solutions if and only if a bad row appears 0, 0,..., 0, b with b 0: indeed, then nothing can satisfy this equation 0x 1 + + 0x n = b 0. Variables corresponding to leading coefficients are leading variables; all other variables are free variables (possibly, none then solution is unique). Clearly, when we back-substitute, free variables can take any values ( free ), while leading variables are uniquely expressed in terms of free variables and lower leading variables, which in turn are..., so in fact in the final form of solution leading variables are uniquely expressed in terms of free variables only, while free variables can take independently any values. In other words, free variables are equal to independent parameters, and leading variables are expressed in these parameters. Gauss Jordan Elimination method for solving linear systems We can reduce the augmented matrix even further than in Gauss elimination. Definition A matrix is in reduced row echelon form if: 1. It is in row echelon form. 2. The leading entry in each nonzero row is a 1 (called a leading 1). 3. Each column containing a leading 1 has zeros everywhere else. Gauss Jordan Elimination: 1. Write the augmented matrix of the system of linear equations. 2. Use elementary row operations to reduce the augmented matrix to reduced row echelon form. (In addition to (c) above, also kill off all entries )i.e. create zeros) above the leading one in the same column.) 3. If there is a bad row, then there are no solutions. If there is no bad row (i.e. the resulting system is consistent), then express the leading variables in terms of the constant terms and any remaining free variables. A bit more work to r.r.e.f., but then much easier expressing leading variables in terms of the free variables. The Gaussian (or Gauss Jordan) elimination methods yield the following

MATH10212 Linear Algebra Brief lecture notes 7 Corollary Every consistent linear system over R has either a unique solution (if there are no free variables, so all variables are leading), or infinitely many solutions (when there are free variables, which can take arbitrary values). (We included over R because sometimes linear systems are considered over other number systems, e.g. so-called finite fields, although in this module we work only over R.) Remark If one needs a particular solution (that is, just any one solution), simply set the parameters (leading var.) to any values (usually the simplest is to 0s). E.g. general solution {[1 t + 2u, t, 3 + u, u] t, u R}; setting t = u = 0 we get a particular solution [1, 0, 3, 0]; or we can set, say, t = 1 and u = 2, then we get a particular solution {[4, 1, 5, 2], etc. Definition The rank of a matrix is the number of nonzero rows in its row echelon form. We denote the rank of a matrix A by rank(a). Theorem 2.2 (The Rank Theorem) Let A be the coefficient matrix of a system of linear equations with n variables. If the system is consistent, then number of free variables = n rank(a) Homogeneous Systems Definition A system of linear equations is called homogeneous if the constant term in each equation is zero. In other words, a homogeneous system has an augmented matrix of the form [A 0]. E.g., the following system is homogeneous: x + 2y 3z = 0 x + y + 2z = 0 Remarks. 1) Every homogeneous system is consistent, as it has (at least) the trivial solution [0, 0,..., 0]. 2) Hence, by the Corollary above, every homogeneous system has either a unique solution (the trivial solution) or infinitely many solutions. The next theorem says that the latter case must occur if the number of variables is greater than the number of equations. Theorem 2.3. If [A 0] is a homogeneous system of m linear equations with n variables, where m < n, then the system has infinitely many solutions.

MATH10212 Linear Algebra Brief lecture notes 8 By-product result for matrices Definition Matrices A and B are row equivalent if there is a sequence of elementary row operations that converts A into B. For example, the matrices 0 0 0 1 1 2 3 4 2 3 4 5 and are row equivalent. 1 2 3 4 0 1 2 3 0 0 0 1 Theorem 2.1 Matrices A and B are row equivalent if and only if they can be reduced to the same row echelon form.

MATH10212 Linear Algebra Brief lecture notes 9 Spanning Sets, Linear (In)Dependence, Connections with Linear Systems Linear Combinations, Spans Recall that the sum of two vectors of the same length is a 1 b 1 a 1 + b 1 a 2. + b 2. = a 2 + b 2.. a n b n a n + b n a 1 ka 1 a 2 ka 2 Multiplication by a scalar k R is: k. a n =. ka n. Definition. A linear combination of vectors v 1, v 2,..., v k R n with coefficients c 1,..., c k R is c 1 v 1 + c 2 v 2 + + c k v k. Theorem 2.4. A system of linear equations with augmented matrix [A b] is consistent if and only if b is a linear combination of the columns of A. Method for deciding if a vector b is a linear combination of vectors a 1,..., a k (of course all vectors must be of the same length): form the linear system with augmented matrix whose columns are a 1,..., a k, b (the unknowns of this system are those coefficients). If it is consistent, then b is a linear combination of vectors a 1,..., a k ; if inconsistent, it is not. If one needs to express b as a linear combination of vectors a 1,..., a k, just produce some particular solution, which gives required coefficients. We will often be interested in the collection of all linear combinations of a given set of vectors. Definition. If S = { v 1, v 2,..., v k } is a set of vectors in R n, then the set of all linear combinations of v 1, v 2,..., v k is called the span of v 1, v 2,..., v k and is denoted by span( v 1, v 2,..., v k ) or span(s). Thus, span( v 1, v 2,..., v k ) = {c 1 v 1 + c 2 v 2 + + c k v k c i R}.

MATH10212 Linear Algebra Brief lecture notes 10 Definition. If span(s) = R n, then S is called a spanning set for R n. Obviously, to ask whether a vector b belongs to the span of vectors v 1,..., v k is exactly the same as to ask whether b is a linear combination of the vectors v 1,..., v k ; see Theorem 2.4 and the method described above. Linear (in)dependence Definition. A set of vectors S = { v 1, v 2,..., v k } is linearly dependent if there are scalars c 1, c 2,..., c k at least one of which is not zero, such that c 1 v 1 + c 2 v 2 + + c k v k = 0 A set of vectors that is not linearly dependent is called linearly independent. In other words, vectors { v 1, v 2,..., v k } are linearly independent if equality c 1 v 1 + c 2 v 2 + + c k v k = 0 implies that all the c i are zeros (or: only the trivial linear combination of the v i is equal to 0). Remarks. In the definition of linear dependence, the requirement that at least one of the scalars c 1, c 2,..., c k must be nonzero allows for the possibility that some may be zero. In the example above, u, v and w are linearly dependent, since 3 u+2 v w = 0 and, in fact, all of the scalars are nonzero. On the other hand, [ 2 6 ] [ 1 2 3 ] [ 4 + 0 1 ] [ ] 0 = 0 [ ] [ ] [ ] 2 1 4 so, and are linearly dependent, since at least one 6 3 1 (in fact, two) of the three scalars 1, 2 and 0 is nonzero. (Note, that the actual dependence arises simply from the fact that the first two vectors are multiples.) Since 0 v 1 + 0 v 2 + + 0 v k = 0 for any vectors v 1, v 2,..., v k, linear dependence essentially says that the zero vector can be expressed as a nontrivial linear combination of v 1, v 2,..., v k. Thus, linear independence means that the zero vector can be expressed as a linear combination of v 1, v 2,..., v k only in the trivial way: c 1 v 1 + c 2 v 2 + + c k v k = 0 only if c 1 = 0, c 2 = 0,..., c k = 0.

MATH10212 Linear Algebra Brief lecture notes 11 Theorem 2.6. n m matrix Let v 1, v 2,..., v m be (column) vectors in R n and let A be the A = [ v 1 v 2 v m ] with these vectors as its columns. Then v 1, v 2,..., v m are linearly dependent if and only if the homogeneous linear system with augmented matrix [A 0] has a nontrivial solution. Proof. v 1, v 2,..., v m are linearly dependent if and only if there are scalars c 1, c 2,..., c m not all zero, such that c 1 v 1 + c 2 v 2 + + c m v m = 0. By Theorem 2.4, this is equivalent to saying that the system with the augmented matrix [ v 1 v 2... v m 0] has a non-trivial solution. Method for determining if given vectors v 1, v 2,..., v m are linearly dependent: form the homogeneous system as in Theorem 2.6 (unknowns are those coefficients). Reduce its augmented matrix to r.e.f. If there are no nontrivial solutions (= no free variables), then the vectors are linearly independent. If there are free variables, then there are non-trivial solutions and the vectors are dependent. To find a concrete dependence, find a particular non-trivial solution, which gives required coefficients; for that set the free variables to 1, say (not all to 0). Example 2.22. Any set of vectors 0, v 2,..., v m containing the zero vector is linearly dependent. For we can find a nontrivial combination of the form c 1 0 + c 2 v 2 + + c m v m = 0. by setting c 1 = 1 and c 2 = c 3 = = c m = 0. The relationship between the intuitive notion of dependence and the formal definition is given in the next theorem. Theorem 2.5. Vectors v 1, v 2,..., v m in R n are linearly dependent if and only if at least one of the vectors can be expressed as a linear combination of the others. Proof. If one of the vectors, say, v 1, is a linear combination of the others, then there are scalars c 2,..., c m such that Rearranging, we obtain v 1 = c 2 v 2 + + c m v m. v 1 c 2 v 2 c m v m = 0,

MATH10212 Linear Algebra Brief lecture notes 12 which implies that v 1, v 2,..., v m are linearly dependent, since at least one of the scalars (namely, the coefficient 1 of v 1 ) is nonzero. Conversely, suppose that v 1, v 2,..., v m are linearly dependent. there are scalars c 1, c 2,..., c m not all zero, such that Suppose c 1 0. Then c 1 v 1 + c 2 v 2 + + c m v m = 0. c 1 v 1 = c 2 v 2 c m v m Then and we may multiply both sides by 1 c 1 to obtain v 1 as a linear combination of the other vectors: ( ) ( ) c2 cm v 1 = v 2 v m. c 1 c 1 Corollary. Two vectors u, v R n are linearly dependent if and only if they are proportional. E.g., vectors [1, 2, 1] and [1, 1, 3] are linearly independent, as they are not proportional. Vectors [ 1, 2, 1] and [2, 4, 2] are lin. dependent, since they are proportional (with coeff. 2). Theorem 2.8. Any set of m vectors in R n is linearly dependent if m > n. Proof. Let v 1, v 2,..., v m be (column) vectors in R n and let A be the n m matrix A = [ v 1 v 2 v m ] with these vectors as its columns. By Theorem 2.6, v 1, v 2,..., v m are linearly dependent if and only if the homogeneous linear system with augmented matrix [A 0] has a nontrivial solution. But, according to Theorem 2.3 (not 2.6 a misprint in the Textbook here), this will always be the case if A has more columns than rows; it is the case here, since number of columns m is greater than number of rows n. (Note that here m and n have opposite meanings compared to Theorem 2.3.) Theorem 2.7. m n matrix Let v 1, v 2,..., v m be (row) vectors in R n and let A be the v 1 v 2. v m with these vectors as its rows. Then v 1, v 2,..., v m are linearly dependent if and only if rank(a) < m. Note that there is no linear system in Th.2.7 (although e.r.o.s must be used to reduce A to r.e.f.; then rank(a) = number of non-zero rows of this r.e.f.)

MATH10212 Linear Algebra Brief lecture notes 13 Proof. If v 1, v 2,..., v m are linearly dependent, then by Th. 2.5 one of these vectors is equal to a linear combination of the others. Swapping rows by type 1 e.r.o. if necessary, we can assume that v m = c 1 v 1 + + c m 1 v m 1. We can now kill off the m-th row by e.r.o.s A Rm c1r1 Rm c2r2 R m c m 1R m 1 ; the resulting matrix will have m-th row consisting of zeros. Next, we apply e.r.o.s to reduce the submatrix consisting of the upper m 1 rows to r.e.f. Clearly, together with the zero m-th row it will be r.e.f. of A, with at most m 1 non-zero rows. Thus, rank(a) m 1. (assumed without proof) The idea is that if rank(a) m 1, then r.e.f. of A has zero row at the bottom. Analysing e.r.o.s that lead from A to this r.e.f. one can show (we assume this without proof) that one of the rows is a linear combination of the others; see the textbook, Example 2.25. Row Method for deciding if vectors v 1,..., v m are linearly dependent. Form the matrix A with rows v i (even if originally you were given columns, just lay them down, rotate by 90 0 clockwise). Reduce A by e.r.o.s to r.e.f., the number of non-zero rows in this r.e.f. is =rank(a). The vectors are linearly dependent if and only if rank(a) < m. (Again: note that there is no linear system to solve here; no unknowns, it does not matter if there is a bad row.) Theorem on e.r.o.s and spans. matrix. E.r.o.s do not alter the span of rows of a (Again: there is no linear system here, no unknowns.) Proof. Let v 1, v 2,..., v m be the rows of a matrix A, to which we apply e.r.o.s. Clearly, it is sufficient to prove that the span of rows is not changed by a single e.r.o. Let u 1, u 2,..., u m be the rows of the new matrix. By the definition of e.r.o., every u i is a linear combination of the v j (most rows are even the same). Now, in every linear combination c 1 u 1 + c 2 u 2 + + c m u m we can substitute those expressions of the u i via the v j. Expand brackets, collect terms: this becomes a linear combination of the the v j. In other words, span( v 1,..., v m ) span( u 1,..., u m ). By Lemma on inverse e.r.o., the old matrix is also obtained from the new one by the inverse e.r.o. By the same argument, span( v 1,..., v m ) span( u 1,..., u m ). As a result, span( v 1,..., v m ) = span( u 1,..., u m ).