Size: px
Start display at page:

Download ""

Transcription

1 Solving linear equations on parallel distributed memory architectures by extrapolation Christer Andersson Abstract Extrapolation methods can be used to accelerate the convergence of vector sequences. It is shown how three dierent extrapolation algorithms, the minimal polynomial extrapolation (MPE), the reduced rank extrapolation (RRE) and the modied minimal polynomial extrapolation (MMPE), can be used to solve systems of linear equations. The algorithms are derived and equivalence to dierent Krylov subspace methods are established. The extrapolation algorithms are to prefer on parallel distributed memory architectures since less inter-processor communication is needed. Numerically the extrapolation methods are not as stable as the Krylov subspace methods since they require the solution of ill-conditioned overdetermined systems. Several techniques of improving convergence and stability are presented. Some of these are new to the best of the author's knowledge. The use of regularization methods and a slightly modied stationary method have proved to be especially useful. Error bounds and methods of estimating accuracy are given. Some aspects of implementation are discussed with emphasis on parallel distributed memory architectures. Implementations of RRE and GMRES are compared on an IBM RS/6000 Power Parallel SP. RRE has some diculties converging to solutions with errors close to the oating point relative accuracy. For slightly larger tolerances however, RRE is better than GMRES. RRE seems to be the most useful of the extrapolation algorithms, especially if it is used with the modied stationary method given here. 1

2 CONTENTS 2 Contents 1 Introduction 3 2 Theoretical background Notation and common properties Minimal polynomial extrapolation Reduced rank extrapolation Modied minimal polynomial extrapolation Summary Extrapolation vs. Krylov subspace methods Practical use and implementation Cycling Improving convergence Initial iterations Discarding generated vectors Choosing stationary method Using higher precision to solve the overdetermined linear system Normalization Regularization Solving overdetermined systems Truncated SVD regularization Computational complexity Numerical experiments Test problems Choosing extrapolation method Empirical examination of techniques of improving convergence Initial iterations Discarding generated vectors Choosing stationary method Using higher precision to solve the overdetermined linear system Normalization Truncated SVD regularization Extrapolation methods compared to Krylov subspace methods Estimating the accuracy of the extrapolated solution Mathematical results Error bounds The eigenvectors of G A heuristic argument for normalizing the solution of the overdetermined system Parallel implementation Some important concepts Hardware and software Implementation Matrix-vector multiplication Forming and solving the normal equations Computing the residual Timing experiments MPI Allreduce Extrapolation methods Extrapolation methods compared to Krylov subspace methods Conclusion 37

3 1 INTRODUCTION 3 1 Introduction Many problems in science and engineering at some point require the solution of a system of linear equations. There are several methods for computing the solution, the most well-known being standard Gaussian elimination. When choosing the algorithm to use one wishes to take advantage of any special properties the system might have. For larger systems this becomes increasingly important. Sparse linear systems constitutes an important class of problems. For sparse systems a large number of the elements in the coecient matrix are zero. Banded systems, where all the non-zero elements are located in a band along the diagonal, are of special interest since they arise from discretizing partial dierential equations. Due to ll-in Gaussian elimination does not preserve sparsity, i.e. the elimination process introduces non-zero elements. Thus sparse systems are often solved by iterative methods requiring only matrix-vector multiplications with the original matrix. Starting from some initial guess iterative methods generate a sequence of approximations converging to the solution. Extrapolation methods constitute a class of iterative methods used to accelerate the convergence of vector sequences. Starting from a vector sequence generated by a stationary iterative method the reduced rank extrapolation (RRE), the minimal polynomial extrapolation (MPE) and the modied minimal polynomial extrapolation (MMPE) construct a better approximation from the information in the vector sequence. It is known that for linearly generated vector sequences the above methods are mathematically equivalent towell known Krylov subspace methods such as Arnoldi, Lanczos and GMRES. Numerically the extrapolation methods are not as stable as the later methods since they require the solution of ill-conditioned overdetermined linear systems. The stability ofthis solution, and thus the extrapolation methods, can be increased by using regularization methods or a slightly modied stationary method. The main reason for preferring extrapolation methods to Krylov subspace methods is their computational advantages on distributed parallel memory computers. They do not require the orthogonalization of vectors needed for the Krylov subspace methods. Thus the need for inter-processor communication is reduced. The purpose of this Master's project is to investigate dierent extrapolation methods in terms of numerical stability and accuracy, as well as to compare this class of methods with some well known Krylov subspace methods. In section 2 the theoretical background for the extrapolation methods is briey reviewed. The main focus is on the derivation of the methods. This section is based on the papers [17], [19], [21] and the book [14]. Practical usage and implementation of extrapolation methods are discussed in section 3. Especially dierent approaches to improve stability and convergence are suggested. Some numerical experiments are described in section 4. From these experiments we conclude that the extrapolation methods are as ecient as Krylov subspace methods for small systems. Implementation on parallel distributed memory architectures is discussed in section 6. Parallel implementations of RRE and GMRES are compared for a few test problems. RRE has some diculties converging to tolerances close to the oating point relative accuracy. For slightly larger tolerances RRE works well indicating that it can be useful for solving large systems as well. In terms of speed and parallelization properties RRE shows major advantages compared to GMRES. 2 Theoretical background 2.1 Notation and common properties In this paper we are interested in solving sparse linear systems Ax = b (2.1) where A 2 R NN is invertible. More specic, an approximation to the solution will be constructed by accelerating a linear stationary iterative method of the form x j+1 = Gx j + f (2.2) where A = M ; N is a splitting such thatg = M ;1 N and M ;1 b exist. The class of iterative methods described by (2.2) includes the well known Jacobi and Gauss-Seidel methods. To accelerate the convergence of (2.2) G and f do not have tobeknown expicitly, the knowledge of the

4 2 THEORETICAL BACKGROUND 4 sequence alone is sucient. We will however assume that all eigenvalues of G are dierent from 1 so that a unique xed point to (2.2) exists and that it is the solution to (I ; G)s = f: (2.3) Comparing this to the original system of equations (2.1) we nd that s is the solution to the preconditioned system M ;1 As =(I ; G)s = f = M ;1 b: If a sequence of k vectors has been generated starting from some initial guess x 0,wewant to determine coecients j so that s can be expressed as a linear combination of the x j, s = j x j (2.4) where j =1: (2.5) It turns out that if k is chosen to at least the degree of the minimal polynomial of x 1 ; x 0 with respect to G, i.e. the monic polynomial P k of least degree k 0 such that P k0 (x 1 ; x 0 ) = 0, then all extrapolation methods herein converge to the xed point s. We will assume that k = k 0 in sections 2.2, 2.3 and 2.4 and return to the more general case in sections 2.6 and 3.1. In the following sections we will derive the reduced rank extrapolation (RRE), the minimal polynomial extrapolation (MPE) and the modied minimal polynomial extrapolation (MMPE). For that purpose the following notation will be useful. Dene u j x j = x j+1 ; x j (2.6) and v j u j = u j+1 ; u j (2.7) U n [u 0 u 1 ::: u n ] (2.8) The dierence between x j and the solution is denoted by From (2.2), (2.6) and (2.7) it follows that and 2.2 Minimal polynomial extrapolation V n [v 0 v 1 ::: v n ]: (2.9) j x j ; s: (2.10) u j+1 = Gu j = G j+1 u 0 j 0 (2.11) v j+1 =(G ; I)u j j 0: (2.12) Using (2.10) and (2.5) the solution s to (I ; G)x = f can be written s = j x j = j (s + j )=s + Clearly the j should be chosen such that P k j j = 0 holds. The errors j can not be computed unless s is known. However we have the following j j :

5 2 THEORETICAL BACKGROUND 5 P Lemma P k 1. Let c j be coecients in the polynomial P k (t) = c jt j such that P k (G)u 0 = 0, k then c j j =0 Proof. First (G ; I) j =(G ; I)(x j ; s) =(Gx j ; x j ) ; (Gs ; s) = (x j+1 ; f) ; x j ; (s ; f) ; s = x j+1 ; x j = u j : From (2.11) and the denition of the minimal polynomial it follows that 0= c j G j u 0 = c j u j =(G ; I) and since (G ; I) is of full rank this concludes the proof. In other words, if c j are the coecients in the minimal polynomial of G with respect to u 0 then P k c j j =0. From this lemma it follows that j = c j satises P j j =0. This relation is satised even if c j j all the j are multiplied with a constant. To satisfy the constraint wechoose j = c j Pk c j j =0 1 ::: k provided that P c j 6=0. It can be shown that this is always true when (I ; G) is non-singular. Using (2.11) the minimal polynomial can be written Since P is monic c k =1andthus P (G)u 0 = k;1 X c j G j u 0 = c j u j =0: c j u j = ;u k : (2.13) Introducing the vector c =[c 0 c 1 ::: c k;1] T this can be expressed as the overdetermined linear system, U k;1c = ;u k (2.14) where U k;1 2 R Nk. Since k is the degree of the minimal polynomial it follows from Cayley-Hamilton's theorem [3] that k N. Thus the resulting system of equations is smaller than the original. In practice, for many problems the degree of the minimal polynomial is much less than N. An important question that needs to be addressed if we expect to nd the solution to Ax = b is whether or not the system (2.14) is consistent. By denition of the minimal polynomial we know that (2.13) has a unique solution and hence (2.14) is consistent. Summarizing we have the following algorithm. Algorithm 1 (MPE). 1. Generate k +1 vectors x 0 x 1 ::: x k Compute U k;1 and u k. 3. Solve U k;1c = ;u k. 4. Set c k =1and j = P cj cj. 5. s = P j x j. This version of MPE diers somewhat from the algorithm originally presented by Cabay and Jackson [4]. We have chosen the simpler form given by Ford et al. in [21].

6 2 THEORETICAL BACKGROUND Reduced rank extrapolation When determining the j for MPE we solved U k [c c k ] T =0 (2.15) subject to c k =1: (2.16) The c j were then scaled to obtain j satisfying P j =1. Since =[ 0 1 ::: k ] also satises U k = 0 we might ask ourselves what would happen if the constraint were to be changed from c k =1to P c k =1. The resulting system would be U k =0 subject to j =1: (2.17) This approach also yields the solution. Consider the residual P r = f ; (I ; G)x =(Gx + f) ; x: Substituting x for the solution j x j yields r = f ; j (I ; G)x j and with (2.17) we nd r = j (Gx j + f) ; x j = j u j : (2.18) Since the residual is zero for the solution the j can be found by solving (2.17). This extrapolation approach is called reduced rank extrapolation (RRE). Algorithm 2 (RRE). 1. Generate k +1 vectors x 0 x 1 ::: x k Compute U k. 3. Solve U k =0subject to P j =1. 4. s = P j x j. Even though rst and second dierences are used in the alternative form of RRE it does not require any additional storage. There is no need to store the generated vectors once U k has been computed. The memory allocated for x 1 ::: x k can be used to store v 0 ::: v k;1. This form of RRE is sometimes called Mesinas algorithm [13], it was chosen to emphasize the similarities with MPE, the only dierence being the constraint chosen when the overdetermined system is solved. The form of RRE usually found in the literature is somewhat dierent, see [21]. It does not involve a constraint and the overdetermined system is formed using second instead of rst dierences, see algorithm 3. Algorithm 3 (Alternative RRE). 1. Generate k +1 vectors x 0 x 1 ::: x k Compute U k and V k;1. 3. Solve V k;1 = ;u s = x 0 + P j u j.

7 2 THEORETICAL BACKGROUND 7 The derivation of the alternative RRE is straightforward, see for example [21], and will not be given here. However, we show that it is equivalent with algorithm 2. The expression for the solution in the alternative RRE has the form X k;1 k;1 s = x 0 + j u j =(1; 0 )x 0 + ( j;1 ; j )x j + k;1x k = X j=1 Thus it is evident that can be expressed using, = Inserting this in (2.17), we obtain The constraint is satised since ;1 0 ::: ;1 ::: ::: 1 ;1 0 0 ::: j x j : = e 1 + S S 2 R (k+1)k : (2.19) 0=U k = u 0 + U k S = u 0 + V k;1 : (2.20) X k;1 j =(1; 0 )+ ( j;1 ; j )+ k =1 j=1 so algorithms 2 and 3 are mathematically equivalent. However, for practical purposes the algorithms show dierent numerical behaviour. In most cases the alternative RRE seems to be the most ecient. This does not apply for slowly converging sequences. Due to cancellation signicant digits will be lost when computing U k and even more when computing V k;1. For a slowly converging sequence the coecient matrix is very inaccurate which results in inaccurate j. For RRE (algorithm 2) we encounter other diculties due to the constraint P j = 1. The constraint does not specify an upper bound for j j j. Often the j are large with alternating signs which may produce large round-o errors when the solution is computed. j j j seems to be roughly proportional to j. This is consistent with the notion that the last vectors generated should be the most accurate. 2.4 Modied minimal polynomial extrapolation Modied minimal polynomial extrapolation rst appeared in [19] where a general procedure of deriving extrapolation methods using the Shanks-Schmidt transform [15] was given. The starting point is equation (2.13), k;1 X c j u j = ;u k : (2.21) Instead of solving this system in least squares sense, we introduce k linearly independent bounded linear functionals Q i, i =1 ::: k. Applying them to (2.21) yields k;1 X c j Q i (u j )=;Q i (u k ) i =1 ::: k (2.22) which isak k linear system. Choosing Q i (y) (e i y) where e i is the i:th standard basis vector, (2.22) is just the rst k equations of (2.21). Since (2.21) is consistent this is enough to nd the unique solution to (2.21), in fact, any k equations will do. For k k 0 equation (2.21) is no longer consistent andq i must be

8 2 THEORETICAL BACKGROUND 8 chosen more carefully. This will be discussed in section 4.2. In practice we represent the functionals with a matrix Q 2 R Nk and solve the k k system Q T U k;1c = ;Q T u k : For completeness we also formulate the modied minimal polynomial extrapolation algorithm (MMPE). Algorithm 4 (MMPE). 1. Generate k +1 vectors x 0 x 1 ::: x k Compute U k;1 and u k. 3. Choose k linearly independent bounded linear functionals Q 1 ::: Q k. P 4. Solve c j Q i (u j )=;Q i (u k ) i =1 ::: k. 5. Set c k =1and j = P cj cj. 6. s = P j x j. 2.5 Summary PAll three methods, MPE, RRE and MMPE solve the system U k =0subject to the constraint j =1. For RRE the system is solved directly using for example Lagrange relaxation [18]. In section 2.3 we saw that the residual can be written U k and thus the resulting algorithm minimizes the L 2 -norm of the residual. If the last column of U k is moved to the right hand side and U k;1c = ;u k is solved instead, we obtain MPE by introducing c k = 1 and normalizing the c j. The solution to U k;1c = ;u k can be found by multiplying with U T k;1 from the left and solving the normal equations. If the system is multiplied with a matrix Q T of rank k instead, Q T U k;1c = ;Q T u k,wehave MMPE. All algorithms discussed so far belong to the family of polynomial extrapolation methods. There is another family of method knowns as the epsilon algorithms, including the scalar epsilon algorithm (SEA), the vector epsilon algorithm (VEA) and the topological epsilon algorithm (TEA) [21]. They are based on recursive formulas and hence they are more dicult to implement eciently on parallel architectures. Even on serial computers they have a major drawback since a sequence of 2k vectors is required to nd the solution instead of the k +1 vectors needed for the polynomial methods. An overview of the epsilon algorithms can be found in [21]. 2.6 Extrapolation vs. Krylov subspace methods Krylov subspace methods (or Krylov methods for short) are examples of iterative projection methods. In general projection methods seek an approximate solution to Ax = b satisfying x m 2 x 0 + K m b ; Ax m?l m where K m and L m are subspaces of dimension m, see for example the exposition in [14]. Krylov methods are based on projection onto Krylov subspaces, K m (A r 0 )=spanfr 0 Ar 0 A 2 r 0 ::: A m;1 r 0 g where r 0 is the residual of the initial guess, r 0 = b ; Ax 0. By choosing L m dierently one obtains dierent methods. In general there are several mathematically equivalent ways to formulate these methods. The Arnoldi method (or the Full Orthogonalization Method, FOM) is an orthogonal projection method with L m (A r 0 ) = K m (A r 0 ). The approximation x m computed in iteration m is the

9 2 THEORETICAL BACKGROUND 9 projection of the solution onto K m. GMRES takes L m (A r 0 )=AK m (A r 0 )). This results in a method that minimizes the L 2 -norm of the residual in every iteration. Further information on Krylov methods can be found in [14]. One important property of the Krylov methods is that in exact arithmetic a solution belonging to R N is found in no more than N iterations. More specic, the necessary number of iterations is equal to the degree of the minimal polynomial of r 0 with respect to A. To establish equivalence with the extrapolation methods Krylov methods will be applied to the system (I ; G)x = f and thus r 0 = u 0. Using the notation introduced for the extrapolation methods we know that the solution will be found in no more than k 0 iterations. This suggests that performing k iterations with some Krylov method is equivalent to extrapolating with k vectors. Theorem 1. RRE and GMRES are equivalent when applied to the system (I ; G)x = f. Proof. To prove equivalence we will show that the extrapolated solution lies in x 0 + K k and is orthogonal to L k. We have K k (I ; G u 0 )=spanfu 0 (I ; G)u 0 ::: (I ; G) k;1 u 0 g = spanfu 0 Gu 0 ::: G k;1 u 0 g = spanfu 0 u 1 ::: u k;1g: (2.23) For alternative RRE s = P k;1 ju j obviously lies in x 0 + K m. To show that s is orthogonal to L k we use (2.12), For r k = f ; (I ; G)s we have L k =(I ; G)spanfu 0 u 1 ::: u k;1g = spanfv 0 v 1 ::: v k;1g: (V k;1) T (f ; (I ; G)s) =(V k;1) T (V k;1 + u 0 )= T (V T k;1 V k;1 + V T k;1u 0 )=0 by equation (2.20). Thus we have shown r k?l k. Theorem 2. MPE and the Arnoldi method areequivalent when applied to the system (I;G)x = f. Proof. By using the formulation of MPE by Cabay and Jackson [4] it is easy to show that the extrapolated solution belong to x 0 + K k in the same way astheproofoftheorem1. For r k = f ; (I ; G)s we have (U k;1) T (f ; (I ; G)s) = T U T k;1(u k;1c + u k )=( T =( by equation (2.14). Thus we have shown r k?l k. c j )= c j ) (U T k;1 U k;1c + U T k;1u k )=0 It is also possible to show equivalence between the topological epsilon algorithm (TEA) and the Lanczos method. The theorems above make no assumptions on k and are valid for k 6= k 0. For RRE (and GMRES) we know thatchoosing k less than k 0 results in an algorithm minimizing the L 2 -norm of the residual. We conclude this section by showing that the error in the solution found by using MPE (or the Arnoldi method) is orthogonal to the k dominant components of the error. From the proof of lemma 1 we have u i =(G ; I) i. With L k dened as by the Arnoldi method we nd L k = spanfu 0 u 1 ::: u k g =(I ; G)spanf 0 1 ::: k g: The residual of the extrapolated solution, f ;(I ; G)~s, is orthogonal to this subspace. Since (I ; G) is non-singular we multiply by (I ; G) ;1 from the left to obtain (I ; G) ;1 f ; ~s = s ; ~s? spanf 0 1 ::: k g:

10 3 PRACTICAL USE AND IMPLEMENTATION 10 3 Practical use and implementation The implementation of extrapolation methods consists of two separate parts, the implementation of the sequence generator and the implementation of the extrapolation process. These two parts can be implemented independent ofeach other and require dierent computational kernels. Since we are concerned with linear sequence generators given by (2.2), the sequence can be generated using only matrix-vector multiplications and saxpys, z x + y. Both the matrixvector multiplication and saxpy operation can be found optimized in numerical libraries like BLAS [7]. For non-linear sequence generators other kernels may beofinterrest. In the extrapolation process an overdetermined N k linear system is formed and solved. We would like to nd the solution by solving the normal equations since they are easy to implement eciently on both serial and parallel architectures. This requires routines for matrix-matrix multiplication and solution of linear systems available in the BLAS [7] and LAPACK [1] libraries. There are some numerical diculties associated with the normal equations that will be discussed in section Implementation on serial computers is straightforward and follow the algorithms outlined in the previous section with some modications discussed in section 3.1. Some techniques of improving the convergence of the extrapolation methods are given in section 3.2. Regularization methods for improving the stability of the solution of the overdetermined systems are considered in section 3.3. Finally the arithmetic complexities of the extrapolation and Krylov methods are compared in section 3.4. On parallel computers some of the operations require special treatment, see the discussion in section Cycling Extrapolation methods are used to accelerate the convergence of a stationary iterative method. The objective isto nd the solution to (I ; G)x = f faster than we would have usingonly the stationary method. Since we are using nite arithmetics we do not expect to compute the exact solution, the objective is to nd a solution accurate enough. For this purpose it is usually not optimal to choose k equal to k 0 (if we should happen to know it). A process called cycling is used instead. Algorithm 5 (Cycling). 1. Choose an initial vector x Generate k +1 vectors starting from x Extrapolate to nd an approximate solution ~s. 4. If ~s is accurate enough stop. Otherwise set x 0 = ~s and go to 2. Cycling does not provide a way ofchoosing k but makes it possible to nd an accurate solution without knowing k 0 and hopefully without having to generate a large number of vectors. It also has other advantages. The size of the overdetermined linear system increases with k and so does the time required solving it. If k 0 is large, solving this full system may be more time consuming than solving the original sparse system Ax = b. For a larger k one can in general expect a faster convergence. By using cycling we can choose k such that the work and storage needed to solve the overdetermined system does not become the dominating part, and yet at the same time achieve a reasonable convergence. For GMRES and the Arnoldi method it is not necessary to know k 0. If suitable algorithmic structures are used we canseeifk 0 = i for each iteration i. We could use a similar approach for extrapolation methods and compute a solution for each new vector generated. This is very time consuming and we usually prefer cycling. Cycling can also be used with Krylov methods where it is often called restarting. Numerical experiments indicate that k can be chosen somewhat arbitrary with good results. For systems with 100 to unknowns, a value between 10 and 40 will usually do. The condition number of the overdetermined system increases with k so a higher k does not always imply faster convergence.

11 3 PRACTICAL USE AND IMPLEMENTATION Improving convergence There are several ways of improving the convergence of extrapolation methods. Five techniques will be presented. The rst three techniques concerns the convergence of the stationary method. The fourth is used to obtain a more accurate solution to the system of equations. Thelasttechnique attempts to decrease the error after the extrapolated solution has been computed. Numerical experiments with these techniques are presented in section 4.3. In section 4.4 an alternative method of generating the vectors that span a Krylov subspace that seems to improve the stability of RRE is discussed Initial iterations The eigenvectors of G are of great importance to the stationary method. To see why we assume that G is diagonalizable. Then the eigenvectors of G 2 R NN form an orthogonal basis of R N. The initial error, 0, can be written as a linear combination of the eigenvectors, and thus 0 = G n 0 = N;1 X i=0 N;1 X i=0 a i p i a i n i p i : If the largest of the j i j, the spectral radius of G, is less than one, the stationary method is convergent, otherwise From lim n!1 Gn 0 6=0: x j+1 = Gx j + f = G(s + j )+f = s + G j we have j = G j;1 and thus the sequence is divergent ifany j i j > 1. Extrapolation methods can be used to nd the xed point (or anti-limit) for diverging sequences but converging sequences are of greater interrest. If some of the eigenvalues are very small, the components of the error along the corresponding eigenvectors vanishes after a few iterations. This can be used to improveconvergence. If k+1+n vectors are generated instead of k+1 we can extrapolate using x n x 1+n ::: x k+1+n, thus eliminating contributions to the error from some eigenvectors. The rst n iterations are called initial iterations. In [16] the relation lim n!1 s n ; s = O j k+1 j n is established. s n denotes the extrapolated solution when n initial iterations are performed. It is assumed that the eigenvalues of G are ordered according to magnitude so that j 1 j j 2 j :::j k j :::, and that j k j > j k+1 j. For a converging sequence initial iterations improves the extrapolated solution asymptotically. Further analysis of the eect on initial iterations can be found in [20] where a thorough theoretical explanation is given Discarding generated vectors Initial iterations can be used to improve convergence when there are eigenvalues close to zero but they have little eect on eigenvalues with magnitude close to one. Assuming a convergent sequence (all j i j < 1) where some of the eigenvalues are close to 1 we nd for the eigenvectors p i G n p i = n i p i p i

12 3 PRACTICAL USE AND IMPLEMENTATION 12 even for large n. If the other eigenvalues are small we will encounter diculties due to cancellation (the loss of signicant digits when subtracting two almost equal numbers) when computing the rst dierences. To deal with this problem we must somehow "lower" the magnitude of the larger eigenvalues. This can be done by discarding some of the generated vectors. Instead of generating k + 1 vectors we generate q(k+1) and extrapolate using every q-th vector, i.e. x 0 x q ::: x q(k+1). It is equivalent to use the stationary method X q;1 x j+1 = G q x j + G i f: Since the eigenvalues of G j are j wehave \lowered" the magnitude of the eigenvalues. If we have complex eigenvalues with magnitudes close to one this becomes even more important because of the oscillations in the vectors generated by the stationary method Choosing stationary method The better convergence the underlying stationary method has, the better the convergence of the extrapolation method will be. In this work we have for simplicity used the Jacobi method which requires only a matrix-vector multiplication and a vector addition to compute a vector in the sequence. Other methods could of course be used as well. If Jacobi gives unsatisfactory convergence it is natural to consider the Gauss-Seidel method. For the Gauss-Seidel method a triangular system of the same size as A must be solved for each vector in the sequence which makes it more expensive than the Jacobi method. The question is how many cycles less are needed if we use a more ecient stationary method and if the time gained there is enough to justify a more expensive sequence generator. The Gauss-Seidel method involves solving a system of linear equations for every generated vector which makes it dicult to implement on parallel computers. An alternative method that is discussed in [18] is the relaxation process i=0 x j+1 =(1;!)x j +!(Gx j + f) 0 <! 1: Using higher precision to solve the overdetermined linear system In all extrapolation methods discussed here the extrapolated solution is constructed as a linear combination of successive stationary iterations. The coecients in this linear combination is determined from solving an overdetermined linear system. This system is built from rst- or second dierences of the successive approximations. It can thus be expected that at least when one is close to the solution this linear system will be ill-conditioned. The accuracy of the solution to a system of linear equations is largely dependent on the condition number of the system (see section 3.3.1). For ill-conditioned systems we might have to use higher precision to obtain the desired accuracy in the solution. In a programming language like C or FORTRAN this is easy, long double is used instead of double (C). If this is not possible (in Matlab for example) an alternative approach must be used. Dekker proposed the following algorithm for extending available precision [5]. Algorithm = base digits= = a = a ; 3. a u = + Calculate upper half 4. a l = a ; a u Calculate lower half

13 3 PRACTICAL USE AND IMPLEMENTATION 13 The new variables are used to store terms of dierent size. This reduces the contributions from round-o errors. Dekker's algorithm has been used for numerical testing in Matlab. Before discussing the results we look at an example of how to use the new variables. The example below describes a matrixvector multiplication, y Ax. Compute A u, A l, x u and x l y u = A u x u y l = A u x l + A l x u y err = A l x l y = y l + y err y = y + y u If the higher precision is to have any impact it must be used in all relevant operations. To solve the overdetermined system using the normal equations and computing the extrapolated solution this means extended precision has to be used in: forming the normal equations solving the normal equations computing the extrapolated solution computing the residual Extended double precision has been used to solve the overdetermined systems for both MPE and RRE. It does not aect the solution in most cases. It can have some eect when the relative error in the extrapolated solution is close to the oating point relative accuracy of the computer Normalization AP necessary condition for nding the solution to (I ; G)x = f using extrapolation methods is that j = 1, see algorithms 2.2, 2.3 and 2.4. In nite arithmetics we do not expect to compute the exact j but an approximation ~ j j + e j. With this in mind it is likely that ~ j =1+ e j 1+E where E is small but dierent from zero. One way of making sure the constraint is satised is to divide the ~ j with 1 + E. From numerical experiments we know that the magnitude of E can be used to predict the accuracy of the extrapolated solution when we are close to the solution. In section 5.3 a heuristic argument for doing this normalization will be presented. Normalizing does not always have a positive eect. Normalization eliminates one component of the error (equations (5.2) and 5.3)) but introduces another. In section 5.3 it is shown that the error in the extrapolated solution can be eliminated when the residual is small. The norm of the residual can be used to determine when to normalize. Another criteria is to examine E or perhaps study E and the norm of the residual at the same time. In practice these three criteria seems to be equally eective. 3.3 Regularization Some of the overdetermined systems that appear when we use extrapolation methods are examples of ill-posed problems. For ill-posed problems small perturbations in the data may cause large variations in the solution. To avoid this we introduce a well-posed approximation to the original problem thus increasing the stability of the solution. This is the basic principle of regularization [10].

14 3 PRACTICAL USE AND IMPLEMENTATION Solving overdetermined systems Both RRE and MPE produce overdetermined systems that must be solved in order to nd the extrapolated solution. If we have k<k 0 these systems are inconsistent, all equations can not be satised at the same time. Instead the solution is computed in least-squares sense. To compute this solution we can use a number of algorithms. The simplest of these is to solve the normal equations. From a parallel point of view the normal equations are superior to other algorithms for solving least-squares problems. Unfortunately it is not a very stable algorithm. Golub and Van Loan presents an upper bound for the relative error in the solution of Ax = b [8] that grows with the condition number of the overdetermined system, (A). kx ; ~xk1 kxk1 4u 1(A) (3.1) where u is the oating point relative accuracy. For a square matrix the condition number is given by the ratio of the largest and the smallest eigenvalue. When normal equations are used to solve the overdetermined system the condition number is squared. Since the systems that appear when extrapolation methods are used often have large condition numbers (sometimes as large as 10 6 ; 10 7 ) we can not expect the computed j to be very accurate. QR-factorization [8] is a better algorithm from a numerical point of view. If the overdetermined system is denoted Cx d, the QR-factorization of C is C = Q R where Q is orthogonal and R is upper triangular. The solution is found by solving the triangular system Rx = Q T d: More operations and storage are needed if QR-factorization is used but since only orthogonal transformations are used the condition number of R is the same as the condition number of C, giving a more stable and accurate process. For comparison the overdetermined systems were solved in Matlab using both QR-factorization and the normal equations. As expected the QR-factorization results in more accurate solutions but it seems that the normal equations are to be preferred. To see why we recall the cycling algorithm. In every cycle we solve a system of equations. Since QR-factorization is much more time-consuming it is necessary to reduce the number of needed cycles signicantly for the implementation using QR-factorization to be more ecient. Such a reduction in the number of cycles has yet not been seen in our numerical experiments. More important, when implementing extrapolation methods in a parallel computer environment for parallelization eciency reasons we want to avoid using a global QR-factorization. Instead we want to form the normal equations C T C in parallel and then solve this small k k linear system sequentially on every processor. An alternative to using QR-factorization is to use a regularization method to stabilize the normal equations. To illustrate this concept we use a regularization technique based upon singular value decomposition of the normal equations Truncated SVD regularization The eigenvalues are an important tool for analysing problems involving square matrices. For nonsquare matrices we introduce singular values that can be seen as a generalization of eigenvalues. For square matrices the eigenvalues and the singular values are identical. The singular values are associated with the decomposition of matrices given in theorem 3. Since this is not a text on numerical algebra only the concepts necessary for understanding regularization will be presented. Further information can be found in [8] from which the following theorem was taken. Theorem 3 (Singular value decomposition (SVD)). If A is a real m-by-n matrix then there exist orthogonal matrices U =[u 1 ::: u m ] 2 R mm and V =[v 0 ::: v n ] 2 R nn

15 3 PRACTICAL USE AND IMPLEMENTATION 15 such that where 1 2 r 0. Proof. See proof of theorem in [8]. U T AV = diag( 1 ::: r ) 2 R mn r = minfm ng With the SVD it is possible to dene a pseudo-inverse that can be used to nd the solution to linear least-squares problems. Introducing, and =diag( 1 ::: r ) 2 R mn r = minfm ng + = diag( ;1 1 ::: ;1 r ) 2 R nm r = minfm ng C can be written C = UV T = rx i=1 i u i v T i : Using the orthogonality ofu and V and the fact that + =I rr we dene C + = V + U T and the solution to Cx d as x = V + U T d = rx i=1 u T d i v: (3.2) It can be shown that this is the least squares-solution. The matrix V + U T is often referred to as the Moore-Penrose generalized inverse. An important property of the singular value decomposition is that it can be used to nd the closest rank-decient approximation of a matrix. If we set C p = px i=1 i u i v T i p<r then C p is the best approximation of rank p to C in the sense that kc ; C p k 2 is minimized. Furthermore we have kc ; C p k 2 = p+1. The characterization of ill-posed linear least-squares problems can now be stated (as given in [10]). The problem Cx d is said to be ill-posed if the singular values of C decay gradually to zero the ratio between the largest and the smallest nonzero singular value is large. The singular values of these systems often decrease gradually to zero without any distinct drop in magnitude. Another characteristic is that u i and v i have more sign changes in their elements as i increases. Considering the solution to the overdetermined system in (3.2) we notice that the solution is dominated by the terms corresponding to the small singular values. The many signchanges in v i will also be seen in the solution. Regularization is used to damp the contribution from the small singular values by truncating formula (3.2) x reg = px i=1 u T d i v:

16 3 PRACTICAL USE AND IMPLEMENTATION 16 With regularization the contribution to the error due to perturbations in d is reduced but a new error due to regularization appears. The original problem has been replaced with min x kcx ; dk 2 min x kc p x ; dk 2 : C has been replaced with the closest approximation of rank p. This is called truncated SVD regularization (TSVD). TSVD can also be motivated by looking at the condition number of the coecient matrix. From the characterization of an ill-posed least-squares problem we know thatthe ratio between the largest and the smallest singular value is large. Since the condition number of C can be written 1 = r the condition number is large as well. By using regularization the condition number of the coecient matrix decreases. A dierent problem is solved but the contributions from round-o errors are reduced. One diculty with regularization is to balance the perturbation and regularization errors, whether or not regularization has any positive eect depends solely on how well this is done. To balance the errors an appropriate number of singular values must be discarded. The easiest way of doing this is to introduce a largest permitted condition number,. If the quotient 1 = i is larger than, i is discarded. This approach has been used successfully to stabilize the solution of the normal equations and improve the convergence of the extrapolation methods. 3.4 Computational complexity One way of measuring the eciency of an algorithm is to count the number of operations required. This gives an estimate of the computing time. Here we will give the complexity for the extrapolation methods as well as some Krylov methods. The complexity of the Krylov methods was taken from [2] and is computed from methods based on modied Gram-Schmidt orthogonalization. For other orthogonalization methods, see [2] or [14]. The complexity in terms of vector operations is given in table 1. Saxpy denotes the number of vector updates, z x + y, and Matvec the number of matrix-vector multiplications with the original system matrix. All methods involve solving a system of linear equations. For the Krylov methods the coecient matrix is a Hessenberg matrix, a triangular matrix with one subdiagonal. The extrapolation methods require the solution of an overdetermined system. It is assumed that this system is solved using the normal equations. These linear systems are smalled compared to the original problem, they are of size k k or (k +1) (k +1). Method Inner products Saxpy Matvec Linsys MPE (k 2 +2k)=2 2k +3 k RRE (k +1) 2 =2 2k +3 k Alternative RRE (k +1) 2 =2 3k +3 k MMPE 0 2k +2 k GMRES (k 2 +3k)=2 (k 2 +3k)=2 k 1 4 The Arnoldi method (k 2 +3k)=2 (k 2 +3k)=2 k 1 4 Table 1: Complexity From table 1 we see that the complexity is roughly the same for all methods. The only exception is MMPE that does not require any inner products. The inner products for the other extrapolation methods comes from forming the normal equations and thus they are not needed for MMPE. The dierence is the number of synchronization points needed for a parallel implementation. A synchronization point is a point in the program where all processors must have completed their tasks before the program can continue. This means that at each synchronization point all

17 4 NUMERICAL EXPERIMENTS 17 Method Synchronization points RRE and MPE 1 MMPE 2 GMRES and Arnoldi's method 2k Table 2: Synchronization points processors must wait for the processor that requires the longest time to complete its task. One of the advantages of extrapolation methods over Krylov methods is that they require fewer synchronization points, as seen in table 2. Instead of having GMRES's 2k synchronization points per cycle we only have one(ortwo when using MMPE). For MPE and RRE one synchronization point is needed to form the normal equations. MMPE requires one synchronization point to compute the solution to the linear system and one to compute the residual. The number of synchronization points necessary for the Krylov methods are computed from an algorithm in [6] originally derived by de Sturler. 4 Numerical experiments 4.1 Test problems Numerical experiments are important tools that can be used to validate or disprove assumptions and theories. It is important tochoose test problems that reect the properties of the real problems we wish to solve. Since iterative methods are used widely for the solution of sparse systems it is appropriate to choose sparse test problems. dierences to approximate dierential equations. For most tests we have used the two-dimensional convection-diusion u u 2 One way to obtain sparse systems is to use nite + u = g(x y) (4.1) with Dirichlet boundary conditions. g(x y) is chosen to obtain a solution that is easy to verify. By varying and the spectral radius of G can be modied and thus the convergence rate of the stationary iterative method. Unless otherwise stated, whenever test problems are referred to in this section, we meanthetwo-dimensional convection-diusion equation. The second test problem was found in [20] has been used to produce slowly converging sequences. Here we choose the iteration matrix directly, G = ::: 0 ::: 0 0 ::: ::: 0 and choose an arbitrary f. For this choice of G we can give an analytical expression for the eigenvalues provided that, and are real, j = +2 p cos j j =1 2 ::: N: N +1 If a large N is chosen most eigenvalues will lie in the proximity of +2 p or ; 2 p. For most tests the Jacobi method has been used as a stationary method. For reasons of comparison the Gauss-Seidel method has been used as well (section 4.3.3). The numerical experiments were conducted on a serial computer using Matlab. 1 N k system solved with normal equations resulting in a k k linear system 2 N (k + 1) system solved with normal equations resulting in a (k +1) (k + 1) linear system 3 k k linear system 4 N k Hessenberg system (4.2)

18 4 NUMERICAL EXPERIMENTS Choosing extrapolation method For k = k 0 all polynomial extrapolation methods can be used to nd the solution. If cycling is used for k<k 0 the choice of method becomes more interresting. All three methods have similar complexities and parallel properties so the choice must be based on numerical properties. For RRE and MPE we must also choose a suitable algorithmic structure. We will start by comparing the dierent extrapolation methods and then discuss the choice of algorithmic structure. In general the most eective algorithmic structure has been used for comparison. MMPE seems to be the least eective of the three methods. Perhaps this is a consequence of not using all the information in the generated vectors. To achieve the same convergence as MPE and RRE, MMPE requires a slightly larger k. Using a few initial iterations greatly improves the stability of the method. For diverging sequences MMPE has a tendency to stagnate, i.e., 1 is close to one and all the other j are close to zero. For most sequences MPE and RRE seem to be equally ecient during the rst few cycles. After that RRE is usually better but there are exceptions. MPE is recommended for slowly converging sequences. RRE is always more ecient for diverging sequences for which MPE may failtoconverge if k is not large enough. Even for small values of k RRE accelerates the convergence of divergent stationary methods. For RRE the dierent algorithmic structures are mathematically equivalent for k < k 0 and behave alike. This is not true for MMPE. Dierent choices of Q j lead to dierent linear systems and mathematically dierent algorithmic structures. A few dierent ways of choosing Q j have been tried. The simplest wayistochoose Q j so that only k of the equations in U k;1c = ;u k are considered. For parallel implementations it is advantageous to choose k equations next to each other, for example the rst k equations. For sequences where information is propagated slowly in the vectors this choice of Q j is not good. If the k equations are chosen in a part of the system where the convergence is slow the coecient matrix will be nearly singular or singular. It is better to choose the equations distributed equally across the system. Even better results are obtained if the k equations are formed using all the N original equations. One way ofdoing this is to compute the sum of N=k equations to obtain one new. On a serial computer this is a fast operation. The convergence and stability properties are better than if just k equations are selected. It is conceivable to let Q j vary between cycles to obtain a more adaptive method. An example of such an approachwould be to select the equations corresponding to the k largest elements of the residual in the previous cycle. This has not been tried. The conclusions here are based on experience and numerical testing and can not easily be proved. In [21] Ford et al. compare RRE to MPE and come to the conclusion that MPE is at least as ecient asrre. Herewehave reached the opposite conclusion, RRE is at least as eective as MPE. 4.3 Empirical examination of techniques of improving convergence In this section the techniques for improving the convergence discussed in section 3.2 are applied to the test problems in section 4.1. All techniques are except the vector-discard is applied to the two-dimensional convection-diusion equation. The technique of discarding generated vectors is applied to the slowly converging vector sequence 4.2, also given in section Initial iterations The two-dimensional convection-diusion equation with = ;2:8211 and = 4:0053 is solved on a grid. Figure 1 shows the convergence for RRE with and without initial iterations with k = 10. The spectral radius of G is approximately There are 10 small eigenvalues (< 10 ;15 ), the rest lie between and To nd a solution for which the L 2 -norm of the residual is less than 10 ; vectors are generated without initial iterations, 143 vectors are generated with one initial iteration per cycle and 117 whith two initial iterations per cycle. Not

19 4 NUMERICAL EXPERIMENTS No initial iterations 1 initial iteration 2 initial iterations L2 norm of residual Cycle Figure 1: Initial iterations only do we benet from having to generate fewer vectors, there are also fewer extrapolation steps and thus fewer linear systems to solve Discarding generated vectors In gure 2 an example of a case where the eigenvalues of the coecient matrix are complex with magnitudes close to one. Test problem 4.2 with 50 unknowns and parameters = 0:03, =0:015+0:5i, = ;0:09 ; 0:45i is solved using RRE with k =5. Every q-th vector is discarded. The Jacobi method converges slowly due to eigenvalues close to one (the spectral radius is approximately 0.986), the oscillations in the generated sequence causes some instability in the extrapolation process which can be seen by the oscillating residual. When every other vector is discarded the convergence and stability are greatly improved. The oscillations can also be damped by choosing a larger k. With k = 10 the two methods converge to given tolerance in 6 and 14 cycles respectively Choosing stationary method In gure 3 both the Jacobi and the Gauss-Seidel method have been applied to a discretization of the two-dimensional convection-diusion equation with 225 unknowns and = 1:1899 and = 16:5527. RRE has been used for both stationary methods. In this example we do not benet from using the Gauss-Seidel method. We only gain one cycle and that is not enough to compensate for the extra time needed to generate the sequence. It is not dicult to nd cases where much fewer cycles are needed for the Gauss-Seidel than the Jacobi method (sometimes only half as many). It seems however, that it is dicult to nd cases where we nd the solution faster by using the Gauss-Seidel method even in the serial case.

20 4 NUMERICAL EXPERIMENTS RRE, q=2 RRE, q=1 Jacobi 10 4 L2 norm of residual Cycle Figure 2: Discarding generated vectors L2 norm of residual RRE using Jacobi RRE using Gauss Seidel The Jacobi method The Gauss Seidel method Cycle Figure 3: Dierent stationary methods

Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems

Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001,

More information

Factorization Theorems

Factorization Theorems Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

More information

CS3220 Lecture Notes: QR factorization and orthogonal transformations

CS3220 Lecture Notes: QR factorization and orthogonal transformations CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss

More information

SOLVING LINEAR SYSTEMS

SOLVING LINEAR SYSTEMS SOLVING LINEAR SYSTEMS Linear systems Ax = b occur widely in applied mathematics They occur as direct formulations of real world problems; but more often, they occur as a part of the numerical analysis

More information

Operation Count; Numerical Linear Algebra

Operation Count; Numerical Linear Algebra 10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point

More information

Solution of Linear Systems

Solution of Linear Systems Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start

More information

Computational Optical Imaging - Optique Numerique. -- Deconvolution --

Computational Optical Imaging - Optique Numerique. -- Deconvolution -- Computational Optical Imaging - Optique Numerique -- Deconvolution -- Winter 2014 Ivo Ihrke Deconvolution Ivo Ihrke Outline Deconvolution Theory example 1D deconvolution Fourier method Algebraic method

More information

Numerical Methods I Eigenvalue Problems

Numerical Methods I Eigenvalue Problems Numerical Methods I Eigenvalue Problems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001, Fall 2010 September 30th, 2010 A. Donev (Courant Institute)

More information

LINEAR ALGEBRA W W L CHEN

LINEAR ALGEBRA W W L CHEN LINEAR ALGEBRA W W L CHEN c W W L Chen, 1997, 2008 This chapter is available free to all individuals, on understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied,

More information

Nonlinear Iterative Partial Least Squares Method

Nonlinear Iterative Partial Least Squares Method Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for

More information

General Framework for an Iterative Solution of Ax b. Jacobi s Method

General Framework for an Iterative Solution of Ax b. Jacobi s Method 2.6 Iterative Solutions of Linear Systems 143 2.6 Iterative Solutions of Linear Systems Consistent linear systems in real life are solved in one of two ways: by direct calculation (using a matrix factorization,

More information

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression The SVD is the most generally applicable of the orthogonal-diagonal-orthogonal type matrix decompositions Every

More information

Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

More information

1 VECTOR SPACES AND SUBSPACES

1 VECTOR SPACES AND SUBSPACES 1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

More information

Chapter 12 Modal Decomposition of State-Space Models 12.1 Introduction The solutions obtained in previous chapters, whether in time domain or transfor

Chapter 12 Modal Decomposition of State-Space Models 12.1 Introduction The solutions obtained in previous chapters, whether in time domain or transfor Lectures on Dynamic Systems and Control Mohammed Dahleh Munther A. Dahleh George Verghese Department of Electrical Engineering and Computer Science Massachuasetts Institute of Technology 1 1 c Chapter

More information

7 Gaussian Elimination and LU Factorization

7 Gaussian Elimination and LU Factorization 7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method

More information

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

More information

Vector and Matrix Norms

Vector and Matrix Norms Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

Applied Linear Algebra I Review page 1

Applied Linear Algebra I Review page 1 Applied Linear Algebra Review 1 I. Determinants A. Definition of a determinant 1. Using sum a. Permutations i. Sign of a permutation ii. Cycle 2. Uniqueness of the determinant function in terms of properties

More information

Notes on Symmetric Matrices

Notes on Symmetric Matrices CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.

More information

The Lindsey-Fox Algorithm for Factoring Polynomials

The Lindsey-Fox Algorithm for Factoring Polynomials OpenStax-CNX module: m15573 1 The Lindsey-Fox Algorithm for Factoring Polynomials C. Sidney Burrus This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 2.0

More information

Direct Methods for Solving Linear Systems. Matrix Factorization

Direct Methods for Solving Linear Systems. Matrix Factorization Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011

More information

Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

More information

(Quasi-)Newton methods

(Quasi-)Newton methods (Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting

More information

Lecture 5: Singular Value Decomposition SVD (1)

Lecture 5: Singular Value Decomposition SVD (1) EEM3L1: Numerical and Analytical Techniques Lecture 5: Singular Value Decomposition SVD (1) EE3L1, slide 1, Version 4: 25-Sep-02 Motivation for SVD (1) SVD = Singular Value Decomposition Consider the system

More information

by the matrix A results in a vector which is a reflection of the given

by the matrix A results in a vector which is a reflection of the given Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that

More information

1 Example of Time Series Analysis by SSA 1

1 Example of Time Series Analysis by SSA 1 1 Example of Time Series Analysis by SSA 1 Let us illustrate the 'Caterpillar'-SSA technique [1] by the example of time series analysis. Consider the time series FORT (monthly volumes of fortied wine sales

More information

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEY-INTERSCIENCE A John Wiley & Sons, Inc.,

More information

Data analysis in supersaturated designs

Data analysis in supersaturated designs Statistics & Probability Letters 59 (2002) 35 44 Data analysis in supersaturated designs Runze Li a;b;, Dennis K.J. Lin a;b a Department of Statistics, The Pennsylvania State University, University Park,

More information

ALGEBRAIC EIGENVALUE PROBLEM

ALGEBRAIC EIGENVALUE PROBLEM ALGEBRAIC EIGENVALUE PROBLEM BY J. H. WILKINSON, M.A. (Cantab.), Sc.D. Technische Universes! Dsrmstedt FACHBEREICH (NFORMATiK BIBL1OTHEK Sachgebieto:. Standort: CLARENDON PRESS OXFORD 1965 Contents 1.

More information

3 Orthogonal Vectors and Matrices

3 Orthogonal Vectors and Matrices 3 Orthogonal Vectors and Matrices The linear algebra portion of this course focuses on three matrix factorizations: QR factorization, singular valued decomposition (SVD), and LU factorization The first

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

Iterative Methods for Solving Linear Systems

Iterative Methods for Solving Linear Systems Chapter 5 Iterative Methods for Solving Linear Systems 5.1 Convergence of Sequences of Vectors and Matrices In Chapter 2 we have discussed some of the main methods for solving systems of linear equations.

More information

6. Cholesky factorization

6. Cholesky factorization 6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix

More information

Orthogonal Bases and the QR Algorithm

Orthogonal Bases and the QR Algorithm Orthogonal Bases and the QR Algorithm Orthogonal Bases by Peter J Olver University of Minnesota Throughout, we work in the Euclidean vector space V = R n, the space of column vectors with n real entries

More information

Lecture 1: Schur s Unitary Triangularization Theorem

Lecture 1: Schur s Unitary Triangularization Theorem Lecture 1: Schur s Unitary Triangularization Theorem This lecture introduces the notion of unitary equivalence and presents Schur s theorem and some of its consequences It roughly corresponds to Sections

More information

α = u v. In other words, Orthogonal Projection

α = u v. In other words, Orthogonal Projection Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

More information

Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

More information

The Characteristic Polynomial

The Characteristic Polynomial Physics 116A Winter 2011 The Characteristic Polynomial 1 Coefficients of the characteristic polynomial Consider the eigenvalue problem for an n n matrix A, A v = λ v, v 0 (1) The solution to this problem

More information

160 CHAPTER 4. VECTOR SPACES

160 CHAPTER 4. VECTOR SPACES 160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results

More information

Linear Codes. Chapter 3. 3.1 Basics

Linear Codes. Chapter 3. 3.1 Basics Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length

More information

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors Chapter 6 Eigenvalues and Eigenvectors 6. Introduction to Eigenvalues Linear equations Ax D b come from steady state problems. Eigenvalues have their greatest importance in dynamic problems. The solution

More information

Chapter 6. Orthogonality

Chapter 6. Orthogonality 6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Recall that two vectors in are perpendicular or orthogonal provided that their dot Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

More information

Examination paper for TMA4205 Numerical Linear Algebra

Examination paper for TMA4205 Numerical Linear Algebra Department of Mathematical Sciences Examination paper for TMA4205 Numerical Linear Algebra Academic contact during examination: Markus Grasmair Phone: 97580435 Examination date: December 16, 2015 Examination

More information

CONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation

CONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation Chapter 2 CONTROLLABILITY 2 Reachable Set and Controllability Suppose we have a linear system described by the state equation ẋ Ax + Bu (2) x() x Consider the following problem For a given vector x in

More information

Solving Systems of Linear Equations

Solving Systems of Linear Equations LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how

More information

LINEAR ALGEBRA. September 23, 2010

LINEAR ALGEBRA. September 23, 2010 LINEAR ALGEBRA September 3, 00 Contents 0. LU-decomposition.................................... 0. Inverses and Transposes................................. 0.3 Column Spaces and NullSpaces.............................

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J. Olver 6. Eigenvalues and Singular Values In this section, we collect together the basic facts about eigenvalues and eigenvectors. From a geometrical viewpoint,

More information

26. Determinants I. 1. Prehistory

26. Determinants I. 1. Prehistory 26. Determinants I 26.1 Prehistory 26.2 Definitions 26.3 Uniqueness and other properties 26.4 Existence Both as a careful review of a more pedestrian viewpoint, and as a transition to a coordinate-independent

More information

Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

More information

Inner product. Definition of inner product

Inner product. Definition of inner product Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

More information

BANACH AND HILBERT SPACE REVIEW

BANACH AND HILBERT SPACE REVIEW BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but

More information

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large

More information

Linear Algebraic Equations, SVD, and the Pseudo-Inverse

Linear Algebraic Equations, SVD, and the Pseudo-Inverse Linear Algebraic Equations, SVD, and the Pseudo-Inverse Philip N. Sabes October, 21 1 A Little Background 1.1 Singular values and matrix inversion For non-smmetric matrices, the eigenvalues and singular

More information

On closed-form solutions of a resource allocation problem in parallel funding of R&D projects

On closed-form solutions of a resource allocation problem in parallel funding of R&D projects Operations Research Letters 27 (2000) 229 234 www.elsevier.com/locate/dsw On closed-form solutions of a resource allocation problem in parallel funding of R&D proects Ulku Gurler, Mustafa. C. Pnar, Mohamed

More information

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5.

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5. PUTNAM TRAINING POLYNOMIALS (Last updated: November 17, 2015) Remark. This is a list of exercises on polynomials. Miguel A. Lerma Exercises 1. Find a polynomial with integral coefficients whose zeros include

More information

University of Lille I PC first year list of exercises n 7. Review

University of Lille I PC first year list of exercises n 7. Review University of Lille I PC first year list of exercises n 7 Review Exercise Solve the following systems in 4 different ways (by substitution, by the Gauss method, by inverting the matrix of coefficients

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

1 Sets and Set Notation.

1 Sets and Set Notation. LINEAR ALGEBRA MATH 27.6 SPRING 23 (COHEN) LECTURE NOTES Sets and Set Notation. Definition (Naive Definition of a Set). A set is any collection of objects, called the elements of that set. We will most

More information

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix 7. LU factorization EE103 (Fall 2011-12) factor-solve method LU factorization solving Ax = b with A nonsingular the inverse of a nonsingular matrix LU factorization algorithm effect of rounding error sparse

More information

13 MATH FACTS 101. 2 a = 1. 7. The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

13 MATH FACTS 101. 2 a = 1. 7. The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions. 3 MATH FACTS 0 3 MATH FACTS 3. Vectors 3.. Definition We use the overhead arrow to denote a column vector, i.e., a linear segment with a direction. For example, in three-space, we write a vector in terms

More information

1. Introduction. Consider the computation of an approximate solution of the minimization problem

1. Introduction. Consider the computation of an approximate solution of the minimization problem A NEW TIKHONOV REGULARIZATION METHOD MARTIN FUHRY AND LOTHAR REICHEL Abstract. The numerical solution of linear discrete ill-posed problems typically requires regularization, i.e., replacement of the available

More information

Introduction to Logistic Regression

Introduction to Logistic Regression OpenStax-CNX module: m42090 1 Introduction to Logistic Regression Dan Calderon This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Abstract Gives introduction

More information

Integer Factorization using the Quadratic Sieve

Integer Factorization using the Quadratic Sieve Integer Factorization using the Quadratic Sieve Chad Seibert* Division of Science and Mathematics University of Minnesota, Morris Morris, MN 56567 seib0060@morris.umn.edu March 16, 2011 Abstract We give

More information

NORTHWESTERN UNIVERSITY Department of Electrical Engineering and Computer Science LARGE SCALE UNCONSTRAINED OPTIMIZATION by Jorge Nocedal 1 June 23, 1996 ABSTRACT This paper reviews advances in Newton,

More information

MATH 551 - APPLIED MATRIX THEORY

MATH 551 - APPLIED MATRIX THEORY MATH 55 - APPLIED MATRIX THEORY FINAL TEST: SAMPLE with SOLUTIONS (25 points NAME: PROBLEM (3 points A web of 5 pages is described by a directed graph whose matrix is given by A Do the following ( points

More information

Oscillations of the Sending Window in Compound TCP

Oscillations of the Sending Window in Compound TCP Oscillations of the Sending Window in Compound TCP Alberto Blanc 1, Denis Collange 1, and Konstantin Avrachenkov 2 1 Orange Labs, 905 rue Albert Einstein, 06921 Sophia Antipolis, France 2 I.N.R.I.A. 2004

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

More information

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively. Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry

More information

Orthogonal Diagonalization of Symmetric Matrices

Orthogonal Diagonalization of Symmetric Matrices MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

More information

Solving Linear Systems, Continued and The Inverse of a Matrix

Solving Linear Systems, Continued and The Inverse of a Matrix , Continued and The of a Matrix Calculus III Summer 2013, Session II Monday, July 15, 2013 Agenda 1. The rank of a matrix 2. The inverse of a square matrix Gaussian Gaussian solves a linear system by reducing

More information

4.3 Lagrange Approximation

4.3 Lagrange Approximation 206 CHAP. 4 INTERPOLATION AND POLYNOMIAL APPROXIMATION Lagrange Polynomial Approximation 4.3 Lagrange Approximation Interpolation means to estimate a missing function value by taking a weighted average

More information

P164 Tomographic Velocity Model Building Using Iterative Eigendecomposition

P164 Tomographic Velocity Model Building Using Iterative Eigendecomposition P164 Tomographic Velocity Model Building Using Iterative Eigendecomposition K. Osypov* (WesternGeco), D. Nichols (WesternGeco), M. Woodward (WesternGeco) & C.E. Yarman (WesternGeco) SUMMARY Tomographic

More information

8 Square matrices continued: Determinants

8 Square matrices continued: Determinants 8 Square matrices continued: Determinants 8. Introduction Determinants give us important information about square matrices, and, as we ll soon see, are essential for the computation of eigenvalues. You

More information

3. INNER PRODUCT SPACES

3. INNER PRODUCT SPACES . INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.

More information

Notes on Factoring. MA 206 Kurt Bryan

Notes on Factoring. MA 206 Kurt Bryan The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor

More information

THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS

THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS KEITH CONRAD 1. Introduction The Fundamental Theorem of Algebra says every nonconstant polynomial with complex coefficients can be factored into linear

More information

The Image Deblurring Problem

The Image Deblurring Problem page 1 Chapter 1 The Image Deblurring Problem You cannot depend on your eyes when your imagination is out of focus. Mark Twain When we use a camera, we want the recorded image to be a faithful representation

More information

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. Vector space A vector space is a set V equipped with two operations, addition V V (x,y) x + y V and scalar

More information

( ) which must be a vector

( ) which must be a vector MATH 37 Linear Transformations from Rn to Rm Dr. Neal, WKU Let T : R n R m be a function which maps vectors from R n to R m. Then T is called a linear transformation if the following two properties are

More information

Numerical Analysis An Introduction

Numerical Analysis An Introduction Walter Gautschi Numerical Analysis An Introduction 1997 Birkhauser Boston Basel Berlin CONTENTS PREFACE xi CHAPTER 0. PROLOGUE 1 0.1. Overview 1 0.2. Numerical analysis software 3 0.3. Textbooks and monographs

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

Metric Spaces. Chapter 7. 7.1. Metrics

Metric Spaces. Chapter 7. 7.1. Metrics Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some

More information

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued). MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors Jordan canonical form (continued) Jordan canonical form A Jordan block is a square matrix of the form λ 1 0 0 0 0 λ 1 0 0 0 0 λ 0 0 J = 0

More information

Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008

Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008 A tutorial on: Iterative methods for Sparse Matrix Problems Yousef Saad University of Minnesota Computer Science and Engineering CRM Montreal - April 30, 2008 Outline Part 1 Sparse matrices and sparsity

More information

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = 36 + 41i.

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = 36 + 41i. Math 5A HW4 Solutions September 5, 202 University of California, Los Angeles Problem 4..3b Calculate the determinant, 5 2i 6 + 4i 3 + i 7i Solution: The textbook s instructions give us, (5 2i)7i (6 + 4i)(

More information

RESULTANT AND DISCRIMINANT OF POLYNOMIALS

RESULTANT AND DISCRIMINANT OF POLYNOMIALS RESULTANT AND DISCRIMINANT OF POLYNOMIALS SVANTE JANSON Abstract. This is a collection of classical results about resultants and discriminants for polynomials, compiled mainly for my own use. All results

More information

1 Review of Least Squares Solutions to Overdetermined Systems

1 Review of Least Squares Solutions to Overdetermined Systems cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares

More information

Mean value theorem, Taylors Theorem, Maxima and Minima.

Mean value theorem, Taylors Theorem, Maxima and Minima. MA 001 Preparatory Mathematics I. Complex numbers as ordered pairs. Argand s diagram. Triangle inequality. De Moivre s Theorem. Algebra: Quadratic equations and express-ions. Permutations and Combinations.

More information

4: EIGENVALUES, EIGENVECTORS, DIAGONALIZATION

4: EIGENVALUES, EIGENVECTORS, DIAGONALIZATION 4: EIGENVALUES, EIGENVECTORS, DIAGONALIZATION STEVEN HEILMAN Contents 1. Review 1 2. Diagonal Matrices 1 3. Eigenvectors and Eigenvalues 2 4. Characteristic Polynomial 4 5. Diagonalizability 6 6. Appendix:

More information

Linear Algebra I. Ronald van Luijk, 2012

Linear Algebra I. Ronald van Luijk, 2012 Linear Algebra I Ronald van Luijk, 2012 With many parts from Linear Algebra I by Michael Stoll, 2007 Contents 1. Vector spaces 3 1.1. Examples 3 1.2. Fields 4 1.3. The field of complex numbers. 6 1.4.

More information

Lecture 5 Least-squares

Lecture 5 Least-squares EE263 Autumn 2007-08 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property

More information