Nested iteration methods for nonlinear matrix problems

Size: px
Start display at page:

Download "Nested iteration methods for nonlinear matrix problems"

Transcription

1 Nested iteration methods for nonlinear matrix problems Geneste iteratie methoden voor niet-lineaire matrix problemen (met een samenvatting in het Nederlands) Proefschrift ter verkrijging van de graad van doctor aan de Universiteit Utrecht op gezag van de Rector Magnificus, Prof. dr. W. H. Gispen, ingevolge het besluit van het College van Promoties in het openbaar te verdedigen op maandag 22 september 2003 des ochtends te uur door Jasper van den Eshof geboren op 21 januari 1975 te Utrecht

2 Promotor: Co-promotor: Prof. Dr. H.A. van der Vorst Faculteit der Wiskunde en Informatica Universiteit Utrecht Dr. G.L.G. Sleijpen Faculteit der Wiskunde en Informatica Universiteit Utrecht Het onderzoek beschreven in dit proefschrift is financieel mogelijk gemaakt door de Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NWO). Mathematics Subject Classification: 65F15, 65F10, 65F50. Van den Eshof, Jasper Nested iteration methods for nonlinear matrix problems Proefschrift Universiteit Utrecht Met een samenvatting in het Nederlands. ISBN

3 Contents 1 Introduction The eigenvalue problem Chapter 2: The subspace extraction Chapter 3: Simple vector iterations The overlap operator in quantum chromodynamics Chapter 4: Numerical methods for the overlap operator Chapter 5: Inexact Krylov subspace methods Eigenvector approximations from a subspace Introduction Rayleigh-Ritz approximations A priori error bounds for the Ritz pair A well-known upper bound A sharp upper bound Some results based on Theorem Discussion Harmonic Rayleigh-Ritz approximations Useful properties of harmonic Rayleigh-Ritz A minmax characterization for harmonic Ritz values Optimal inclusion intervals for eigenvalues

4 iv Contents The concept of ρ-values Harmonic Rayleigh-Ritz and Krylov subspaces A connection with Gauss-Radau quadrature Comparing harmonic and refined Rayleigh-Ritz Refined Rayleigh-Ritz approximations The optimal value of ξ in refined Rayleigh-Ritz The optimal value of σ in harmonic Rayleigh-Ritz Illustration Discussion Numerical experiments A priori error bounds A posteriori error estimation A condition for the minimizing shift A condition for the shift σ m Discussion The selection of a harmonic Ritz pair The selection strategies Numerical experiments Summary and outlook Subspace expansion using simple vector iterations Introduction Rayleigh quotient iteration The Jacobi-Davidson correction equation Illustration Discussion The iterative solution of the correction equation Discussion

5 Contents v 3.5 Numerical experiments Summary and outlook Numerical methods for the QCD overlap operator Introduction A Krylov subspace framework The Chebyshev approach Methods based on the Lanczos reduction Lanczos approximations Smooth convergence with Lanczos on Q The quality of the polynomials Error estimation Practical implementations The PFE/CG method Error estimation The choice of the rational approximation Removing converged systems Discussion Numerical experiments Summary and outlook Inexact Krylov subspace methods for linear systems Introduction Krylov subspace methods Derivation from Krylov decompositions Inexact Krylov subspace methods Relaxation strategies The analysis of inexact Krylov subspace methods

6 vi Contents A general expression for the residual gap Inexact Richardson iteration Discussion Inexact Chebyshev iteration Discussion The inexact Conjugate Gradient method The case of T k positive definite The case of T k indefinite The behavior of the computed residuals Variants of the Conjugate Gradient method Numerical experiments Discussion Inexact FOM and GMRES The behavior of the computed residuals Practical aspects of relaxation Nested inexact Krylov subspace method The outer iteration: Richardson iteration The outer iteration: flexible GMRES Choosing the precisions ξ j Discussion Numerical experiments Summary and outlook A Subspaces and their bases 147 A.1 The Krylov subspace B Definitions from QCD 151 C The overlap operator in computer arithmetic 153

7 Contents vii C.1 Boriçi s method for the overlap operator C.2 The effect of rounding errors in the Lanczos method C.3 An alternative implementation C.3.1 Dealing with the partial fraction expansion Nederlandse samenvatting 177 Dankwoord 181 Curriculum Vitae 183

8

9 Chapter 1 Introduction Today s applications in scientific computing require the more and more complex coupling of various building blocks dedicated to specific visualization and numerical tasks. The Numlab scientific computing workbench [85] aims at providing its users with the possibility to construct rapidly and conveniently applications for scientific computing and visualization problems. A key issue here is the availability of flexible (and relevant) modules. Particularly focusing on the numerical part, we see that in large scale problems iterative solution methods often play a pivotal role. Iteration methods are solution methods that proceed by, starting with an initial guess, repeatedly improving the last obtained approximation until it satisfies the required accuracy. Traditionally, iteration methods are seen as the opposite of, what are called, direct methods (although more and more there grows a common consensus among researchers that clever combinations of the classes can lead to very good solvers). Whereas for direct methods there is now a vast amount of literature studying the efficiency, stability and accuracy of various methods, the situation for iterative methods is still not as advanced. General purpose implementations of iterative methods are often not readily available. In fact, in practice often experts are needed to tune the specific methods on a problem by problem basis. The general goal of this thesis is to prepare iterative methods for use in a scientific computing laboratory. As a concrete starting point we will study the numerical solution of two matrix problems: the computation of eigenvectors, and their corresponding eigenvalues, of large sparse matrices the multiplication of Green s function with a source vector in simulations in quantum chromodynamics with overlap fermions. Both these problems are nonlinear and they are frequently solved by nesting itera-

10 2 Chapter 1. Introduction tion methods. This means that in each iteration step of a certain method a second iteration method is invoked to solve some subproblem. The nesting aspect is the focus of this thesis and is what makes these problems in particular interesting for a scientific computing laboratory because of the necessity of coupling different iteration methods (or computational kernels). There are various reasons for nesting iteration methods. For example, in some iterative methods the cost of an iteration step grows linearly with the iteration number. This can happen if for acceleration purposes a subspace is formed that is spanned by the approximations computed so far. The dimension of this subspace increases with every step of the iteration method, thereby increasing with every step the amount of work involved in the construction of an appropriate basis for this subspace. Introducing additional information in the process may reduce the number of required iterations and therefore result in a significant reduction of the overall cost. The necessary information might be obtained by partly solving the original problem or some relevant subproblem by invoking a second iteration method. This raises the question how to couple the two levels of iteration or, posed differently, how accurately the embedded solver has to solve the subproblem. As a concrete example, we study, in the first part of this thesis, modern iterative solvers for computing a few eigenvectors and eigenvalues of large sparse matrices. A different situation where nested iteration schemes appear naturally is when the problem to be solved consists of relatively simple problems that are coupled. An example where this occurs is the so-called Stokes problem from computational fluid dynamics. The Stokes problem consists of two coupled linear systems of equations. The velocity can be computed by solving a linear system which involves the unknown pressure and the pressure depends again, in some discretizations, on the velocity by a second linear system. Here, two-level iteration methods are sometimes used in practical solution strategies. As a different and concrete example, we will study a linear problem that occurs in large scale simulations in quantum chromodynamics, the physical theory that describes the strong interaction between elementary particles. The challenge here is that the system of equations is only given implicitly by a matrix function that must be computed with an iterative method. The solution method that is used in practical simulations for this problem is a standard iteration method for linear system that invokes a second iteration method for dealing with the matrix function. Again, we have to ask ourselves the question how accurately we have to compute this matrix function given a required and predefined precision for the whole problem. In this thesis we study the two separate building blocks that make up a two-level iteration method for both problems and, in particular, the tuning of the coupling between the two levels of iteration which is important for a scientific working environment. Our study should lead to strategies for automatically optimizing this coupling. This puts also emphasis on the individual components. For example, we need good termination criteria for the iterative methods. The outline of this thesis is as follows. Chapter 2 and Chapter 3 are dedicated to the eigenvalue

11 1.1. The eigenvalue problem 3 problem. Chapter 4 and Chapter 5 are concerned with simulations in quantum chromodynamics with overlap fermions. In the remainder of this chapter we give a short summary of the work in this thesis and summarize some of our contributions. 1.1 The eigenvalue problem The origins of eigenvalue problems are diverse and range from search engines for searching the Internet to the stability analysis of large structures. The problem can often be formulated as an algebraic eigenvalue problem where one tries to find a nonzero vector x and a scalar λ such that Ax = λx. Usually only a very small number of eigenvectors and their corresponding eigenvalues are needed. In this case iterative projection methods (sometimes called subspace methods) come into the picture. These methods compute iteratively an approximation to the desired eigenpair by building up a subspace and in every iteration step they extract an approximation to the sought-after eigenpair from this subspace. Two components can be identified. First, there is the computation of appropriate vectors to expand the subspace which may require the (approximate) solution of a linear system. If this is done iteratively we refer to this part as the inner iteration. The other key ingredient is the collection of the expansion vectors in a subspace and the subsequent extraction of good approximations to the wanted eigenpair by an extraction technique. It generally involves an orthogonalization method for constructing an orthogonal basis for the subspace. We term this the outer iteration. The identified structure of iterative projection methods is also reflected by the outline of the first part of this thesis. In Chapter 2 we focus on the extraction of useful eigenvector/eigenvalue approximations from a given subspace and issues concerning the computation of the expansion vectors are treated in Chapter Chapter 2: The subspace extraction We ask ourselves the following main question in Chapter 2: suppose a given subspace contains a good approximation to the eigenvector, how can we extract eigenvector approximations from that subspace? We will review the Rayleigh-Ritz method and prove that for eigenvalues that are in some sense in the exterior of the spectrum, the approximations generated by the Rayleigh-Ritz method are guaranteed to be useful. We study this in more detail, for the situation that A is real symmetric, by deriving a priori error bounds for the eigenvector approximations expressed in terms of the eigenvalues of the matrix A and the angle of the subspace with the eigenvector of interest.

12 4 Chapter 1. Introduction For eigenvalues that are in the interior of the spectrum, i.e., interior eigenvalues, the Rayleigh-Ritz method always constructs good approximations to the eigenvalues but the approximation to the eigenvectors might be useless. This means that alternative extraction methods must be considered for these type of eigenvectors. One such alternative is the recently proposed harmonic Rayleigh-Ritz method, which can be seen as a variant of the Rayleigh-Ritz method. This method involves a parameter that should be chosen appropriately depending on the location of the part of the spectrum that is of interest. In literature, many numerical experiments have been reported showing that harmonic Rayleigh-Ritz indeed resolves the problems of standard Rayleigh-Ritz for interior eigenvalues. Despite this practical success, the effect of the parameter on the method is complex and not well understood. This raises the question how to choose this parameter when we are interested, for example, in the eigenpair with its eigenvalue close to some target value. In Chapter 2 we address this question by showing that the harmonic Rayleigh- Ritz approximations of interest are equal to the approximations of the classical Rayleigh-Ritz method when applied to the transformed eigenvalue problem (A τi) 2 x = (λ τ) 2 x, for a specific value of τ. We also use this relation to give a comparison of harmonic Rayleigh-Ritz and an alternative extraction method, specially designed for interior eigenvalues, known as the refined Rayleigh-Ritz method. For more theoretical purposes, we use the demonstrated equivalence to derive a posteriori and a priori error bounds for the eigenpair approximations of this method. The harmonic Rayleigh-Ritz method generates a whole set of approximations to eigenpairs of the matrix A just as the Rayleigh-Ritz method. From this set an appropriate approximation to the eigenpair of interest must be selected. We conclude Chapter 2 by discussing a new criterion for doing this Chapter 3: Simple vector iterations A reliable and robust extraction method results in an approximation from the subspace to the eigenpair of interest. Based on this approximation, we want to construct a vector to expand our subspace with. This is typically accomplished by iteratively solving some linear system. This is the inner iteration of the iterative projection method and is the subject of Chapter 3. As a basis for an expansion strategy we will discuss simple vector iterations like inverse iteration and Rayleigh quotient iteration. These methods repeatedly compute a new approximation to an eigenvector based only on the approximation from the previous step. The important observation is that we can apply one step of some simple iteration scheme to the extracted approximation from the subspace and expand the subspace with the resulting vector. Many powerful subspace methods

13 1.1. The eigenvalue problem 5 are based on this idea and sometimes the resulting methods are seen as accelerated versions of the simpler iterations. Therefore, we study in Chapter 3 simple vector iterations. Of particular importance is Rayleigh quotient iteration which computes a new approximation u based on a known approximation u by solving the linear system (A ϑi)u = u with ϑ = ut Au u T u. (1.1.1) The matrix on the left is ill conditioned if ϑ is close to an eigenvalue and therefore accurate approximations to u are often too expensive to determine in practice. Rayleigh quotient iteration has appealing local convergence properties that are, unfortunately, lost when the matrix A in (1.1.1) is replaced by a nearby matrix that allows a cheaper computation of an approximate u. In Chapter 3 we propose a simple vector iteration that is based on the correction equation of the Jacobi-Davidson method. This iteration is mathematically equivalent to Rayleigh quotient iteration if the correction equation is solved exactly. However, it is observed in literature that this correction equation is more robust with respect to replacing the exact matrix A with a nearby matrix. We will explain this by relating this iteration scheme to Rayleigh quotient iteration on a nearby matrix that possesses an eigenvector that is, with every step of the iteration, increasingly closer to the wanted eigenpair. This connection leads to convergence bounds for the simple iteration when the matrix A is replaced with some nearby matrix. The correction equation may be solved with a (preconditioned) iterative solver for linear systems. Iterative solvers are usually terminated when a given relative residual precision for the linear system has been obtained. We discuss the effect of this criterion when used in the simple iteration. This confirms the results of Dembo et al. [26] for the more general class of inexact Newton methods. Their results show that higher order convergence can be achieved by working with an increasingly smaller tolerance. As a consequence, this gives a suitable sequence of tolerances for use in the Jacobi-Davidson method. In our final section we discuss some numerical experiments where this strategy is applied for the full Jacobi-Davidson method, that is, an additional outer iteration is added to the simple iteration which contains the subspace acceleration. In these cases a trade-off has to be made between the amount of work that is spent in the inner iterations and in the outer iteration. Obviously, solving the correction equation very accurately is not efficient. Conversely, if the correction equation is solved less accurately then the number of outer iterations grows and therefore, for example, the cost for the orthogonalization of the basis for the subspace increases. We show by several numerical experiments that an improved condition, as discussed for the simple iteration, might also be useful for the complete method.

14 6 Chapter 1. Introduction 1.2 The overlap operator in quantum chromodynamics In the second part of this thesis we start our discussion on numerical techniques for the overlap formulation in quantum chromodynamics (QCD), the physical theory that describes the strong interaction between elementary particles. This overlap formulation initiated a lot of research in solving linear systems of the form (rg 5 sign(q))x = b (r 1), (1.2.1) where Q and G 5 are sparse Hermitian indefinite matrices. In today s simulations the dimension of Q and G 5 is in the order of one to ten million. The matrix sign(q) is the so-called matrix sign function or, more precisely, if we have the eigenvalue/eigenvector decomposition, Q = XDX, with D = diag(λ 1,..., λ n ), then the matrix sign function is defined as sign(q) := X sign(d)x = Xdiag( sign(λ 1 ),..., sign(λ n ))X, where sign(t) is the standard sign function. Solving the full problem, that is solving (1.2.1) for x, given G 5 and Q, requires the solution of a simple linear system coupled to the nonlinear problem of computing the matrix sign function. Although the matrix Q is very sparse, the linear system in (1.2.1) is dense. The solution method that we consider, which is the method of choice in practical simulations, consists of applying a standard iterative solver for linear systems to (1.2.1). This is the outer iteration. One of the main advantages of using an iterative solver for this problem is that the matrix rg 5 sign(q) has not to be known and stored explicitly which is, due to the density and large dimension of this matrix, not feasible. Instead, we need to compute the product of this matrix with some vector in every outer iteration step. (Nevertheless still a computationally demanding task.) Vector iteration methods for computing the product of sign(q) with a generic vector are discussed in Chapter 4. In Chapter 5 we study the impact of an approximate matrix-vector product on various iterative solvers for linear systems which should lead to strategies for tuning the precision of the matrix sign function times vector Chapter 4: Numerical methods for the overlap operator In Chapter 4 we focus on the computation of the product of the matrix sign function with a generic vector, say y. The methods that we will consider are vector iteration methods that compute, in step k, an approximation of the form sign(q)y p(q)y,

15 1.2. The overlap operator in quantum chromodynamics 7 where p is a polynomial of degree less than k. We give a unified treatment of, and propose various improvements to, a number of methods that have been considered previously in literature. We consider, among others, methods based on Chebyshev polynomials, Lanczos approximations and methods exploiting the multi-shift Conjugate Gradient method. Special emphasis is put on explicit accuracy bounds on the inner iterations. This is important in order to be able to tune the precision of the computed matrix-vector product in every outer iteration step with strategies that we will propose in Chapter 5. We develop procedures for various approximation methods that guarantee a given accuracy for the matrix-vector product. In one particular method, frequently used by physicists, the matrix sign function is approximated by a rational matrix function written as the sum of poles, this gives m sign(q)y ω i Q(Q 2 + τ i I) 1 y. i=1 The choice of the shifts τ i, the weights ω i and the number of poles, m, depends on the type of rational approximation used, the location of the eigenvalues of Q and the required precision. This scheme reduces the problem to solving m, socalled, shifted linear systems which may be efficiently accomplished with a method from the class of multi-shift Krylov subspace methods, which are variants of the standard iterative methods designed for solving families of shifted systems. The cost of this method depends, besides the standard cost of the conjugate gradient method, on the number shifted systems to be solved. We will improve this method considerably by reducing the number of poles. First, we propose a new rational approximation based on the work of Zolotarev. This leads to a significant reduction of the number of necessary poles compared to rational approximation previously used in the computation of the overlap operator. Furthermore, we propose a modification of the multi-shift iterative solver that saves computational work by using an individual tolerance for each shifted system. Again, we are developing a procedure to guarantee a given accuracy. Chapter 4 is concluded with a comparative study with realistic configurations of the various improved methods on a parallel cluster computer. This shows that our new multishift approach based on Zolotarev s work in combination with early termination of converged shifted systems is the most efficient Chapter 5: Inexact Krylov subspace methods Matrix-vector products are an essential ingredient of iterative solvers for linear systems, in particular of the so-called Krylov subspace methods. In Chapter 5 we discuss the impact of an approximately computed matrix-vector product on a variety of iterative solvers for linear systems. Although this problem was motivated by the overlap formulation in quantum chromodynamics, we will give a very general

16 8 Chapter 1. Introduction treatment of this problem for linear systems of the form Ax = b. Following nomenclature often used in literature, we will refer to Krylov subspace methods with approximate matrix-vector product as inexact Krylov subspace methods. The errors in the matrix-vector products essentially have two consequences: the accuracy of the iterative method is limited and, secondly, the convergence speed is altered. We investigate both aspects by studying the convergence behavior and smallest attainable value of the true residual, defined as b Ax k where x k is the computed approximation in step k of the iterative method. A consequence of working with an inexact matrix-vector product is that the computed residual in step k, r k, usually is not a residual anymore corresponding to the computed approximation x k, hence, r k b Ax k. We have that b Ax } {{ k } 2 r k (b Ax k ) } {{ } 2 + r }{{} k 2, true residual residual gap computed residual where the first quantity on the right is commonly referred to as the norm of the residual gap. This simple inequality forms the basis of our analysis. We argue that the attainable accuracy is determined by the norm of the residual gap whereas convergence speed is determined by the computed residuals. In Chapter 5 we study the residual gap and convergence behavior of the computed residuals for various Krylov subspace methods, including stationary methods, like Chebyshev iteration, as well as several non-stationary methods as the Conjugate Gradient method and the GMRES method. Bouras and Frayssé present in a recent technical report [14] a large number of numerical experiments in which, in step k of the GMRES method, the matrixvector product is computed with a relative precision given by ε b Ax k 1 2. The value of ε is chosen in the order of the required residual precision. They empirically observe that, for this choice of the precision, the attainable precision of the inexact method is about ε. Furthermore, they notice from their numerical experiments that the convergence speed of the perturbed method is approximately as fast as for the exact GMRES method. They refer to this choice for the relative precision as a relaxation strategy since it results in very accurate matrix-vector products in the early iterations but this precision is relaxed during the iteration process as soon as x k 1 becomes a better approximation to the exact solution. Our analysis in Chapter 5 explains the success of this strategy and shows that it is essentially correct and optimal. Furthermore, we point out when a similar strategy is appropriate for other Krylov subspace methods as well, and for some Krylov methods an even more aggressive relaxation strategy is proposed.

17 1.2. The overlap operator in quantum chromodynamics 9 In the second part of Chapter 5 we discuss the computational advantages and drawbacks of the use of a relaxation strategy. We argue that the drawbacks can be overcome by preconditioning an inexact Krylov subspace method by another inexact Krylov subspace method set to a larger precision. This means, for example for the QCD problem, that we get, in total, a three-level iteration scheme. The nesting of inexact Krylov subspace methods can be a very effective tool in reducing the total cost of the matrix-vector multiplications, we demonstrate this for a Schur complement system that stems from a model that describes the steady barotropic flow in a homogeneous ocean with constant depth.

18

19 Chapter 2 Eigenvector approximations from a subspace The research in this chapter is published as part of: G. L. G. Sleijpen, J. van den Eshof, and P. Smit. Optimal a priori error bounds for the Rayleigh-Ritz method. Math. Comp., 72: , G. L. G. Sleijpen and J. van den Eshof. On the use of harmonic Ritz pairs in approximating internal eigenpairs. Linear Algebra Appl., 358(1-3): , 2003.

20 12 Chapter 2. Eigenvector approximations from a subspace 2.1 Introduction In many scientific computations it is at some point necessary to compute an eigenvector corresponding to some eigenvalue of a matrix A. Or, in other words, one wants to find an approximation to a pair (λ, x) (with x 0) that satisfies Ax = λx. Often the matrix A is of very large dimension but contains a few nonzero elements and only a small subset of the eigenvalues and eigenvectors is required. Iterative projection methods are designed for solving these large sparse eigenvalue problems and well-known examples of methods in this class include the Lanczos method [94, Chapter 13], the Davidson method [24] and Jacobi-Davidson [110], to mention only a few. There are two distinct aspects of these type of projection methods. The first is the step-by-step construction of a subspace that contains approximations to the sought-after eigenvectors. The second aspect is the extraction of good eigenvector approximations from that subspace by using a projection technique. The subspace projection is sometimes viewed as a way to accelerate the convergence of a simple iteration method, in a similar fashion as, for example, GMRES for systems of linear equations can be seen as an accelerated version of Richardson iteration. However, the situation for eigenvalue methods is often more delicate because frequently an approximate eigenpair from the subspace is used in the computation of a vector to expand the subspace or for restart purposes. For this reason the success of the solution method crucially depends on the success of extracting a good eigenvector approximation to a relevant eigenpair. In this chapter we focus on the extraction phase. The expansion of the subspace is the subject of Chapter 3. This means that in this chapter we assume that we are given some subspace that contains a reasonable approximation to the eigenvector of interest to us which depends on the particular application. In the remainder of this section we outline the organization of this chapter. The best-known method for forming approximations from a given subspace is the Rayleigh-Ritz method which we discuss in Section 2.2. We then review a result that says that, if the subspace contains a good approximation to the wanted eigenvector, the Rayleigh-Ritz method constructs at least one approximate eigenpair for which the approximate eigenvalue is close to the eigenvalue of interest. We consider how good the associated approximate eigenvector (called a Ritz vector) is as approximation to the eigenvector. This is the central question that we ask ourselves for Rayleigh-Ritz since it teaches us for which type of eigenvalues the Rayleigh-Ritz method is guaranteed to be an appropriate method. We will show that, in order for the eigenvector approximation to be relevant, it is sufficient that the target eigenvalue is in some sense an outlier in the spectrum. We will say that this eigenvalue is in the exterior of the spectrum. To get some insight into the behavior of the Ritz vectors as a function of the quality of the given subspace, we work out the details for the symmetric case by

21 2.1. Introduction 13 deriving error bounds for the Rayleigh-Ritz approximation to the eigenpair with the smallest eigenvalue. The bounds are expressed in terms of the eigenvalues of A and the angle between the subspace and the eigenvector of interest. We may therefore call these bounds truly a priori. (Obviously, all results can be transformed to statements about the largest eigenvalue and corresponding eigenvector by replacing A with A.) This is the subject of Section 2.3. In practical applications one is often searching for an eigenpair with the eigenvalue in some relevant region of the complex plane. For example, one is interested in the smallest eigenvalue or the one closest to some target value in the interior of the spectrum. Unfortunately, Rayleigh-Ritz is less suitable in this latter case. We discuss this in Section In particular for symmetric matrices there are various efforts to overcome the difficulties with finding interior eigenpairs. For example, Scott [102] argues that working with a shifted and inverted operator in Rayleigh-Ritz is preferable. Morgan points out in [86] that the necessary expensive inversion of the operator can be handled implicitly with a particular choice for the subspace. The resulting method has been given the name harmonic Rayleigh-Ritz in [93]. Independently of this work, the eigenvalue approximations of this method (the harmonic Ritz values) had already received considerable attention in the special case that the subspace is a so-called Krylov subspace. Then the harmonic Ritz values are equal to the roots of Kernel polynomials which play an important role in the theory of iterative minimal residual methods for linear systems, see [35, 84] and [32, Section 2.5] for some recent work and references. For general subspaces, harmonic Ritz values have also been studied in the context of Lehmann s optimal inclusion intervals for eigenvalues [81, 82, 94, 7]. The connection between these different areas of research was made in [93]. In Section 2.4 we give a definition of harmonic Rayleigh-Ritz with respect to some shift parameter and we summarize some useful properties in Section 2.5. Subsequently, in Section 2.6, we compare harmonic Rayleigh-Ritz to refined Rayleigh-Ritz. Refined Rayleigh-Ritz, popularized by Jia [72], is another method to compute approximations from a subspace specially designed for eigenvectors with eigenvalues in the interior of the spectrum. In Section we give a relation that shows that both methods are equivalent in some sense. Although the relation between these two approaches is of interest on its own account, it turns out to be also useful in the rest of this chapter. If we vary the shift in the harmonic Rayleigh- Ritz method then the angle between the eigenvector approximation (the harmonic Ritz vector) and the target eigenvector changes. As an application of the relation between harmonic and refined Rayleigh-Ritz we also discuss in Section 2.6 the question, what shift for harmonic Rayleigh-Ritz minimizes this angle. This should provide insight into the issue of choosing this shift parameter. The subject of Section 2.7 is a priori error bounds for the harmonic Rayleigh-Ritz method. We generalize well-known error bounds for Rayleigh-Ritz to the harmonic Rayleigh-Ritz context and discuss some of their limitations. A posteriori error

22 14 Chapter 2. Eigenvector approximations from a subspace bounds for the harmonic Ritz values are discussed in Section 2.8. By changing the shift in harmonic Rayleigh-Ritz different intervals can be obtained. Each interval contains at least one eigenvalue. We give a condition for a posteriori choosing a new shift that results in a smaller inclusion interval. Repeatedly relocating the shift using this condition will ultimately result in an, evidently appealing, optimal interval with respect to the given information. This interval can be used as an a posteriori error estimator. So far we have assumed that we were able to identify the harmonic Ritz pair that has its approximate eigenvector close to the wanted eigenvector. When searching for the smallest and largest eigenvalues of a symmetric matrix with the Rayleigh- Ritz method, this is indeed not a difficult problem. However, when searching for an eigenpair with its eigenvalue closest to some target with harmonic Rayleigh-Ritz this is less obvious. For a particular shift, the harmonic Rayleigh-Ritz method produces a set of harmonic Ritz vectors. In practice, the eigenvector is unknown, and it is not obvious how to tell which vector from this set forms the best approximation to the target eigenvector. The problem of selecting a well-suited harmonic Ritz vector for a given shift is treated in Section 2.9. Although some of the results in this chapter have practical applications, the purpose of this chapter is to provide insight rather than algorithms. 2.2 Rayleigh-Ritz approximations Let A C n n be a general matrix with eigenpairs (λ, x) and let V C n k be a matrix, whose columns form an orthonormal basis for the k dimensional subspace V. We are interested in techniques that compute approximations from a subspace to eigenpairs. The most important method in this class is the Rayleigh- Ritz method. The Rayleigh-Ritz method obtains k approximate eigenpairs (ϑ, u), the so-called Ritz pairs, by imposing the Ritz-Galerkin condition Au ϑu V with u V\{0}, or equivalently, V AV z ϑz = 0 with u := V z 0. (2.2.1) The value ϑ can be seen as an approximation to an eigenvalue of A and is called a Ritz value. The associated vector u (Ritz vector) forms an approximation to an eigenvector of A. 1 From (2.2.1) it follows that ϑ equals the so-called Rayleigh 1 According to B.N. Parlett the terms Ritz value and Ritz vector are not correct in the non- Hermitian case for historical reasons. He proposed to overcome this problem of nomenclature by adding quotations marks in the non-hermitian case, i.e., use the terms Ritz value and Ritz vector. We, however, will not follow this suggestion.

23 2.2. Rayleigh-Ritz approximations 15 quotient, ρ(u), of the vector u, ϑ = ρ(u) where ρ(v) := v Av v v. We will assume throughout this chapter that u 2 = 1. In this chapter we assume that we are looking for some approximation to a particular eigenpair that we denote with (λ, x). In order to be able to construct robust algorithms for eigenvector computation, we need reliable methods for extracting eigenvector approximations from a subspace to this eigenvector x and similar for the eigenvalue. Therefore, we consider the following question: suppose that we are searching for an eigenpair (λ, x) and that (V, x) is small, is there then a Ritz pair (ϑ, u) such that ϑ λ and (u, x) are small? For the Ritz values this question is answered positively by the following result. Theorem (Stewart and Jia [74]). There exists a Ritz value ϑ such that ϑ λ 4 A 2 tan (V, x) 1/k (2 + tan (V, x) ) 1 1/k. This shows that if the angle between the subspace V and the unknown eigenvector x decreases there is always a Ritz value getting closer and closer to the eigenvalue λ. For the Ritz vectors the following result is well known. It was originally proved by Saad [98] for real symmetric matrices and later extended by Stewart to the general case [119]. Theorem (Stewart [119]). Let u be a Ritz vector with respect to the space V. Let W be the orthogonal complement of u in V. Then ) sin 2 (u, x) (1 + η2 sin 2 (V, x), (2.2.2) where η := V V A(I V V ) 2 and α := inf z 2=1 (W AW )z λz 2 with W an arbitrary orthogonal basis for W. α 2 The problem with this bound is that the value α in general cannot be bounded from below a priori. It can be shown [74] that a similar result holds with λ in the expression for α in Theorem replaced by ϑ. This gives the possibility of a posteriori checking the quality of the Ritz vector. Nevertheless, it can happen in practice that a good subspace V (i.e., (V, x) small) results in a Ritz value close to the eigenvalue of interest, λ, but the theorem does not guarantee that the corresponding Ritz vector is a good approximation to x because α is small. Unfortunately, in practical situations it is observed that in these cases the Ritz vector can be totally irrelevant. We return to this in Section (see also [102, 74, 86]). Theorem does not exclude that there are eigenvalues λ for which we, at forehand, can say that we can safely use the Ritz vector corresponding to ϑ as an

24 16 Chapter 2. Eigenvector approximations from a subspace approximation to x. This means that we have to show that α is bounded from below if ϑ is close to the target eigenvalue. This quantity α is unfortunately difficult to assess since it requires knowledge of the unknown Ritz vector. Therefore, we give the following variant of Theorem that involves a quantity γ( ). Theorem Let (ϑ, u) be a Ritz pair with respect to the space V. If γ(ϑ) > 0 where γ(µ) := min z Az z x,z 0 z z µ, then where η := (I xx )(A ϑi)(i xx ) 2. tan (u, x) η tan (V, x) (2.2.3) γ(ϑ) Proof. Without loss of generality we can assume that the matrix A is upper triangular and is of the form [ ] λ r A =, (2.2.4) 0 R with r R 1 n 1 and R R n 1 n 1 upper triangular. Hence, the eigenvector of interest is simply the first standard basis vector: x = e 1. Let x V be the projection of x onto the space V. For the moment we assume that we can write u := (e 1u) 1 u = [ 1 e ] and x V := (e 1x V ) 1 x V = [ 1 f ]. (2.2.5) The residual of the Ritz vector u, Au ϑu = [ λ ϑ + re (R ϑi)e ], is by definition orthogonal to u and x V. This results in the two equations 0 = λ ϑ + re + e (R ϑi)e 0 = λ ϑ + re + f (R ϑi)e. Equating both expressions and taking the absolute value on both sides gives γ(ϑ) e 2 2 e (R ϑi)e = f (R ϑi)e f 2 e 2 R ϑi 2 from which (2.2.3) follows. It remains to be checked that u is not perpendicular to x, or equivalently, [ ] 0 u =: u e for some e with e 2 = 1. This also implies that x V is nonzero. The proof is by contradiction: writing out u (Au ϑu ) = 0 gives that e (R ϑi)e = 0 which implies that e = 0 since γ(ϑ) > 0. This concludes the proof.

25 2.3. A priori error bounds for the Ritz pair 17 It follows from this theorem that if γ(λ) > 0 then the Ritz vector associated to ϑ is a good approximation to x if ϑ is close enough to λ. Theorem shows that there is a Ritz value arbitrarily close to λ if the quality of the subspace is high (that is, (V, x) is small). Hence, for this type of eigenvalue we can safely use the Rayleigh-Ritz approximation without extra precautions. The condition γ(λ) > 0 means that λ is, in some sense, an extreme eigenvalue. For example if A is normal it says that λ is outside the convex hull of the other eigenvalues of A. In particular for the real symmetric case it means that we can expect sensible approximations for the smallest and largest eigenpair. The bound in Theorem does not provide a true a priori error bound since it requires knowledge of ϑ. Therefore, it is difficult to interpret how the quality of the Ritz vector precisely depends on the quality of the subspace ( (V, x)). In the next section we derive a priori error bounds for the Ritz vector in the real symmetric case when approximating the smallest eigenvalue. 2.3 A priori error bounds for the Ritz pair From now on we assume that A R n n is symmetric and V is real. The eigenpairs (λ i, x i ) of A are numbered such that λ 1 λ 2 λ n, and we index the Ritz values in a similar fashion: ϑ 1 ϑ 2 ϑ k 1 ϑ k. From Theorem we know that the Rayleigh-Ritz method can be safely used for finding an approximation to the first eigenpair. In this section we want to make this statement more precise and we are interested in the Ritz pair, (ϑ V, u V ), for which sin 2 (u V, x 1 ) is minimal over all Ritz vectors u i. This is the pair with the Ritz vector that makes the smallest angle with x 1 over all Ritz vectors. In the ideal case we would have that u V is a multiple of x V, where x V is the normalized projection of x 1 on V. This would give sin 2 (u V, x 1 ) = sin 2 (V, x 1 ), which is optimal. Unfortunately, the approximation u V is not a multiple of x V in general. In this section we derive optimal upper bounds for the first Ritz pair. We will moreover show that ϑ V equals ϑ 1 given that the subspace contains a sufficiently accurate approximation. This is the subject of Section For convenience of the reader and comparison purposes, we start in the next subsection with discussing some classical bounds for the first Ritz pair that can be found in literature. Besides our theoretical interest in a priori error bounds, the new, sharper bounds can be used to improve a priori convergence bounds for iterative eigenvalue methods. Often, the analysis of these methods can be split in the construction of an

26 18 Chapter 2. Eigenvector approximations from a subspace upper bound on sin 2 (V, x 1 ) and the analysis of the error contributed by the Rayleigh-Ritz method. For example, Theorem 1 in [98] gives a bound for the angle between x 1 and Krylov subspaces. Combining this with the classical and known error bounds discussed in the next section gives precisely the bound for the first eigenvector of Kaniel [75] for the Lanczos method. In literature, these bounds are often improved by (implicitly) constructing better bounds for sin 2 (V, x 1 ). In this section we focus on error bounds for the Rayleigh-Ritz method and our results are not restricted to a specific method A well-known upper bound A first approach for obtaining a true a priori bound is suggested at the end of Section 11.9 in [94] where the elegant bounds of Kaniel [75] (see also [94, Theorem ]) are the starting point. Using the notation ε := sin 2 (V, x 1 ) these bounds are summarized by the following theorem. Theorem (Kaniel [75]). Furthermore, both inequalities are sharp. ϑ 1 λ 1 (λ n λ 1 )ε (2.3.1) sin 2 (u 1, x 1 ) ϑ 1 λ 1 λ 2 λ 1. (2.3.2) We recall that for more general matrices we gave an error bound in Theorem for the Ritz vector in case the corresponding Ritz value is close to an extreme eigenvalue λ, that is γ(λ) > 0. An interesting question is, if we, for this more general situation, can also derive a bound in terms of the eigenvalues of the matrix and λ ϑ as in (2.3.2). It turns out that this is not possible, not even for γ(λ) > 0. To see this, let u be some vector with ρ(u) = λ and let u and A be decomposed as in (2.2.5) and (2.2.4), respectively. Then, 0 = ρ(u) λ = r e + e Re. This equality does not imply that e = 0 if γ(λ) > 0. Therefore, it follows from this example of the Rayleigh-Ritz method with a one dimensional subspace that it is in general not possible to derive error bounds for the Ritz vector in terms of the quantity ϑ λ and the eigenvalues of the matrix only (unless r in (2.2.4) is zero). We return to the issue of deriving a priori error bounds for the Ritz vectors in the symmetric case. From Theorem we can easily obtain an error bound for the first Ritz vector that is truly a priori, in other words it is expressed in terms of ε and the eigenvalues of A. The proof of this statement is a straightforward combination of (2.3.1) and (2.3.2).

27 2.3. A priori error bounds for the Ritz pair 19 Theorem sin 2 (u 1, x 1 ) λ ( n λ 1 ε = 1 + λ ) n λ 2 ε. (2.3.3) λ 2 λ 1 λ 2 λ 1 Although (2.3.3) is a combination of the sharp bounds (2.3.1) and (2.3.2), there is no guarantee that this bound is sharp itself. Since (2.3.2) attains equality if u 1 has a component in the direction of x 2, while for (2.3.1) equality is attained when there is a component in the direction of x n, it is suggested that (2.3.3) may not be sharp. Indeed, in the next section we improve this bound and construct a sharp bound for ε < λ2 λ1 λ n λ 1. Notice that (2.3.3) is not useful when this condition on ε is not fulfilled. Another question that we address is whether ϑ V equals ϑ 1. This is important for the selection problem, i.e., at some point, it is necessary to select the Ritz vector that makes the smallest angle with x A sharp upper bound In his PhD thesis [117] and in technical report [116], Smit addressed the problem of obtaining optimal bounds for the Rayleigh-Ritz process. He derived such bounds for the case dim(v) = 2 and generated approximations for the k dimensional case (k > 2) by numerical experiments. On the basis of his numerical results, he conjectured that when ε < λ2 λ1 λ n λ 1, the optimal bound for the k dimensional case equals the optimal bound for the two dimensional case. In this section we prove that this is indeed correct. For convenience we use the following notation. Let δ V := min sin 2 (u j, x 1 ), where the minimum is taken over all Ritz vectors, u j, with respect to V. Put ε V := sin 2 (V, x 1 ). For ε > 0 we define δ k (ε) := max{δ V dim(v) = k, ε V ε}. The following lemma is an adaptation of Theorem 4.1 in [116]. We give a shorter proof and have added the statement that ϑ V = ϑ 1 in case ε < λ2 λ1 λ n λ 1 which we need in the remainder of this section. Lemma If dim(v) = 2 and 0 ε < λ2 λ1 λ n λ 1, then ϑ V = ϑ 1 < λ 2. Furthermore, with κ := { 1 2 δ 2 (ε) = (1 + ε) 1 2 (1 ε)2 κε if ε < λ2 λ1 λ2 λ1 (1 + ε) if ε 1 2 (λ n λ 2) 2 (λ n λ 1)(λ 2 λ 1). λ n λ 1, λ n λ 1,

Numerical Methods I Eigenvalue Problems

Numerical Methods I Eigenvalue Problems Numerical Methods I Eigenvalue Problems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001, Fall 2010 September 30th, 2010 A. Donev (Courant Institute)

More information

Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Vector and Matrix Norms

Vector and Matrix Norms Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

More information

Lecture 1: Schur s Unitary Triangularization Theorem

Lecture 1: Schur s Unitary Triangularization Theorem Lecture 1: Schur s Unitary Triangularization Theorem This lecture introduces the notion of unitary equivalence and presents Schur s theorem and some of its consequences It roughly corresponds to Sections

More information

α = u v. In other words, Orthogonal Projection

α = u v. In other words, Orthogonal Projection Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

More information

Duality of linear conic problems

Duality of linear conic problems Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least

More information

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Recall that two vectors in are perpendicular or orthogonal provided that their dot Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

More information

ALGEBRAIC EIGENVALUE PROBLEM

ALGEBRAIC EIGENVALUE PROBLEM ALGEBRAIC EIGENVALUE PROBLEM BY J. H. WILKINSON, M.A. (Cantab.), Sc.D. Technische Universes! Dsrmstedt FACHBEREICH (NFORMATiK BIBL1OTHEK Sachgebieto:. Standort: CLARENDON PRESS OXFORD 1965 Contents 1.

More information

4.5 Linear Dependence and Linear Independence

4.5 Linear Dependence and Linear Independence 4.5 Linear Dependence and Linear Independence 267 32. {v 1, v 2 }, where v 1, v 2 are collinear vectors in R 3. 33. Prove that if S and S are subsets of a vector space V such that S is a subset of S, then

More information

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem

More information

1 VECTOR SPACES AND SUBSPACES

1 VECTOR SPACES AND SUBSPACES 1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

More information

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively. Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry

More information

Math 4310 Handout - Quotient Vector Spaces

Math 4310 Handout - Quotient Vector Spaces Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable

More information

BANACH AND HILBERT SPACE REVIEW

BANACH AND HILBERT SPACE REVIEW BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but

More information

by the matrix A results in a vector which is a reflection of the given

by the matrix A results in a vector which is a reflection of the given Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

7 Gaussian Elimination and LU Factorization

7 Gaussian Elimination and LU Factorization 7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method

More information

Implementations of tests on the exogeneity of selected. variables and their Performance in practice ACADEMISCH PROEFSCHRIFT

Implementations of tests on the exogeneity of selected. variables and their Performance in practice ACADEMISCH PROEFSCHRIFT Implementations of tests on the exogeneity of selected variables and their Performance in practice ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag

More information

Chapter 6. Cuboids. and. vol(conv(p ))

Chapter 6. Cuboids. and. vol(conv(p )) Chapter 6 Cuboids We have already seen that we can efficiently find the bounding box Q(P ) and an arbitrarily good approximation to the smallest enclosing ball B(P ) of a set P R d. Unfortunately, both

More information

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

More information

3. INNER PRODUCT SPACES

3. INNER PRODUCT SPACES . INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.

More information

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013 Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,

More information

CS3220 Lecture Notes: QR factorization and orthogonal transformations

CS3220 Lecture Notes: QR factorization and orthogonal transformations CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss

More information

P164 Tomographic Velocity Model Building Using Iterative Eigendecomposition

P164 Tomographic Velocity Model Building Using Iterative Eigendecomposition P164 Tomographic Velocity Model Building Using Iterative Eigendecomposition K. Osypov* (WesternGeco), D. Nichols (WesternGeco), M. Woodward (WesternGeco) & C.E. Yarman (WesternGeco) SUMMARY Tomographic

More information

Applied Linear Algebra I Review page 1

Applied Linear Algebra I Review page 1 Applied Linear Algebra Review 1 I. Determinants A. Definition of a determinant 1. Using sum a. Permutations i. Sign of a permutation ii. Cycle 2. Uniqueness of the determinant function in terms of properties

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

More information

3 Orthogonal Vectors and Matrices

3 Orthogonal Vectors and Matrices 3 Orthogonal Vectors and Matrices The linear algebra portion of this course focuses on three matrix factorizations: QR factorization, singular valued decomposition (SVD), and LU factorization The first

More information

Section 4.4 Inner Product Spaces

Section 4.4 Inner Product Spaces Section 4.4 Inner Product Spaces In our discussion of vector spaces the specific nature of F as a field, other than the fact that it is a field, has played virtually no role. In this section we no longer

More information

MATH 551 - APPLIED MATRIX THEORY

MATH 551 - APPLIED MATRIX THEORY MATH 55 - APPLIED MATRIX THEORY FINAL TEST: SAMPLE with SOLUTIONS (25 points NAME: PROBLEM (3 points A web of 5 pages is described by a directed graph whose matrix is given by A Do the following ( points

More information

Chapter 6. Orthogonality

Chapter 6. Orthogonality 6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be

More information

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEY-INTERSCIENCE A John Wiley & Sons, Inc.,

More information

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J. Olver 6. Eigenvalues and Singular Values In this section, we collect together the basic facts about eigenvalues and eigenvectors. From a geometrical viewpoint,

More information

Linear Algebra I. Ronald van Luijk, 2012

Linear Algebra I. Ronald van Luijk, 2012 Linear Algebra I Ronald van Luijk, 2012 With many parts from Linear Algebra I by Michael Stoll, 2007 Contents 1. Vector spaces 3 1.1. Examples 3 1.2. Fields 4 1.3. The field of complex numbers. 6 1.4.

More information

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix Linear Algebra & Properties of the Covariance Matrix October 3rd, 2012 Estimation of r and C Let rn 1, rn, t..., rn T be the historical return rates on the n th asset. rn 1 rṇ 2 r n =. r T n n = 1, 2,...,

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

The Heat Equation. Lectures INF2320 p. 1/88

The Heat Equation. Lectures INF2320 p. 1/88 The Heat Equation Lectures INF232 p. 1/88 Lectures INF232 p. 2/88 The Heat Equation We study the heat equation: u t = u xx for x (,1), t >, (1) u(,t) = u(1,t) = for t >, (2) u(x,) = f(x) for x (,1), (3)

More information

LINEAR ALGEBRA W W L CHEN

LINEAR ALGEBRA W W L CHEN LINEAR ALGEBRA W W L CHEN c W W L Chen, 1997, 2008 This chapter is available free to all individuals, on understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied,

More information

Chapter 17. Orthogonal Matrices and Symmetries of Space

Chapter 17. Orthogonal Matrices and Symmetries of Space Chapter 17. Orthogonal Matrices and Symmetries of Space Take a random matrix, say 1 3 A = 4 5 6, 7 8 9 and compare the lengths of e 1 and Ae 1. The vector e 1 has length 1, while Ae 1 = (1, 4, 7) has length

More information

Chapter 20. Vector Spaces and Bases

Chapter 20. Vector Spaces and Bases Chapter 20. Vector Spaces and Bases In this course, we have proceeded step-by-step through low-dimensional Linear Algebra. We have looked at lines, planes, hyperplanes, and have seen that there is no limit

More information

Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

More information

x1 x 2 x 3 y 1 y 2 y 3 x 1 y 2 x 2 y 1 0.

x1 x 2 x 3 y 1 y 2 y 3 x 1 y 2 x 2 y 1 0. Cross product 1 Chapter 7 Cross product We are getting ready to study integration in several variables. Until now we have been doing only differential calculus. One outcome of this study will be our ability

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

More information

1.3. DOT PRODUCT 19. 6. If θ is the angle (between 0 and π) between two non-zero vectors u and v,

1.3. DOT PRODUCT 19. 6. If θ is the angle (between 0 and π) between two non-zero vectors u and v, 1.3. DOT PRODUCT 19 1.3 Dot Product 1.3.1 Definitions and Properties The dot product is the first way to multiply two vectors. The definition we will give below may appear arbitrary. But it is not. It

More information

On the Subspace Projected Approximate Matrix method. J. H. Brandts and R. Reis da Silva

On the Subspace Projected Approximate Matrix method. J. H. Brandts and R. Reis da Silva NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2011; 00:1 23 Published online in Wiley InterScience (www.interscience.wiley.com). On the Subspace Projected Approximate Matrix method

More information

Orthogonal Diagonalization of Symmetric Matrices

Orthogonal Diagonalization of Symmetric Matrices MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

More information

Notes on Symmetric Matrices

Notes on Symmetric Matrices CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.

More information

ESSAYS ON MONTE CARLO METHODS FOR STATE SPACE MODELS

ESSAYS ON MONTE CARLO METHODS FOR STATE SPACE MODELS VRIJE UNIVERSITEIT ESSAYS ON MONTE CARLO METHODS FOR STATE SPACE MODELS ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad Doctor aan de Vrije Universiteit Amsterdam, op gezag van de rector magnificus

More information

Lecture Topic: Low-Rank Approximations

Lecture Topic: Low-Rank Approximations Lecture Topic: Low-Rank Approximations Low-Rank Approximations We have seen principal component analysis. The extraction of the first principle eigenvalue could be seen as an approximation of the original

More information

University of Lille I PC first year list of exercises n 7. Review

University of Lille I PC first year list of exercises n 7. Review University of Lille I PC first year list of exercises n 7 Review Exercise Solve the following systems in 4 different ways (by substitution, by the Gauss method, by inverting the matrix of coefficients

More information

Some Polynomial Theorems. John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA 90405 rkennedy@ix.netcom.

Some Polynomial Theorems. John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA 90405 rkennedy@ix.netcom. Some Polynomial Theorems by John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA 90405 rkennedy@ix.netcom.com This paper contains a collection of 31 theorems, lemmas,

More information

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued). MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors Jordan canonical form (continued) Jordan canonical form A Jordan block is a square matrix of the form λ 1 0 0 0 0 λ 1 0 0 0 0 λ 0 0 J = 0

More information

Elasticity Theory Basics

Elasticity Theory Basics G22.3033-002: Topics in Computer Graphics: Lecture #7 Geometric Modeling New York University Elasticity Theory Basics Lecture #7: 20 October 2003 Lecturer: Denis Zorin Scribe: Adrian Secord, Yotam Gingold

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

THREE DIMENSIONAL GEOMETRY

THREE DIMENSIONAL GEOMETRY Chapter 8 THREE DIMENSIONAL GEOMETRY 8.1 Introduction In this chapter we present a vector algebra approach to three dimensional geometry. The aim is to present standard properties of lines and planes,

More information

Orthogonal Projections

Orthogonal Projections Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors

More information

Metric Spaces. Chapter 7. 7.1. Metrics

Metric Spaces. Chapter 7. 7.1. Metrics Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some

More information

Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

More information

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 10

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 10 Lecture Notes to Accompany Scientific Computing An Introductory Survey Second Edition by Michael T. Heath Chapter 10 Boundary Value Problems for Ordinary Differential Equations Copyright c 2001. Reproduction

More information

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

More information

Systems of Linear Equations

Systems of Linear Equations Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and

More information

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8 Spaces and bases Week 3: Wednesday, Feb 8 I have two favorite vector spaces 1 : R n and the space P d of polynomials of degree at most d. For R n, we have a canonical basis: R n = span{e 1, e 2,..., e

More information

Inner products on R n, and more

Inner products on R n, and more Inner products on R n, and more Peyam Ryan Tabrizian Friday, April 12th, 2013 1 Introduction You might be wondering: Are there inner products on R n that are not the usual dot product x y = x 1 y 1 + +

More information

Recall the basic property of the transpose (for any A): v A t Aw = v w, v, w R n.

Recall the basic property of the transpose (for any A): v A t Aw = v w, v, w R n. ORTHOGONAL MATRICES Informally, an orthogonal n n matrix is the n-dimensional analogue of the rotation matrices R θ in R 2. When does a linear transformation of R 3 (or R n ) deserve to be called a rotation?

More information

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

More information

Matrix Representations of Linear Transformations and Changes of Coordinates

Matrix Representations of Linear Transformations and Changes of Coordinates Matrix Representations of Linear Transformations and Changes of Coordinates 01 Subspaces and Bases 011 Definitions A subspace V of R n is a subset of R n that contains the zero element and is closed under

More information

Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems

Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001,

More information

Fairness in Routing and Load Balancing

Fairness in Routing and Load Balancing Fairness in Routing and Load Balancing Jon Kleinberg Yuval Rabani Éva Tardos Abstract We consider the issue of network routing subject to explicit fairness conditions. The optimization of fairness criteria

More information

160 CHAPTER 4. VECTOR SPACES

160 CHAPTER 4. VECTOR SPACES 160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results

More information

Model order reduction via dominant poles

Model order reduction via dominant poles Model order reduction via dominant poles NXP PowerPoint template (Title) Template for presentations (Subtitle) Joost Rommes [joost.rommes@nxp.com] NXP Semiconductors/Corp. I&T/DTF/Mathematics Joint work

More information

Solving Systems of Linear Equations

Solving Systems of Linear Equations LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how

More information

What is Linear Programming?

What is Linear Programming? Chapter 1 What is Linear Programming? An optimization problem usually has three essential ingredients: a variable vector x consisting of a set of unknowns to be determined, an objective function of x to

More information

Separation Properties for Locally Convex Cones

Separation Properties for Locally Convex Cones Journal of Convex Analysis Volume 9 (2002), No. 1, 301 307 Separation Properties for Locally Convex Cones Walter Roth Department of Mathematics, Universiti Brunei Darussalam, Gadong BE1410, Brunei Darussalam

More information

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. Norm The notion of norm generalizes the notion of length of a vector in R n. Definition. Let V be a vector space. A function α

More information

Continuity of the Perron Root

Continuity of the Perron Root Linear and Multilinear Algebra http://dx.doi.org/10.1080/03081087.2014.934233 ArXiv: 1407.7564 (http://arxiv.org/abs/1407.7564) Continuity of the Perron Root Carl D. Meyer Department of Mathematics, North

More information

3.1 State Space Models

3.1 State Space Models 31 State Space Models In this section we study state space models of continuous-time linear systems The corresponding results for discrete-time systems, obtained via duality with the continuous-time models,

More information

24. The Branch and Bound Method

24. The Branch and Bound Method 24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

[1] Diagonal factorization

[1] Diagonal factorization 8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:

More information

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A =

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A = MAT 200, Midterm Exam Solution. (0 points total) a. (5 points) Compute the determinant of the matrix 2 2 0 A = 0 3 0 3 0 Answer: det A = 3. The most efficient way is to develop the determinant along the

More information

Mechanics 1: Vectors

Mechanics 1: Vectors Mechanics 1: Vectors roadly speaking, mechanical systems will be described by a combination of scalar and vector quantities. scalar is just a (real) number. For example, mass or weight is characterized

More information

1 Teaching notes on GMM 1.

1 Teaching notes on GMM 1. Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in

More information

Derivative Free Optimization

Derivative Free Optimization Department of Mathematics Derivative Free Optimization M.J.D. Powell LiTH-MAT-R--2014/02--SE Department of Mathematics Linköping University S-581 83 Linköping, Sweden. Three lectures 1 on Derivative Free

More information

Operation Count; Numerical Linear Algebra

Operation Count; Numerical Linear Algebra 10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point

More information

t := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).

t := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d). 1. Line Search Methods Let f : R n R be given and suppose that x c is our current best estimate of a solution to P min x R nf(x). A standard method for improving the estimate x c is to choose a direction

More information

Inner product. Definition of inner product

Inner product. Definition of inner product Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

More information

State of Stress at Point

State of Stress at Point State of Stress at Point Einstein Notation The basic idea of Einstein notation is that a covector and a vector can form a scalar: This is typically written as an explicit sum: According to this convention,

More information

5. Orthogonal matrices

5. Orthogonal matrices L Vandenberghe EE133A (Spring 2016) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal

More information

Practical Guide to the Simplex Method of Linear Programming

Practical Guide to the Simplex Method of Linear Programming Practical Guide to the Simplex Method of Linear Programming Marcel Oliver Revised: April, 0 The basic steps of the simplex algorithm Step : Write the linear programming problem in standard form Linear

More information

Notes on Determinant

Notes on Determinant ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

More information

So let us begin our quest to find the holy grail of real analysis.

So let us begin our quest to find the holy grail of real analysis. 1 Section 5.2 The Complete Ordered Field: Purpose of Section We present an axiomatic description of the real numbers as a complete ordered field. The axioms which describe the arithmetic of the real numbers

More information

Factorization Theorems

Factorization Theorems Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

More information

Lecture 5: Singular Value Decomposition SVD (1)

Lecture 5: Singular Value Decomposition SVD (1) EEM3L1: Numerical and Analytical Techniques Lecture 5: Singular Value Decomposition SVD (1) EE3L1, slide 1, Version 4: 25-Sep-02 Motivation for SVD (1) SVD = Singular Value Decomposition Consider the system

More information