The Mathematics any Physicist Should Know Thomas Hjortgaard Danielsen
Contents Preface 5 I Representation Theory of Groups and Lie Algebras 7 1 Peter-Weyl Theory 9 1.1 Foundations of Representation Theory............... 9 1.2 The Haar Integral.......................... 15 1.3 Matrix Coefficients.......................... 18 1.4 Characters............................... 20 1.5 The Peter-Weyl Theorem...................... 24 2 Structure Theory for Lie Algebras 29 2.1 Basic Notions............................. 29 2.2 Semisimple Lie Algebras....................... 35 2.3 The Universal Enveloping Algebra................. 42 3 Basic Representation Theory of Lie Algebras 49 3.1 Lie Groups and Lie Algebras.................... 49 3.2 Weyl s Theorem........................... 54 4 Root Systems 59 4.1 Weights and Roots.......................... 59 4.2 Root Systems for Semisimple Lie Algebras............. 62 4.3 Abstract Root Systems........................ 68 4.4 The Weyl Group........................... 72 5 The Highest Weight Theorem 75 5.1 Highest Weights........................... 75 5.2 Verma Modules............................ 79 5.3 The Case sl(3, C)........................... 86 6 Infinite-dimensional Representations 91 6.1 Gårding Subspace.......................... 91 6.2 Induced Lie Algebra Representations................ 95 6.3 Self-Adjointness............................ 97 6.4 Applications to Quantum Mechanics................ 102 II Geometric Analysis and Spin Geometry 109 7 Clifford Algebras 111 7.1 Elementary Properties........................ 111 7.2 Classification of Clifford Algebras.................. 117 3
4 7.3 Representation Theory........................ 121 8 Spin Groups 125 8.1 The Clifford Group.......................... 125 8.2 Pin and Spin Groups......................... 128 8.3 Double Coverings........................... 131 8.4 Spin Group Representations..................... 135 9 Topological K-Theory 139 9.1 The K-Functors............................ 139 9.2 The Long Exact Sequence...................... 144 9.3 Exterior Products and Bott Periodicity.............. 149 9.4 Equivariant K-theory......................... 151 9.5 The Thom Isomorphism....................... 155 10 Characteristic Classes 163 10.1 Connections on Vector Bundles................... 163 10.2 Connections on Associated Vector Bundles*............ 166 10.3 Pullback Bundles and Pullback Connections............ 172 10.4 Curvature............................... 175 10.5 Metric Connections.......................... 178 10.6 Characteristic Classes........................ 180 10.7 Orientation and the Euler Class................... 186 10.8 Splitting Principle, Multiplicative Sequences............ 190 10.9 The Chern Character......................... 197 11 Differential Operators 201 11.1 Differential Operators on Manifolds................. 201 11.2 The Principal Symbol........................ 205 11.3 Dirac Bundles and the Dirac Operator............... 210 11.4 Sobolev Spaces............................ 220 11.5 Elliptic Complexes.......................... 227 12 The Atiyah-Singer Index Theorem 233 12.1 K-Theoretic Version......................... 233 12.2 Cohomological Version........................ 236 A Table of Clifford Algebras 245 B Calculation of Fundamental Groups 247 Bibliography 251 Index 252
Preface When following courses given by Ryszard Nest at the Copenhagen University, you can be almost certain that a reference to the Atiyah-Singer Index Theorem will appear at least once during the course. Thus it was an obvious project for me to find out what this, apparently great theorem, was all about. However, from the beginning I was well aware that this was not an easy task and that it was necessary for me to delve into a lot of other subjects involved in its formulation, before the goal could be reached. It has never been my intension to actually prove the theorem (well except for a few moments of utter over ambitiousness) but merely to pave a road for my own understanding. This road leads through as various subjects as K-theory, characteristic classes and elliptic theory. I have tried to treat each subject as thoroughly and self-contained as I could, even though this meant including stuff which wasn t really necessary for the Index Theorem. The starting point is of course my own prerequisites when I began my work half a year ago, that is a solid foundation in Riemannian geometry, algebraic topology (notably homology and cohomology) and pseudodifferential calculus on Euclidean space. From here we develop at first, in a systematic way, topological K-theory. The approach is via vector bundles as it can be found in for instance [Atiyah] or [Hatcher], no C -algebras are involved. In the first two sections the basic theory will be outlined and most proofs will be given. In the third section we present the famous Bott-periodicity Theorem, without giving a proof. The last two sections are dedicated to the Thom Isomorphism. To this end we introduce equivariant K-theory (that is, K-theory involving group actions), a slight generalization of the K-theory treated in the first sections. I follow the outline given in the classical article by [Segal]. One could argue, that equivariant K-theory could have been introduced from the very beginning, however I have chosen not to, in order not to blur the introductory presentation with too many technicalities. The second chapter deals with the Chern-Weil approach to characteristic classes of vector bundles. The first four sections are devoted to the study of the basic theory of connections on vector bundles. From the curvature forms and invariant polynomials we construct characteristic classes, in particular Chern and Pontrjagin classes and their relationships will be discussed. In the following section the Euler class of oriented bundles is defined. I have relied heavily on [Morita] and [Milnor, Stacheff] when working out these sections but also [Madsen, Tornehave] has provided valuable inspiration. The chapter ends with a discussion of certain characteristic classes constructed, not from invariant polynomials but from invariant formal power series. Examples of such classes are the Todd class and the total Ã-class and the Chern character. No effort has been made to include great theorems, in fact there are really no major results in this chapter. It serves as a tool box to be applied to the construction of the topological index. The third chapter revolves around differential operators on manifolds. In the 5
6 standard literature on this subject not much care is taken, when transferring the differential operators and principal symbols from Euclidean space to manifolds. I ve tried to remedy this, giving a precise and detailed treatment. To this I have added a lot of examples of classical differential operators, such as the Laplacian, Hodge-de Rham operators, Dirac operators etc. calculating their formal adjoints and principal symbols. To shed some light on the analytic properties we introduce Sobolev spaces. Essentially there are two different definitions: in the first one, Sobolev spaces are defined in terms of connections, and in the second they are defined as the clutching of local Euclidean Sobolev spaces. We prove that the two definitions agree, when the underlying manifold is compact, and we show how to extend differential operators to continuous operators between the Sobolev spaces. The major results such as the Sobolev Embedding Theorem, the Rellich lemma and Elliptic Regularity are given without proofs. We then move on to elliptic complexes, which provides us with a link to the K-theory developed in the first chapter. In the fourth and final chapter the Index Theorem is presented. We construct the so-called topological index map from the K-group K(T M) to the integers and state the index theorem, which says that the index function when evaluated on the specific K-class determined from the symbol of an elliptic differential operator, is in fact equal to the Fredholm index. I give a short sketch of the proof based on the original 1968-article by Atiyah and Singer. Then by introducing the cohomological Thom isomorphism, Thom Defect classes etc. and drawing heavily on the theory developed in the previous chapters we manage to deduce the famous cohomological index formula. To demonstrate the power of the Index Theorem, we prove two corollaries, namely the generalized Gauss-Bonnet Theorem and the fact that any elliptic differential operator on a compact manifold of odd dimension has index 0. I would like to thank Professor Ryszard Nest for his guidance and inspiration, as well as answers to my increasing amount of questions. Copenhagen, March 2008. Thomas Hjortgaard Danielsen.
Part I Representation Theory of Groups and Lie Algebras 7
Chapter 1 Peter-Weyl Theory 1.1 Foundations of Representation Theory We begin by introducing some basic but fundamental notions and results regarding representation theory of topological groups. Soon, however, we shall restrict our focus to compact groups and later to Lie groups and their Lie algebras. We begin with the basic theory. To define the notion of a representation, let V denote a separable Banach space and equip B(V ), the space of bounded linear maps V V, with the strong operator topology i.e. the topology on B(V ) generated by the seminorms A x = Ax. Let Aut(V ) B(V ) denote the group of invertible linear maps and equip it with the subspace topology, which turns it into a topological group. Definition 1.1 (Representation). By a continuous representation of a topological group G on a separable Banach space V we understand a continuous group homomorphism π : G Aut(V ). We also say that V is given the structure of a G-module. If π is an injective homomorphism the representation is called faithful. By the dimension of the representation we mean the dimension of the vector space on which it is represented. If V is infinite-dimensional the representation is said to be infinite-dimensional as well. In what follows a group without further specification will always denote a locally compact topological group, and by a representation we will always understand a continuous representation. The reason why we demand the groups to be locally compact should be apparent in the next section. We will distinguish between real and complex representations depending on whether V is a real or complex Banach space. Without further qualification, the representations considered will all be complex. The requirement on π to be strongly continuous can be a little hard to handle, so here is an equivalent condition which is more applicable: Proposition 1.2. Let π : G Aut(V ) be a group homomorphism. Then the following conditions are equivalent: 1) π is continuous w.r.t. the the strong operator topology on Aut(V ), i.e. π is a continuous representation. 2) The map G V V given by (g, v) π(g)v is continuous. For a proof see [1] Proposition 18.8.
10 Chapter 1 Peter-Weyl Theory Example 1.3. The simplest example one can think of is the trivial representation: Let G be a group and V a Banach space, and consider the map G g id V. This is obviously a continuous group homomorphism and hence a representation. Now, let G be a matrix Lie group (i.e. a closed subgroup of GL(n, C)). Choosing a basis for C n we get an isomorphism Aut(C n ) GL(n, C), and we can thus define a representation of G on C n simply by the inclusion map G GL(n, C). This is obviously a continuous representation of G, called the defining representation. We can form new representations out of old ones. If (π 1, V 1 ) and (π 2, V 2 ) are representations of G on Banach spaces we can form their direct sum π 1 π 2 to be the representation of G on V 1 V 2 (which has been given the norm (x, y) = x + y, turning V 1 V 2 into a Banach space) given by (π 1 π 2 )(g)(x, y) = (π 1 (g)x, π 2 (g)y). If we have a countable family (H i ) i I of Hilbert spaces we can form the direct sum Hilbert space i I H i to be the vector space of sequences (x i ), x i H i, satisfying i I x i 2 H i <. Equipped with the inner product (x i ), (y i ) = i I x i, y i this is again a Hilbert space. If we have a countable family (π i, H i ) of representations such that sup i I π i (g) < for each g G, then we can form the direct sum of the representations i I π i on i I H i by ( π i )(g)(x i ) = (π i (g)x i ). i I Finally, if (π 1, H 1 ) and (π 2, H 2 ) are representations on Hilbert spaces, we can form the tensor product, namely equip the tensor product vector space H 1 H 2 with the inner product x 1 x 2, y 1 y 2 = x 1, y 1 x 2, y 2 which turns H 1 H 2 into a Hilbert space, and define the tensor product representation π 1 π 2 by (π 1 π 2 )(g)(x y) = π 1 (g)x π 2 (g)y. Definition 1.4 (Unitary Representation). By a unitary representation of a group G we understand a representation π on a Hilbert space H such that π(g) is a unitary operator for each g G. Obviously the trivial representation is a unitary representation. As is the defining representation of any subgroup of the unitary group U(n). In the next section we show unitarity of some more interesting representations. Definition 1.5 (Intertwiner). Let two representations (π 1, V 1 ) and (π 2, V 2 ) of the same group G be given. By an intertwiner or an intertwining map between π 1 and π 2 we understand a bounded linear map T : V 1 V 2 rendering the following diagram commutative π 1(g) V 1 V 1 T T V 2 V 2 π 2(g) i.e. satisfying T π 1 (g) = π 2 (g) T for all g G. The set of all intertwining maps is denoted Hom G (V 1, V 2 ).
1.1 Foundations of Representation Theory 11 A bijective intertwiner with bounded inverse between two representations is called an equivalence of representations and the two representations are said to be equivalent. This is denoted π 1 = π2. It s easy to see that Hom G (V 1, V 2 ) is a vector space, and that Hom G (V, V ) is an algebra. The dimension of Hom G (V 1, V 2 ) is called the intertwining number of the two representations. If π 1 = π2 via an intertwiner T, then we have π 2 (g) = T 1 π 1 (g) T. Since we thus can express the one in terms of the other, for almost any purpose the two representations can be regarded as the same. Proposition 1.6. Hom G respects direct sum in the sense that Hom G (V 1 V 2, W ) = Hom G (V 1, W ) Hom G (V 2, W ) and (1.1) Hom G (V, W 1 W 2 ) = Hom G (V, W 1 ) Hom G (V, W 2 ). (1.2) Proof. For the first isomorphism we define Φ : Hom G (V 1 V 2, W ) Hom G (V 1, W ) Hom G (V 2, W ) by Φ(T ) := (T V1, T V2 ). It is easy to check that this is indeed an element of the latter space. It has an inverse Φ 1 given by Φ 1 (T 1, T 2 )(v 1, v 2 ) := T 1 (v 1 ) + T 2 (v 2 ), and this proves the first isomorphism. The latter can be proved in the same way. Definition 1.7. Given a representation (π, V ) of a group G, we say that a linear subspace U V is π-invariant or just invariant if π(g)u U for all g G. If U is a closed invariant subspace for a representation π of G on V, we automatically get a representation of G on U, simply by restricting all the π(g) s to U (U should be a Banach space, and therefore we need U to be closed). This is clearly a representation, and we will denote it π U (although we are restricting the π(g) s to U and not π). Here is a simple condition to check invariance of a given subspace, at least in the case of a unitary representation Lemma 1.8. Let (π, H) be a representation of G, let H = U U be a decomposition of H and denote by P : H U the orthogonal projection onto U. If U is π-invariant then so is U. Furthermore U is π-invariant if and only if P π(g) = π(g) P for all g G. Proof. Assume that U is invariant. To show that U is invariant let v U. We need to show that π(g)v U, i.e. u U : π(g)v, u = 0. But that s easy, exploiting unitarity of π(g): π(g)v, u = π(g 1 )(π(g)v), π(g 1 )u = v, π(g 1 )u which is 0 since π(g 1 )u U and v U. Thus U is invariant Assume U to be invariant. Then also U is invariant by the above. We split x H into x = P x + (1 P )x and calculate P π(g)x = P (π(g)(p x + (1 P )x)) = P π(g)p x + P π(g)(1 P )x. The first term is π(g)p x, since π(g)p x U, and the second term is zero, since π(g)(1 P )x U. Thus we have the desired formula.
12 Chapter 1 Peter-Weyl Theory Conversely, assume that P π(g) = π(g) P. Every vector u U is of the form P x for some x H. Since U is an invariant subspace. π(g)u = π(g)(p x) = P (π(g)x) U, For any representation (π, V ) it is easy to see two obvious invariant subspaces, namely V itself and {0}. We shall focus a lot on representations having no invariant subspaces except these two: Definition 1.9. A representation is called irreducible if it has no closed invariant subspaces except the trivial ones. The set of equivalence classes of finitedimensional irreducible representations of a group G is denoted Ĝ. A representation is called completely reducible if it is equivalent to a direct sum of finite-dimensional irreducible representations. Any 1-dimensional representation is obviously irreducible, and if the group is abelian the converse is actually true. We prove this in Proposition 1.14 If (π 1, V 1 ) and (π 2, V 2 ) are irreducible representations then the direct sum π 1 π 2 is not irreducible, since V 1 is an π 1 π 2 -invariant subspace of V 1 V 2 : (π 1 π 2 )(g)(v, 0) = (π 1 (g)v, 0). The question is more subtle when considering tensor products of irreducible representations. Whether or not the tensor product of two irreducible representations is irreducible and if not, to write is as a direct sum of irreducible representations is a branch of representation theory known as Clebsch-Gordan theory. Lemma 1.10. Let (π 1, V 1 ) and (π 2, V 2 ) be equivalent representations. Then π 1 is irreducible if and only if π 2 is irreducible. Proof. Given the symmetry of the problem, it is sufficient to verify that irreducibility of π 1 implies irreducibility of π 2. Let T : V 1 V 2 denote the intertwiner, which by the Open Mapping Theorem is a linear homeomorphism. Assume that U V 2 is a closed invariant subspace. Then T 1 U V 1 is closed and π 1 -invariant: π 1 (g)t 1 U = T 1 π 2 (g)u T 1 U But this means that T 1 U is either 0 or V 1, i.e. U is either 0 or V 2. Example 1.11. Consider the group SL(2, C) viewed as a real (hence 6-dimensional) Lie group. We consider the following 4 complex representations of the real Lie group SL(2, C) on C 2 : ρ(a)ψ := Aψ, ρ(a)ψ := Aψ, ρ(a)ψ := (A T ) 1 ψ, ρ(a)ψ := (A ) 1 ψ, where A simply means complex conjugation of all the entries. All four are clearly irreducible. They are important in physics where they are called spinorial representations. The physicists have a habit of writing everything in coordinates, thus ψ will usually be written ψ α, where α = 1, 2 but the exact notation will vary according to which representation we have imposed on C 2 (i.e. according to how ψ transforms as the physicists say). In other words they view C 2 not as a vector space but rather as a SL(2, C)-module. The notations are ψ α C 2, ψ α C 2, ψ α C 2, ψ α C 2.
1.1 Foundations of Representation Theory 13 The representations are not all ( mutually ) inequivalent, actually the map ϕ : 0 1 C 2 C 2 given by the matrix intertwines ρ with ρ and intertwines ρ 1 0 with ρ. On the other hand ρ and ρ are actually inequivalent as we will se in Section 1.4. These two representations are called the fundamental representations of SL(2, C). In short, representation theory has two goals: 1) given a group: find all the irreducible representations and 2) given a representation of this group: split it (if possible) into a direct sum of irreducibles. The rest of this chapter deals with the second problem (at least for compact groups) and in the end we will achieve some powerful results (Schur Orthogonality and the Peter-Weyl Theorem). Chapter 5 revolves around the first problem of finding irreducible representations. But already at this stage we are able to state and prove two quite interesting results. The first result is known as Schur s Lemma. We prove a slightly more general version than is usually seen, allowing the representations to be infinitedimensional. Theorem 1.12 (Schur s Lemma). Let (π 1, H 1 ) and (π 2, H 2 ) be two irreducible unitary representations of a group G, and suppose that F : H 1 H 2 is an intertwiner. Then either F is an equivalence of representations or F is the zero map. If (π, H) is an irreducible unitary representation of G and F B(H) is a linear map which commutes with all π(g), then F = λ id H. Proof. The proof utilizes a neat result from Gelfand theory: suppose that A is a commutative unital C*-algebra which is also an integral domain (i.e. ab = 0 implies a = 0 or b = 0), then A = Ce. The proof is rather simple. Gelfand s Theorem states that there exists a compact Hausdorff space X such that A = C(X). To reach a contradiction, assume that X is not a one-point set, and pick two distinct points x and y. Then since X is a normal topological space, we can find disjoint open neighborhoods U and V around x and y, and the Urysohn Lemma gives us two nonzero continuous functions f and g on X, the first one supported in U and the second in V, the product, thus, being zero. This contradicts the assumption that A = C(X) was an integral domain. Therefore X can contain only one point and thus C(X) = C. With this result in mente we return to Schur s lemma. F being an intertwiner means that F π 1 (g) = π 2 (g) F, and using unitarity of π 1 (g) and π 2 (g) we get that F π 2 (g) = π 1 (g) F where F is the hermitian adjoint of F. This yields (F F ) π 2 (g) = F π 1 (g) F = π 2 (g) (F F ). In the last equality we also used that F intertwines the two representations. Consider the C -algebra A = C (id H2, F F ), the C -algebra generated by id H2 and F F. It s a commutative unital C -algebra, and all the elements are of the form n=0 a n(f F ) n. They commute with π 2 (g): ( a n (F F ) n) π 2 (g) = (a n (F F ) n π 2 (g)) = a n (π 2 (g)(f F ) n ) n=1 n=1 = π 2 (g) a n (F F ) n. n=1 We only need to show that A is an integral domain. Assume ST = 0. Since π 2 (g)s = Sπ 2 (g) it s easy to see that ker S is π 2 -invariant. π 2 is irreducible n=1
14 Chapter 1 Peter-Weyl Theory so ker S is either H 2 or {0}. In the first case S = 0, and we are done, in the second case, S is injective, and so T must be the zero map. This means that A = C id H2, in particular, there exists a λ C so that F F = λ id H2. Likewise, one shows that F F = λ id H1. Thus, we see λf = F (F F ) = (F F )F = λ F which implies F = 0 or λ = λ. In the second case if λ = λ = 0 then F F v = 0 for all v H 1, and hence 0 = v, F F v = F v, F v, i.e. F = 0. If λ = λ and λ 0 then it is not hard to see that λ 1 2 F is unitary, and that F therefore is an isomorphism. The second claims is an immediate consequence of the proof of the first. The content of this can be summed up to the following: If π 1 and π 2 are irreducible unitary representations of G on H 1 and H 2, then Hom G (H 1, H 2 ) = C if π 1 and π 2 are equivalent and Hom G (H 1, H 2 ) = {0} if π 1 and π 2 are inequivalent. Corollary 1.13. Let (π, H 1 ) and (ρ, H 2 ) be finite-dimensional unitary representations which decompose into irreducibles π = i I m i δ i and ρ = i I n i δ i. Then dim Hom G (H 1, H 2 ) = i I n im i. Proof. Denoting the representations spaces of the irreducible representations by V i we get from (1.1) and (1.2) that Hom G (H 1, H 2 ) = i I n i m j Hom(V i, V j ), and by Schur s Lemma the dimension formula now follows. j I Now for the promised result on abelian groups Proposition 1.14. Let G be an abelian group and (π, H) be a unitary representation of G. If π is irreducible then π is 1-dimensional. Proof. Since G is abelian we have π(g)π(h) = π(h)π(g) i.e. π(h) is an intertwiner. Since π is irreducible, Schur s Lemma says that π(h) = λ(h) id H. Thus, each 1-dimensional subspace of H is invariant, and by irreducibility H is 1-dimensional. Example 1.15. With the previous lemma we are in a position to determine the set of irreducible complex representations of the circle group T = R/Z. Since this is an abelian group, we have found all the irreducible representations when we know all the 1-dimensional representations. A 1-dimensional representation is just a homomorphism R/Z C, so let s find them: It is well-known that the only continuous homomorphisms R C are those of the form x e 2πiax for some a R. But since we also want it to be periodic with periodicity 1, only integer values of a are allowed. Thus, T consists of the homomorphisms ρ n (x) = e 2πinx for n Z. Proposition 1.16. Every finite-dimensional unitary representation is completely reducible.
1.2 The Haar Integral 15 Proof. If the representation is irreducible then we are done, so assume we have a unitary representation π : G Aut(H) and let {0} U H be an invariant subspace. The point is that U is invariant as well cf. Lemma 1.8. If both π U and π U are irreducible we are done. If one of them is not, we find an invariant subspace and perform the above argument once again. Since the representation is finite-dimensional and since 1-dimensional representations are irreducible, the argument must stop at some point. 1.2 The Haar Integral In the representation theory of locally compact groups (also known as harmonic analysis) the notions of Haar integral and Haar measure play a key role. Some preliminary definitions: Let X be a locally compact Hausdorff space and C c (X) the space of complex valued functions on X with compact support. By a positive integral on X is understood a linear functional I : C c (X) C such that I(f) 0 if f 0. The Riesz Representation Theorem tells us that to each such positive integral there exists a unique Radon measure µ on the Borel algebra B(X) such that I(f) = fdµ. We say that this measure µ is associated with the positive integral. Now, let G be a group. For each g 0 G we have two maps L g0 and R g0, left and right translation, on the set of complex-valued functions on G, given by (L g0 f)(g) = f(g 1 0 g), (R g 0 f)(g) = f(gg 0 ). These obviously satisfy L g1g 2 = L g1 L g2 and R g1g 2 = R g1 R g2. Definition 1.17 (Haar Measure). Let G be a locally compact group. A nonzero positive integral I on G is called a left Haar integral if I(L g f) = I(f) for all g G and f C c (X). Similarly a nonzero positive integral is called a right Haar integral if I(R g f) = I(f) for all g G and f C c (X). An integral which is both a left and a right Haar integral is called a Haar integral. The measures associated with left and right Haar integrals are called left and right Haar measures. The measure associated with a Haar integral is called a Haar measure. Example 1.18. On (R n, +) the Lebesgue integral is a Haar integral: it is obviously positive, and it is well-known that the Lebesgue integral is translation invariant: f(x + a)dx = R n f( a + x)dx = R n f(x)dx. R n The associated Haar measure is of course the Lebesgue measure m n. On the circle group (T, ) we define an integral I by X C(T) f 1 2π 2π 0 f(e it )dt. As before this is obviously a positive integral and since I(L e iaf) = 1 2π = 1 2π 2π 0 2π 0 f(e ia e it )dt = 1 2π f(e it )dt 2π 0 f(e i( a+t) )dt
16 Chapter 1 Peter-Weyl Theory again by exploiting translation invariance of the Lebesgue measure, I is a left Haar integral on T. Likewise one can show that it is a right Haar integral as well, and hence a Haar integral. The associated Haar measure on T is also called the arc measure. In both cases the groups were abelian and in both cases the left Haar integrals were also right Haar integrals. This is no mere coincidence for if G is an abelian group we have L g0 = R g 1 and thus a positive integral is a left Haar integral if 0 and only if it is a right Haar integral. The following central theorem attributed to Alfred Haar and acclaimed as one of the most important mathematical discoveries in the 20th century states existence and uniqueness of left and right Haar integrals on locally compact groups. Theorem 1.19. Every locally compact group G possesses a left Haar integral and a right Haar integral, and these are unique up to multiplication by a positive constant. If G is compact then the two integrals coincide, and the corresponding Haar measure is finite. It would be far beyond the scope of this thesis to delve into the proof of this. The existence part of the proof is a hard job so we just send some acknowledging thoughts to Alfred Haar and accept it as a fact of life. Now we restrict focus to compact groups on which, as we have just seen, we have a finite Haar measure. The importance of this finiteness is manifested in the following result: Theorem 1.20 (Unitarization). Let G be a compact group and (π, H) are representation on a Hilbert space (H,, ). Then there exists an inner product, G on H equivalent to, which makes π a unitary representation. Proof. Since the measure is finite, we can integrate all bounded measurable functions over G. Let us assume the measure to be normalized, i.e. that µ(g) = 1. For x 1, x 2 H the map g π(g)x 1, π(g)x 2 is continuous (by Proposition 1.2), hence bounded and measurable, i.e. integrable. Now define a new inner product by x 1, x 2 G := π(g)x 1, π(g)x 2 dg. (1.3) G That this is a genuine inner product is not hard to see: it is obviously sesquilinear by the properties of the integral, it is conjugate-symmetric, as the original inner product is conjugate-symmetric. Finally, if x 0 then π(g)x 0 (π(g) is invertible) and thus π(g)x > 0 for all g G. Since the map g π(g)x 2 is continuous we have x, x G = π(g)x dg > 0. G By the translation of the Haar measure we get π(h)x 1, π(h)x 2 G = π(gh)x 1, π(gh)x 2 dg G = π(g)x 1, π(g)x 2 dg G = x 1, x 2 G. Thus, π is unitary w.r.t. this new inner product. We just need to show that the two norms and G corresponding to the two inner products are equivalent, i.e. that there exists a constant C so that C G and G C. To this end, consider the map g π(g)x 2 for some x H. It s a continuous map, hence sup g G π(g)x 2 < for all x, and
1.2 The Haar Integral 17 the Uniform Boundedness Principle now says that C := sup g G π(g) <. Therefore x 2 = x 2 dg = π(g 1 )π(g)x 2 dg C 2 π(g)x 2 = C 2 x 2 G. G Conversely we see x 2 G = π(g)x 2 G This proves the claim. G G π(g) 2 x 2 dg C 2 If we combine this result with Proposition 1.16 we get G G x 2 dg = C 2 x 2. Corollary 1.21. Every finite-dimensional representation of a compact group is completely reducible. The Peter-Weyl Theorem which we prove later in this chapter provides a strong generalization of this result in that it states that every Hilbert space representation of a compact group is completely reducible. We end this section by introducing the so-called modular function which is a function that provides a link between left and right Haar integrals. Let G be a topological group and I : f f(g)dg a left Haar integral. G Let h G and consider the integral Ĩh : f G f(gh 1 )dg. This is positive and satisfies Ĩ h (L g0 f) = G f(g0 1 gh 1 )dg = f(gh 1 )dg = Ĩh(f) G i.e. is a left Haar integral. By the uniqueness part of Haar s Theorem there exists a positive constant c such that Ĩh(f) = ci(f). We define the modular function : G R + by assigning this constant to the group element h i.e. f(gh 1 )dg = (h) f(g)dg. G It is not hard to see that this is indeed a homomorphism: on one hand we have f(g(hk) 1 )dg = (hk) f(g)dg, G and on the other hand we have that this equals f(gk 1 h 1 )dg = (h) f(gk 1 )dg = (h) (k) G G G G G.f(g)dg Since this holds for all integrable functions f we must have (hk) = (h) (k). One can show that this is in fact a continuous group homomorphism and thus in the case of G being a Lie group, a Lie group homomorphism. If is identically 1, that is if every right Haar integral satisfies f(hg)dg = f(g)dg (1.4) G for all h, then the group G is called unimodular. Eq. (1.4) says that an equivalent condition for a group to be unimodular is that all right Haar integrals are also left Haar integrals. As we have seen previously in this section abelian groups and compact groups are unimodular groups. G
18 Chapter 1 Peter-Weyl Theory 1.3 Matrix Coefficients Definition 1.22 (Matrix Coefficient). Let (π, V ) be a finite-dimensional representation of a compact group G. By a matrix coefficient for the representation π we understand a map G C of the form for fixed v V and ϕ V. m v,ϕ (g) = ϕ(π(g)v) If we pick a basis {e 1,..., e n } for V and let {ε 1,..., ε n } denote the corresponding dual basis, then we see that m ei,ε j = ε j (π(g)e i ) precisely are the entries of the matrix-representation of π(g), therefore the name matrix coefficient. If V comes with an inner product,, then by the Riesz Theorem all matrix coefficients are of the form m v,w = π(g)v, w for fixed v, w V. By Theorem 1.20 we can always assume that this is the case. Denote by C(G) π the space of linear combinations of matrix coefficient. Since a matrix coefficient is obviously a continuous map, C(G) π C(G) L 2 (G). Thus, we can take the inner product of two functions in C(G) π. Note, however that the elements of C(G) π need not all be matrix coefficients for π. The following technical lemma is an important ingredient in the proof of the Schur Orthogonality Relations which is the main result of this section. Lemma 1.23. Let (π, H) be a finite-dimensional unitary representation of a compact group G. Define the map T π : End(H) C(G) by Then C(G) π = im T π. T π (A)(g) = Tr(π(g) A). (1.5) Proof. Given a matrix coefficient m v,w we should produce a linear map A : H H, such that m v,w = T π (A). Consider the map L v,w : H H defined by L v,w (u) = u, w v, the claim is that this is the desired map A. To see this we need to calculate Tr L v,w and we claim that the result is v, w. Since L v,w is sesquilinear in its indices (L av+bv,w = al v,w + bl v,w), it s enough to check it on elements of an orthonormal basis {e 1,..., e n } for H. while for i j Tr L ei,e i = Tr L ei,e j = L ei,e i e k, e k = k=1 L ei,e j e k, e k = k=1 Thus, Tr L v,w = v, w. Finally since we see that e k, e i e i, e k = 1 k=1 e k, e j e i, e k = 0. k=1 L v,w π(g)u = π(g)u, w v = u, π(g 1 )w v = L v,π(g 1 )wu T π (L v,w )(g) = Tr(π(g) L v,w ) = Tr(L v,w π(g)) = v, π(g 1 )w = π(g)v, w = m v,w (g). Conversely, we should show that any map T π (A) is a linear combination of matrix coefficients. Some linear algebraic manipulations should be enough to
1.3 Matrix Coefficients 19 convince the reader that we for any A End(H) have A = n i,j=1 Ae j, e i L ei,e j w.r.t some orthonormal basis {e 1,..., e n }. But then we readily see ( n ) T π (A)(g) = T π Ae j, e i L ei,e j (g) = = i,j=1 Ae j, e i m ei,e j (g). i,j=1 n i,j=1 Ae j, e i T π (L ei,e j )(g) Theorem 1.24 (Schur Orthogonality I). Let (π 1, H 1 ) and (π 2, H 2 ) be two unitary, irreducible finite-dimensional representations of a compact group G. If π 1 and π 2 are equivalent, then we have C(G) π1 = C(G) π2. If they are not, then C(G) π1 C(G) π2 inside L 2 (G). Before the proof, a few remarks on the integral of a vector valued function would be in order. Suppose that f : G H is a continuous function into a finite-dimensional Hilbert space. Choosing a basis {e 1,..., e n } for H we can write f in it s components f = n i=1 f i e i, which are also continuous, and define f(g)dg := f i (g)dg e i. G i=1 It s a simple change-of-basis calculation to verify that this is independent of the basis in question. Furthermore, one readily verifies that it is left-invariant and satisfies G f(g)dg, v = f(g), v dg and A f(g)dg = G G when A End(H). G Af(g)dg Proof of Theorem 1.24. If π 1 and π 2 are equivalent, there exists an isomorphism T : H 1 H 2 such that T π 1 (g) = π 2 (g)t. For A End(H 1 ) we see that T π2 (T AT 1 )(g) = Tr(π 2 (g)t AT 1 ) = Tr(T 1 π 2 (g)t A) = Tr(π 1 (g)a) = T π1 (A)(g). Hence the map sending T π1 (A) to T π2 (T AT 1 ) is the identity id : C(G) π1 C(G) π2 proving that the two spaces are equal. Now we show the second claim. Define for fixed w 1 H 1 and w 2 H 2 the map S w1,w 2 : H 1 H 2 by S w1,w 2 (v) = π 1 (g)v, w 1 π 2 (g 1 )w 2 dg. S w1,w 2 G is in Hom G (H 1, H 2 ) since by left-invariance S w1,w 2 π 1 (h)(v) = π 1 (gh)v, w 1 π 2 (g 1 )w 2 dg = π 1 (g)v, w 1 π 2 (hg 1 )w 2 dg G G = π 2 (h) π 1 (g)v, w 1 π 2 (g 1 )w 2 dg G = π 2 (h)s w1,w 2 (v). Assume that we can find two matrix coefficients m v1,w 1 and m v2,w 2 for π 1 and π 2 that are not orthogonal, i.e. we assume that 0 m v1,w 2 (g)m v2,w 2 (g)dg = π 1 (g)v 1, w 1 π 2 (g)v 2, w 2 dg G G = π 1 (g)v 1, w 1 π 2 (g 1 )w 2, v 2 dg. G
20 Chapter 1 Peter-Weyl Theory From this we read S w1,w 2 v 1, v 2 0, so that S w1,w 2 0. Since it s an intertwiner, Schur s Lemma tells us that S w1,w 2 is an isomorphism. By contraposition, the second claim is proved. In the case of two matrix coefficients for the same representation, we have the following result Theorem 1.25 (Schur Orthogonality II). Let (π, H) be a unitary, finitedimensional irreducible representation of a compact group G. For two matrix coefficients m v1,w 1 and m v2,w 2 we have m v1,w 1, m v2,w 2 = 1 dim H v 1, v 2 w 2, w 1. (1.6) Proof. As in the proof of Theorem 1.24 define S w1,w 2 : H H by S w1,w 2 (v) = π 1 (g)v, w 1 π 2 (g 1 )w 2 dg = π(g 1 )L w2,w 1 π(g)v dg. G G We see that m v1,w 1, m v2,w 2 = π(g)v 1, w 1 π(g)v 2, w 2 dg G = π(g)v 1, w 1 π(g 1 )w 2, v 2 dg G = π(g)v 1, w 1 π(g 1 )w 2, v 2 G = S w1,w 2 v 1, v 2. Furthermore, since S w1,w 2 commutes with π(g), Schur s Lemma yields a complex number λ(w 1, w 2 ), such that S w1,w 2 = λ(w 1, w 2 ) id H. The operator S w1,w 2 is linear in w 2 and anti-linear in w 1, hence λ(w 1, w 2 ) is a sesquilinear form on H. We now take the trace on both sides of the equation S w1,w 2 = λ(w 1, w 2 ) id H : the right hand side is easy, it s just λ(w 1, w 2 ) dim H. For the left hand side we calculate That is, we get λ(w 1, w 2 ) = (dim H) 1 w 1, w 2, and hence S w1,w 2 = (dim H) 1 w 1, w 2 id H. By substituting this into the equation m v1,w 1, m v2,w 2 = S w1,w 2 v 1, v 2 the desired result follows. 1.4 Characters Definition 1.26 (Class Function). For a group G, a class function is a function on G which is constant on conjugacy classes. The set of square-integrable resp. continuous class functions on G are denoted L 2 (G, class) and C(G, class). It is not hard to see that the closure of C(G, class) inside L 2 (G) is L 2 (G, class). Thus, L 2 (G, class) is a Hilbert space. Given an irreducible finite-dimensional representation the set of continuous class functions inside C(G) π is very small: Lemma 1.27. Let (π, H) be a finite-dimensional irreducible unitary representation of a compact group G, then the only class functions inside C(G) π are complex scalar multiples of T π (id H ).
1.4 Characters 21 Proof. To formulate the requirement on a class function, consider the representation ρ of G on C(G) by (ρ(g)f)(x) = f(g 1 xg), then in terms of this a function f is a class function if and only if π(g)f = f for all g. For reasons which will be clear shortly, we introduce another representation Π of G on End(H) by Π(g)A = π(g)aπ(g 1 ). Equipping End(H) with the inner product A, B := Tr(B A), it is easy to see that Π becomes unitary. The linear map T π : End(H) C(G) π which we introduced in Lemma 1.23 is an intertwiner of the representations ρ and Π: T π (Π(g)A)(x) = Tr ( π(x)π(g)aπ(g 1 ) ) = Tr(π(g 1 xg)a) = ρ(g) Tr(π(x)A) = ρ(g)t π (A)(x). T π was surjective by Lemma 1.23. To show injectivity we define T π := dim H T π and show that this is unitary. Since the linear maps L v,w span End(H) it is enough to show unitarity on these. But first we need some facts concerning L v,w : L v,w x, y = x, w v, y = x, w v, y = x, v, y w = x, y, v w = x, L w,v showing that L v,w = L w,v. Furthermore L w,v L v,wx = L w,v ( x, w v) = x, w v, v w = v, v x, w w = v, v L w,wx. With the inner product on End(H) these results now yield L v,w, L v,w = Tr(L w,v L v,w) = Tr( v, v L w,w) = v, v w, w. Since T π (L v,w )(x) = m v,w (x), and using Schur Orthogonality II we see T π (L v,w ), T π (L v,w ) = dim H m v,w, m v,w = v, v w, w = L v,w, L v,w. Thus T π is unitary and in particular injective. Now we come to the actual proof: let ϕ C(G) π be a class function. Tπ is bijective, so there is a unique A End(H) for which ϕ = T π (A). That T π intertwines Π and ρ leads to ϕ(g 1 xg) = (ρ(g)ϕ)(x) = ρ(g) T π (A)(x) = T π (Π(g)A)(x) = T π (π(g)aπ(g 1 )), and since ϕ was a class function we get that π(g)aπ(g 1 ) = A, i.e. A intertwines π. But π was irreducible, which by Schur s Lemma implies A = λ id H, and hence ϕ = λt π (id H ). In particular there exists a unique class function ϕ 0 which is positive on e and which has L 2 -norm 1: namely we have ϕ 0 2 2 = T π (A) 2 2 = A 2 = Tr(A A) so if ϕ 0 should have norm 1 and be positive on e, then A is forced to be (dim H) 1 id H, so that ϕ 0 is given by ϕ 0 (g) = Tr π(g). This is a function of particular interest:
22 Chapter 1 Peter-Weyl Theory Definition 1.28 (Character). Let (π, V ) be a finite-dimensional representation of a group G. By the character of π we mean the function χ π : G C given by χ π (g) = Tr π(g). If χ is a character of an irreducible representation, χ is called an irreducible character. The character is a class function, so in the case of two representations π 1 and π 2 being equivalent via the intertwiner T : π 2 (g) = T π 1 (g)t 1, we have χ π1 = χ π2. Thus, equivalent representations have the same character. Actually, the converse is also true, we show that at the end of the section. Suppose that G is a topological group, and that H is a Hilbert space with orthonormal basis {e 1,..., e n }. Then we can calculate the trace as Tr π(g) = π(g)e i, e i i=1 which shows that χ π C(G) π. In due course we will prove some powerful orthogonality relations for irreducible characters. But first we will see that the character behaves nicely with respect to direct sum and tensor product operations on representations. Proposition 1.29. Let (π 1, V 1 ) and (π 2, V 2 ) be two finite-dimensional representations of the group G. The characters of π 1 π 2 and π 1 π 2 are then given by χ π1 π 2 (g) = χ π1 (g) + χ π2 (g) and χ π1 π 2 (g) = χ π1 (g)χ π2 (g). (1.7) Proof. Equip V 1 and V 2 with inner products and pick orthonormal bases (e i ) and (f j ) for V 1 and V 2 respectively. Then the vectors (e i, 0), (0, f j ) form an orthonormal basis for V 1 V 2 w.r.t. the inner product Thus we see (v 1, v 2 ), (w 1, w 2 ) := v 1, w 1 + v 2, w 2. χ π1 π 2 (g) = Tr π 1 π 2 (g) m = π1 π 2 (g)(e i, 0), (e i, 0) + = i=1 m π 1 (g)e i, e i + i=1 π1 π 2 (g)(0, f j ), (0, f j ) j=1 π 2 (g)f j, f j = χ π1 (g) + χ π2 (g). j=1 Likewise, the vectors e i f j constitute an orthonormal basis for V 1 V 2 w.r.t. the inner product and hence v 1 v 2, w 1 w 2 := v 1, w 1 v 2, w 2, χ π1 π 2 (g) = Tr π 1 π 2 (g) = = m,n π1 π 2 (g)(e i f j ), (e i f j ) i,j=1 m,n π 1 (g)e i, e i π 2 (g)f j, f j i,j=1 m = π 1 (g)e i, e i π 2 (g)f j, f j = χ π1 (g)χ π2 (g). i=1 j=1
1.4 Characters 23 The following lemma, stating the promised orthogonality relations of characters, shows that irreducible characters form an orthonormal set in C(G). The Schur Orthogonality Relations are important ingredients in the proof, thus henceforth we need the groups to be compact. Lemma 1.30. Let (π 1, V 1 ) and (π 2, V 2 ) be two finite-dimensional irreducible representations of a compact group G. Then the following hold: 1) π 1 = π2 implies χ π1, χ π2 = 1. 2) π 1 π 2 implies χ π1, χ π2 = 0. Proof. In the first case, we have a bijective intertwiner T : V 1 V 2. Choose an inner product on V 1 and an orthonormal basis (e i ) for V 1. Define an inner product on V 2 by declaring T to be unitary. Then (T e i ) is an orthonormal basis for V 2. Let n = dim V 1 = dim V 2. The expressions χ π1 χ π2 = n j=1 π 2(g)T e j, T e j along with (1.6) yield χ π1, χ π2 = = = = 1 n i,j=1 i,j=1 i,j=1 G G G π 1 (g)e i, e i π 2 (g)t e j, T e j dg π 1 (g)e i, e i T π 1 (g)e j, T e j dg π 1 (g)e i, e i π 1 (g)e j, e j dg e i, e j e i, e j = 1 n i,j=1 = n i=1 π 1(g)e i, e i and 1 = 1. In the second case, if π 1 and π 2 are non-equivalent then by Theorem 1.24 we have C(G) π1 C(G) π2. Since χ π1 C(G) π1 and χ π2 C(G) π2, the result follows. This leads to the main result on characters: Theorem 1.31. Let π be an finite-dimensional representation of a compact group G. Then π decomposes according to π = χ π, χ πi π i. π i Ĝ Proof. Proposition 1.16 says that π = m i π i where π i is irreducible and m i is the number of times that π i occurs in π. From Lemma 1.30 it follows that χ π = i m iχ πi and hence by orthonormality of the irreducible characters that m i = χ π, χ πi. Example 1.32. A very simple example to illustrate this is the following. Consider the 2-dimensional representation π of T given by x 1 ( ) e 2πinx + e 2πimx e 2πinx + e 2πimx 2 e 2πinx + e 2πimx e 2πinx + e 2πimx for n, m Z. It is easily seen to be a continuous homomorphism T Aut(C 2 ) with character χ π (x) = e 2πimx + e 2πinx. But the two terms are irreducible characters for T, cf. Example 1.15, and by Theorem 1.31 we have π = ρ n ρ m. i=1
24 Chapter 1 Peter-Weyl Theory Corollary 1.33. For finite-dimensional representations π 1, π 2 and π of a compact group we have: 1) π 1 = π2 if and only if χ π1 = χ π2. 2) π is irreducible if and only if χ π, χ π = 1. Proof. For the first statement, the only-if part is true by the remarks following the definition of the character. To see the converse, assume that χ π1 = χ π2. Then for each irreducible representation ρ we must have χ π1, χ ρ = χ π2, χ ρ and therefore π 1 and π 2 are equivalent to the same decomposition of irreducible representations, hence they are equivalent. If π is irreducible then Lemma 1.30 states that χ π, χ π = 1. Conversely, assume χ π, χ π = 1 and decompose π into irreducibles: π = m i π i. Orthonormality of the irreducible characters again gives χ π, χ π = m 2 i. From this it is immediate that there is precisely one m i which is 1, while the rest are 0, i.e. π = π i. Therefore π is irreducible. Considering the representations ρ and ρ from Example 1.11 we see that the corresponding characters satisfy χ ρ = χ ρ and since χ ρ is actually a complex map, they are certainly not equal. Hence the representations are inequivalent. 1.5 The Peter-Weyl Theorem The single most important theorem in the representation theory of compact topological groups is the Peter-Weyl Theorem. It has numerous consequences, some of which we will mention at the end of this section. Theorem 1.34 (Peter-Weyl I). Let G be a compact group. Then the subspace M(G) := π Ĝ C(G) π of C(G) is dense in L 2 (G). In other words the linear span of all matrix coefficients of the finite-dimensional irreducible representations of G is dense in L 2 (G). Proof. We want to show that M(G) = L 2 (G). We prove it by contradiction and assume that M(G) 0. Now, suppose that M(G) (which is a closed subspace of L 2 (G) and hence a Hilbert space itself) contains a finite-dimensional R-invariant subspace W (R is the right-regular representation) such that R W is irreducible (we prove below that this is a consequence of the assumption M(G) 0). Then we can pick an finite orthonormal basis (ϕ i ) for W, and then for 0 f W N f(x) = f, ϕ i ϕ i (x). This is a standard result in Hilbert space theory. Then we see that i=1 f(g) = (R W (g)f)(e) = N R W (g)f, ϕ i ϕ i (e). Since R W is a finite-dimensional irreducible representation, the map g R W (g), ϕ i is a matrix coefficient. But this means that f M(G), hence a contradiction. i=1
1.5 The Peter-Weyl Theorem 25 Now, let s prove the existence of the finite-dimensional right-invariant subspace. Let f 0 M(G) be nonzero. As C(G) is dense in L 2 (G) we can find a ϕ C(G) such that ϕ, f 0 0 where ϕ(g) = ϕ(g 1 ). Define K C(G G) by K(x, y) = ϕ(xy 1 ) and let T : L 2 (G) L 2 (G) be the integral operator with K as its kernel: T f(x) = K(x, y)f(y)dy. G According to functional analysis, this is a well-defined compact operator, and it commutes with R(g): T R(g)f(x) = K(x, y)r(g)f(y)dy = ϕ(xy 1 )f(yg)dy G G = ϕ(xgy 1 )f(y)dy = K(xg, y)f(y)dy G = R(g)(T f)(x). In the third equation we exploited the invariance of the measure under the right translation y yg 1. Since R(g) is unitary, also the adjoint T of T commutes with R(g): T R(g) = T R(g 1 ) = (R(g 1 ) T ) = (T R(g 1 )) = R(g) T. Thus, the self-adjoint compact operator T T commutes with R(g). The Spectral Theorem for compact operators yields a direct sum decomposition of L 2 (G): ( ) L 2 (G) = ker(t T ) E λ where all the eigenspaces E λ are finite-dimensional. They are also R-invariant, for if f E λ then G λ 0 T T (R(g)f) = R(g)(T T )f = R(g)(λf) = λ(r(g)f) (1.8) i.e. R(g)f E λ. Actually M(G) is R-invariant: all functions are of the form n i=1 a i π i (x)ϕ i, ψ i and since R(g)f(x) = f(xg) = a i π i (x)(π i (g)ϕ i ), ψ i i=1 we see that R(g)f M(G). But then also M(G) is invariant. If P : L 2 (G) M(G) denotes the orthogonal projection, then by Lemma 1.8, P commutes with R(g), and a calculation like (1.8) reveals that P E λ are all R-invariant subspaces of M(G). These are very good candidates to the subspace we wanted: they are finite-dimensional and R-invariant, so we can restrict R to a representation on these. We just need to verify that at least one of them is nonzero. So assume that P E λ are all 0. This means by definition of P that λ E λ M(G) and hence that M(G) ( λ E λ) = ker T T ker T, where the last inclusion follows since f ker T T implies 0 = T T f, f = T f, T f, i.e. T f = 0. But applied to the f 0 M(G) we picked at the beginning, we have T f 0 (e) = ϕ(ey 1 )f 0 (y)dy = ϕ(y)f 0 (y)dy = ϕ, f 0 0, G and as T f 0 is continuous, T f 0 0 as an L 2 function. Thus, we must have at least one λ for which P E λ 0. If R restricted to this space is not irreducible, it contains a nontrivial subspace on which it is. Thus, we have proved the result. G
26 Chapter 1 Peter-Weyl Theory What we actually have shown in the course of the proof is that we for each nonzero f can find a finite-dimensional subspace U L 2 (G) which is R-invariant and, restricted to which, R is irreducible. We can show exactly the same thing for the left regular representation L, all we need to alter is the definition of K, which should be K(x, y) = ϕ(x 1 y). This observation will come in useful now, when we prove the promised generalization of Corollary 1.21: Theorem 1.35 (Peter-Weyl II). Let (π, H) be any (possibly infinite-dimensional) representation of a compact group G on a Hilbert space H. Then π = πi where π i is a finite-dimensional irreducible representation of G, i.e. π is completely reducible. Proof. By virtue of Theorem 1.20 we can choose a new inner product on H turning π into a unitary representation. Then we consider the set Σ of collections of finite-dimensional invariant subspaces of H restricted to which π is irreducible, i.e. an element (U i ) i I in Σ is a collection of subspaces of H satisfying the mentioned properties. We equip Σ with the ordering defined by (U i ) i I (U j ) j J if i U i j U j. It is easily seen that (Σ, ) is inductively ordered, hence Zorn s Lemma yields a maximal element (V i ) i I. To show the desired conclusion, namely that H = i I V i, we assume that W := ( V i ) 0. We have a contradiction if we in W can find a finite-dimensional π-invariant subspace on which π is irreducible, so that s our goal. First we remark that W is π-invariant since it s the orthogonal complement to an invariant subspace, thus we can restrict π to a representation on W. Now, we will define an intertwiner T : W L 2 (G) between π W and the left regular representation L. Fix a unit vector x 0 H and define (T y)(g) = y, π(g)x 0. T y : G C is clearly continuous, and since T x 0 (e) = x 0 0, T x 0 is nonzero in L 2 (G), hence T is nonzero as a linear map. T is continuous, as the Cauchy-Schwartz inequality and unitarity of π(g) give T y(g) = y, π(g)x 0 y x 0 that is T x 0. T is an intertwiner: (T π(h))y(g) = π(h)y, π(g)x 0 = y, π(h 1 g)x 0 = L(h) (T y)(g). The adjoint T : L 2 (G) W (which is nonzero, as T is) is an intertwiner as well, for taking the adjoint of the above equation yields π(h) T = T L(h) for all h. Using unitarity we get π(h 1 ) T = T L(h 1 ), i.e. also T is an intertwiner. As T is nonzero, there is an f 0 L 2 (G) such that T f 0 0. But by the remark following the proof of the first Peter-Weyl Theorem we can find a nontrivial finite-dimensional L-invariant subspace U L 2 (G) containing f 0. Then T U W is finite-dimensional, nontrivial (it contains T f 0 ) and π-invariant, for if T f T U, then π(h) T f = T L(h)f T U. Inside T U we can now find a subspace on which π is irreducible, hence the contradiction. An immediate corollary of this is: Corollary 1.36. An irreducible representation of a compact group is automatically finite-dimensional.
1.5 The Peter-Weyl Theorem 27 In particular the second Peter-Weyl Theorem says that the left regular representation is completely reducible. In many textbooks this is the statement of the Peter-Weyl Theorem. The proof of this is not much different from the proof we gave for the first version of the Peter-Weyl Theorem, and from this it would also be possible to derive our second version of the Peter-Weyl Theorem. I chose the version with matrix coefficients since it can be used immediately to provide elegant proofs of some results in Fourier theory, which we now discuss. Theorem 1.37. Let G be a compact group. The set of irreducible characters constitute an orthonormal basis for the Hilbert space L 2 (G, class). In particular every square integrable class function f on G can be written f = π Ĝ f, χ π χ π, the convergence being L 2 -convergence. Proof. Let P π : L 2 (G) C(G) π denote the orthogonal projection onto C(G) π. It is not hard to see that P π maps class functions to class functions, hence P π (L 2 (G, class)) C(G) π C(G, class), the last space being the 1-dimensional Cχ π by Lemma 1.27. Hence the space M(G, class) := M(G) C(G, class) = ρ Ĝ C(G) π C(G, class) has as orthonormal basis the set of irreducible characters of G. To see that the characters also form an orthonormal basis for the Hilbert space L 2 (G, class) assume that there exists an f L 2 (G, class) which is orthogonal to all the characters. Then since P π f is just a scalar multiple of χ π we see P π f = P π f, χ π χ π = f, χ π χ π = 0 where in the third equality we exploited self-adjointness of the projection P π. Thus we must have f M(G) which by Peter-Weyl I implies f = 0. Specializing to the circle group T yields the existence of Fourier series. First of all, since T is abelian, all functions defined on it are class functions, and functions on T are nothing but functions on R with periodicity 1. Specializing the above theorem to this case then states that the irreducible characters e 2πinx constitute an orthonormal basis for L 2 (T, class) and that we have an expansion of any such square integrable class function f = n Z c n (f)e 2πinx (1.9) where c n is the n th Fourier coefficient c n (f) = f, ρ n = 1 0 f(x)e 2πinx dx. It s important to stress that the convergence in (1.9) is only L 2 -convergence. If we put some restrictions to f such as differentiability or continuous differentiability we can achieve pointwise or uniform convergence of the series. We will not travel further into this realm of harmonic analysis.
28 Chapter 1 Peter-Weyl Theory
Chapter 2 Structure Theory for Lie Algebras 2.1 Basic Notions Although we succeeded in Chapter 1 to prove some fairly strong results, we must realize that it is limited how much we can say about topological groups, compact or not. For instance the Peter-Weyl Theorem tells us that every representation of a compact group is completely reducible, but if we don t know the irreducible representations then what s the use? Therefore we change our focus to Lie groups. The central difference, when regarding Lie groups, is of course that we have their Lie algebras at our disposal. Often these are much easier to handle than the groups themselves, while at the same time saying quite a lot about the group. Therefore we need to study Lie algebras and their representation theory. In this section we focus solely on Lie algebras, developing the tools necessary for the representation theory of the later chapters. We will only consider Lie algebras over the fields R and C (commonly denoted K) although many of the results in this chapter carry over to arbitrary (possibly algebraically closed) fields of characteristic 0. Definition 2.1 (Lie Algebra). A Lie algebra g over K is a K-vector space g equipped with a bilinear map [, ] : g g g satisfying 1) [X, Y ] = [Y, X] (antisymmetry) 2) [[X, Y ], Z] + [[Y, Z], X] + [[Z, X], Y ] = 0 (Jacobi identity). A Lie subalgebra h of g is a subspace of g which is closed under the bracket, i.e. for which [h, h] h. A Lie subalgebra h for which [h, g] h is called an ideal. In this thesis all Lie algebras will be finite-dimensional unless otherwise specified. Example 2.2. The first examples of Lie algebras are algebras of matrices. By gl(n, R) and gl(n, C) we denote the set of real resp. complex n n matrices equipped with the commutator bracket. It is trivial to verify that these are indeed Lie algebras. The list below contains the definition of some of the classical Lie algebras. They are all subalgebras of the two Lie algebras just mentioned. It is a matter of routine calculations to verify that these examples are indeed 29
30 Chapter 2 Structure Theory for Lie Algebras closed under the the commutator bracket. sl(n, R) = {X gl(n, R) Tr X = 0} sl(n, C) = {X gl(n, C) Tr X = 0} so(n) = {X gl(n, R) X + X t = 0} so(m, n) = {X, gl(m + n, R) X t I m,n + I m,n X = 0} so(n, C) = {X gl(n, C) X + X t = 0} u(n) = {X gl(n, C) X + X = 0} u(m, n) = {X gl(m + n, C) X I m,n + I m,n X = 0} su(n) = {X gl(n, C) X + X = 0, Tr X = 0} su(m, n) = {X gl(m + n, C) X I m,n + I m,n X = 0, Tr X = 0} where I m,n is the block-diagonal matrix whose first m m block is the identity and the last n n block is minus the identity. Another interesting example is the endomorphism algebra End K (V ) for some K-vector space V, finite-dimensional or not. Equipped with the commutator bracket [A, B] = AB BA this becomes a Lie algebra over K, as one can check. To emphasize the Lie algebra structure of this, it is sometimes denoted gl(v ). We stick to End(V ). We always have the trivial ideals in g, namely 0 and g itself. If g is a Lie algebra and h is an ideal in g, then we can form the quotient algebra g/h in the following way: The underlying vector space is the vector space g/h and this we equip with the bracket [X + h, Y + h] = [X, Y ] + h. Using the ideal-property it is easily checked that this is indeed well-defined and satisfies the properties of a Lie algebra. Definition 2.3 (Lie Algebra Homomorphism). Let g and g be Lie algebras over K. A K-linear map ϕ : g g is called a Lie algebra homomorphism if it satisfies [ϕ(x), ϕ(y )] = ϕ[x, Y ] for all X, Y g. If ϕ is bijective it is called a Lie algebra isomorphism. An example of a Lie algebra homomorphism is the canonical map κ : g g/h mapping X to X + h. It is easy to see that the image of a Lie algebra homomorphism is a Lie subalgebra of g and that the kernel of a homomorphism is an ideal in g. Another interesting example is the so-called adjoint representation ad : g End(V ) given by ad(x)y = [X, Y ]. We see that ad(x) is linear, hence an endomorphism, and that the map X ad(x) is linear. By virtue of the Jacobi identity it respects the bracket operation and is thus ad is a Lie algebra homomorphism. In analogy with vector spaces and rings we have the following Proposition 2.4. Let ϕ : g g be a Lie algebra homomorphism and h g an ideal which contains ker ϕ, then there exists a unique Lie algebra homomorphism ϕ : g/h g such that ϕ = ϕ κ. In the case that h = ker ϕ and g = im ϕ the induced map is an isomorphism. If h and k are ideals in g then there exists a natural isomorphism (h+k)/k h/(h k). Definition 2.5 (Centralizer). Finally, for any element X g we define the centralizer C(X) of X to be the set of elements in g which commute with X. Let h be any subalgebra of g. The centralizer C(h) of h is the set of all elements of
2.1 Basic Notions 31 g that commute with all elements of h. The centralizer of g is called the center and is denoted Z(g). For a subalgebra h of g we define the normalizer N(h) of h to be all elements X g for which [X, h] h. We immediately see that the centralizer of X is just ker ad(x), hence C(X) is an ideal. Furthermore we see that C(h) = X h C(X) and that Z(g) = ker ad. Hence also the center is an ideal. Finally, a subalgebra of g is an ideal if and only if it s normalizer is g. Now consider the so-called derived algebra: Dg := [g, g] which clearly is an ideal. g is called abelian if Dg = 0, i.e. if [X, Y ] = 0 for all X, Y g. Every 1-dimensional Lie algebra is abelian by antisymmetry of the bracket. Definition 2.6 (Simple Lie Algebra). A nontrivial Lie algebra is called indecomposable if the only ideals are the trivial ones: g and 0. A nontrivial Lie algebra is called simple if it is indecomposable and Dg 0. Any 1-dimensional Lie algebra is indecomposable and as the next proposition shows, the requirement Dg 0 is just to get rid of these trivial examples: Proposition 2.7. A Lie algebra is simple if and only if it is indecomposable and dim g 2. Proof. If g is simple then it is not abelian, hence we must have dim g 2. Conversely, assume that g is indecomposable and dim g 2. As Dg is an ideal we can only have Dg = 0 or Dg = g. In the first case, g is abelian and hence all subspaces are ideals, and since dim g 2, nontrivial ideals exist, contradicting indecomposability. Therefore Dg = g 0. Now, let s consider the following sequence of ideals D 1 g := Dg, D 2 g := [Dg, Dg],..., D n g := [D n 1 g, D n 1 g], the so-called derived series. Obviously we have D m+n g = D m (D n g). To see that they are really ideals we use induction: We have already seen that D 1 g is an ideal, so assume that D n 1 g is an ideal. Let X, X D n 1 g and let Y g be arbitrary. Then by the Jacobi identity [[X, X ], Y ] = [[X, Y ], X] [[Y, X], X ]. Since D n 1 g is an ideal, [X, Y ], [Y, X] D n 1 g showing that [[X, X ], Y ] D n g. Definition 2.8 (Solvable Lie Algebra). A Lie algebra is called solvable if there exists an N such that D N g = 0. Abelian Lie algebras are solvable, since we can take N = 1. On the other hand, simple Lie algebras are definitely not solvable, for we showed in the proof of Proposition 2.7 that Dg = g which implies that D n g = g for all n. Proposition 2.9. Let g be a Lie algebra. 1) If g is solvable, then so are all subalgebras of g. 2) If g is solvable and ϕ : g g is a Lie algebra homomorphism, then im ϕ is solvable. 3) If h g is a solvable ideal so that g/h is solvable, then g is solvable.
32 Chapter 2 Structure Theory for Lie Algebras 4) If h and k are solvable ideals of g, then so is h + k. Proof. 1) It should be clear that Dh Dg. Hence, by induction, D i h D i g and since D N g = 0 for some N, then D N h = 0 as well. 2) Since ϕ is a Lie algebra homomorphism, we have D(ϕ(g)) = ϕ(dg), and again by induction D i (ϕ(g)) = ϕ(d i g). Thus, D N g = 0 implies D N (ϕ(g)) = 0. 3) Assume there is an N for which D N (g/h) = 0 and consider the canonical map κ : g g/h. Like above we have D i (g/h) = D i (κ(g)) = κ(d i g). Thus, since D N (g/h) = 0, we have κ(d N g) = 0 i.e. D N g h. But h was also solvable, so we can find an M for which D M h = 0. Then D M+N g = D M (D N g) D M h = 0 i.e. g is solvable. 4) By 3) of this proposition it is enough to prove that (h + k)/k is solvable. By Proposition 2.4 there exists an isomorphism (h+k)/k h/(h k), and the right hand side is solvable since it is the image of the canonical map h h/(h k). The last point of this proposition yields the existence of a maximal solvable ideal in g, namely if h and k are solvable ideals, then h + k will be a solvable ideal containing both. Thus the sum of all solvable ideals is a solvable ideal. This works since the Lie algebra is finite-dimensional. By construction, it is unique. Definition 2.10 (Radical). The maximal solvable ideal, the existence of which we have just verified, is called the radical of g and is denoted Rad g. A Lie algebra g is called semisimple if Rad g = 0. Since all solvable ideals are contained in Rad g another way of formulating semisimplicity would be to say that it has no nonzero solvable ideals. In this sense, semisimple Lie algebras are as far as possible from being solvable. In the next section we prove some equivalent conditions for semisimplicity. Proposition 2.11. Semisimple Lie algebras have trivial centers. Proof. The center is an abelian, hence solvable, ideal, and is therefore trivial by definition. We now consider a concept closely related to solvability. Again we consider a sequence of ideals: g 0 := g, g 1 := Dg, g 2 := [g, g 1 ],..., g n := [g, g n 1 ]. It shouldn t be too hard to see that D i g g i. Definition 2.12 (Nilpotent Lie Algebra). A Lie algebra g is called nilpotent if there exists an N such that g N = 0. Since D i g g i nilpotency of g implies solvability of g. The converse statement is not true in general. So schematically: abelian nilpotent solvable in other words, solvability and nilpotency are in some sense generalizations of being abelian. Here is a proposition analogous to Proposition 2.9 Proposition 2.13. Let g be a Lie algebra. 1) If g is nilpotent, then so are all its subalgebras. 2) If g is nilpotent and ϕ : g g is a Lie algebra homomorphism, then im ϕ is nilpotent.
2.1 Basic Notions 33 3) If g/z(g) is nilpotent, then g is nilpotent. 4) If g is nilpotent, then Z(g) 0. Proof. 1) In analogy with the proof of Proposition 2.9 a small induction argument show that if h g is a subalgebra, then h i g i. Thus, g N = 0 implies h N = 0. 2) We have already seen that ϕ(g) 1 = ϕ(dg). Furthermore ϕ(g) 2 = [ϕ(g), ϕ(g) 1 ] = [ϕ(g), ϕ(dg)] = ϕ([g, Dg]) = ϕ(g 2 ) and by induction we get ϕ(g) i = ϕ(g i ). Hence nilpotency of g implies nilpotency of ϕ(g). 3) Letting κ : g g/z(g) denote the canonical homomorphism, we see that (g/z(g)) i = (κ(g)) i = κ(g i ) = g i /Z(g). Thus, if (g/z(g)) N = 0 then g N Z(g). But then g N+1 = [g, g N ] [g, Z(g)] = 0, hence g is nilpotent. 4) As g is nilpotent there is a smallest n such that g n 0 and g n+1 = 0. This means that [g, g n ] = 0 i.e. everything in g n commutes with all elements of g. Thus, 0 g n Z(g). Definition 2.14. An element X g is called ad-nilpotent if ad(x) is a nilpotent linear map, i.e. if there exists an N such that ad(x) N = 0. If the Lie algebra is a subalgebra of an algebra of endomorphisms (for instance End(V )), it makes sense to ask if the elements themselves are nilpotent. In this case nilpotency and ad-nilpotency of an element X need not be the same. For instance in End(V ) we have the identity I, which is obviously not nilpotent. However, ad(i) = 0, and thus I is ad-nilpotent. The reverse implication, however, is true: Lemma 2.15. Let g be a Lie algebra of endomorphisms of some vector space. If X g is nilpotent, then it is ad-nilpotent. Proof. We associate to A g two linear maps λ A, ρ A : End(V ) End(V ) by λ A (B) = AB and ρ A (B) = BA. It s easy to see that they commute, and that ad(a) = λ A ρ A. As A is nilpotent, λ A and ρ A are also nilpotent, so we can find an N for which λ N A = ρn A = 0. Since they commute, we can use the binomial formula and get ad(a) 2N = (λ A ρ A ) 2N = 2N j=0 ( 1) j ( 2N j ) λ 2N j A ρ j A which is zero since all terms contain either λ A or ρ A to a power greater than N. An equivalent formulation of nilpotency of a Lie algebra is that there exists an N such that ad(x 1 ) ad(x N )Y = 0 for all X 1,..., X N, Y g. In particular, if g is nilpotent, then there exists an N such that ad(x) N = 0 for all X g, i.e. X is ad-nilpotent. Thus, for a nilpotent Lie algebra g, all elements are ad-nilpotent. That the converse is actually true is the statement of Engel s Theorem, which will be a corollary to the following theorem. Theorem 2.16. Let V be a finite-dimensional vector space and g End(V ) be a subalgebra consisting of nilpotent linear endomorphisms. Then there exists a nonzero v V which is an eigenvector for all A End(V ).
34 Chapter 2 Structure Theory for Lie Algebras Proof. We will prove this by induction over the dimension of g. First, assume dim g = 1. Then g = KA for some nonzero A g. As A is nilpotent there is a smallest N such that A N 0 and A N+1 = 0, i.e. we can find a vector w V with A N w 0 and A(A N w) = A N+1 w = 0. Since all elements of g are scalar multiples of A the vector A N w will qualify. Now assuming that the theorem holds for all Lie algebras of dimension strictly less than n, we should prove that it holds for n-dimensional algebras as well. The algebra g consists of nilpotent endomorphisms on V, hence by the previous lemma, all elements are ad-nilpotent. Consider a subalgebra h g of g which thus also consists of ad-nilpotent elements. For A h we have that ad(a)h h since h as a subalgebra is closed under brackets. We can form the vector space g/h and define a linear map ad(a) : g/h g/h by ad(a)(b + h) = (ad(a)b) + h. This is well defined for if B + h = B + h, then B B h and therefore ad(a)(b + h) = ad(a)b + h = ad(a)b + ad(a)(b B ) + h = ad(a)b + h = ad(a)(b + h). This map is again nilpotent since ad(a) N (B + h) = (ad(a) N B) + h = h = [0]. So, the situation now is that we have a subalgebra ad(h) of End(g/h) with dim ad(h) dim h < dim g = n. Our induction hypothesis then yields an element 0 [B 0 ] = B 0 + h g/h on which ad(a) is zero for all A h. This means that [A, B 0 ] h for all h, i.e. the normalizer N(h) of h is strictly larger than h. Now assume that h is any maximal subalgebra h g. Then since N(h) is a strictly larger subalgebra we must have N(h) = g and consequently h is an ideal. Then g/h is a Lie algebra with canonical Lie algebra homomorphism κ : g g/h and g/h must have dimension 1 for, assuming otherwise, we could find a 1-dimensional subalgebra k g/h in g/h, and then κ 1 (k) g would be a subalgebra strictly larger that h. This is a contradiction, hence dim g/h = 1 and g = h KA 0 for some nonzero A 0 g \ h. So far, so good. Now we come to the real proof of the existence of the nonzero vector v V. h was an ideal of dimension n 1 hence the induction hypothesis assures that the subspace W := {v V B h : Bv = 0} is nonempty. We will show that each linear map A g (which maps V V ) can be restricted to a map W W and that it as such a map is still nilpotent. This will, in particular, hold for A 0 which by nilpotency will have the eigenvalue 0 and hence a nonzero eigenvector v W associated to the eigenvalue 0. This will be the desired vector, for all linear maps in g can according to the decomposition above be written as B + λa 0 for some B h, and Bv = 0 since v was chosen to be in W. Thus, to finish the proof we only need to see that W is invariant. So let A g be any map. Since h is an ideal, [A, h] h and hence for w W B(Aw) = A(Bw) [A, B]w = 0 for all B h. This shows that Aw W and hence that W is invariant. A restriction of a nilpotent map is clearly nilpotent. This completes the proof. From this we can prove Corollary 2.17 (Engel s Theorem). A Lie algebra is nilpotent if and only if all its elements are ad-nilpotent.
2.2 Semisimple Lie Algebras 35 Proof. We have already showed the only if part. To show the if part we again invoke induction over the dimension of g. If dim g = 1 then g is abelian, hence nilpotent. Now set n = dim g and assume that the result holds for all Lie algebras with dimension strictly less than n. All the elements of g are ad-nilpotent, hence ad(g) is a subalgebra of End(g) consisting of nilpotent elements and the previous theorem yields an element 0 X g for which ad(y )(X) = 0 for all Y g i.e. X is contained in the center Z(g) which is therefore a nonzero ideal and g/z(g) is a Lie algebra whose dimension strictly less than n. Furthermore all elements of g/z(g) are ad-nilpotent, since by definition of the quotient bracket ad([a])[b] = ad(a)b + Z(g) we have that ad(a) N = 0 implies ad([a]) N = 0. Thus g/z(g) consists solely of ad-nilpotent elements. Consequently the induction hypothesis assures that g/z(g) is nilpotent. Then by Proposition 2.13 g is nilpotent. 2.2 Semisimple Lie Algebras The primary goal of this section is to reach some equivalent formulations of semisimplicity. Our approach to this will be via the so-called Cartan Criterion for solvability which we will prove shortly. First we need a quite powerful result from linear algebra regarding advanced diagonalization : Theorem 2.18 (SN-Decomposition). Let V be a finite-dimensional vector space over K and let A End(V ). Then there exist unique commuting linear maps S, N End(V ), S being diagonalizable and N being nilpotent, satisfying A = S +N. This is called the SN-decomposition In fact S and N can be realized as polynomials in A without constant terms. Furthermore, if A = S +N is the SN-decomposition of A, then ad(s)+ad(n) is the SN-decomposition of ad(a). We will not prove this 1. Cartan s Criterion gives a sufficient condition for solvability based on the trace of certain matrices. Therefore the following lemma is necessary. Lemma 2.19. Let V be a finite-dimensional vector space, W 1 and W 2 be subspaces of End(V ) and define M := {B End(V ) ad(a)w 1 W 2 }. If A M satisfies Tr(AB) = 0 for all B M then A is nilpotent. Proof. Let A M satisfy the required condition, and consider the SN-decomposition of A = S + N. We are done if we can show that S = 0. Well, S is diagonalizable, so we can find a basis {e 1,..., e n } for V in which S has the form diag(a 1,..., a n ). We will show that all these eigenvalues are 0, and we do so in a curious way: We define E := span Q {a 1,..., a n } K to be the subspace of K over the rationals spanned by the eigenvalues. If we can show that this space, or equivalently its dual space E, consisting of Q-linear maps E Q, is 0, then we are done. So, let ϕ E be arbitrary. The basis we chose for V readily gives us a basis for End(V ), consisting of E ij where E ij is the linear map determined by E ij e j = e i and E ij e k = 0 for k j. Then we see (ad(s)e ij )e j = [S, E ij ]e j = SE ij e j E ij Se j = Se i a j E ij e j = (a i a j )e i 1 For a proof the reader is referred to for instance [5] Section 4.3.
36 Chapter 2 Structure Theory for Lie Algebras while [S, E ij ]e k = 0 for k j i.e. ad(s)e ij = (a i a j )E ij. Now, let B End(V ) denote the linear map which in the basis {e 1,..., e n } is diag(ϕ(a 1 ),..., ϕ(a n )). As with S we have that ad(b)e ij = (ϕ(a i ) ϕ(a j ))E ij. There exists a polynomial p = N n=1 c nx n without constant term which maps a i a j to ϕ(a i a j ) = ϕ(a i ) ϕ(a j ) (it s a matter of solving some equations to find the coefficients c n ). Then we have p(ad S)E ij = c n (ad S) n E ij + + c 1 (ad S)E ij = c n (a i a j ) n E ij + + c 1 (a i a j )E ij = p(a i a j )E ij = (ϕ(a i ) ϕ(a j ))E ij which says that p(ad S) = ad B. A statement in the SN-decomposition was that ad S, being the diagonalizable part of ad A, is itself a polynomial expression in ad A without constant term, which implies that ad B is a polynomial in ad A without constant term. Since A M we have that ad(a)w 1 W 2, and since ad(b) was a polynomial expression in ad(a) then also ad(b)w 1 W 2, i.e. B M, and therefore by assumption Tr(AB) = 0. The trace of AB is the sum n i=1 a iϕ(a i ) and applying ϕ to the equation Tr(AB) = 0 we get n i=1 ϕ(a i) 2 = 0, i.e. ϕ(a i ) = 0 (ϕ(a i ) are rationals hence ϕ(a i ) 2 0). Therefore we must have ϕ = 0 which was what we wanted. Theorem 2.20 (Cartan s Criterion). Let V be a finite-dimensional vector space and g End(V ) a subalgebra. If Tr(AB) = 0 for all A g and all B Dg then g is solvable. Proof. As D n g = D n 1 (Dg) (Dg) n 1 we see that g will be solvable if Dg is nilpotent. To show that Dg is nilpotent we invoke Engel s Theorem and Lemma 2.15 which combined say that Dg is nilpotent if all X Dg are nilpotent. To this end we use the preceding lemma with W 1 = g and W 2 = Dg and M = {B End(V ) [B, g] Dg}. Notice that g M. The reverse inclusion need not hold. Now, let A Dg be arbitrary, we need to show that it is nilpotent, and by virtue of the previous lemma it suffices to verify that Tr(AB) = 0 for all B M. A is of the form [X, Y ] for X, Y g and we have, in general that Tr([X, Y ]B) = Tr(XY B) Tr(Y XB) = Tr(Y BX) Tr(BY X) = Tr([Y, B]X) = Tr(X[Y, B]). (2.1) Since B M and Y g we have by construction of M that [Y, B] Dg. But then by assumption in the theorem we have that Tr(AB) = Tr([X, Y ]B) = 0. With this powerful tool we can prove the promised equivalent conditions for a Lie algebra to be semisimple. One of them involves the so-called Killing form: Definition 2.21 (Killing Form). By the Killing form for a Lie algebra g over K we understand the bilinear form B : g g K given by B(X, Y ) = Tr(ad(X) ad(y )). Proposition 2.22. The Killing form is a symmetric bilinear form satisfying B([X, Y ], Z) = B(X, [Y, Z]). (2.2) Furthermore, if ϕ is any Lie algebra automorphism of g then B(ϕ(X), ϕ(y )) = B(X, Y ).
2.2 Semisimple Lie Algebras 37 Proof. B is obviously bilinear, and symmetry is a consequence of the property of the trace: Tr(AB) = Tr(BA). Eq. (2.2) follows from (2.1). If ϕ : g g is a Lie algebra automorphism, then another way of writing the equation [ϕ(x), ϕ(y )] = ϕ([x, Y ]) is ad(ϕ(x)) ϕ = ϕ ad(x). Therefore B(ϕ(X), ϕ(y )) = Tr(ϕ ad(x) ad(y ) ϕ 1 ) = Tr(ad(X) ad(y )) = B(X, Y ). Calculating the Killing form directly from the definition is immensely complicated. Fortunately, for some of the classical Lie algebras we have a much simpler formula: 2(n + 1) Tr(XY ), for X, Y sl(n + 1, K), sp(2n, K) B(X, Y ) = (2n 1) Tr(XY ), for X, Y so(2n + 1, K) (2.3) 2(n 1) Tr(XY ), for X, Y so(2n, K). Lemma 2.23. If g is a Lie algebra with Killing form B and h g is an ideal, then B h h is the Killing form of h. Proof. First a general remark: If ϕ : V V is a linear map, and W V is a subspace for which im ϕ W, then Tr ϕ = Tr(ϕ W ): Namely, pick a basis {e 1,..., e k } for W and extend it to a basis {e 1,..., e k,..., e n } for V. Let {ε 1,..., ε n } denote the corresponding dual basis. As ϕ(v) W we have ε k+i (ϕ(v)) = 0 and hence Tr ϕ = ε i (ϕ(e i )) = i=1 k ε i (ϕ(e i )) = Tr(ϕ W ). i=1 Now, let X, Y h, then as h is an ideal: ad(x)g h and ad(y )g h, which means that the image of ad(x) ad(y ) lies inside h. It should be obvious that the adjoint representation of h is just ad(x) h for X h. Therefore B h (X, Y ) = Tr(ad(X) h ad(y ) h ) = Tr((ad(X) ad(y )) h ) = B h h (X, Y ). Theorem 2.24. If g is a Lie algebra, then the following are equivalent: 1) g is semisimple i.e. Rad g = 0. 2) g has no nonzero abelian ideals. 3) The Killing form B of g is non-degenerate. 4) g is a direct sum of simple Lie algebras: g = g 1 g n. Proof. We first prove that 1 and 2 are equivalent. If g is semisimple, then g has no nonzero solvable ideals and since abelian ideals are solvable, no nonzero abelian ideals either. Conversely, if Rad g 0, then, since Rad g is solvable, there is a smallest N for which D N (Rad g) 0 and D N+1 (Rad g) = 0. Then D N (Rad g) and is an abelian ideal hence a solvable ideal. So by contraposition, if no solvable ideals exist, then g is semisimple. Now we show that 1 implies 3. We consider the so-called radical of the Killing form B namely the subspace h := {X g Y g : B(X, Y ) = 0}. h is an ideal for if X h and Y g, then for all Z g: B([X, Y ], Z) = B(X, [Y, Z]) = 0
38 Chapter 2 Structure Theory for Lie Algebras i.e. [X, Y ] h. Obviously B is non-degenerate if and only if h = 0. Now we assume that Rad g = 0 and want to show that h = 0. We can do this by showing that h is solvable for then h Rad g. First we use the Cartan Criterion on the Lie algebra ad(h) showing that this is solvable: By definition of h we have that 0 = B(X, Y ) = Tr(ad(X) ad(y )) for all X h and Y g, in particular it holds for all X Dh. In other words we have Tr(AB) = 0 for all A ad(dh) = D(ad h) and all B ad h. Hence the Cartan Criterion tells us that ad h is solvable, i.e. 0 = D N (ad h) = ad(d N h). This says that D N h Z(g) implying that D N+1 h = 0. Thus, h is solvable and consequently equals 0. Then we prove 3 implies 2. Assume that h = 0, and assume k to be an abelian ideal and let X k and Y g. Since the adjoint representations maps according to (exploiting the ideal property of k) g ad(y ) g ad(x) k ad(y ) k ad(x) Dk = 0 we have (ad(x) ad(y )) 2 = 0 that is ad(x) ad(y ) is nilpotent. Since nilpotent matrices have zero trace we see that 0 = Tr(ad(X) ad(y )) = B(X, Y ). This implies X h, i.e. k h and thus the desired conclusion. We then proceed to show that 1 implies 4. Suppose g is semisimple, and let h g be any ideal. We consider its orthogonal complement w.r.t. B: h := {X g Y h : B(X, Y ) = 0}. This is again an ideal in g for if X h and Y g, then for all Z g we have [Y, Z] h and hence B([X, Y ], Z) = B(X, [Y, Z]) = 0 saying that [X, Y ] h. To show that we have a decomposition g = h h we need to show that the ideal h h is zero. We can do this by showing that it is solvable for then semisimplicity forces it to be zero. By some remarks earlier in this proof, solvability of h h would be a consequence of ad(h h ) being solvable. To show that ad(h h ) is solvable we invoke the Cartan Criterion: For X D(h h ) h h and Y h h we have Tr(ad(X) ad(y )) = B(X, Y ) = 0 since, in particular, X h and Y h. Thus, the Cartan Criterion renders solvability of ad(h h ) implying h h = 0. Hence, h h = 0 and g = h h. After these preliminary remarks we proceed via induction over the dimension of g. If dim g = 2, then g is simple, for any nontrivial ideal in g would have to be 1-dimensional, hence abelian, and such do not exist. Assume now that dim g = n and that the result is true for Lie algebras of dimension strictly less than n. Suppose that g 1 is a minimal nonzero ideal in g then g 1 is simple since dim g 1 2 and since any nontrivial ideal in g 1 would be an ideal in g properly contained in g 1 contradicting minimality. Then we have g = g 1 g 1 with g 1 semisimple, for if k is any abelian ideal in g 1 then it is an abelian ideal in g and these do not exist. Then by the induction hypothesis we have g 1 = g 2 g n, a sum of simple Lie algebras, hence g = g 1 g 2 g n, a sum of simple algebras. Finally we show that 4 implies 2. So consider g := g 1 g n and let h g be an abelian ideal. It is not hard to verify that h i := h g i is an abelian ideal in g i, thus h i = g i or h i = 0. As h i is abelian and g i is not, we can rule out the first possibility, i.e. h i = 0 and hence h = 0. During the proof we saw that any ideal in a semisimple Lie algebra has a complementary ideal. This is important enough to be stated as a separate result: Proposition 2.25. Let g be a semisimple Lie algebra and h g an ideal. Then h := {X g Y h : B(X, Y ) = 0} is an ideal in g and g = h h. Another very important concept in the discussion to follow is that of complexification.
2.2 Semisimple Lie Algebras 39 Definition 2.26 (Complexification). Let V be a real vector space. By the complexification V C of the vector space V we understand V C := V iv which equipped with the scalar multiplication (a + ib)(v 1 + iv 2 ) = (av 1 bv 2 ) + i(av 2 + bv 1 ) becomes a complex vector space. If g is a real Lie algebra, the complexification g C of g is the vector space g ig equipped with the bracket [X 1 + ix 2, Y 1 + iy 2 ] = ([X 1, Y 1 ] [X 2, Y 2 ]) + i([x 1, Y 2 ] + [X 2, Y 1 ]) (note that this in not the usual direct sum bracket!). It is easily checked that g C is a complex Lie algebra. Other presentations of this subject define the complexification of g by g C = g R C, where C is considered a 2-dimensional real space. By writing C = R ir and use distributivity of the tensor product, this definition is equivalent to ours. Example 2.27. The classical real Lie algebras mentioned earlier have the following complexifications gl(n, R) C = gl(n, C) sl(n, R) C = sl(n, C) so(n) C = so(n, C) so(m, n) C = so(m + n, C) u(n) C = gl(n, C) u(m, n) C = gl(m + n, C) su(n) C = sl(n, C) su(m, n) C = sl(m + n, C). Let s prove a few of them. For the first one, pick an element X of gl(n, C) and split it in real and imaginary parts X = X 1 + ix 2. It is an easy exercise to verify that the map X X 1 + ix 2 is a Lie algebra isomorphism gl(n, C) gl(n, R) C. To prove u(n) C = gl(n, C), let X gl(n, C) and write it as X = X X 2 + i X + X. 2i It is not hard to see that both 1 2 (X X ) and 1 2i (X + X ) are skew-adjoint, i.e. elements of u(n). Again it is a trivial calculation to show that X X X 2 + i X + X 2i is a Lie algebra isomorphism gl(n, C) u(n) C. The other identities are verified in a similar fashion. Proposition 2.28. A Lie algebra g is semisimple if and only if g C is semisimple. Proof. Let B denote the Killing form of g and B C the Killing form of g C. Our first task is to relate them. If {X 1,..., X n } is a basis for g as an R-vector space, then {X 1,..., X n } is also a basis for g C as a C-vector space. Therefore for X, Y g the linear map ad(x) ad(y ) will have the same matrix whether it
40 Chapter 2 Structure Theory for Lie Algebras is considered a linear map on g or g C. In particular their traces will be equal which amounts to say that B(X, Y ) = B C (X, Y ). In other words B C g g = B. (2.4) Now assume g to be semisimple, or, equivalently B to be non-degenerate. Then B(X, Y ) = 0 for all Y g implies X = 0. To show that B C is non-degenerate, let X g C satisfy B C (X, Y ) = 0 for all Y g C. Then it particularly holds for all Y g. Write X = A 1 + ia 2 where A 1, A 2 g, then by (2.4) 0 = B C (A 1, Y ) + B C (A 2, Y ) = B(A 1, Y ) + ib(a 2, Y ) for all Y g. Hence by non-degeneracy of B we have A 1 = A 2 = 0, i.e. X = 0. Thus B C is non-degenerate. Now assume B C to be non-degenerate and suppose B(X, Y ) = 0 for all Y g. This particularly holds for the basis elements: B(X, X k ) = 0 for k = 1,..., n. By (2.4) we also have B C (X, X k ) = 0, and since {X 1,..., X n } was also a basis for g C we get B C (X, Y ) = 0 for all Y g C and thus by non-degeneracy of B C that X = 0, i.e. B is non-degenerate. Up till now we have talked a lot about semisimple Lie algebras and their amazing properties. But we have not yet encountered one single example of a semisimple Lie algebra. The rest of this section tends to remedy that. The first thing we do is to introduce a class of Lie algebras which contains the semisimple ones: Definition 2.29 (Reductive Lie Algebra). A Lie algebra g is called reductive if for each ideal a g there is an ideal b g such that g = a b. From Proposition 2.25 it follows that semisimple Lie algebras are reductive. So schematically we have simple semisimple reductive. Note how these classes of Lie algebras are somehow opposite to the classes of abelian, solvable or nilpotent algebras. The next proposition characterizes the semisimple Lie algebras among the reductive ones Proposition 2.30. If g is reductive, then g = Dg Z(g) and Dg is semisimple. Thus a reductive Lie algebra is semisimple if and only if its center is trivial. Proof. Let Σ be the set of direct sums a 1 a k where a 1,..., a k are indecomposable ideals (i.e. they contain only trivial ideals). The elements of Σ are themselves ideals. Let a Σ be an element of maximal dimension. As g is reductive, there exists an ideal b such that g = a b. We want to show that b = {0} (and hence g = a) so assume for contradiction that b {0} and let b b be the smallest nonzero indecomposable ideal (which always exists, for if b contains no proper ideals, then b is indecomposable). But then a b Σ contradicting maximality of a, and therefore g = a Σ. Now let s write g = a 1 a j } {{ } g 1 a j+1 a k } {{ } g 2 where a 1,..., a j are 1-dimensional and a j+1,..., a k are higher dimensional and thus simple. Therefore g 1 is abelian and g 2 is semisimple (by Theorem 2.24) and by definition of the direct sum bracket we have Dg = D(a 1 a k ) = Da 1 Da k = Da j+1 Da k = g 2.
2.2 Semisimple Lie Algebras 41 This shows that Dg is semisimple. We now only have to justify that g 1 equals the center. We have g 1 Z(g) for in the decomposition g = g 1 g 2 [(X, 0), (Y, Z)] = [X, Y ] + [0, Z] = 0. Conversely, let X Z(g). We decompose it X = X 1 + + X k according to the decomposition of g in indecomposable ideals. Then X i Z(a i ) which means that X i = 0 for j > i and hence X g 1. The next result will help us mass-produce examples of reductive Lie algebras Proposition 2.31. Let g be a Lie subalgebra of gl(n, R) or gl(n, C). If g has the property that X g implies X g (where X is the conjugate transpose of X), then g is reductive. Proof. Define a real inner product on g by X, Y = Re Tr(XY ). This is a genuine inner product: it s symmetric: Y, X = Re Tr(Y X ) = Re Tr(Y X ) = Re Tr((XY ) ) = Re Tr(XY ) = Re Tr(XY ) = X, Y, and it s positive definite, for Tr(XX ) is nothing but the sum of the square of the norm of the columns in X which is 0 if and only if X = 0. Assuming a to be an ideal in g, let a be the complementary subspace w.r.t. the inner product just defined. Then as vector spaces it holds that g = a a. For this to be a Lie algebra direct sum we need a to be an ideal. Let X a and Y g, then for all Z a [X, Y ], Z = Re Tr(XY Z Y XZ ) = Re Tr(XZ Y XY Z ) = Re Tr(X(Y Z) X(ZY ) ) = X, [Y, Z] which is 0 as X a and [Y, Z] a since Y g. Thus a is an ideal. Obviously gl(n, R) and gl(n, C) are closed under the operation conjugation transpose and are therefore reductive. They are not semisimple as their centers contain the scalar matrices diag(a,..., a) for a R or a C respectively, violating Proposition 2.11. The Lie algebras so(n) are semisimple for n 3. Recall that so(n) is the set of real n n matrices X for which X + X = 0. From the definition it is clear that if X so(n) then also X so(n). Hence so(n) is reductive for all n. so(2) is a 1-dimensional (hence abelian) Lie algebra and thus is not semisimple. Let us show that so(3) is semisimple. Thanks to Proposition 2.30 this boils down to verifying that its center is trivial. So assume 0 a b X = a 0 c b c 0 to be in an element in the center of so(3). In particular it has to commute with the two matrices A 1 = 0 1 0 1 0 0 and A 2 0 0 1 0 0 0. 0 0 0 1 0 0 We have a 0 c a 0 0 A 1 X = 0 a b and XA 1 = 0 a 0. 0 0 0 c b 0
42 Chapter 2 Structure Theory for Lie Algebras As these two matrices should be equal we immediately get that b = c = 0. Furthermore 0 0 0 0 0 0 A 2 X = 0 0 0 and XA 2 = 0 0 a 0 a 0 0 0 0 and we get a = 0. Thus X = 0, and the center is trivial. Generalizing this to higher dimensions one can show that so(n) is semisimple for n 3. Now since so(n, C) = so(n) C (cf. Example 2.27) Proposition 2.28 says that also so(n, C) is semisimple for n 3. The Lie algebra u(n) is reductive. These are just the n n complex matrices satisfying X+X = 0 and again it is clear that u(n) is closed under the operation conjugate transpose and hence reductive. It is not semisimple, since the matrices diag(ia,..., ia) for a R are all in the center. However the subalgebra su(n) is semisimple for n 2 (su(1) is zero-dimensional) as can be seen by an argument analogous to the one given above. Since its complexification is sl(n, C) this is also semisimple for n 2. But sl(n, C) is also the complexification of sl(n, R) which is therefore also semisimple for n 2. By the same argument also so(m, n) for m+n 3 and su(m, n) for m+n 2 are semisimple since their complexifications are. Wrapping up, the following Lie algebras are semisimple sl(n, R) for n 2 sl(n, C) for n 2 so(n) for n 3 so(m, n) for m + n 3 so(n, C) for n 3 su(n) for n 2 su(m, n) for m + n 2. 2.3 The Universal Enveloping Algebra For a finite-dimensional vector space V we have the tensor algebra T (V ) defined by T (V ) = V n. n=0 From this one can form various quotients. One of the more important is the symmetric algebra S(V ) where we mod out the ideal I generated by elements of the form X Y Y X. The resulting algebra is commutative by construction. If {X 1,..., X n } is a basis for V, then one can show that the set {X i1 1 Xin n i 1,..., i n N 0 } (we define X 0 = 1) will be a basis for S(V ) which is thus (unlike the exterior algebra) infinite-dimensional. If we set I = (i 1,..., i k ) we will use the short-hand notation X I for X i1 X ik. We define the length of I to be I = k and write j I if j i 1,..., i k. Definition 2.32 (Universal Enveloping Algebra). Let g be a Lie algebra. By a universal enveloping algebra of g we understand a pair (U, i) of an associative unital algebra U and a linear map i : g U with i([x, Y ]) = i(x)i(y ) i(y )i(x) satisfying that for any pair (A, ϕ) of an associative unital algebra A and a linear map ϕ : g A with ϕ([x, Y ]) = ϕ(x)ϕ(y ) ϕ(y )ϕ(x) there is a unique algebra homomorphism ϕ : U A with ϕ = ϕ i.
2.3 The Universal Enveloping Algebra 43 In other words any linear map ϕ : g A satisfying the above condition factorizes through U rendering the following diagram commutative i g U ϕ As for the symmetric algebra, multiplication in a universal algebra is written by juxtaposition. Proposition 2.33. Let g be a Lie algebra and J the two-sided ideal in T (g) generated by elements of the form X Y Y X [X, Y ]. If i denotes the restriction of the canonical map κ : T (g) T (g)/j to g then (T (g)/j, i) is a universal enveloping algebra for g. It is unique up to algebra isomorphism. Proof. Uniqueness first. Assume that (U, i) and (Ũ, ĩ) are universal enveloping algebras for g. Since ĩ : g Ũ is a linear map satisfying the bracket condition, the universal property of (U, i) yields an algebra homomorphism ϕ : U Ũ so that ĩ = ϕ i. Likewise for i : g U the universal property of (Ũ, ĩ) yields an algebra homomorphism ψ : Ũ U so that i = ψ ĩ. Composing these gives that i = ψ φ i, i.e. ψ ϕ makes the following diagram commutative i ϕ A g U i But obviously id U also makes the diagram commute and by uniqueness ψ ϕ = id U. Likewise one shows that ϕ ψ = idũ, thus U and Ũ are isomorphic. To show existence we just need to verify that (T (g)/j, i) is really a universal enveloping algebra. Well, first of all i([x, Y ]) = κ([x, Y ]) = [X, Y ] + J U ψ ϕ = [X, Y ] + X Y Y X [X, Y ] + J = X Y Y X + J = (X + J) (Y + J) (Y + J) (X + J) = κ(x)κ(y ) κ(y )κ(x) = i(x)i(y ) i(y )i(x). Now, suppose that ϕ : g A is a linear map satisfying ϕ([x, Y ]) = ϕ(x)ϕ(y ) ϕ(y )ϕ(x). Consider the following diagram. g ι κ T (g) T (g)/j ϕ ϕ ϕ A Since ϕ is linear it factorizes uniquely through T (g) yielding an algebra homomorphism ϕ : T (g) A. On the generators of J we see that ϕ (X Y Y X [X, Y ]) = ϕ (X Y ) ϕ (Y X) ϕ ([X, Y ]) = ϕ(x)ϕ(y ) ϕ(y )ϕ(x) ϕ([x, Y ]) = 0. Thus, vanishing on J, ϕ factorizes uniquely through T (g)/j by an algebra homomorphism ϕ : T (g)/j A, i.e. ϕ = ϕ i. This proves existence.
44 Chapter 2 Structure Theory for Lie Algebras If g is abelian, then the ideal J is generated by elements X Y Y X, i.e. U(g) is just the symmetric algebra S(g). For the tensor algebra we have a filtration if we define T m (g) := m g k. We can carry this over to U(g) if we define U m (g) := κ(t m (g)). We see that U m (g)u n (g) = κ(t m (g))κ(t n (g)) = κ(t m (g) T n (g)) κ(t m+n (g)) = U m+n (g). Since we have U m 1 U m it makes sense to define the vector spaces G m (g) = U m (g)/u m 1 (g) with canonical map q m : U m (g) G m (g) and G(g) := k=0 G m (g). m=0 To shorten the notation we will just write T m, U m, G m and G when no confusion is possible. The product in U(g) defines a graded algebra structure on G. Namely, define a product map G m G n G m+n by q m (v)q n (w) = q m+n (vw) where v U m and w U n are representatives for the elements in G m and G n. For this to make sense we used that U m U n U m+n as showed above. A simple argument shows that this product is well-defined (i.e. independent of choice of representatives). Thus, G becomes a graded algebra. Now the composition g m κ q m U m G m gives a linear map ϕ m : g m G m which is surjective as both κ and q m are surjective. Then also the linear map ϕ = ϕ m : T (g) G is surjective. Lemma 2.34. The map ϕ : T (g) G is a surjective algebra homomorphism that vanishes on I. Thus, it induces a surjective map Φ : S(g) G. Proof. Let x g m and y g n. Then ϕ(x y) = q m+n (κ(x y)) = q m+n (κ(x)κ(y)) = q m (κ(x))q n (κ(y)) = ϕ(x)ϕ(y). Hence, ϕ is an algebra homomorphism. Now, consider a generator X Y Y X for the ideal I. We have that X Y Y X g 2 and therefore κ(x Y Y X) U 2. Hence ϕ(x Y Y X) = q 2 (κ(x Y Y X)). But by definition of κ we have κ(x Y Y X) = κ([x, Y ]) U 1 and thus q 2 (κ([x, Y ])) = 0. Hence ϕ vanishes on I. This induced homomorphism is exceedingly interesting. The rest of this section is devoted to prove the following theorem and some of its consequences. Theorem 2.35 (Poincaré-Birkhoff-Witt). The induced map Φ : S(g) G is an isomorphism of algebras.
2.3 The Universal Enveloping Algebra 45 For brevity we will refer to it as the PBW-Theorem. We split up the proof of this into a series of lemmas. For the first one let {X 1,..., X n } denote a basis for g, let {X I I increasing} be the associated basis for the symmetric algebra and recall the filtration S m (g). Lemma 2.36. For each m N 0 there is a unique linear map f m : g S m S(g) satisfying 1) f m (X j X I ) = X j X I for j I and X I S m (meaning I m), 2) f m (X j X I ) X j X I S k when X I S k and k m, 3) f m (X i f m (X j X I )) = f m (X j f m (X i X I )) + f m ([X i, X j ] X I ) for X I S m 1, and f m restricted to g S m 1 equals f m 1. Proof. For the time being we assume that we have shown uniqueness. Since S m 2 S m 1 S m and since f m satisfies 1) and 2) for X I S m, f m s restriction to g S m 1 clearly satisfies 1) and 2) for X I S m 1. Likewise f m satisfies 3) for X I S m 2. By uniqueness this restriction must equal f m 1. To show existence and uniqueness we use induction over m. For m = 0 S 0 is spanned by 1 so we only need to define f 0 on X i 1. If 1) should be fulfilled we can only have f 0 (X i 1) = X i. Obviously 2) is also satisfied, and 3) is an empty statement as S 1 = {0}. Thus f 0 exists and is unique. Now for the induction step we assume that we have a unique map f m 1 satisfying 1)-3). It is enough to define f m on elements of the form X j X I with I = m (and I increasing) since the remarks at the beginning and uniqueness of f m 1 implies that f m restricted to g S m 1 is f m 1. If j I we must have by 1) that f m (X j X I ) = X j X I. If we don t have j I then we write I = (i, J) where i J, i < j and J = m 1. We therefore have X I = X i X J and as i J we have X I = X i X J = f m (X i X J ). Now we can exploit 3) to define f m on X j X I by f m (X j X I ) = f m (X j f m (X i X J )) = f m (X i f m (X j X J )) + f m ([X j, X i ] X J ). (2.5) One question arises: are the terms on the right-hand side already defined? Well, in the first term we have from 2) that f m (X j X J ) = f m 1 (X j X J ) = X j X J + y where y S m 1. Thus the first term in (2.5) becomes f m (X i X j X I )+f m (X i y) which is already defined since i (j, J). Also the second term in (2.5) is welldefined as X J S m 1. Thus, we have in a unique way defined a linear map f m which clearly satisfies 1) and 2). It s a bit more problematic to show that it satisfies 3): If j < i and j I then f m was defined through 3) so it obviously holds in this case. If i < j and i I we get it from the previous case by exchanging X i with X j and use that the Lie bracket is anti-commutative. If i = j, then 3) is true as the first 2 terms are equal and the last is 0. Finally we need to check 3) in the situation where neither i I nor j I. Write I = (i 0, J) whence i 0 < i and i 0 < j. As 3) is valid for m 1 we get (after some boring calculations) f m (X j X I ) = f m (X j f m (X i0 X J )) = f m (X i0 f m (X j X J )) + f m ([X j, X i0 ] X J )
46 Chapter 2 Structure Theory for Lie Algebras since J = m 2. Furthermore we have by 2) that f m (X j X J ) = X j X J + w (2.6) where w S m 2. Since i 0 J and i 0 j we may use 3) on the expression f m (X i f m (X i0 X j X J )) and w can be written as a linear combination of X K s with K = m 2 and therefore, by linearity, we may use 3) on f m (X i f m (X j w)). By (2.6) we can use 3) a couple of times on f m (X i f m (X j f m (X i0 X J ))) and get f m (X i f m (X j X I )) = f m (X i f m (X j f m (X i0 X J ))) = f m (X i0 f m (X i f m (X j X J ))) + f m ([X i, X i0 ] f m (X i X J )) + f m ([X j, X i0 ] f m (X i X J )) + f m ([X i, [X j, X i0 ]] X J ). Interchanging i and j in the expression above and subtracting we obtain f m (X i f m (X j X I )) f m (X j f m (X i X I )) = f m ([X i, X j ] X I ) which is exactly 3). Lemma 2.37. There is a Lie algebra homomorphism ρ : g End(S(g)) satisfying 1) ρ(x i )X J = X i X J when i J. 2) ρ(x i )X J X i X J S m when J = m. Proof. By Lemma 2.36 we can define a linear map f : g S(g) End(S(g)) to be the unique map whose restriction to g S m equals f m. If J has length m, we have from Lemma 2.36 3) that f([x i, X j ] X J ) = f m ([X i, X j ] X J ) = f m (X i f m (X j X J )) f m (X j f m (X i X J )). Therefore we get a Lie algebra homomorphism ρ by ρ(x i )X J = f(x i X J ). 1) and 2) are fulfilled as an immediate consequence of 1) and 2) in Lemma 2.36. Lemma 2.38. Let Y T m J and let Y m be the component of Y in g m, i.e. it s component of pure degree m. Then Y m I. Proof. We need to show that Y m is 0 in the symmetric algebra. Y m is a linear combination of elements of the form X 1 X m, X 1,..., X m g, so for simplicity we assume Y m is of this form. From the previous lemma we have a Lie algebra homomorphism ρ : g End(S(g)). Now consider the commutative diagram g ι T (g) ρ ρ End(S(g)) κ ρ U(g) ρ induces an algebra homomorphism ρ : U(g) End(S(g)) which by composition with κ yields an algebra homomorphism ρ : T (g) End(S(g)) with ker ρ J. Since Y m J we thus have ρ (Y m ) = ρ(x 1 ) ρ(x m ) = 0. But we also have by successive use of property 2) satisfied by ρ that ρ(x 1 ) ρ(x m )1 = X 1 X m + y where y is an element in S m 1. Thus we have X 1 X m + y = 0 and since X 1 X m is the only term of degree m it has to be 0. Thus Y m is 0 in S(g).
2.3 The Universal Enveloping Algebra 47 Now we are finally ready to do what we set out for: proving the Poincaré- Birkhoff-Witt Theorem: Proof of Theorem 2.35. The only thing we need to show is that Φ is injective. This is equivalent to I = ker ϕ. From Lemma 2.34 we have I ker ϕ. Thus we only need to show the inclusion. So assume that we have y ker ϕ, i.e. y g m and ϕ m (y) = 0. This implies q m (κ(y)) = 0 which implies κ(y) U m 1. Since U m 1 = κ(t m 1 ) there is a y T m 1 so that κ(y) = κ(y ). But this means that y y J = ker κ, hence y y T m J. As y is the m th component of y y the previous lemma says that y I. This apparently innocent-looking theorem has some useful consequences of which we will mention a few. Corollary 2.39. Let W g m be a subspace such that π W : W S m (g) is an isomorphism. Then κ(w ) and U m 1 are algebraically complementary subspaces in U m, i.e. U m = κ(w ) U m 1. Proof. Let s consider the following diagram g m κ U m (2.7) π S m (g) Φ G m q m It s commutative since both compositions equal ϕ m. By assumption and since Φ is bijective, Φ π W : W G m is a bijection. Thus also q m κ : W G m is a bijection. Since U m 1 is the kernel of q m, it is clear that κ(w ) U m 1 = {0}, namely if x = κ(y) κ(w ) U m 1 then 0 = q m (x) = q m κ(y) and since q m κ is a bijection, y = 0 and therefore x = 0. Finally we need to show that U m = κ(w ) + U m 1. The inclusion is obvious, so let x U m. If x is in the kernel of q m, it is in U m 1 and we are done, so assume that x is not in the kernel. Then q m (x) 0 and due to bijectivity of q m κ there exists 0 y W such that q m κ(y) = q m (x). But this exactly means that there is an y U m 1 such that x = κ(y) + y, hence the inclusion is established. Corollary 2.40. The map i : g U(g) is injective, i.e. g can be considered a subspace of U(g). Proof. We clearly have that S 1 (g) = g, and π : g S 1 (g) is just the injection of g into the symmetric algebra, i.e. it s bijective. Also Φ is bijective by the PBW-Theorem. Thus by diagram (2.7) also q 1 κ is bijective, in particular κ : g U 1 U is injective. But κ restricted to g is nothing but i which is therefore injective. Corollary 2.41. Let {X 1,..., X n } be a basis for g. Then the set consisting of elements of the form X i1 1 Xin n, i k N 0 (Xk 0 = 1) is a basis for U(g). Proof. We will find a basis for each subspace in the filtration U m for U(g), then since U(g) = m=0 U m we get a basis for U(g). For U 0 which is just the scalar field, we have an obvious basis, namely {1}. Now we use induction over m: assume U m 1 has a basis consisting of elements X i1 1 Xin n with i 1 + + i n m 1. Let W g m be the subspace spanned by X i1 1 Xin n with i 1 + + i n = m. This is isomorphic to S m (g) via π (they have the same kind of basis) and hence by Corollary 2.39 we have that U m = U m 1 κ(w ). W is via κ mapped bijectively to the set span{x i1 1 Xin n i 1 + + i n = m}.
48 Chapter 2 Structure Theory for Lie Algebras Thus we get the desired basis for U m. A basis for U(g) as given above is often referred to as a PBW-basis. Sometimes this last corollary is the PBW-Theorem. Corollary 2.42. Let g be a Lie algebra and h a subalgebra of g. Then U(h) is canonically isomorphic to the subalgebra of U(g) generated by 1 and h. Proof. Let ρ : h g be the injection and compose it with ι : g U(g) to get Lie algebra homomorphism h U(g). By the universal property of U(h) this map factorizes to a unital algebra homomorphism ρ : U(h) U(g). If we pick a basis {X 1,..., X k } for h and extend it to a basis {X 1,..., X k,..., X n } for g then as ρ(x i ) = X i for i k and ρ(x j ) = 0 for j > k and since U(h) has a PBW-basis generated by X 1,..., X k, it should be clear that ρ : U(h) ρ(u(h)) is an isomorphism. Corollary 2.43. Let g be the Lie algebra direct sum g 1 g 2, then there is a vector space isomorphism U(g) U(g 1 ) C U(g 2 ). Proof. From the preceding corollary we can view U(g 1 ) and U(g 2 ) as subspaces of U(g). Therefore it makes sense to define the map ϕ : U(g 1 ) U(g 2 ) U(g) by ϕ(x Y ) = XY. By choosing bases {X 1,..., X k } for g 1 and {X k+1,..., X n } for g 2, then {X 1,..., X n } will be a basis for g and from the PBW-Theorem it is not hard to see, that ϕ is a vector space isomorphism.
Chapter 3 Basic Representation Theory of Lie Algebras 3.1 Lie Groups and Lie Algebras The definition of a representation of a Lie group is not different from the definition of a representation of a topological group: It is still a continuous map G Aut(V ) where Aut(V ) is equipped with the strong operator topology. If V is finite-dimensional, then Aut(V ) is a Lie group itself and so we can prove Lemma 3.1. Let (π, V ) be a finite-dimensional representation of a Lie group G. Then the map π : G Aut(V ) is a Lie group homomorphism. Proof. For a finite-dimensional Banach space V the strong operator topology on B(V ) is equal to the norm topology. Hence the subspace topology on Aut(V ) induced by the strong operator topology on B(V ) is equal to the topology on Aut(V ) given by the smooth structure. But then, since π is a continuous group homomorphism, a classical result on Lie groups states that π is automatically smooth 1. Now, Lie group homomorphisms induce Lie algebra homomorphisms. In order to exploit this we need to study representation theory on the Lie algebra level. If V is some fixed finite-dimensional vector space, it is well-known that End(V ) with the commutator bracket is the Lie algebra of Aut(V ). Definition 3.2 (Group Representation). Let g be a Lie algebra and V a vector space. A representation of g on V is a Lie algebra homomorphism ρ : g End(V ). We also say that we have given V the structure of a g- module. If ρ is an injective map, the representation is called faithful. The dimension of the Lie algebra representation is the dimension of the vector space, on which it is represented. Let G be a Lie group with Lie algebra g. If π : G Aut(V ) is a finitedimensional representation of G, then π is a smooth map (by Lemma 3.1) and it thus induces a Lie algebra homomorphism: π : g End(V ). This renders the following diagram commutative: g π End(V ) (3.1) exp G 1 See for instance [14] Theorem 3.39. π exp Aut(V )
50 Chapter 3 Basic Representation Theory of Lie Algebras Thus, a finite-dimensional Lie group representation automatically yields a representation of the Lie algebra of G. This we call the induced Lie algebra representation (in physics texts this is sometimes called the infinitesimal representation of g). The converse statement that a finite-dimensional Lie algebra representation automatically returns a Lie group representation is true in the case where G is simply connected. All the concepts introduced in Section 1.1 have Lie algebra analogs: Definition 3.3. Let ρ : g End(V ) be a Lie algebra representation. A subspace U V is called ρ-invariant, if for all X g: ρ(x)u U. The representation ρ is called irreducible if the only invariant subspaces are 0 and g. Let (ρ, V ) be another Lie algebra representation. A linear map T : V V is called an intertwiner between ρ and ρ if T ρ(x) = ρ (X) T for all X g. The set of all intertwiners is denoted Hom g (V, V ). If T is an isomorphism, T is called an equivalence of representations and the two representations are said to be equivalent (denoted ρ = ρ ). The analogy is actually quite close as can be seen from the following proposition which compares a Lie group representation with its Lie algebra representation Proposition 3.4. Let (π, V ) and (π, V ) be finite-dimensional representations of a connected Lie group G, and (π, V ) and (π, V ) the Lie algebra representations. Then the following hold: 1) A subspace U V is π-invariant if and only if it is π -invariant. 2) π is irreducible if and only if π is irreducible. 3) T : V V is an intertwiner between π and π if and only if it is an intertwiner between π and π. 4) π = π if and only if π = π. Proof. 1) If G e denotes the connected component of G containing the identity element, then it s a fact from elementary Lie group theory that G e is the subgroup of G generated by all elements of the form exp(x) for X g. Since G is connected, G e = G, i.e. all elements of G are on the form exp(x 1 ) exp(x k ). Now assume that U V is π -invariant. We want to show that π(g)u U. g is of the form exp(x 1 ) exp(x k ) and hence (recall the diagram above) π(g) = π(exp(x 1 )) π(exp(x k )) = exp(π (X 1 )) exp(π (X k )). Since exp(π (X i )) = n=0 1 n! (π (X i )) n and U is π (X i )-invariant, we have exp(π (X i ))U U and therefore π(g)u U. Conversely, assume U is π-invariant and v U. Then as π (X)v = d dt (π(exp(tx))v), (3.2) t=0 (a formula which can be found in introductory textbooks on differential geometry) and π(exp(tx))v U, we have π (X)v U. 2) This now follows from 1). 3) Assume T Hom g (V, V ), i.e. T π (X) = π (X) T. We consider T π(g) and once again use that g = exp(x 1 ) exp(x k ) so that T π(g) = T exp(π (X 1 )) exp(π (X k )).
3.1 Lie Groups and Lie Algebras 51 Again since exp(π (X i )) = n=0 1 n! (π (X i )) n and T is continuous, it intertwines each term, and so we have T exp(π (X i )) = exp(π (X i )) T. Therefore T π(g) = π (g) T. Conversely, assume T π(g) = π (g) T. Then (T π (X))v = T d dt π(exp(tx))v = d t=0 dt (T π(exp tx)v) t=0 = d dt (π (exp tx) T )v = (π (X) T )v. t=0 4) This follows directly from 3). We even have a version of Schur s Lemma Theorem 3.5 (Schur s Lemma). Let (ρ, V ) and (ρ, V ) be finite-dimensional irreducible representations of a Lie algebra g. If ϕ : V V is an intertwiner, then ϕ is either the zero map or an equivalence of representations. If (ρ, V ) is a finite-dimensional irreducible complex representation and ψ : V V a linear map commuting with all ρ(x) then ψ = λ id V for a λ C. Proof. ker ϕ is an invariant subspace of V, for if v ker ϕ then ϕ(ρ(x)v) = ρ (X)ϕ(v) = 0. Thus ker ϕ = V or ker ϕ = {0}. In the first case ϕ is the zero map and in the second it is injective. Likewise im ϕ is an invariant subspace of V, so that im ϕ = {0} or im ϕ = V. In the case of ϕ being injective (and V being nontrivial), we cannot have im ϕ = {0}. Hence im ϕ = V and ϕ is a bijection. For the second assertion, observe that ψ has an eigenvalue λ C, i.e. ψ λ id V in not injective. Since ψ λ id V intertwines ρ we must have ψ λ id V = 0. In Chapter 1 we introduced direct sum and tensor product of representations. How do these operations behave when passing to the Lie algebra level? Proposition 3.6. Assume that we have two finite-dimensional representations (π, V )and (π, W ) of a Lie group G. The direct sum representation π π induces a representation of g on V W given by (π π ) (X)(v, w) = (π (X)v, π (X)w), i.e. (π π ) (X) = π (X) π (X). The tensor product representation π π induces a representation of g on V W given by (π π ) (X)(v w) = π (X)v w + v π (X)w, i.e. (π π ) (X) = π (X) id W + id V π (X). Proof. From (3.2) we have that (π π ) (X)(v, w) = d dt (π π )(exp(tx))(v, w) t=0 = d ) dt (π(exp(tx))v, π (exp(tx))w t=0 = (π (X)v, π (X)w).
52 Chapter 3 Basic Representation Theory of Lie Algebras For the second part we need to calculate (π π ) (X)(v w) = d dt (π π )(e tx )(v w) t=0 = d dt π(e tx )v π (e tx )w. (3.3) t=0 In other words, we need to be able to differentiate a tensor product of two smooth curves. A calculation completely similar to the one carried out when proving the formula for differentiation of an ordinary product shows that (v w) (t) = v (t) w(t) + v(t) w (t) whenever v : I V and w : I W are smooth vector valued functions. This and (3.3) yield: (π π ) (X)(v w) = d ( dt π(e tx )v ) π (e 0X )w t=0 + π(e 0X )v d ( dt π (e tx )w ) t=0 which was the desired expression. = π (x)v w + v π (X)w Based on this we simply define the direct sum of two Lie algebra representations (ρ, V ) and (ρ, W ) to be (ρ ρ )(X) := ρ(x) ρ (X) and the tensor product of the two by (ρ ρ )(X) := ρ(x) id W + id V ρ (X). Now consider a complex representation (ρ, V ) of g. We can extend this to a representation ρ C of g C on V by ρ C (X +iy )v := ρ(x)v+iρ(y )v. This we call the complexification of the representation ρ. For the transition from a representation to its complexification we have an analogy of Proposition 3.4: Proposition 3.7. Let g be a finite-dimensional real Lie algebra, and (ρ, V ) and (ρ, V ) complex Lie algebra representations. Let g C, ρ C and ρ C denote the associated complexifications. Then the following hold: 1) A subspace W V is ρ-invariant if and only if it is ρ C -invariant. 2) ρ is irreducible if and only if ρ C is irreducible. 3) A linear map T : V V intertwines ρ and ρ if and only if it intertwines ρ C and ρ C. 4) ρ and ρ are equivalent if and only ρ C and ρ C are equivalent. Proof. 1) If W is ρ-invariant then ρ C (X + iy )W = ρ(x)w + iρ(x)w W. Conversely, if W is ρ C -invariant then for X g we have ρ(x)w = ρ C (X)W W. This proves 1). 2) Follows immediately from 1). 3) If T intertwines ρ and ρ then T ρ C (X + iy ) = T (ρ(x) + iρ(y )) = T ρ(x) + T iρ(y ) = ρ (X) T + iρ (Y ) T = ρ C(X + iy ) T. Conversely, if T intertwines ρ C and ρ C then for X g: T ρ(x) = T ρ C (X) = ρ C(X) T = ρ (X) T which proves the claim. 4) This is an immediate consequence of 3).
3.1 Lie Groups and Lie Algebras 53 In the chapters to follow we will develop a powerful theory to determine irreducible representations of complex semisimple Lie algebras. But at this point it is fruitful to examine the simplest semisimple Lie algebra we know: sl(2, C). Apart from being interesting because of it s connection with the group SU(2) and quantum mechanics, it will also serve as a simple example for the more general theory. sl(2, C) is the complex Lie algebra consisting of complex 2 2 matrices with zero trace. An obvious basis is the following H := ( ) 1 0 0 1 E := which obeys the commutation relations ( ) 0 1 0 0 F := ( ) 0 0 1 0 [H, E] = 2E, [H, F ] = 2F, [E, F ] = H. Theorem 3.8. For each integer m 1 there is, up to equivalence, only one irreducible complex representation (ρ m, V m ) of sl(2, C) of dimension m and for V m there exists a basis {v 0,..., v m 1 } so that 1) ρ m (H)v j = (m 2j 1)v j. 2) ρ m (E)v j = j(m j)v j 1 for j > 0 and ρ m (E)v 0 = 0. 3) ρ m (F )v j = v j+1 for j < m 1, and ρ m (F )v m 1 = 0. Proof. We first prove that any irreducible complex representation is of the form mentioned in the theorem. So let (ρ, V ) be an arbitrary complex irreducible representation of sl(2, C) and let m be its dimension. ρ(h) is a complex endomorphism of V and therefore has an eigenvalue λ with eigenvector v. We see that ρ(h)ρ(e)v = ρ(e)ρ(h)v + ρ([h, E])v = λ ρ(e)v + 2ρ(E)v = (λ + 2)ρ(E)v. Thus either ρ(e)v is zero or it s an eigenvector for ρ(h) with an eigenvalue different from λ. Since the representation is finite-dimensional there can only be finitely many distinct eigenvalues, i.e. there is a j 0 such that ρ(e) j0 v 0 and ρ(e) j0+1 v = 0. Let s define v 0 := ρ(e) j0 v and let s denote the eigenvalue by λ. Thus v 0 satisfies ρ(h)v 0 = λv 0 and ρ(e)v 0 = 0. We now define v j := ρ(f ) j v 0. These satisfy ρ(h)v j = ρ(h)ρ(f ) j v 0 = (λ 2j)v j and ρ(f )v j = v j+1. Again, as there are only finitely many distinct eigenvalues, there exists a maximal n such that v n 0 and v n+1 = 0. Let W = span{v 0,..., v n }. We want to show that W = V, and as ρ is irreducible, it suffices to verify that W is an invariant subspace. All the vectors in W are eigenvectors for ρ(h), so W is ρ(h)-invariant. By construction W is also ρ(f )-invariant. For ρ(e)-invariance we use induction over j to show that ρ(e)v j = j(λ j + 1)v j 1 for j > 0 and ρ(e)v 0 = 0. This last information we already have, so we proceed immediately to the induction step: Assume ρ(e)v j = j(λ j + 1)v j 1. Then ρ(e)v j+1 = ρ(e)ρ(f )v j = ρ(f )ρ(e)v j + ρ([e, F ])v j = ρ(f )(j(λ j + 1)v j 1 ) + ρ(h)v j = j(λ j + 1)v j + (λ 2j)v j = (j + 1)(λ j)v j,
54 Chapter 3 Basic Representation Theory of Lie Algebras and this was what we wanted. Hence W is also ρ(e)-invariant and therefore W = V and n + 1 = m = dim V. The only thing left is to show that λ = n, i.e. ρ(h)v 0 = nv 0. To show this we calculate the trace of ρ(h) in two different ways. First ρ(h) = ρ([e, F ]) = [ρ(e), ρ(f )] and the trace of a commutator is always 0. Thus Tr ρ(h) = 0. On the other hand, in the basis {v 0,..., v n } ρ(h) is diagonal with eigenvalues λ, λ 2,..., λ 2n and therefore the trace is λ + (λ 2) + + (λ 2n). The only way this could ever equal 0 is if λ = n. This shows that (ρ, V ) is equivalent to (ρ m, V m ) and the uniqueness part of the theorem is proved. To prove existence, let V be a complex vector space of dimension m and let {v 0,..., v m 1 } denote a basis. We define a linear map ρ : sl(2, C) End(V ) by ρ(h)v j := (m 2j 1)v j ρ(e)v j := j(m j)v j 1 and ρ(e)v 0 = 0 ρ(f )v j := v j+1 and ρ(f )v m 1 = 0 i.e. we brutally force ρ to have the properties we want. It is not hard to verify that this is really a representation of sl(2, C). To see that ρ is irreducible, let W V be a nontrivial invariant subspace. As W is ρ(h)-invariant and ρ(h) has the v j s as eigenvectors, W must contain at least one of these, say v k. Then as W is ρ(e)-invariant, and v 0 is a multiple of ρ(e) k v k which is in W, we must have v 0 W. And finally as W is ρ(f )- invariant we have v j = ρ(f )v 0 W for all j = 0,..., m 1. Hence W = V and ρ is irreducible. This completes the proof. 3.2 Weyl s Theorem In this section we will show a nice theorem due to Weyl, stating that any finitedimensional representation of a semisimple Lie algebra is completely reducible, i.e. can be decomposed into irreducible representations. Lemma 3.9. Let g be a semisimple Lie algebra, and (ρ, V ) a finite-dimensional faithful representation of g. Then the bilinear form on g given by β(x, Y ) = Tr(ρ(X)ρ(Y )) is symmetric, non-degenerate and β([x, Y ], Z) = β(x, [Y, Z]). Note that if ρ is just the adjoint representation of g, then β is nothing but the Killing form of g. Proof. That β is symmetric and has the associativity property β([x, Y ], Z) = β(x, [Y, Z]) can be proved in exactly the same way as we proved Proposition 2.22. Now define, as for the Killing form, the radical Rad β := {X g Y g : β(x, Y ) = 0}. Obviously β is non-degenerate if and only if Rad β = {0}. We will use Cartan s Criterion to show that ρ(rad β) is solvable. For X, X, Y Rad β we have Tr([ρ(X), ρ(x )]ρ(y )) = β([x, X ], Y ) = 0 by definition of Rad β, hence ρ(rad β) is solvable. Since ρ : g ρ(g) is a Lie algebra isomorphism, ρ(g) is semisimple. As ρ(rad β) ρ(g) is solvable, we must have ρ(rad β) = {0} and consequently Rad β = {0}, i.e. β is nondegenerate.
3.2 Weyl s Theorem 55 Now let B = {X 1,..., X n } be a basis for g. Since β is non-degenerate we get a corresponding dual basis B = {Y 1,..., Y n } for g determined by β(x i, Y j ) = δ ij. For each X g we can expand [X, X i ] and [X, Y i ] in the bases B and B respectively [X, X i ] = a ij X j and [X, Y i ] = b ij Y j. j=1 By associativity of β we get the following connection between the coefficients a ik = a ij β(x j, Y k ) = β([x, X i ], Y k ) = β( [X i, X], Y k ) = β(x i, [X, Y k ]) j=1 = b kj β(x i, Y j ) = b ki. j=1 Definition 3.10 (Casimir element). Still assuming ρ to be faithful we define the Casimir element of ρ relative to the basis B by c ρ = ρ(x i )ρ(y i ) End(V ). i=1 One can show that this is in fact independent of the choice of basis. Since ad(ρ(x)) is a derivation of End(V ) i.e. j=1 [ρ(x), ρ(y )ρ(z)] = [ρ(x), ρ(y )]ρ(z) + ρ(y )[ρ(x), ρ(z)] c ρ commutes with ρ(x): [ρ(x), c ρ ] = = = [ρ(x), ρ(x i )ρ(y i )] i=1 [ρ(x), ρ(x i )]ρ(y i ) + i=1 a ij ρ(x j )ρ(y i ) + i,j=1 ρ(x i )[ρ(x), ρ(y i )] i=1 b ij ρ(x i )ρ(y i ) = 0. i,j=1 In the case of ρ being irreducible Schur s Lemma says that c ρ is nothing but multiplication by a constant λ. We can even calculate this particular constant: On one hand we must have Tr c ρ = λ dim V, but on the other hand Tr c ρ = Tr ρ(x i )ρ(y i ) = β(x i, Y i ) = n = dim g i=1 therefore λ = dim g dim V so in this case we see that the Casimir element is indeed independent of the choice of basis. If ρ is not faithful, the above construction is meaningless since β is no longer non-degenerate and the dual basis is therefore ill defined. Nonetheless we can still define a Casimir element of ρ. The kernel ker ρ is an ideal in g and is therefore a direct sum of simple ideals (g is still assumed to be semisimple). Let g denote the direct sum of the remaining simple ideals in g, then g is semisimple, g = ker ρ g and ρ g is injective. Choosing a basis for g we define the Casimir element c ρ of ρ to be the Casimir element of ρ g relative to the chosen basis for g. Obviously, c ρ commutes with ρ(x) just as before. Therefore if ρ is irreducible also ρ g is irreducible and by the same reasoning as above we see that c ρ is just multiplication by the constant λ = dim g / dim V which is non-zero unless the representation is trivial.
56 Chapter 3 Basic Representation Theory of Lie Algebras Lemma 3.11. Let (ρ, V ) be a representation of a semisimple Lie algebra g. Then ρ(g) sl(v ), i.e. Tr ρ(x) = 0 for each X g. In particular any 1- dimensional representation will be trivial: ρ(x)v = 0. Proof. As g is semisimple, we have that g = [g, g] i.e. an arbitrary X g can be written X = [Y, Y ]. Therefore ρ(x) = [ρ(y ), ρ(y )] and commutators always have zero trace. Theorem 3.12 (Weyl). Every finite-dimensional representation of a semisimple Lie algebra is completely reducible. Proof. At first we work out the case in which W V is a codimension 1 invariant subspace on which ρ is irreducible. As mentioned, the Casimir element c ρ commutes with ρ(x) and from this it is not hard to extract that ker c ρ is an invariant subspace of V and that (by Schur s Lemma) c ρ (W ) W. The induced representation of ρ on V/W is trivial by the preceding lemma (for V/W is 1- dimensional) and this says that ρ(x) maps V into W. Thus since c ρ is just the sum of compositions of certain ρ(x) s we also have that c ρ maps V into W. Therefore c ρ has a nontrivial kernel. But on c ρ W is the Casimir element of ρ W and by the remarks above this is just multiplication by a nonzero constant. Therefore W ker c ρ = {0}, and hence V = W ker c ρ, i.e. ker c ρ is the desired invariant subspace. Next we will investigate the case in which V contains an invariant subspace W of codimension 1 (i.e. dim V dim W = 1) on which ρ is not irreducible. We want to find a (necessarily 1-dimensional) invariant subspace Z of V such that V = W Z. We do accomplish this by induction over dim W. If dim W = 0 then we can take Z to be V. For the induction step let dim W > 0 and assume the conclusion to hold for all dimensions less than W s. Let W W be a proper non-zero invariant subspace, then W is also an invariant subspace of V and W/W V/W is a subspace of codimension 1. ρ induces a representation on V/W by ρ(x)[v] = [ρ(x)v] (it is easily checked that it is well-defined). Then W/W is an invariant subspace of V/W of codimension 1 and since dim W/W < dim W the induction hypothesis (if ρ restricted to W/W is not irreducible) or the remarks above (if ρ restricted to W/W is irreducible) yields a 1-dimensional invariant subspace W /W V/W such that V/W = W/W W /W. But this means that W W is a subspace of codimension 1 and again since dim W < dim W we get a 1-dimensional invariant subspace Z such that W = W Z. To see that we also have V = W Z we observe that W Z = {0} for if W Z {0} we would have that Z W and consequently W /W W/W contradicting the decomposition above. Thus W Z = {0} and therefore (for dimension reasons) V = W Z. Now for the general case where W V is just some invariant subspace. ρ gives rise to a representation of Hom(V, W ) by the definition ρ(x)f = ρ(x)f fρ(x). Let U Hom(V, W ) be the set of linear maps f which restricted to W are just multiplication by a constant, i.e. f U if w W : f(w) = λw for some λ C. U is a ρ-invariant subspace since for w W ρ(x)w = ρ(x)f(w) f(ρ(x)w) = λρ(x)w λρ(x)w = 0 i.e. ( ρ(x)f) W = 0 for f U. Now let Ũ U be the set of linear maps f : V W such that f W = 0. Also Ũ is an invariant subspace of U, and Ũ has codimension 1 since any function f U can be written as a sum f +f 0 where f Ũ and where f 0 is λ id W extended by 0 to all of V. From the analysis in the
3.2 Weyl s Theorem 57 beginning of the proof we get a 1-dimensional invariant subspace Z U such that U = Ũ Z. Let g 0 : V W be a function that spans Z. By a scaling we can obtain g 0 W = id W. Since Z is 1-dimensional the lemma says that ρ(x)f 0 = 0, i.e. g 0 commutes with ρ(x) and therefore ker g 0 is an invariant subspace of V. Again, since g 0 W = id W we must have W ker g 0 = {0} and as im g 0 = W the Dimension Theorem of linear algebras says that dim ker g 0 + dim im g 0 = dim V and therefore V = W ker g 0. What we have actually proved now, is that any invariant subspace has a complementary invariant subspace. Having observed this the rest of the proof is identical to the proof of Proposition 1.16. Armed with this theorem we only need to know the irreducible representations of a semisimple Lie algebra to classify the finite-dimensional representations. Classifying the irreducible representations of complex semisimple Lie algebras is the goal of a later chapter. For now we state and prove to corollaries regarding sl(2, C) representations which will be useful at a later stage. Corollary 3.13. If ρ is a finite-dimensional complex representation of sl(2, C) on V, then ρ(h) has only integer eigenvalues. Proof. sl(2, C) is semisimple and so by Weyl s Theorem V is a direct sum V = V 1 V n so that ρ Vi is irreducible. Assume v V to be an eigenvector for ρ(h) and λ the corresponding eigenvalue. Then v = v 1 + + v n with v i V i and we see that ρ(h)v i = λv i i.e. λ is an eigenvalue for, say, ρ V1. But from Theorem 3.8 it should be apparent that such an eigenvalue has to be an integer. Corollary 3.14. Let ρ : sl(2, C) End(V ) be a, possibly infinite-dimensional, representation such that for each v V there is a finite-dimensional invariant subspace containing v, then ρ is a, possibly infinite, direct sum of irreducible representations. Proof. Every element v V is contained in a finite-dimensional invariant subspace, and ρ restricted to this subspace can, by Weyl s Theorem, be decomposed into a direct sum of irreducible representations. Thus we have V = i I U i where U i is a finite-dimensional irreducible invariant subspace of V. We need to show that this sum of subspaces is in fact a direct sum. To this end call a subspace J I an independent set if i J U i is a direct sum. One-point sets are examples of independent sets. It s not hard to see that the set of independent subsets of I, ordered by inclusion, is inductively ordered (i.e. every totally ordered subset has an upper bound). Thus Zorn s Lemma gives a maximal independent subset J 0 I. Put V 0 := i J 0 U i = i J 0 U i. We obviously have V 0 V. To show the reverse inclusion consider U i for an arbitrary i I. We want to show that U i V 0 and hence that V V 0. This is trivially true if i J 0. If not then, by maximality, V 0 +U i cannot be a direct sum, i.e. V 0 U i 0. But V 0 U i U i is an invariant subspace and by irreducibility of U i we must have V 0 U i = U i i.e. U i V 0.
58 Chapter 3 Basic Representation Theory of Lie Algebras
Chapter 4 Root Systems 4.1 Weights and Roots First some terminology: Definition 4.1 (Maximal Torus). Let g be a finite-dimensional Lie algebra. A torus t in g is a commutative subalgebra of g. A maximal torus in g is a torus which is not contained in any strictly larger torus. Lemma 4.2. Let t be any maximal torus of g. Then t equals its own centralizer. Proof. As t is commutative then every element of t commutes with all of t, hence t sits in its centralizer. Conversely, assume that X is in the centralizer, then obviously t+kx is a torus. Since t was maximal this implies that X t. Every Lie algebra possesses a nonzero maximal torus: Pick any 1-dimensional Lie subalgebra, this is automatically commutative, and find a maximal commutative subalgebra that contains it. Definition 4.3 (Cartan Subalgebra). Let g be a complex Lie algebra. A Cartan subalgebra h of g is a maximal torus such that ad(h) End(g) is diagonalizable for all H h. It is not obvious at all that non-trivial Cartan subalgebras exist. One can show that they do in fact exist if g is semisimple. Furthermore, if both h and h are Cartan subalgebras, one can show that there is an automorphism of g mapping h to h. In particular all Cartan subalgebras have the same complex dimension. This common dimension is called the rank of the Lie algebra. Proposition 4.4. If g i are Lie algebras and h i are corresponding Cartan subalgebras, then h 1 h 2 is a Cartan subalgebra of g 1 g 2. Proof. Obviously h 1 h 2 is an abelian subalgebra and it s maximally compact, for if (X, Y ) g 1 g 2 commutes with every (H 1, H 2 ) h 1 h 2, then (0, 0) = [(X, Y ), (H 1, H 2 )] = ([X, H 1 ], [Y, H 2 ]) and since h i is maximally abelian, we get X h 1 and Y h 2. Furthermore, we see that for H = (H 1, H 2 ) h 1 h 2 that ad(h) : g 1 g 2 g 1 g 2 is the map given by (X, Y ) (ad(h 1 )X, ad(h 2 )Y ) and since ad(h i ) is diagonalizable, ad(h) is diagonalizable as well. 59
60 Chapter 4 Root Systems Definition 4.5 (Weight). Let g be a complex Lie algebra and consider and a (possibly infinite-dimensional) representation (ρ, V ) of g and let h be any fixed torus in g. A functional λ h for which V λ := {v V ρ(h)v = λ(h)v for all H h } is non-zero is called a weight for ρ w.r.t. h. The set of weights is denoted Λ(ρ, h) (or just Λ(ρ)). For λ Λ(ρ, h) the space V λ is called the weight space associated with λ and the elements are called weight vectors. It is not hard to see that the set of weights is preserved under equivalence of representations: Consider two representations (ρ, V ) and (ρ, V ) of g which are equivalent through the bijective intertwiner T : V V, and let Λ(ρ) and Λ(ρ ) be the set of weights relative to a fixed torus h in g. If λ Λ(ρ), then V λ and T (V λ ) are both nonzero. For 0 v V λ we see that ρ (H)(T v) = T (ρ(h)v) = λ(h)t v, i.e. T (V λ ) V λ which is therefore nonzero. Thus Λ(ρ) Λ(ρ ). A similar argument gives the reverse inclusion and hence that equivalent representations have the same weights. We record some important properties of weights Theorem 4.6. Let (ρ, V ) be a finite-dimensional representation of g and h a torus in g. Then Λ(ρ, h) is nonempty and finite. If furthermore ρ(h) is diagonalizable for each H h, then we have the weight space decomposition V = V λ. (4.1) λ Λ(ρ) Proof. Let {X 1,..., X n } be a basis for h. As ρ is a complex representation, ρ(x 1 ) must have an eigenvalue, λ 1 and associated with this the eigenspace E 1. Since h is abelian it is easy to see that ρ(h) maps E 1 into E 1 for any H h, and by the same argument ρ(x 2 ) E1 must have an eigenvalue λ 2 with eigenspace E 2 {0}. By induction we get a subspace {0} E j E 1, j = 1,..., n on which ρ(x j ) acts by multiplication by λ j. Thus, if we define the functional λ h by λ(x j ) = λ j we get ρ(h)v = λ(h)v for all H h and v E n. Thus E n V λ so that V λ {0} and λ Λ(ρ, h). If λ is a weight then λ(x i ) is an eigenvalue for ρ(x i ). Since this acts on a finite-dimensional space, there can be only finitely many eigenvalues. Thus there can be only finitely many weights. Now assume that ρ(h) is diagonalizable for each H h. In particular this holds for X 1. Therefore we can decompose V into eigenspaces for ρ(x 1 ). Again as h is abelian ρ(x 2 ) maps these eigenspaces into themselves and as ρ(x 2 ) is diagonalizable we can decompose each of the eigenspaces of ρ(x 1 ) into eigenspaces for ρ(x 2 ). Each of these eigenspaces can then be decomposed into eigenspaces for ρ(x 3 ) and so on, until we have a decomposition V = V 1 V N (which is finite as V is finite-dimensional) such that ρ(x i ) acts on V j by the scalar λ ij, i = 1,..., n, j = 1,..., N. Now define functionals λ j h, j = 1,..., N by λ j (X i ) = λ ij. We claim that Λ(ρ, h) = {λ 1,..., λ N }. To show that λ j is a weight we only need to see that V λj = {v V ρ(h)v = λ j (H)v} is non-zero. But we see that {0} V j V λj for if v V j then for any i = 1,..., n we have ρ(x i )v = λ ij v = λ j (X i )v. (4.2) Thus λ j is a weight. If λ is a weight how is V λ related to the subspaces V j? The claim is that V λ = V j. j:λ=λ j
4.1 Weights and Roots 61 The inclusion is obvious from (4.2). To see let v V λ and split it v = v 1 + + v N with v j V j. Since V j is ρ-invariant we also have ρ(h)v j = λ(h)v j. Let J be an index such that λ J λ. Then there exists an I such that λ(x I ) λ J (X I ). But on the other hand we have ρ(x I )v J = λ IJ = λ J (X I )v J. The only way this can be compatible with λ(x I ) λ J (X I ) is if v J = 0. This shows the reverse inclusion. Now suppose λ is a weight. Then we just showed that V λ = λ j=λ V j. As V λ is non-trivial there is at least one j such that λ = λ j Therefore we have Λ(ρ, h) = {λ 1,..., λ N }. In particular we have the weight space decomposition. In particular, due to the weight space decomposition, the number of weights cannot exceed dim C V. From the proof of this proposition we saw that diagonalizability of the endomorphisms ρ(x) was crucial for the weight space decomposition to hold. By definition of the Cartan subalgebra the adjoint representation exactly possesses this property. To exploit this we make the following definition Definition 4.7 (Root). Let g be a complex Lie algebra and h be any fixed Cartan subalgebra in g. A nonzero functional α h for which g α := {X g [H, X] = α(h)x for all H h } is non-zero is called a root for g w.r.t. h. The set of roots is denoted R(g, h) (or just R for brevity). For α R(g, h) the space g α is called the root space associated with α and the elements are called root vectors for α. Thus roots are nothing more that weights of the adjoint representation. Notice that we do not recognize the zero functional as a root. This, however, doesn t mean that g 0 is zero. In fact g 0 = h: By definition of g 0, this is just the centralizer of h. But by definition h is itself a maximal torus and thus equals its own centralizer g 0. Now a direct translation of Theorem 4.6 gives Corollary 4.8. Let g be a complex Lie algebra and h a Cartan subalgebra. Then R(g, h) is nonempty and finite and we have the so-called root space decomposition of g relative to h: ( g = h g α ). (4.3) In particular we see from the root space decomposition that the number of roots cannot exceed dim C g dim C h. In the case of a semisimple Lie algebra, roots carry a lot of information about the structure of the Lie algebra, in fact they can be used to classify semisimple Lie algebras, and thus it pays off to investigate the set of roots somewhat closer. That s the purpose of the next section. For now we just prove some elementary results without the assumption of g being semisimple. Proposition 4.9. Let (ρ, V ) be a (possibly infinite-dimensional) representation of a complex Lie algebra g and let h g be a Cartan subalgebra. For λ Λ(ρ, h) and α R(g, h) we have ρ(g α )V λ V λ+α. α R Thus, if λ + α is not a weight, then ρ(g α ) annihilates V λ.
62 Chapter 4 Root Systems Proof. Let X g α and v V λ. Then for all H h we have ρ(h)(ρ(x)v) = ρ(x)ρ(h)v + [ρ(h), ρ(x)]v = λ(h)ρ(x)v + ρ([h, X])v = λ(h)ρ(x) + ρ(α(h)x)v = (λ(h) + α(h))ρ(x)v, i.e. ρ(x)v V λ+α. If λ+α is not a weight, then V λ+α = 0 and the last assertion follows. In the case of the adjoint representation of g we immediately get Corollary 4.10. For α, β R(g, h) we have [g α, g β ] g α+β. 4.2 Root Systems for Semisimple Lie Algebras In this section we record some further results about roots relying heavily on semisimplicity of the Lie algebras in question. In this section g will be a complex semisimple Lie algebra and h a Cartan subalgebra. B will denote the Killing form, i.e. the bilinear form g g C given by B(X, Y ) = Tr(ad(X) ad(y )). On a semisimple Lie algebra, this is non-degenerate (cf. Theorem 2.24). Lemma 4.11. Let R = R(g, h) be the set of roots, then: 1) Let α, β R {0} so that α+β 0, X g α and Y g β, then B(X, Y ) = 0, in other words: g α and g β are B-orthogonal. 2) For α R {0} B gα g α is non-singular, i.e. B(X, Y ) = 0 for every Y g α implies X = 0. 3) If α R then α R. 4) B h h is non-degenerate, thus to any α R there is a unique H α h satisfying α(h) = B(H, H α ) for all H h. 5) For α R, X g α and Y g α, then [X, Y ] = B(X, Y )H α. 6) For α, β R, β(h α ) is a rational multiple of α(h α ). 7) The roots span h. 8) If α R then α(h α ) 0. Proof. 1) From Corollary 4.10 we have that [g α, g β ] g α+β or equivalently ad(g α )g β g α+β, and hence ad(g α ) ad(g β )g γ g α+β+γ. As α + β 0, g α+β+γ g γ, so ad(x) ad(y ) maps g γ to a totally different subspace of g. Therefore by picking bases for the root spaces and for g 0 = h and putting them together to a basis for g (the root space decomposition), the matrix representation of ad(x) ad(y ) must have zeros in the diagonal and therefore the trace is 0. 2) Let X g α and assume B(X, Y ) = 0 for all Y g α. If we can show that B(X, Y ) = 0 for all Y g, then non-degeneracy of B renders X = 0. But if β R {0} and β α and Y g β then 1) tells us that B(X, Y ) = 0 and by the root space decomposition we get that B(X, Y ) = 0 for all Y g.
4.2 Root Systems for Semisimple Lie Algebras 63 3) If α R then there is a non-zero X g α, and assuming g α to be 0 we get from 2) that X = 0 since B(X, Y ) = 0 for all Y g α. Thus by contradiction g α is non-zero and α is thus a root. 4) Non-degeneracy of B on h h is an immediate consequence of 2) with α = α = 0. Hence we get an isomorphism B : h h by mapping X to the functional B(X, ). Therefore if we put H α := B 1 (α) we get an element in h satisfying α(h) = B(H α )(H) = B(H α, H). 5) By Corollary 4.10 we get [g α, g α ] h and so for Y g α, we have [X, Y ] h. For H h we see that B([X, Y ], H) = B(Y, [X, H]) = α(h)b(y, X) = B(H, B(X, Y )H α ) = B(B(X, Y )H α, H) and as B h h is non-degenerate, [X, Y ] = B(X, Y )H α. 6) Pick X α 0 in g α. The by 2) there exists a 0 Y α g α so that B(X α, Y α ) 0. By a proper scaling we can assume B(X α, Y α ) = 1. Then 4) says that [X α, Y α ] = H α. Now, put g := n Z g β+nα (observe that only finitely many of these spaces are non-trivial). Since ad(h α ) maps any root space to to itself, ad(h α ) will also map g into itself. Thus by restriction of ad(h α ) we get a map g g. We will calculate the trace of this in two ways. On one hand the action of ad(h α ) on g β+nα is [H α, X] = (β + nα)(h α )X so the trace becomes Tr(ad(H α )) = n Z(β(H α ) + nα(h α )) dim g β+nα. Again, this sum is finite as only finitely many of the dimensions are non-zero. On the other hand, as ad(x α ) maps g β+nα into g β+(n+1)α, g is ad(x α )- invariant. Similarly it is ad(y α )-invariant, so as a map on g we have ad(h α ) = ad X α ad Y α ad Y α ad X α and consequently the trace of ad H α is 0. Solving the equation n Z (β(h α) + nα(h α )) dim g β+nα = 0 yields β(h α ) = n Z n dim g β+(n+1)α n Z dim g α(h α ) β+(n+1)α and this is a rational multiple of α(h α ). 7) Pick a basis {H 1,..., H n } for h, then the functionals ϕ k, k = 1,..., n given by ϕ k (H j ) = δ kj will constitute a basis for h. Assume that the roots do not span h, then there is a ϕ k which is not in the span of the roots. This can only happen if α(h k ) = 0 for all roots α R. Thus for any root α and X g α we have 0 = α(h k )X = [H k, X], and since H k commutes with every element of h H k must be in the center of g. But for a semisimple Lie algebra the center is trivial, i.e. H k = 0 and hence a contradiction. 8) Assume α(h α ) = 0. Then by 6) β(h α ) = 0 for all roots β. By 7) the roots span h, and this implies H α = 0. But then α(h) = B(H, H α ) = 0 for all H h, i.e. H α = 0 and hence α = 0 which is a contradiction. Observe that for point 1) it wasn t necessary for g to be semisimple. The particular element H α h whose existence was asserted in 4) is called a co-root. Unlike X α or Y α which were arbitrarily chosen, H α is really unique,
64 Chapter 4 Root Systems due to semisimplicity of g. In fact for any functional ϕ h we have a unique element H ϕ = B 1 (ϕ) h. This gives rise to a bilinear form, on h by ϕ, ψ := B(H ϕ, H ψ ). (4.4) Thus we see that ϕ(h ψ ) = ψ(h ϕ ) = ϕ, ψ. Note, however, that this is not an inner product on h, since it is symmetric and not conjugate symmetric (recall that h is a complex vector space). At the end of this section we restrict it to a real subalgebra on which it is an inner product. Our next objective is to break up g into copies of sl(2, C) the representation theory of which we have already studied. The first result states that the only choice in picking X α is the choice of a constant. Proposition 4.12. Let g be semisimple. If α is a root, then dim g α = 1 and the only integer multiples of α in R are ±α. Proof. As before we pick X α g α and Y α g α so that B(X α, Y α ) = 1 and therefore [X α, Y α ] = H α. Now, define g = CX α CH α n<0 g nα. We progress as in the previous proof: We show that the space is invariant under ad H α, ad X α and Y α and calculate the trace of ad H α = ad X α ad Y α ad Y α ad X α on this space, obtaining an equation which yields the desired result. It is obviously ad H α -invariant. As [X α, X α ] = 0, [X α, H α ] = α(h α )X α and since ad X α maps g nα into g (n+1)α (which is at most h) it is also ad X α - invariant. Finally as [Y α, X α ] = H α, [Y α, H α ] = α(h α )Y α g α and since ad Y α maps g nα into g (n 1)α, it is also ad Y α -invariant. Now we calculate the trace. Like before ad H α is a commutator and as such has zero trace. On the other hand ad H α acts on g by eigenvalues α(h α ) 0, 0 and nα(h α ) respectively and therefore the trace becomes Tr ad H α = α(h α ) + nα(h α ) dim g nα. n=1 Equating this with zero yields n=1 n dim g nα = 1. Since g α is a non-trivial space we must have dim g α = 1 and dim g nα = 0 for n > 1. Since nα R if an only if nα R (cf. Lemma 4.11) the claim is proved. In addition to the formula (2.3) we have the following relatively simple formula for computing the Killing form when restricted to a Cartan subalgebra. This is of particular interest because of Lemma 4.11 which says that B, due to nondegeneracy, provides an isomorphism from h to its dual. Corollary 4.13. Restricted to h h the Killing form is given by B(H, H ) = α R α(h)α(h ).
4.2 Root Systems for Semisimple Lie Algebras 65 Proof. Pick a basis {H 1,..., H k } for h and pick for each root α a non-zero element X α g α. Then the set {H 1,..., H k, X α1,..., X αn } is a basis for g. ad H ad H acts by eigenvalue 0 on H 1,..., H k and with eigenvalue α(h)α(h ) on X α. Hence B(H, H ) = Tr(ad H ad H ) = α R α(h)α(h ). With this at hand we are actually capable of computing the Killing form, cf. Example 4.26 in the next section. As observed in the proof of Proposition 4.12 we have for each root a triple {X α, Y α, H α } satisfying [H α, X α ] = α(h α )X α, [H α, Y α ] = α(h α )Y α, [X α, Y α ] = H α. Thus they span a 3-dimensional complex subalgebra. If we normalize, i.e. define then we see that H α = 2H α α(h α ), X α = 2X α α(h α ), Y α = Y α, [H α, X α] = 2X α [H α, Y α] = 2Y α [X α, Y α] = H α i.e. the triple {X α, Y α, H α} is isomorphic, as a Lie algebra, to sl(2, C). Therefore we denote sl α := span{x α, Y α, H α}. Let α R and β R {0}. By an α-string containing β we understand the set of roots of the form β + nα for n Z. This is also called a root string. Proposition 4.14. Let α R and β R {0}. 1) The α-string containing β is of the form {β + nα p n q} (4.5) for fixed p, q N 0, that is, there are no gaps in the string. Furthermore p q = 2 β,α β,α α,α, i.e. 2 α,α is an integer. 2) If β is not an integer multiple of α then we define a representation ρ of sl α on g := n Z g β+nα by restriction of ad g to sl α. This representation is irreducible. Proof. As we saw in Proposition 4.12 the only integer multiples of α which are in R are ±α. Thus if β is an integer multiple of α then β = ±α and the root string is then either {0, 2α} or { 2α, 0}. In both cases it is readily checked that 1) holds. Under the assumption that β is not an integer multiple of α we check 1) and 2) simultaneously. It is evident that ad H α acts diagonally on g with eigenvalues (β + nα)(h α) = 2(β + nα)(h α) α, α = 2 β, α α, α + 2n. Thus any ad H α-invariant subspace of g is a sum of g β+nα -spaces. Since ρ- invariant subspaces are in particular ad H α-invariant, this holds for ρ-invariant subspaces as well. Now, let V g be a ρ-invariant subspace on which ρ is irreducible, and let p and q be the smallest resp. greatest integer n such that
66 Chapter 4 Root Systems g β+nα is a summand in V. From the representation theory of sl(2, C) we know that this is the unique irreducible representation of sl α of dimension dim V, and thus that the eigenvalues of ad H α are dim V + 1, dim V + 3,..., dim V 3, dim V 1. Comparing with the eigenvalues above we see that there can be no gaps, i.e. the α-string contains the roots in 4.5. Furthermore we see that dim V 1 = 2 β, α α, α + 2q and 1 dim V = 2 β, α α, α 2p. Adding these two equations yields p q = 2 β,α α,α. By Weyl s Theorem ρ can be decomposed into irreducible representations. Assume that V is an irreducible summand in g different from V and let again p and q be the smallest resp. greatest n such that g β+nα is a summand in V. But then p q = p q = 2 β,α α,α. Since V V = {0} we must have p > q or q < p. Assuming the first we get q p > q p i.e. p > q. Adding this to q > p (which we have by assumption) we get p q > p q which is a contradiction. Similarly if we assume q < p. Thus, there cannot exist other irreducible summands in g than V and consequently g = V, so the representation is irreducible. Hence g is the direct sum of g β+nα where p n q, thus (4.5) is indeed equal to the α-string. The last thing we need to see is that p and q are non-negative. If β = 0 then ±α is in the root string and hence p, q 1. If β 0 then β is in the root string and therefore p, q 0. Thus in any case p and q are non-negative. We can use this to elaborate on the result of Corollary 4.10 Corollary 4.15. If α, β R {0} and α + β 0, then [g α, g β ] = g α+β. Proof. Since we cannot have α = β = 0, we can assume α 0. If at first, β = nα, we only have 2 possibilities for n: 0 and 1 (we cannot have β = α). If β = 0, then g β = h and by definition of g α : [g α, h] = g α = g α+β. If β = α then g α+β = g 2α = {0} and the equality follows immediately from Corollary 4.10. Assuming β is not an integer multiple of α we use the preceding proposition. For each n between p and q we have a representative X β+nα for g β+nα. These correspond to the vectors v 0,..., v N we found when we studied sl(2, C), since ad X α maps X β+nα to X β+(n+1)α. The only one of these listed elements which is mapped to 0 by ad X α is X β+qα. Now, if [X α, X β ] 0 then g α+β {0} i.e. α+β R and since the root spaces are 1-dimensional we must have the desired equality. If [X α, X β ] = 0, then we must have q = 0 (for X β+qα was the only element mapped to 0 by ad X α ). But that means the β +α is not in the root string, i.e. β +α / R so that g α+β = {0}. Thus we must have equality. Corollary 4.16. Let α, β R and assume them not to be multiples of each other. Let p and q be the integers from Proposition 4.14 then for X α g α, Y α g α and X β g β [Y α, [X α, X β ]] = q(1 + p) α(h α )B(X α, Y α )X β. (4.6) 2 Proof. Both sides equal 0 if one of X α, Y α and X β is 0, so let s assume them to be non-zero. It is enough (by Proposition 4.12) to show that the formula holds for X α and Y α and since B(X α, Y α) = 2 α(hα) the formula we want to verify is [Y α, [X α, E β ]] = q(1 + p)x β.
4.2 Root Systems for Semisimple Lie Algebras 67 Comparing with the situation for sl(2, C) we have (up to constants) that v 0 corresponds to X β+qα, v 1 to X β+(q 1)α and v N to X β pα (where N = dim g 1 = p + q + 1) and ad Y α maps X β+nα to X β+(n 1)α (up to a scale factor). Therefore X β = c(ad Y α ) q X β+qα for some constant c, i.e. X β corresponds to v q. Hence [Y α, [X α, X β ]] = (ad Y α )(ad X α )X β = q(n q + 1)X β = q(1 + p)x β. This proves (4.6). By a real form of a complex vector space W, we shall understand a real vector space V, such that W = V C, i.e. W is the complexification of V. Proposition 4.17. Let g be a complex semisimple Lie algebra and R(g, h) the set of roots w.r.t. a Cartan subalgebra h. If h 0 denotes the R-linear span of the co-roots, then h 0 is a real form of the vector space h and all the roots are real valued on h 0. Proof. Let α be a root. By Corollary 4.13 we have α, α = B(H α, H α ) = β R β(h α )β(h α ) = β R β, α 2. From Proposition 4.14 we get for each root β integers p β and q β associated with the α-string containing β, and these satisfy β, α = 1 2 (p β q β ) α, α. Substituting this into the expression above yields α, α = β R 1 4 (p β q β ) 2 α, α 2 and thereby 4 α, α = β R (p β q β ) 2. Hence α, α is a rational number. Since β, α is a rational multiple of α, α (cf. Lemma 4.11) also β, α is rational. By Lemma 4.11 7) the roots span h and by non-degeneracy of the Killing form the co-roots will span h. Let n be the complex dimension of h, we can therefore find co-roots H α1,..., H αn which constitute a basis for h. Let ω 1,..., ω n denote the corresponding dual basis for h. Set V := span R {ω 1,..., ω n }. As {ω 1,..., ω n } is a complex basis for h, V is a real form for h. Furthermore all the roots lie in V : indeed let α be a root and write α = n i=1 c jω i for c j C, then α(h αj ) = c i ω i (H αj ) = c j i=1 and hence by the arguments above that c j is rational, in particular real. Thus the roots span V. Now put h 0 := span R {H α α R}. We have an isomorphism h h given by α H α. This isomorphism obviously maps V to h 0, thus h 0 is a real form of h. For H = n i=1 c ih αi h 0 we have real coefficients c i and therefore α(h) = c i α(h αi ) i=1 which is real since α(h αi ) is rational as proved above. Restricted to the real space h 0 the bilinear form, introduced in (4.4) (this was just, in some sense, the dual of the Killing form) is a genuine inner product. It is also positive definit for if ϕ h 0 is non-zero then ϕ, ϕ = B(H ϕ, H ϕ ) = α R α(h f ) 2
68 Chapter 4 Root Systems and this is strictly positive by 8) of Lemma 4.11 since α is real valued and since the roots span h 0. Lemma 4.18. The space h 0 equipped with the bilinear form (4.4) is a real inner product space. Finally we will introduce a set of orthogonal transformations, known as root reflections: Let α be a root and define the map s α : h 0 h 0 by ϕ, α s α (ϕ) = ϕ 2 α, α α The action of s α on ϕ is simply that it reflects ϕ in the hyperplane orthogonal to α, hence the name. Clearly, it is an orthogonal linear map, for s α (α) = α whereas for ϕ {α} we have s α (ϕ) = ϕ. Thus by choosing a proper orthonormal basis the matrix of s α is diag( 1, 1,..., 1), i.e. s α is orthogonal. Proposition 4.19. The root reflections map R into R. Proof. For α, β R we see (by Proposition 4.14) that β, α s α (β) = β 2 α = β (p q)α = β + (q p)α α, α where p and q are the integers given by Proposition 4.14. Since p q p q we see that β +(q p)α is in the α-string containing β and hence it is in R {0}. But as β 0 also s α (β) 0, i.e. s α (β) R. 4.3 Abstract Root Systems To be able to handle root systems in more abstract terms we introduce the notion of an abstract root system and in the next section we define the Weyl group. The purpose of these sections is not to make a complete exposition of the vast theory of abstract root systems and classification of semisimple Lie algebras, rather it is intended to serve as a tool box providing the results necessary to prove the Highest Weight Theorem in the next chapter. Definition 4.20 (Abstract Root System). Let V be a real finite-dimensional inner product space. By an abstract root system we understand a finite set R V of non-zero elements satisfying 1) R spans V. 2) R is invariant under root reflections, i.e. if α, β R then 3) If α, β R then 2 β,α α 2 β, α s α (β) := β 2 α 2 α R. is an integer. Elements of R are called roots. The dimension of V is called the rank of the root system. The root system is called reduced if α R implies 2α / R. An element α R is called reduced if 1 2α / R. Phrased in this new language a root system R(g, h) of a complex semisimple Lie algebra w.r.t. a Cartan subalgebra is an abstract reduced root system in h 0 (the inner product given by the dual of the Killing form): 1) follows from
4.3 Abstract Root Systems 69 Lemma 4.11, 2) follows from Proposition 4.19, 3) follows from Proposition 4.14, and finally R(g, h) is reduced as a consequence of Proposition 4.12. Let R V and R V be two abstract root systems. We say that the two root systems are isomorphic if there is an orthogonal linear map ϕ : V V which maps R to R. A root system R is called reducible if there is a decomposition R = R R where any element of R is orthogonal to all the elements of R. If not so, the root system is called irreducible. It is a non-trivial fact, that the root system of a complex semisimple Lie algebra is irreducible if and only if the Lie algebra is simple. In the next lemma we present some basic facts about abstract root systems Lemma 4.21. Let R V be an abstract root system. 1) If α R, then α R. 2) If α R, then 0, ± 1 2α, ±α, ±2α are the only possible elements in R {0} which are proportional to α. If α is reduced, only 0, ±α and ±2α are possible. If R is reduced, then only 0 and ±α are possible. 3) If α R and β R {0}, then the integer 2 β,α α 2 equals either 0, ±1, ±2, ±3 or ±4. The possibility ±4 only occurs when R is non-reduced and β = ±2α. 4) If α, β R are non-proportional and α β, then 2 β,α α 2 ±1. equals 0 or 5) Let α, β R. If α, β > 0, then α β R {0}. If α, β < 0, then α + β R {0}. Proof. 1) This follows since α, α R s α (α) = α 2 α 2 α = α. 2) By 1) we have that 0 and ±α are elements in R {0} proportional to α and if R is non-reduced then also ±2α are possible elements in R {0}. Now assume cα R for some non-zero real number c, then cα, α α, cα 2 α 2 and 2 cα 2 are integers. This implies that 2c and 2 c are integers. The last condition says that c is of the form c = 2 n for n Z \ {0} whereas the first condition further reduces the possible values to 2, 1, 1 2, 1 2, 1, 2. If α is reduced, c = ± 1 2 cannot occur. 3) If β = 0 then 2 α,β α = 0, so assume β 0. From the Cauchy-Schwarz 2 inequality we have β, α β 4 2 2 α, α 2 β 2 α 2 β α α β = 4. β 2 If 2 β,α α is non-zero, then also 2 α,β 2 β is non-zero and as their product is less 2 than 4, both integers must have absolute value at most 4. If α and β are nonproportional the inequality is strict, hence the integers have absolute value at most 3. Therefore for ±4 to occur we must have β = cα and from 2) we see that the only possibility is β = ±2α. This, of course, can only happen if R is non-reduced.
70 Chapter 4 Root Systems 4) Again, since the inequality is strict, we get β, α 2 2 2 = β 2 β, α < β α = 2 α β 2 β 2 β 2 and so the conclusion follows. 5) If α and β are proportional, i.e. α = cβ then from 2) we know that c can only be ± 1 2, ±1 or ±2. If α, β < 0 then c < 0. For c = 2 we have α + β = β R. If c = 1, then α + β = 0 R {0}, and if c = 1 2, then α + β = 1 2β = α R. For α, β we do exactly the same. Now suppose α and β are non-proportional. W.l.o.g. we may assume α β, then by 4) 2 α,β β is either 0 or ±1. If α, β > 0 then we must have 2 = 1 and consequently s β (α) = α + β R. Conversely, if α, β < 0 then 2 α,β β 2 2 α,β β 2 = 1 and s β (α) = α β R. For the rest of this section we will introduce the two closely connected notions of a positive system and a fundamental system for a root system in V and show how they are related. For this purpose we divide V into half spaces. By an open half space of V we mean a subset of V of the form {v V v, ξ > 0} where ξ is some fixed non-zero vector. Definition 4.22 (Positive System). Let R be a root system in V. By a positive system or a system of positive roots we understand a subset R + R satisfying 1) There exists an open half space in V containing R +, 2) R = R + ( R + ). The elements of R + are called positive roots. If R is a root system and R + a system of positive roots, then a root is called simple if it positive and if it is not possible to express it as a sum of positive roots. Lemma 4.23. If α and β are distinct simple roots, then α β is not a root. Consequently α, β 0. Geometrically this amounts to say that the angle between α and β is at least π/2. Proof. Assume α β to be a root. If it is positive, then α = (α β)+β is a sum of positive roots. If it is not positive, then β α is positive and β = (β α) + α is a sum of positive roots. In any case we get a contradiction, so α β is not a root. The last claim is an immediate consequence of Lemma 4.21 5). Now for the related notion Definition 4.24 (Fundamental System). Let R be a root system in V. A subset Π R is called a fundamental system (or a simple system or a basis) for R if 1) Π is a basis for V, 2) R N 0 Π ( N 0 Π). An equivalent formulation of the last condition is that any root is expressible as a linear combination α = n i α i where α i Π and the coefficients n i are all either non-positive or non-negative integers. If α = n i α i is positive then ni is a positive integer called the level of α w.r.t. the positive system R +.
4.3 Abstract Root Systems 71 Proposition 4.25. Let R be a root system in V. If R + is a positive system, then the set Π of simple roots is a fundamental system. Conversely, if Π is a fundamental system, then R + := N 0 Π R is a positive system. Proof. We show that Π is a fundamental system. Let α be a positive root. Either α is a simple root or it can be decomposed α = β + γ. Now β and γ are either simple roots or they can be decomposed into sums of positive roots, etc. And so we continue until α is written as a N 0 -linear combination of simple roots. Thus R + N 0 Π and therefore also R = R + ( R + ) N 0 Π ( N 0 Π), and as R spans V also Π is a spanning set. To see that it is linearly independent, let α, β Π be distinct. Then neither α β nor β α are in R +, hence α β is not a root. By Lemma 4.21 5) α, β 0. Π is contained in an open half space meaning that there exists a vector ξ such that ξ, α > 0 for all α Π. Assume λα α = 0 and put Π ± := {α Π ± λ α > 0} and e ± := α Π ± λ α α, where e ± is defined to be 0, if Π ± is empty. We see that e + = e. Since Π + and Π are disjoint α, β 0 if α Π + and β Π. From this it follows that e +, e + = e +, e 0, (as e +, e can be written as a sum of elements of the form λ α λ β α, β with α Π + and α Π ) and consequently e + = e = 0. Therefore 0 = ξ, e ± = α Π ± λ α ξ, α. Since ξ, α is strictly negative, we must have λ α = 0 for all α Π. Hence Π is a linearly independent set and is thus a fundamental system for R. Now let Π = {α 1,..., α n } be a fundamental system. To see that it is contained in an open half space put ξ := α 1 + + α n. Then we have α i, ξ > 0 for all i = 1,..., n and therefore that Π thus N 0 Π hence R + is contained in an open half space. We obviously have R + ( R + ) R. Conversely let α R. By property 2) of a fundamental system α is of the form α = n i α i where n i are all either non-positive or non-negative. But that exactly means that α R + ( R + ). Positive systems, and consequently also fundamental systems, do exist. One way of defining a positive system by virtue of the so-called lexicographic ordering: Let R be a root system in V and {v 1,..., v n } a spanning set for V. We say that α R is positive if there exists a 1 k n such that α, v i = 0 for i < k and α, v k > 0. As one can check this is a positive system and consequently: any root system has positive system and thus also a fundamental system. Example 4.26. Let s consider the Lie algebra sl(3, C). It is conceptually easy (though tedious) to verify that the following subspace { h 1 0 0 h := 0 h 2 0 } h1, h 2, h 3 C, h 1 + h 2 + h 3 = 0 0 0 h 3 is in fact a Cartan subalgebra of sl(3, C). Obviously, it has dimension 2, and a basis is given by H 1 := diag(1, 1, 0) and H 2 := diag(0, 1, 1). Defining for i = 1, 2, 3 the functional e i h by h 1 0 0 e i 0 h 2 0 = h i 0 0 h 3
72 Chapter 4 Root Systems it can be seen that the roots of sl(3, C) w.r.t. the Cartan subalgebra h are just e i e j for i j. This set of roots is given the ordering that e i e j is positive when i < j, i.e. the positive roots are e 1 e 2, e 1 e 3 and e 2 e 3. Since e 1 e 3 = (e 1 e 2 ) + (e 2 e 3 ) we see that e 1 e 2 and e 2 e 3 are simple roots. We want now to calculate the Killing form and the co-roots. Calculating the Killing form is easy thanks to Corollary 4.13 and we get for the basis elements 1 : H 1 2 = B(H 1, H 1 ) = 12 B(H 1, H 2 ) = 6 H 2 2 = B(H 2, H 2 ) = 12. The co-root of e 1 e 2 is the unique vector H e1 e 2 satisfying (e 1 e 2 )(X) = B(H e1 e 2, X) for all X h. Writing H e1 e 2 = ah 1 + bh 2, this condition with X = H 1 gives us the equation 2 = 12a 6b, and when X = H 2 the condition reads 1 = 6a+12b. This system of equations has the solution a = 1 6 and b = 0, hence H e1 e 2 = 1 6 H 1. In a similar fashion we calculate that H e2 e 3 = 1 6 H 2. Since e 1 e 3 was just (e 1 e 2 ) + (e 2 e 3 ) the corresponding co-root is just 1 6 (H 1 + H 2 ). The remaining 3 roots are just the negatives of these three, hence we have calculated all the co-roots. 4.4 The Weyl Group Definition 4.27 (Weyl Group). Let R be a root system in V. The subgroup W (R) (or just W for short) of O(V ) generated by the root reflections s α for α R is called the Weyl group of the root system. If R is the root system of a Lie algebra g w.r.t. a Cartan subalgebra h we denote it W (g, h). The Weyl group is a finite group: Indeed any element of the Weyl group maps R to R and if two elements agree on the roots, they agree on a spanning set, and hence they are equal. Since R is finite, there are only finitely many different orthogonal transformations of V mapping R into R, hence the group is finite. If r is any orthogonal linear map on V we see ϕ, rα s rα (ϕ) = ϕ 2 ( = r In particular, if β = rα then rα 2 rϕ = ϕ ϕ, α 2 r 1 α 2 rα ) r 1 ϕ 2 r 1 ϕ, α α 2 α = rs α (r 1 ϕ) s β = rs α r 1. (4.7) Lemma 4.28. Let R be a root system and Π = {α 1,..., α n } the set of simple roots. If α is a positive root proportional to α i (i.e. α = α i or α = 2α i ) then s αi (α) = α, otherwise s αi (α) is positive. Proof. As α is positive α = i n iα i where n i N 0. If α = α i then of course s αi (α) = α. Similarly, if α = 2α i, one calculates that s αi (α) = α. If α is not proportional to α i, then there exists a j i such that n j > 0. But then the j th coefficient in s αi (α) = n i α i 2 α, α i α i 2 α i i=1 1 In light of our notion of positivity of roots we see that we can simplify our formula to B(H, H ) = 2 α R + α(h)α(h ).
4.4 The Weyl Group 73 equals n j > 0. Hence by the property of Π all coefficients are automatically positive and thus s αi (α) is a positive root. Proposition 4.29. If R is a root system and Π = {α 1,..., α n } the simple elements, then {s α α Π} is a set of generators for W (R). Furthermore if α is reduced, then there exists α j Π and s W (R) such that α = s(α i ). Proof. We prove the theorem backwards. Let W W be the subgroup generated by the root reflections of the simple roots. Let α = n i α i be a reduced root, and assume it to be positive. We do induction on the level of α. If the level of α equals 1, then α = α i for some i, and we can pick the identity element id of W, so that α = id α i. Now let the level of α be strictly greater than 1 and assume the result to hold for any root which is of a strictly lower level. We have 0 < α 2 = n i α, α i hence there must exist an index i 0 such that α, α i0 > 0. We cannot have α = α i0 (since the level was greater than 1) and we cannot have α = 2α i0 (since α was reduced). By the preceding lemma we get that β := s αi0 α is positive. But as β = ( n i α i + n i0 2 α, α ) i 0 α i0 2 α i0 i i 0 and since α, α i0 > 0, β is of a strictly lower level than α. Hence by the induction hypothesis β = s(α j ) with s W and α = s 1 α i0 (β) = s αi0 s(α j ). Since s αi0 s W we have the desired result. Now let α R. We can assume α to be reduced for if not, 1 2α will be reduced and a quick calculation reveals that s α = s 1 2 α. Thus α = s(α j ) for s W. By (4.7) s α = ss αj s 1, hence s α W. As all the generators of W lie in W the two groups are equal. Proposition 4.30. Let R be a root system and Π and Π two simple systems for R. Then there exists s W (R) such that Π = sπ. Proof. Let R + and R + denote the sets of positive roots w.r.t. Π resp. Π. Then we have R + = R + = 1 2 R =: q and of course we have R+ = R + if and only if Π = Π. Now put r := R + R + = q n for some n N 0. We verify the claim of the proposition by induction over n. If n = 0 then R + = R + and consequently Π = Π and we can choose s = id W. Assume then that n > 0, that is R + R +. As Π generates R + we can t possibly have Π R + for then R + R + and hence they would be equal. Thus we can pick an α Π which is not in R +. But then α R +. If β R + R +, then by Lemma 4.28 s α (β) R + (for as α / R +, β cannot be proportional to α), i.e. s α (β) R + s α (R + ). Consequently R + s α R + contains at least q n elements. But also α R + s α R +. Thus R + s α R + q (n 1), and so, as s α R + has s α Π as fundamental system, the induction hypothesis yields an element s W so that s α Π = sπ, i.e. Π = s α sπ. Definition 4.31 (Dominant Element). An element λ V is called dominant w.r.t. a fundamental system Π if λ, α 0 for all α Π (and hence also for all α R + ). Proposition 4.32. For any λ V there is a fundamental system with respect to which λ is dominant.
74 Chapter 4 Root Systems Proof. If λ = 0 then it is obviously dominant, so we assume λ 0. Put v 1 := λ and complete this to an orthogonal basis {v 1,..., v n } for V. Via the lexicographic ordering this yields a fundamental system Π = {α 1,..., α n }. By definition we have, in particular, that α i, v 1 0 for all i and thus λ = v 1 is dominant w.r.t. Π. Proposition 4.33. Let R V be a root system and R + a positive system. If λ V, then there exists s W (R) such that s(λ) is dominant. Proof. Since R has a positive system it has a fundamental system Π. According to Proposition 4.32 there is a fundamental system Π w.r.t. which λ is dominant. Proposition 4.30 yields an element s W which maps Π to Π, i.e. if Π = {α 1,..., α n } then Π = {sα 1,..., sα n }. Since we see that s 1 λ is dominant. The final result of this chapter is s 1 λ, α i = λ, sα i 0 Proposition 4.34. Let R be a reduced root system and define δ := 1 2 If α is a simple root then s α (δ) = δ α and consequently we have α R + α. δ, α 2 = 1. (4.8) α 2 Proof. From Lemma 4.28 we have that s α maps positive roots to positive roots with the single exception of α which by s α is mapped to α (as R is reduced ±α are the only roots proportional to α). We see s α (2δ) = s α (2δ α) + s α (α). 2δ α is just the sum of all positive roots except for α. By the remarks above, s α just permutes the positive roots, i.e. s α (δ) = δ and thus s α (2δ α) = 2δ α. Therefore s α (2δ) = (2δ α) α = 2(δ α) and the conclusion is reached.
Chapter 5 The Highest Weight Theorem 5.1 Highest Weights The main result of this chapter will be, as promised, the Theorem of the Highest Weight a classification theorem for the irreducible representations of complex semisimple Lie algebras. Apart from its importance in representation theory, it plays a prominent role in particle physics, in that the bound states of quarks known as hadrons are modeled by representations with certain highest weights. At first we set out to define an ordering of the roots. The first step in this process was the construction of the real form for the Cartan subalgebra carried out in Proposition 4.17. The next step is to introduce the notion of a Weyl chamber: Let g be a complex semisimple Lie algebra and R(g, h) its root system relative to the Cartan subalgebra h and let h 0 be the real form of h. As we have R h 0, and thus for each α R the kernel of α is a hyperplane in h 0, and the connected components of h 0 \ α R ker α (which are open and finite in number) are called the Weyl chambers of the root system R. Now pick a Weyl chamber C and let α R be an arbitrary root. Then α is non-zero and real on C and since C is connected, α is either strictly positive or strictly negative. The set of roots which are positive on C are called positive roots relative to the Weyl chamber C and the set is denoted R + (C) or just R +. This is a positive system as defined in the preceding chapter for let x C be nonzero and let ϕ be the functional in h 0 corresponding to x through the inner product,. Then for α R + we have α, ϕ = α(x) > 0, so the positive roots are contained in an open half space of h 0. Furthermore Lemma 4.11 3) says that R = R + ( R + ) and hence R + is a positive system. We denote the corresponding set of simple roots by Π. For two roots α and β we write α β if β α is a positive root. This is the promised ordering of the roots. Lemma 5.1. Let N 0 R + be the set of linear combinations of positive roots with nonnegative integer coefficients, then N 0 R + ( N 0 R + ) = {0}. Proof. If µ N 0 R + then µ 0 on the Weyl chamber C. If µ (N 0 R + ), then likewise µ 0 on C. Since the Weyl chamber is an open subset of h 0 then µ is zero on C and hence zero as a linear functional. Lemma 5.2. The spaces g + := α R + g α and g := are Lie subalgebras of g and we have the decomposition α R + g α g = g + h g. (5.1) 75
76 Chapter 5 The Highest Weight Theorem Proof. We will show that g + is a subalgebra of g. Let α, β R + and assume [g α, g β ] {0}. This means that α + β is a root and as it is obviously positive on C then α + β R +, hence [g α, g β ] g +, so it s a subalgebra. Now (4.3) readily translates into (5.1). Now we can define the important notion of a highest weight and a highest weight vector. Definition 5.3 (Highest Weight Vector). By a highest weight vector for a Lie algebra representation (π, V ) of g we understand a nonzero weight vector v V λ (relative to some Cartan subalgebra h) such that ρ(x)v = 0 for all X g +. A weight whose weight space contains a highest weight vector is called a highest weight. Well, do such objects exist at all? Yes they do: Proposition 5.4. If (ρ, V ) is a finite-dimensional representation of g, then a highest weight vector and thus a highest weight λ exist. Proof. Let C h 0 be the Weyl chamber determining R + and pick an element X 0 C. Then α(x 0 ) > 0 for all α R +. As Λ(ρ) is finite, there is a weight λ such that Re(λ(X 0 )) is maximal. Then λ + α can t be a weight when α R +, i.e. V λ+α = {0}. By Proposition 4.9 we get that ρ(g α )V λ = {0} for all α R + so that ρ(g + )V λ = {0}. Therefore any nonzero v V λ will be a highest weight vector with highest weight λ. For a general representation we cannot hope for uniqueness of the highest weight, a representation can have several highest weights. However for irreducible representations we do have uniqueness. This is the first step towards classification of irreducible representations through their highest weights. The rest will follow in the sequel. To prove uniqueness we will introduce the following technical tool which is used only in proofs. Definition 5.5 (Cyclic Vector). A highest weight vector v for a representation (ρ, V ) is called cyclic if the only ρ-invariant subspace containing v is V itself. Lemma 5.6. Let (ρ, V ) be a finite-dimensional representation of g and let v V be a cyclic highest weight vector for ρ associated with the weight λ. 1) The weight space is 1-dimensional: V λ = Cv. 2) V = Cv span{ρ(x 1 ) ρ(x n )v X k g, n N}. 3) Every weight µ Λ(ρ) is of the form λ ν for ν N 0 R +. Proof. We prove first 2). We define V 0 := Cv and inductively V n+1 := V n + ρ(g )V n and let W be the union of all the V n s. It should be clear that W actually equals Cv span{ρ(x 1 ) ρ(x n )v X k g, n N}, so we need to show that W = V. This we can accomplish by showing that W, which contains v, is invariant. Since we have the decomposition (5.1) it is enough to show that W is g ± - and h-invariant (meaning that ρ(x)w W for X g ± or X h). Since we have, by construction that ρ(g )V n V n+1, it is clear that W is g -invariant. That was the easy part. To show the other invariances we use induction. As v is a weight vector, then ρ(h)v 0 V 0 and ρ(g + )V 0 = {0}, in particular V 0 is h- and g + -invariant. Now assume that V n is h- and g + -invariant. As V n+1 = V n + ρ(g )V n it is enough to show that ρ(h)(ρ(g )V n ) V n+1 and ρ(g + )(ρ(g )V n ) V n+1. For the first one, let H h, Y g and w V n. Then ρ(h)ρ(y )w = ρ(y )ρ(h)w + ρ([h, Y ])w.
5.1 Highest Weights 77 By induction we had that V n was h- and g + -invariant, implying that ρ(h)w V n and ρ([h, Y ])w V n+1 (the last following from [H, Y ] g ), hence ρ(y )ρ(h)w V n+1, thus ρ(h)ρ(y )w V n+1. Therefore W is h-invariant. Now let w V n, X g + and Y g. Then again ρ(x)ρ(y )w = ρ(y )ρ(x)w + ρ([x, Y ])w. As V n is g + -invariant, we have ρ(x)w V n, so that ρ(y )ρ(x)w V n+1. Since V n is h- and g + -invariant and ρ(g )V n V n+1 then the decomposition (5.1) says that ρ(g)v n V n+1, i.e. ρ([x, Y ])w V n+1. This proves the induction step and hence 2). Now we prove 3). Let s consider the vector w = ρ(x 1 ) ρ(x n )v for X j with α j R +. For H h we then see that g αj ρ(h)w = ρ(h)ρ(x 1 ) ρ(x n )v = (ρ(x 1 ) ρ([h, X j ]) ρ(x n )v) + ρ(x 1 ) ρ(x n )ρ(h)v j=1 ( n ) = α j (H) ρ(x 1 ) ρ(x n )v + λ(h)ρ(x 1 ) ρ(x n )v j=1 = (λ ν)(h)w where ν = α 1 + +α n N 0 R +. Thus if w 0 then λ ν is a weight. Now if w is an arbitrary weight vector with weight µ it must, by 2) be a linear combination of elements of the form as above. These are all weight vectors and since different weight spaces are linearly independent they must be weight vectors of the same weight. Thus µ = λ ν for some ν as above. This proves 3) Now for 1). Let w V λ and assume for contradiction that w / Cv. Then if w = ρ(x 1 ) ρ(x n )v for n > 0 and a calculation as above shows that ρ(h)w = (λ α 1 α n )(H)w. If w is a linear combination of such elements they must all be weight vectors of the same weight λ ν where ν 0 since w / Cv. Thus we cannot possibly have ρ(h)w = λ(h)w for all H h, hence a contradiction. With this technical lemma at our disposal we can prove the first part of the Theorem of the Highest Weight Proposition 5.7. Let (ρ, V ) be an irreducible representation of g. Then ρ has a unique highest weight λ. If v V λ is non-zero, then the following hold 1) V λ = Cv. 2) V = Cv span{ρ(x 1 ) ρ(x n )v X k g, n N}. 3) Every weight µ Λ(ρ) is of the form λ ν for ν N 0 R +. 4) For α R + and each X g α we have ρ(x)v = 0. Proof. By Proposition 5.4 a highest weight λ and highest weight vector v for ρ exist. Since ρ is irreducible it is automatically a cyclic highest weight vector and by the previous lemma, 1)-3) are valid. Assume that µ is another highest weight for ρ. By 3), since λ is a highest weight, µ = λ ν for ν N 0 R +. Similarly, as µ is a highest weight λ = µ ν for ν N 0 R +. Thus we both have µ λ = ν N 0 R + and µ λ = ν N 0 R +, hence µ λ = 0 by Lemma 5.1. The proof of point 4 is easy: ρ(x)v λ V λ+α and since α is positive λ + α is strictly higher than λ and therefore V λ+α = 0.
78 Chapter 5 The Highest Weight Theorem Note that 3) justifies the name highest weight, indeed it is higher than any of the other weights. It also shows that the highest weight is independent of the ordering of the roots, i.e. independent of the choice of Weyl chamber. It is also worth mentioning that the weight spaces for non-highest weights need not be 1-dimensional. As a partial converse to point 4) of the proposition above we have Proposition 5.8. Let (ρ, V ) be a representation with highest weight λ. If v V satisfies ρ(x)v = 0 for each X g α with α R +, then v V λ. Proof. Assume for contradiction that v / V λ satisfies that ρ(x)v = 0 for each root vector X g α with α R +. Without loss of generality we can assume that v has no component in V λ. Now consider the weight spaces in which v has a component and let λ 0 be the highest of these weights. λ 0 is (by uniqueness of the highest weight for irreducible representations) strictly lower than λ. As in the proof of Lemma 4.11 we see that Cv span{ρ(x 1 ) ρ(x n )v X k g, n N} is a non-trivial invariant subspace of V and hence by irreducibility equals V. But all the weight vectors are associated with weights which are lower than λ 0 hence they are all strictly lower than λ and that s a contradiction. Earlier we saw that weights are preserved under equivalence of representations. The same is true for highest weights: equivalent representations have the same highest weights. This is not hard to see: Let (ρ, V ) and (ρ, V ) be representations which are equivalent through an intertwiner T : V V. If λ is a highest weight for ρ, then ρ(g + )V λ = {0}. Now λ was also a weight for ρ with weight space T (V λ ). Since ρ (g + )(T V λ ) = T (ρ(g + )V λ ) = {0}, λ is also a highest weight for ρ. A similar argument works for the other direction. For irreducible representations the situation is interesting. Not only will two equivalent irreducible representations have the same (unique) highest weight, also the converse is true: two irreducible representations having the same highest weight are equivalent. Proposition 5.9. Let (ρ, V ) and (ρ, V ) be two irreducible representations of a complex semisimple Lie algebra g. If they have the same highest weight, then they are equivalent. Proof. Let v and v be non-zero highest weight vectors for ρ and ρ respectively. Form the subspace S := C(v, v ) span{(ρ ρ )(X 1 ) (ρ ρ )(X n )(v, v ) X i g, n N} By the same arguments as of Lemma 5.6 this space equals C(v, v ) span{(ρ ρ )(X 1 ) (ρ ρ )(X n )(v, v ) X i g, n N} Obviously this is a ρ ρ -invariant subspace of V V and we will now show that (ρ ρ ) S is irreducible. To this end let T S be a nontrivial invariant subspace on which ρ ρ is irreducible. This exists by Weyl s Theorem. Then (ρ ρ ) T has a unique highest weight by Proposition 5.7. Let (v 0, v 0) 0 be an associated highest weight vector. Now for each X g α with α R + we have by Proposition 5.7 4) that 0 = (ρ ρ )(X)(v 0, v 0) = (ρ(x)v, ρ (X)v ), i.e. ρ(x)v = ρ (X)v = 0 and by the same proposition we have v 0 = cv and v 0 = c v with c, c C. Therefore we have (v, v ) T and hence also in S. But
5.2 Verma Modules 79 since v 0 and v 0 are highest weight vectors and the X i s in g push the weights down we must have (v 0, v 0) C(v, v ). Proposition 5.7 says that T = C(v 0, v 0) span{(ρ ρ )(X 1 ) (ρ ρ )(X n )(v 0, v 0) X i g, n N} but since (v 0, v 0) C(v, v ) we see that T = S and thus that (ρ ρ ) S is irreducible. Now it is not hard to see that the projection π 1 : V V V intertwines (ρ ρ ) S and ρ and that π 2 : V V V intertwines (ρ ρ ) S and ρ. Since these are all irreducible the are all mutually equivalent by Schur s Lemma, in particular ρ and ρ are equivalent. 5.2 Verma Modules Let s introduce an important class of functionals in h Definition 5.10 (Integral Element). An element λ h is called an integral element if the number λ, α 2 α, α is an integer for all roots α R. By Proposition 4.14 all roots are integral elements. The importance of integral elements is due to the following Proposition 5.11. Let ρ be a complex representation of a complex semisimple Lie algebra g, then the weights of ρ w.r.t. a Cartan subalgebra are integral elements. The highest weights of ρ are dominant integral elements. Proof. Let α R, and consider the Lie subalgebra sl α = span{x α, Y α, H α} which is isomorphic to sl(2, C). From Corollary 3.13 we know that ρ(h α) has only integer eigenvalues. Since 0 v V λ implies ρ(h α)v = λ(h α)v = 2 λ(h α) λ, α v = 2 α, α α, α v, 2 λ,α α,α is an eigenvalue for ρ(h α), and must therefore be an integer. Let λ be a highest weight and v V λ a nonzero weight vector. Let furthermore α be a simple root and consider the subspace sl α = span{h α, X α, Y α} which is isomorphic to sl(2, C). Let W be the span of elements of the form ρ(y α) n1 ρ(h α) n2 ρ(x α) n3 v. Since v is a highest weight vector, W equals the span of the elements ρ(y α) n v. But on elements of this form ρ(h α) acts by the eigenvalues (λ nα)(h α) λ, α = 2 α 2 2n. Obviously 2 λ,α α is the greatest of these eigenvalues and from Theorem 3.8 this 2 greatest eigenvalue has to be non-negative. Our final task concerning the Highest Weight Theorem is to construct for each dominant integral element in h an irreducible representation of g having the integral element as highest weight. We retain the notation g + = α R g + α and g = α R g + α and define b := h g + and δ := 1 2 α R + α. Before we proceed we need the following lemma linking the universal enveloping algebra of g to representation theory.
80 Chapter 5 The Highest Weight Theorem Proposition 5.12. Let g be a complex Lie algebra. There is a 1-1 correspondence between (possibly infinite-dimensional) complex representations of g and unital left modules over U(g). By a unital module we mean a U(g)-module such that 1v = v. Proof. Let ρ : g End C (V ) be a complex representation of g. Then by the universal property it factorizes through U(g) to a unital algebra homomorphism ρ : U(g) End C (V ), so V is given the structure of a unital left U(g)-module by uv = ρ(u)v. Conversely, if V is a left U(g)-module, then it is a complex vector space since C sits in U(g), and we define a representation ρ by ρ(x)v = ι(x)v (ι : g U(g) is the embedding of g). These two constructions are easily seen to be inverses of each other. Let V be a complex vector space and a U(g)-module. Referring to Proposition 5.12, to V corresponds a unique complex representation of g on V and therefore we already have the notions of irreducibility, weights, weight vectors and weight spaces for U(g) modules (relative to some Cartan subalgebra of g). We let Λ(V ) denote the set of weights. A direct translation of Proposition 4.9 tells us that g α V µ V α+µ when µ Λ(V ). Hence µ Λ(V ) V µ is a g-invariant subspace of V. In the same spirit we define a highest weight vector for V to be a nonzero weight vector v V µ for some µ such that g + v = {0}. A weight whose weight space contains a highest weight vector is called a highest weight. If v V is a highest weight vector for some U(g)-module the highest weight module generated by v is the U(g)-submodule U(g)v of V. The following lemma on highest weight modules is somewhat an analog to Proposition 5.7. The only (but important) difference is that we now consider infinite-dimensional representations/modules as well. Lemma 5.13. Let V be a U(g)-module, v V be a highest weight vector associated with highest weight λ and let W = U(g)v be the highest weight module generated by v. Then we have 1) W = U(g )v. 2) W = µ h W µ where dim W µ < and dim W λ = 1. 3) Every weight of W is of the form λ n i=1 n iα i with α i R + and n i N 0. Proof. 1) We have that g = g h g + and from the PBW-Theorem we see that any element of U(g) is a (a linear combination) of elements of the form Y HX where X U(g + ), H U(h) and Y U(g ). Since we have U(g + ) = g + U(g) C (by the PBW-Theorem) and Xv = 0 for X g + U(g + ) (since v is a highest weight vector) we get U(g + )v = Cv. Similarly, Hv is just a constant times v, so U(h)v = Cv. Thus, only elements in U(g ) give something new and therefore U(g)v = U(g )v. 2) and 3) We clearly have W µ W. By Proposition 4.9 we get for α R {0} that g α W µ W α+µ and hence (by the PBW-Theorem) U(g)( W µ ) Wµ. As v W µ we see W = U(g)v U(g)( W µ ) W µ. Thus W = W µ. U(g ) is generated by {X α α R + } and hence a basis for W = U(g )v is {X α n1 1 X n k α k v}. In particular any weight vector must be of this form and for H h we see H(X n1 α 1 X n k α k v) = (λ n 1 α 1 n k α k )(H)v.
5.2 Verma Modules 81 This proves 3). Observe that for any given µ only finitely many combinations give µ and therefore dim W µ <. There is only one possibility to get λ, and thus dim W λ = 1, in fact W λ = Cv. This proves 2). Certain infinite-dimensional modules, called Verma modules are necessary to construct the irreducible representations we seek. The construction of the Verma module goes as follows: Let V 1 and V 2 be two complex vector spaces and let A and B be complex, associative unital algebras. Assume furthermore that V 1 is a right B-module, V 2 is a left B-module and that V 1 is a left A-module such that (av)b = a(vb) for all a A, b B and v V 1. Denote by I the two-sided ideal in V 1 C V 2 generated by all elements of the form v 1 b v 2 v 1 bv 2, and define the tensor product V 1 B V 2 := (V 1 C V 2 )/I. What we do is that we identify v 1 b v 2 with v 1 bv 2, one might say we have made it associative. The equivalence class in V 1 B V 2 containing v 1 v 2 will still be denoted v 1 v 2, now we just have the above identification. V 1 B V 2 is given the structure of an A-module by defining a(v 1 v 2 ) = av 1 v 2. With this V 1 B V 2 has the following universal property: Proposition 5.14. Let W be a complex vector space and ψ : V 1 V 2 W a bilinear map satisfying ψ(v 1 b, v 2 ) = ψ(v 1, bv 2 ), then there exists a unique linear map ψ : V 1 B V 2 W such that ψ(v 1, v 2 ) = ψ(v 1 v 2 ). Proof. The proposition is easily proved when considering the following commuting diagram V 1 V 2 V 1 V 2 V 1 B V 1 ψ ψ ψ W As ψ is bilinear it descends uniquely to the linear map ψ, which is 0 on the ideal I since ψ (v 1 b v 2 ) = ψ(v 1 b, v 2 ) = ψ(v 1, bv 2 ) = ψ (v 1 bv 2 ) and therefore descends uniquely to ψ such that ψ(v 1, v 2 ) = ψ(v 1 v 2 ). Let λ h be arbitrary. We define a representation ρ of b = h g + on C by ρ(h)z = (λ δ)(h)z for H h ρ(x)z = 0 for X g +. and this gives C the structure of a U(b)-module. If we want to stress the module structure of C we write it as C λ δ. Now multiplication turns U(g) into both a left U(g)-module and a right U(b)-module, and therefore it makes sense to define the Verma module associated with λ: V (λ) := U(g) U(b) C λ δ. (5.2) This is an infinite-dimensional left U(g)-module and thus corresponds to some infinite-dimensional representation of g. In the following proposition we outline some properties of the Verma module Proposition 5.15. Let λ h be arbitrary. 1) The Verma module V (λ) is a highest weight module generated by 1 1 (called the canonical generator of V (λ)) which is a highest weight vector with weight λ δ.
82 Chapter 5 The Highest Weight Theorem 2) The map U(g ) V (λ) given by u u(1 1) is a linear bijection. 3) If M is another highest weight module over U(g) generated by a highest weight vector v of weight λ δ, then there exists a unique U(g)-module homomorphism Ψ : V (λ) M with Ψ(1 1) = v. This map is surjective and it is injective if and only if u 0 in U(g ) implies uv 0 in M. Proof. 1) Since by definition V (λ) = U(g) U(b) C λ δ it should be clear from the module structure that V (λ) = U(g)(1 1). For X g + and 1 the unit in C we have that X 1 = 0 (this is how we defined the module structure on C) and therefore X(1 1) = (X 1) 1 = (1 X) 1 = 1 X 1 = 0 (in the third equality we used that X b U(b)). Thus 1 1 is a highest weight vector. For H h we have H(1 1) = 1 H 1 = 1 (λ δ)(h) = (λ δ)(h)(1 1) and therefore 1 1 is a weight vector of weight λ δ. 2) Since g = g b we have from Corollary 2.43 a vector space isomorphism U(g) = U(g ) C U(b). If X = X 1 X k X k+1 X n where X 1,..., X k g and X k+1,..., X n b the isomorphism is simply given by X X 1 X k X k+1 X n (and extended by linearity of course). We now get a string of vector space isomorphisms V (λ) = U(g) U(b) C λ δ = (U(g ) C U(b)) U(b) C λ δ = U(g ) C (U(b) U(b) C λ δ ) = U(g ) C C λ δ = U(g ) given by composition of the maps u(1 1) u 1 (u 1) 1 u (1 1) u 1 u. 3) First we define a map ψ : U(g) C λ δ M by (u, z) u(zv). For X g + or X h we see ψ(ux, z) = ux(zv) = zu(xv) whereas ψ(u, Xz) = u((xz)v) = zu(xv) i.e. ψ(ux, z) = ψ(u, Xz) for all X U(b). By the universal property of U(b) (Proposition 5.14) there is a unique map Ψ : U(g) U(b) C λ δ M satisfying Ψ(u z) = ψ(u, z) = u(zv). Phrased a little differently, Ψ is the unique map satisfying Ψ(u(1 1)) = Ψ(u 1) = uv. Thus Ψ is a U(b)-module homomorphism and Ψ(1 1) = v, and existence and uniqueness is verified. Since M is generated by v, any element in M is of the form uv for some u U(g), and from this it follows that Ψ is surjective. Now assume that uv = 0 for some nonzero u U(g) then Ψ(u(1 1)) = uv = 0 and as u(1 1) is nonzero, Ψ is not injective. Conversely, assume that u 0 implies uv 0. Since V (λ) is, by Lemma 5.13 1), generated by the elements u(1 1) for u U(g ), then u 0 implies u(1 1) is nonzero and Ψ(u(1 1)) = uv 0. Hence the only element which is mapped to zero by Ψ is 0, and hence Ψ is injective. Another way of formulating property (3) of this proposition is that any highest weight module with a certain weight is a quotient of the Verma module with
5.2 Verma Modules 83 that same weight. Thus the Verma modules are in some sense the biggest among the highest weight modules. We continue to let λ h and put V (λ) + = µ h µ λ δ V (λ) µ. We claim that any proper U(g)-submodule of V (λ) is contained in V (λ) + : By Lemma 5.13 V (λ) λ δ = C(1 1), so if a submodule contains V (λ) λ δ then it contains the canonical generator 1 1 and hence equals all of V (λ). Let S denote the sum of all proper submodules of V (λ). Obviously this is a submodule contained in V (λ) +, so we can form the quotient L(λ) = V (λ)/s which is equipped with the module structure X[v] = [Xv] (which is well-defined as S is a submodule) for v V (λ). Let q : V (λ) L(λ) denote the quotient map. L(λ) is an irreducible module, for if W L(λ) is a proper submodule then q 1 (W ) is a proper submodule of V (λ) and hence contained in S, i.e. W = {0}. Finally if X g + and H h then we have [1 1] 0 and X[1 1] = [X(1 1)] = 0 and H[1 1] = [H(1 1)] = [(λ δ)(h)1 1] = (λ δ)(h)[1 1] hence we have proved Proposition 5.16. L(λ) is an irreducible highest weight module over U(g) and [1 1] is a highest weight vector with highest weight λ δ. Thus if the Verma module V (λ) is the biggest highest weight module with highest weight λ δ, then L(λ) is the smallest highest weight module with weight λ δ and this one is irreducible. Hence, for each λ h we can produce an irreducible (possibly infinite-dimensional) representation of g 1. The final task is to seek out those which are actually finite-dimensional. We will show that if λ is an integral element which is real on h 0 then L(λ + δ) is actually finite-dimensional and hence the module we set out to find. We break up the proof in some intermediate lemmas. Recall that in the universal enveloping algebra U(g) (or indeed in any algebra) we can define a bracket [, ] by [X, Y ] = XY Y X. If X and Y happens to be in g then the value of this bracket is the same as the Lie bracket of X and Y (this is part of the definition of U(g)). Lemma 5.17. In U(sl(2, C)) we have that [E, F n ] = nf n 1 (H (n 1)) where E, F and H denote the canonical basis vectors for sl(2, C). Proof. Let us by R F denote the map on U(g) which multiplies by F from the right, and similarly by L F the map that multiplies with F from the left, and define ad(f )E := (L F R F )E = [F, E]. By the binomial formula applied to (R F ) n = (L F ad(f )) n we get ( (R F ) n n E = (L j) F ) n j ( ad(f )) j E. j=0 This sum terminates after 3 terms since Therefore we get ad(f ) 3 E = [F, [F, [F, E]]] = [F, [F, H]] = [F, 2F ] = 0. (R F ) n E = (L F ) n E n(lf ) n 1 n(n 1) [F, E] (L F ) n 2 [F, [F, E]] 2 = (L F ) n E + nh(l F ) n 1 n(n 1)(L F ) n 2 F = (L F ) n E + nf n 1 (H (n 1)). 1 It can be shown that any irreducible representation of g is in fact equivalent to one of the form L(λ) but proving that is a little outside the scope of these notes.
84 Chapter 5 The Highest Weight Theorem Subtract (L F ) n E and we have the result we want. Lemma 5.18. Let g be a complex semisimple Lie algebra and h a Cartan subalgebra. Let λ h and α Π be chosen such that m := 2 λ,α α is a positive integer. 2 Let v λ δ denote the canonical generator of the Verma module V (λ) and M be the U(g)-submodule of V (λ) generated by Y m v λ δ, where Y := Y α g α U(g). Then M is isomorphic, as a U(g)-module, to V (s α λ). Proof. As Y g we have Y m U(b) and consequently v := Y m v λ δ 0 according to Proposition 5.15 2). We have v λ δ V (λ) λ δ and hence that v V (λ) λ δ mα. But λ mα = s α (λ) and therefore v V (λ) sα(λ) δ. We now show that Xv = 0 for X := X β where β is a simple root, for then v will be a highest weight vector with weight s α (λ) δ. As M is the corresponding highest weight module generated by v it follows from Proposition 5.15 3) that M and V (s α (λ)) are isomorphic. If β α then g β α = {0} according to Lemma 4.23 and therefore [X, Y ] = 0 and thus we get Xv = XY m v λ δ = Y m Xv λ δ = 0. If β = α then we have the triple sl α = {X, Y, H α} and the preceding lemma tells us that Xv = XY m v λ δ = [X, Y m ]v λ δ = my m 1 (H α (m 1))v λ δ = m ((λ δ)(h α) (m 1)) Y m 1 v λ δ ( ) λ δ, α = m 2 α 2 (m 1) Y m 1 v λ δ = 0 where the last equality is a consequence of Proposition 4.34. Proposition 5.19. Let λ h be a dominant integral element such that λ h0 is real-valued. Then L(λ + δ) is a finite-dimensional irreducible U(g)-module with highest weight λ. Proof. From Proposition 5.16 we know already that L(λ + δ) is an irreducible highest weight module with highest weight λ. Thus we can find a non-zero element [v λ ] L(λ + δ) λ. Let α be a simple root and put Y := Y α and n := 2 λ+δ,α α. As λ is a dominant integral element and 2 δ,α 2 α = 1 (Proposition 2 4.34) n is a positive integer. From the preceding lemma we have that Y n v λ V (s α (λ + δ)) V (λ + δ). But s α (λ + δ) = (λ + δ) nα and thus V (s α (λ + δ)) has highest weight λ nα which is strictly lower than λ. Thus V (s α (λ + δ)) cannot equal V (λ + δ) and therefore the former is a proper submodule of the latter. Consequently Y n v λ S (which was the sum of all proper submodules) and therefore in the quotient, i.e. Y n [v λ ] = [Y n v λ ] = 0. Next we prove that the set of weights for L(λ+δ) is invariant under the action of the Weyl group. For now we just write v λ instead of [v λ ]. Let α be a simple root and sl α = {X, Y, H} denote the corresponding sl(2, C)-triple. Put v i := Y i v λ. As we have just seen, for i big enough, v i is zero. Therefore there is a maximal n such that v n 0 and v n+1 = 0. Consider the space W := Cv 0 + + Cv n. Obviously this space is Y - and H-invariant. Since Xv k = XY k v λ = k Y i 1 [X, Y ]Y k i + Y k Xv λ. i=1 The last term is 0, as v λ is a highest weight, while the first k terms give a multiple of v k 1 (recall that [X, Y ] = H). Thus W is sl α -invariant, and thus it
5.2 Verma Modules 85 is a finite-dimensional U(sl α )-submodule of L(λ + δ). Consider the sum of all finite-dimensional U(sl α )-submodules of L(λ + δ). The space is g-invariant, for if T is a finite-dimensional U(sl α )-module then gt is finite-dimensional (it has dimension at most dim g dim T ) and it is sl α -invariant for if t T, A sl α and B g then ABt = BAt + [A, B]t = Bt + [A, B]t gt, i.e. gt is itself a finite-dimensional U(sl α )-submodule. Thus as L(λ + δ) is irreducible this sum of finite-dimensional U(sl α )-modules (which is non-empty since it contains v λ for example) must equal L(λ + δ). This implies that each vector in L(λ + δ) has components in finitely many finite-dimensional U(sl α )-modules. The sum of these modules is again a finite-dimensional U(sl α )-module, conclusion: every vector in L(λ + δ) is in a finite-dimensional sl α -invariant subspace. Corollary 3.14 then gives a decomposition of L(λ + δ) into finite-dimensional irreducible U(sl α )-modules. Let µ be an arbitrary weight for L(λ + δ), let w L(λ + δ) µ be a nonzero weight vector and α a fixed simple root. Then w will have components in finitely many U(sl α )-submodules and we decompose accordingly: w = w 1 + + w k. Then we have k k Hw i = Hw = µ(h)w = µ(h)w i i=1 i.e. Hw i = 2 µ,α α w 2 i. Observe that m := 2 µ,α α is an integer since µ is, in 2 particular, an integral element. If µ, α > 0 then w i is a weight vector of weight m for the corresponding representation of sl α. The weights of this representation is distributed symmetrically around 0 and acting on w i by Y m we push it down to a nonzero weight vector of weight m. Thus Y m w i 0 and therefore Y m w 0. But as Y m w L(λ + δ) µ mα we see that this space is nontrivial i.e. µ mα = s α (µ) is a weight. If, on the other hand, µ, α < 0 then likewise X m w 0 is an element of L(λ + δ) sα(µ). If µ, α = 0 then s α (µ) = µ. In any case we see that s α (µ) is again a weight. As the Weyl group is generated by root reflections from simple roots (cf. Proposition 4.29) we see that the set of weights is invariant under the action of the Weyl group. Finally we show that the set of weights of L(λ + δ) is finite. Since by 2) of Lemma 5.13 the weight spaces are finite-dimensional the weight space decomposition tells us that L(λ + δ) is finite-dimensional. From Proposition 4.29 any functional in h 0 is of the form w(ϕ) where w is an element of the Weyl group and ϕ is dominant. In particular any weight can be written as w(µ) where µ is a dominant weight. Thus the number of weights is at most the number of elements of W (which is finite) times the number of dominant weights. Any weight is of the form λ k i=1 n iα i (Lemma 5.13) where {α 1,..., α k } is the set of simple roots. If, in addition, λ k i=1 n iα i is dominant then it follows that k λ n i α i, δ 0 and hence λ, δ i=1 k n i α i, δ = i=1 i=1 k n i α i, δ. As α i is simple, we get from Proposition 4.34 that α i, δ = 1 2 α i 2 > 0. Therefore we must have that k i=1 n i is bounded by a constant which is independent of the specific weight in question. Thus there can only be finitely many dominant weights. i=1
86 Chapter 5 The Highest Weight Theorem Wrapping up, the results of this chapter can be stated as Theorem 5.20 (Highest Weight Theorem). Let g be a complex semisimple Lie algebra and h a Cartan subalgebra. There is a 1-1 correspondence between finite-dimensional complex irreducible representations of g and dominant integral elements in h 0. 5.3 The Case sl(3, C) In this last section of the chapter we apply the machinery developed thus far in this chapter to the special case of the Lie algebra sl(3, C). This is a complex semisimple Lie algebra as we have seen. It is the complexification of su(3), hence by Proposition 7.22 the representation theory of these two Lie algebras is the same. Furthermore SU(3) is simply connected (Theorem B.6) and thus there is a 1-1 correspondence between finite-dimensional representations of SU(3) and ditto representations of su(3). Combining these we see that a complete knowledge of the representation theory of sl(3, C) yields complete knowledge of the representation theory of SU(3). In particular, if we can determine the irreducible representations of sl(3, C) we have determined the irreducible representations of SU(3). The group SU(3) is important due to its relation to quantum field theory and particle physics. We start by picking the basis 1 0 0 0 0 0 H 1 = 0 1 0, H 2 = 0 1 0, 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 X 1 = 0 0 0, X 2 = 0 0 1, X 3 = 0 0 0, 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Y 1 = 1 0 0, Y 2 = 0 0 0, Y 3 = 0 0 0. 0 0 0 0 1 0 1 0 0 Note, that physicists would usually replace H 2 by 1 0 0 1 0 1 0. 3 0 0 2 Below we state the commutation relations involving H 1 and/or H 2 [H 1, H 2 ] = 0 [H 1, X 1 ] = 2X 1 [H 1, Y 1 ] = 2Y 1 [H 2, X 1 ] = X 1 [H 2, Y 1 ] = Y 1 [H 1, X 2 ] = X 2 [H 1, Y 2 ] = Y 2 [H 2, X 2 ] = 2X 2 [H 2, Y 2 ] = 2Y 2 [H 1, X 3 ] = X 3 [H 1, Y 3 ] = Y 3 [H 2, X 3 ] = X 3 [H 2, Y 3 ] = Y 3 It is not hard to see that span{h 1, X 1, Y 1 } and span{h 2, X 2, Y 2 } are subalgebras of sl(3, C) and that they are both isomorphic to sl(2, C). Define h := span{h 1, H 2 }. Since [H 1, H 2 ] = 0 this is an abelian subalgebra. Assume that X = a 1 X 1 + a 2 X 2 + a 3 X 3 + b 1 Y 1 + b 2 Y 2 + b 3 Y 3
5.3 The Case sl(3, C) 87 commutes with H 1. Then 0 = [H 1, X] = 2a 1 X 1 a 2 X 2 + a 3 X 3 2b 1 Y 1 + b 2 Y 2 b 3 Y 3 and since the X i s and Y i s are linear independent, we must have a 1 = a 2 = a 3 = b 1 = b 2 = b 3 = 0. From this it follows that h is a maximal torus. From the commutation relations above we see that ad(h 1 ) and ad(h 2 ) are diagonal in the chosen basis and hence that h is a Cartan subalgebra of sl(3, C). We identify h with C 2 by identifying ϕ h with the pair (ϕ(h 1 ), ϕ(h 2 )). Let s determine the set of roots R(sl(3, C), h), i.e. pairs (a 1, a 2 ) for which there exists an X 0 such that [H 1, X] = a 1 X and [H 2, X] = a 2 X. From the commutation relations above we see that we, at least, have the following roots (2, 1), ( 1, 2), (1, 1), ( 2, 1), (1, 2), ( 1, 1) with corresponding root vectors X 1, X 2, X 3, Y 1, Y 2 and Y 3. By the root space decomposition there can be no more roots than these. As fundamental system we pick Π := {α 1, α 2 } where α 1 := (2, 1) and α 2 := ( 1, 2). These are obviously linearly independent and as (2, 1) = α 1 ( 1, 1) = α 1 α 2 (1, 1) = α 1 + α 2 ( 2, 1) = α 1 ( 1, 2) = α 2 (1, 2) = α 2. Hence R N 0 Π ( N 0 Π) and thus Π is a genuine fundamental system. The positive roots are α 1, α 2 and α 1 +α 2 with corresponding root vectors X 1, X 2, X 3 and the negative roots are α 1, α 2 and α 1 α 2 with corresponding root vectors Y 1, Y 2 and Y 3. Therefore sl(3, C) + = span{x 1, X 2, X 3 } and sl(3, C) = span{y 1, Y 2, Y 3 }. Now consider a finite-dimensional complex representation (ρ, V ) of sl(3, C). A weight λ for this is (under the identification above) a pair (λ 1, λ 2 ) C 2 for which there exists 0 v V with ρ(h 1 )v = λ 1 v and ρ(h 2 )v = λ 2 v. We can restrict ρ to {H 1, X 1, Y 1 } thus obtaining a representation of sl(2, C). From Corollary 3.13 we know that the eigenvalues of ρ(h 1 ) are integers. Similarly, the eigenvalues of ρ(h 2 ) are integers. Thus a weight is necessarily a pair of integers. If we, moreover, require the functional (λ 1, λ 2 ) to be dominant, i.e. λ, α 1 + α 2 0 for all positive roots α, then we get that λ, α = λ 1 + λ 2 0 and that λ, α 1 = 2λ 1 λ 2 0 and λ, α 2 = λ 1 + 2λ 2 0. Hence λ 1 = 1 3 ((2λ 1 λ 2 ) + (λ 1 + λ 2 )) 0, λ 2 = 1 3 (( λ 1 + 2λ 2 ) + (λ 1 + λ 2 )) 0. In other words the dominant integral elements of h are precisely the (functionals represented by the) pairs (λ 1, λ 2 ) with λ i N 0. Now the Highest Weight Theorem tells us that there is a 1-1-correspondence between these pairs and irreducible representations of sl(3, C). But how do we, given a dominant integral element, compute the corresponding irreducible representation? In principle
88 Chapter 5 The Highest Weight Theorem we could compute the Verma module and then take the quotient as described in the previous section. However, this is too complicated, so here we outline a somewhat simpler method which is applicable in this case. The first thing we do is to find the so-called fundamental representations of sl(3, C), namely the irreducible representations having (1, 0) and (0, 1) as highest weights. The first of these is simply the defining representation i.e. the representation on C 3 given by ρ 1 (X)v = Xv. This representation is irreducible: assume that W is a non-zero invariant subspace and 0 v = (v 1, v 2, v 3 ) W. We see that ρ(x 1 )v = X 1 v = (v 2, 0, 0) and X 3 v = (v 3, 0, 0), so if either v 2 0 or v 2 0 then e 1 W. If v 2 = v 3 = 0 then v 1 0 and ρ 1 (H 1 )v = (v 1, 0, 0). Thus in any case e 1 W. Likewise one shows that e 2 and e 3 are in W as well and thereby that ρ 1 is irreducible. We easily calculate that (1, 0), ( 1, 1) and (0, 1) are weights of ρ 1 with corresponding weight vectors e 1, e 2 and e 3 (the standard basis vectors for C 3 ). By the weight space decomposition the number of weights can t exceed 3, so we have found all the weights. Since ( 1, 1) = (1, 0) α 1 and (0, 1) = (1, 0) α 2 we see that (1, 0) is the highest weight. Thus ρ 1 is the irreducible representation having (1, 0) as highest weight. Now consider the representation ρ 2 on C 3 given by ρ 2 (X)v = X t v. Since ρ 2 ([X, Y ]) = [X, Y ] t = [ X t, Y t ] = [ρ 2 (X), ρ 2 (X)] this is really a representation. By the same kind of reasoning as before, this representation is irreducible and e 1, e 2 and e 3 are weight vectors of ρ 2 with corresponding weights ( 1, 0), (1, 1) and (0, 1). This time (0, 1) is the highest weight since ( 1, 0) = (0, 1) α 1 α 2 and (1, 1) = (0, 1) α 2. Thus ρ 2 is the irreducible representation having (0, 1) as highest weight. Now let a dominant integral element (m 1, m 2 ) be given. Form the tensor product representation ρ m1m 2 := ρ m1 1 ρ m2 2 (where by ρ m1 1 we mean the m 1 -fold tensor product of ρ 1 with itself). It is not hard to see that the vector v m1m 2 := e m1 1 e m2 3 is a highest weight vector for ρ m1m 2 with weight (m 1, m 2 ). Let W denote the smallest invariant subspace of (C 3 ) (m1+m2) containing v m1m 2, then v m1m 2 is a cyclic highest weight vector and ρ m1m 2 W is an irreducible representation having (m 1, m 2 ) as highest weight. Let s be even more concrete in the case of highest weight (1, 1). To make things a little easier we define a new basis by f 1 := e 3, f 2 := e 2, f 3 := e 1 for then ρ 2 (Y 2 )f 1 = f 2 and ρ 2 (Y 1 )f 2 = f 3 whereas all other possible actions of the Y i s on the f j s are 0. The highest weight vector for ρ 2 is f 1. We should form the tensor product ρ 1 ρ 2 on C 3 C 3 and find the smallest invariant subspace containing e 1 f 1. We find this subspace simply by acting on e 1 f 1 with (ρ 1 ρ 2 )(Y 1 ) and (ρ 1 ρ 2 )(Y 2 ) (acting by (ρ 1 ρ 2 )(X i ) just gives 0 as e 1 and f 1 are highest weight vectors). The resulting vectors we act on again by (ρ 1 ρ 2 )(Y 1 ) and (ρ 1 ρ 2 )(Y 2 ) and so forth until we hit 0. If ones does so, one would find that the smallest invariant subspace W is the span of the following 8 elements (for the sake of brevity we have omitted the tensor product symbol) e 1 f 1, e 2 f 1, e 1 f 2, e 3 f 1 + e 2 f 2, e 2 f 2 + e 1 f 3, e 2 f 3, e 3 f 2, e 3 f 3. Since the vectors e i f j constitute a basis for C 3 C 3 it shouldn t be too hard to see that the vectors above are linearly independent. ρ 1 ρ 2 restricted to this set is thus the irreducible representation corresponding to the highest weight (1, 1).
5.3 The Case sl(3, C) 89 But this representation is nothing but the adjoint representation of sl(3, C). Indeed consider the isomorphism ϕ : W sl(3, C) which maps e 1 f 1 X 3, e 1 f 2 X 1, e 2 f 1 X 2 e 3 f 1 + e 2 f 2 H 2, e 2 f 2 + e 1 f 3 H 1, e 2 f 3 Y 1, e 3 f 2 Y 2, e 3 f 3 Y 3. This is an intertwiner of (ρ 1 ρ 2 ) W and ad. For instance we see that (ρ 1 ρ 2 )(Y 1 )e 1 f 1 = ρ 1 (Y 1 )e 1 f 1 + e 1 ρ 2 (Y 1 )f 1 = e 2 f 1 + e 1 0 = e 2 f 1 whereas ad(y 1 )(ϕ(e 1 f 1 )) = ad(y 1 )(X 3 ) = [Y 1, X 3 ] = X 2 = ϕ(e 2 f 1 ). Thus the adjoint representation of sl(3, C) is (equivalent to) the irreducible representation with highest weight (1, 1).
90 Chapter 5 The Highest Weight Theorem
Chapter 6 Infinite-dimensional Representations 6.1 Gårding Subspace In Chapter 3 we discussed finite-dimensional representations of Lie groups and their induced Lie algebra representations. All that was based on the fact that Aut(V ) is a Lie group when V is a finite-dimensional vector space. It s a natural question to ask what will happen if we consider representations on infinitedimensional Hilbert spaces. Do they descend to representations of the associated Lie algebra? Under certain modifications the answer is yes, but it is clear that the situation is not as simple as in the previous chapter. Recall the situation: Given a finite-dimensional representation (π, V ) of an arbitrary Lie group G, it induces a representation π of g on V given explicitly by π (X)v = d dt π(exp(tx))v. (6.1) t=0 This is well-defined, since π was automatically smooth (Lemma 3.1). In the case of a representation on an infinite-dimensional Hilbert space H we would like to define a g-representation by (6.1) but a priori this expression does not make sense for all vectors in H. Actually, we have to restrict the representation of g to a subspace of H, namely to the space of so-called C -vectors. To define these we need to discuss differentiability of H-valued maps. Definition 6.1. Let U R n be an open set, and f : U H be a map into a Banach space. f is said to be differentiable at x 0 U if the limit f(x 0 + x) f(x 0 ) lim x 0 x exists. In the affirmative case we define f (x 0 ) to be that limit. The function f is said to be differentiable if it is differentiable at all the points of U. We say that f is C k if f is k times differentiable and f (k) is continuous. We say that f is C if it is C k for all k. We define the directional derivative of f at x 0 in the direction of v R n to be lim t 0 f(x 0 + tv) f(x 0 ). t Like in ordinary calculus one can show that f is C k if and only if all the directional derivatives are C k 1. 91
92 Chapter 6 Infinite-dimensional Representations Now we consider a map f : G H from a Lie group G (or, for that matter, any smooth manifold) to H. We can define differentiability of this by demanding that its composition with any inverse coordinate map is differentiable in the sense of Definition 6.1, i.e. if x 0 G and (U, ϕ) is an arbitrary coordinate map for G around x 0 then we demand the map f ϕ 1 : ϕ(u) H to be differentiable. If X g is a left-invariant vector field on G we can interpret (Xf)(g) as the directional derivative of f in g in the direction of X g because we have f(g exp tx) f(g) (Xf)(g) = lim. t 0 t Why is this true? Let θ be the flow of X, then 1 we have θ(t, g) = g exp tx. Hence X g = t θ(t, g) = d t=0 dt g exp tx, t=0 and therefore ( ) d (Xf)(g) = X g f = dt g exp tx f = d t=0 dt f(g exp tx) t=0 = lim t 0 f(g exp tx) f(g) t which was to be proved. To any left-invariant vector field X there is an associated right-invariant vector field Y. Left invariance of X simply means that X g = (L g ) X e, (where L g is the Lie group automorphism h g 1 h) i.e. X is determined from its value at e. Similarly any right-invariant vector field Y satisfies Y g = (R g ) R e where R g is the automorphism h hg. Thus, from X we get a right-invariant vector field Y by Y g = (R g ) X e. We see that Y can also be regarded as a directional derivative: (Y f)(g) = Y g f = (R g ) X e f = X e (f R g ) = d dt (f R g )(exp tx) = d t=0 dt f(exp(tx)g) t=0 f(exp(tx)g) f(g) = lim. t 0 t We can now define the subspace of C -vectors. Definition 6.2. Let (π, H) be a representation of G on a Hilbert space H. A vector v H is called a C -vector for π if the map G H given by g π(g)v is C. The space of C -vectors for π is denoted H π. The goal of the rest of this section is to prove that Hπ is dense in H. The strategy of the proof is to introduce a subspace of Hπ, the so-called Gårding subspace, which is more manageable than Hπ itself, and show that this is dense. A key ingredient in the proof will be the following lemma: Lemma 6.3. Let (π, H) be a representation of G and let S be a subspace of H such that for all v S the limit ϕ(x)v := lim t 0 π(exp(tx))v v t exists and is in S for all X g. Then S H π. 1 See [9] Proposition 20.8 g). (6.2)
6.1 Gårding Subspace 93 Proof. Let v S. We should verify that v H π i.e. that the function f v : G H given by f v (g) := π(g)v is C. We will use induction. First, let s show that it is C 1. The requirement that the limit (6.2) exists for all X g, says that f v is differentiable in e G. We need to show that it is differentiable in any arbitrary point g G. But as π(g) is a continuous map f v (g exp(tx)) f v (g) π(g exp(tx))v π(g)v lim = lim t 0 t t 0 t π(exp(tx))v v = π(g) lim = π(g)ϕ(x)v. t 0 t Thus, all the directional derivatives exist and are continuous (by Proposition 1.2). Hence f v is C 1. Now for the induction step: Assuming that f v is C k, we will show that it is C k+1. Since v is in S, also ϕ(x)v is in S, hence, by assumption f ϕ(x)v is C k. But f ϕ(x)v (g) = π(g)ϕ(x)v is the directional derivative of f v. Since all the directional derivatives of f v are C k, f v itself must be C k+1. Before introducing the Gårding subspace we need to be able to talk about integration in infinite-dimensional Hilbert spaces. So let (X, µ) be a measure space, H a Hilbert space, and assume that f : X H is a compactly supported continuous map, then there exists a unique vector v H satisfying 2 v, w = f(x), w dµ(x). X This unique vector v is what we shall understand by the integral of the function f, and thus we write X f(x)dµ(x) := v. This is integration in the weak sense, and sometimes also called the weak integral For this kind of integration we have the usual inequality 3 f(x)dµ(x) f(x) dµ(x). (6.3) X Furthermore, if T : H H is a continuous linear map, then 4 T f(x)dµ(x) = T f(x)dµ(x). (6.4) We also have dominated convergence: X Theorem 6.4 (Dominated Convergence). Let f n : X H be a sequence of integrable functions converging pointwise to f, and suppose g : X R + is an integrable function such that f n (x) g(x) for all x X, then f is integrable and lim f n dµ = fdµ. n X X Proof. For any x X we have f n (x) g(x) for all n and thus we must also have f(x) g(x), i.e. f is integrable. We see that f n dµ fdµ = (f n f)dµ f n (x) f(x) dµ(x) X X 2 For this implication, see [Rudin] Definition 3.26 and Theorem 3.27. 3 See [Rudin] Theorem 3.29. 4 See [Rudin] Exercise 3.24. X X X X
94 Chapter 6 Infinite-dimensional Representations By the usual dominated convergence we know that f n (x) f(x) dµ(x) = lim n X X lim f n(x) f(x) dµ(x) n and this is 0, since the function x lim n f n (x) f(x) is identically 0. With this in mind we can, given a unitary representation π on H and a f Cc (G), define a linear map π(f) : H H by π(f)v := f(g)π(g)vdµ(g) where µ is the left Haar measure on G. This map is continuous for π(f)v = f(g)π(g)v dg f(g) v dg = f 1 v. G I.e. π(f) f 1 < since f is smooth and compactly supported. G Definition 6.5 (Gårding Subspace). Let (π, H) be a unitary representation of G on the Hilbert space H. The Gårding subspace Hπ G of π is the subspace of H spanned by all elements of the form π(f)v for f Cc (G) and v H. Elements of this space are called Gårding vectors. Lemma 6.6. The Gårding subspace is a subset of H π. Proof. We will show that Hπ G that for all v Hπ G the limit G satisfies the requirements of Lemma 6.3, i.e. ϕ(x)v = lim t 0 π(exp(tx))v v t exists and is of the form π(f)v. We will show that we actually have the explicit expression ϕ(x) π(f)v = π(y f)v (6.5) where Y is the right-invariant vector field corresponding to X and Y f is the C -function on G obtained by letting Y act on f. Now fix a t 0, then π(exp(tx)) id H t π(f)v = 1 t π(exp(tx)) f(g)π(g)v dg 1 f(g)π(g)v dg G t G = 1 f(g)π(exp(tx)g)v dg 1 f(g)π(g)v dg. t G t G In the last equality we used (6.4) to put π(exp(tx)) inside the integral. The Haar measure on G is left-invariant, and hence the first integral will be unaltered if we change the variable from g to exp( tx)g. Then the expression above becomes G f(exp( tx)g) f(g) π(g)v dg. (6.6) t We want to investigate the limit as t 0. The fraction inside the integral sign looks familiar, it converges to (Y f)(g) when t 0, so if we can exchange limit and integration we are done. Because of this convergence we can, given an ε > 0, find a δ such that for t < δ we have f(exp( tx)g) f(g) t (Y f)(g) f(exp( tx)g) f(g) (Y f)(g) t < ε
6.2 Induced Lie Algebra Representations 95 which implies that f(exp( tx)g) f(g) (Y f)(g) + ε. t Y f is a smooth compactly supported function (just like f), hence the right hand side is a t-independent integrable majorant for the expression (6.6). Thus we have dominated convergence and for t 0 (6.6) tends to (Y f)(g)π(g)v dg = π(y f)v G which proves the statement. Theorem 6.7 (Gårding). The Gårding subspace is dense in H. In particular H π is dense in H. Proof. Let v H be arbitrary. For any given ε > 0 we should find a Gårding vector which is ε-close to v. By Proposition 1.2 the map G g π(g)v v H is continuous for all v H. Hence the set U ε := {g G π(g)v v < ε} H is open. It is then possible to find a compactly supported smooth positive function f in G which is supported in U and satisfies G f(g)dg = 1 5. We claim that the Gårding vector π(f)v is ε-close to v: π(f)v v = f(g)π(g)vdg v = f(g)(π(g)v v)dg G G f(g) π(g)v v dg < ε f(g)dg = ε. G In the last inequality we used that f was supported in U where, by definition, π(g)v v < ε. 6.2 Induced Lie Algebra Representations For the rest of this chapter π denotes a unitary representation of G on a Hilbert space H. We have for each X g a map ϕ(x) defined by (6.2) on a dense subset of H, namely H π. In this section we will see that in fact, ϕ is a representation of g on H π. We will develop this result in a few propositions. The first thing to be verified is that ϕ(x) is linear: Proposition 6.8. ϕ(x) is a linear map and im ϕ(x) H π, i.e. ϕ(x) End(H π ). Proof. To show linearity we just we simply calculate ϕ(x)(av + bw) = lim t 0 π(exp(tx))(av + bw) (av + bw) t π(exp(tx))v v = a lim t 0 t = aϕ(x)v + bϕ(x)w. G + b lim t 0 π(exp(tx))w w t 5 Since G is locally compact, we can find an open set V U with V compact and contained in U. Then by standard manifold theory, we can find a smooth positive function f on G which is supported in V ([9] Proposition 2.26). Since f is supported in V and V was compact, f is compactly supported, hence integrable. By scaling f with a proper normalizing constant we obtain a smooth positive compactly supported function f with G fdg = 1.
96 Chapter 6 Infinite-dimensional Representations For the second claim define the map f : G H by f(g) = π(g)v for v H π. Thus constructed, f is a C -map. That ϕ(x)v is a C -vector means by definition that g π(g)ϕ(x)v is C. To see that this is true, we calculate: π(exp(tx))v v π(g)ϕ(x)v = π(g) lim t 0 t = (Xf)(g). But Xf is C, hence ϕ(x)v is a C -vector. = lim t 0 π(g exp(tx))v π(g)v t In short, π(x) is an endomorphism of H π, or a densely defined operator on H. Proposition 6.9. The map ϕ : g End(H π ) mapping X ϕ(x) is a Lie algebra representation of g. Proof. There are two things to prove: showing that ϕ is linear, and showing that it respects Lie brackets. Verifying linearity of ϕ is straightforward: ϕ(x + Y )v = lim t 0 π(exp(tx + ty ))v v t and π(exp(tx) exp(ty ))v π(exp(ty ))v π(exp(ty ))v v = lim + lim t 0 t t 0 t π(exp(tx)) id H = lim lim π(exp(ty ))v + ϕ(y )v t 0 t t 0 π(exp(tx)) id H = lim v + ϕ(y )v = (ϕ(x) + ϕ(y ))v, t 0 t π(exp t(cx))v v ϕ(cx)v = lim t 0 t = cϕ(x)v. = c lim t 0 π(exp(ctx))v v ct Hence ϕ is linear. Now we should prove that ϕ([x, Y ]) = ϕ(x)ϕ(y ) ϕ(y )ϕ(x). Note that since ϕ(x) and ϕ(y ) map H π into H π it makes sense to compose them. To find out how the compositions ϕ(x)ϕ(y ) and ϕ(y )ϕ(x) act, we see that ϕ(x)ϕ(y )v, w = lim s 0 lim t 0 π(exp sx) idh s π(exp ty )v v, w t (6.7) (6.8) In ordinary calculus one shows that if a function has continuous partial derivatives, the function is differentiable. By a similar argument one can show that the equation above implies that ϕ(x)ϕ(y )v, w = π(exp sx) idh lim (s,t) (0,0) s In fact we can put s = t, and hence we get π(exp ty )v v, w t π(exp tx)π(exp ty )v π(exp tx)v π(exp ty )v + v, w ϕ(x)ϕ(y )v, w = lim t 0 t 2. This is valid for all w H and so we conclude that ϕ(x)ϕ(y )v = lim t 0 1 t 2 ( π(exp tx)π(exp ty )v π(exp tx)v π(exp ty )v + v ).
6.3 Self-Adjointness 97 A similar calculation and argument render 1 ( ) ϕ(y )ϕ(x)v = lim π(exp ty )π(exp tx)v π(exp tx)v π(exp ty )v + v. t 0 t 2 Therefore by subtracting we get (π(exp tx exp ty ))v π(exp ty exp tx)v [ϕ(x),ϕ(y )]v = lim t 0 t 2 π(exp( tx) exp( ty ) exp tx exp ty )v v = lim π(exp tx exp ty ) t 0 t 2 = lim t 0 π(exp tx exp ty ) π(exp(t2 [X, Y ] + O(t 3 )))v v t 2 = lim t 0 π(exp(t[x, Y ] + O(t 3/2 )))v v t = ϕ([x, Y ])v in that O(t n ) denotes terms which are bounded by a positive constant times t n. Why does the last equality hold? If θ is an O(t 3/2 )-function then it is easy to see that θ (0) = 0. Thus if F is a real function then d dt F (t + θ(t)) = F (0 + θ(0)) d t=0 dt (t + θ(t)) = F (0). t=0 Thus an O(t 3/2 )-term does not influence the derivative at 0. An elaboration of this argument gives the same result for derivations of functions in a Hilbert space. Thus ϕ(x) is a densely defined operator on H. Proposition 6.10. The operator ϕ(x) is anti-symmetric, i.e. ϕ(x) ϕ(x). Proof. It makes sense to talk about the adjoint of ϕ(x) as it is densely defined. For w in the domain of ϕ(x), ϕ(x) w is the unique vector satisfying ϕ(x)v, w = v, ϕ(x) w for all v H π. In the following computation we exploit unitarity of π(g). Let t 0 and v, w H π : 1 t (π(exp tx)v v), w = π(exp tx)v, 1 t w v, 1 t w and so in the limit t 0 we get = v, 1 t π(exp( tx))w v, 1 t w = v, 1 t( π(exp( tx))w w ), ϕ(x)v, w = v, ϕ(x)w. This precisely states that ϕ(x) is anti-symmetric. 6.3 Self-Adjointness In this section we proceed to investigate the operator ϕ(x). Our goal is to apply it to the quantum mechanical theory of momentum and angular momentum, and for that we would like it to be self-adjoint. But that cannot be, that s what Proposition 6.10 says. What we will show at the end, however, is that ϕ(x) is essentially skew-adjoint i.e. that the closure is skew-adjoint: ϕ(x) = ϕ(x). This implies that the operator iϕ(x) will be self-adjoint.
98 Chapter 6 Infinite-dimensional Representations Proving this, however, is rather complicated and we have to go through a series of technical lemmas before we can prove the result. The first result revolves around how π(f) acts on ϕ(x)(hπ G ). Recall from Chapter 1 that we have defined the modular function : G R + by f(gh 1 )dg = (h) f(g)dg G where dg is the left Haar measure on G. is a Lie group homomorphism, and therefore it makes sense to define a map γ : g R + by γ(x) = d dt (exp(tx)). t=0 G (G) be arbi- Lemma 6.11. Let y = π(k)v be a Gårding vector and letf Cc trary. Then we have π(f)ϕ(x)y = π(xf)y + γ(x) π(f)y. (6.9) Proof. Recall (6.5) by which we have π(f)ϕ(x) π(k)v = π(f)( π(y k)v) = f(g)(y k)(h)π(g)π(h)v dhdg G ( ) 1 = f(g) lim t 0 t k(exp(tx)h) k(h) π(g)π(h)v dhdg. G As in the proof of Lemma 6.6 we have dominated convergence so we can interchange limit and integration, and hence get 1 ( ) lim f(g) k(exp(tx)h) k(h) π(g)π(h)v dhdg. t 0 t G Now, fix a nonzero t and split the integral in two: 1 f(g)k(exp(tx)h)π(g)π(h)v dhdg t G 1 f(g)k(h)π(g)π(h)v dhdg. t G In the first we replace h by exp( tx)h (which does not alter the integral, since the Haar measure is left-invariant), and in the second we replace g by g exp( tx) (which we can do if we compensate by introducing (exp( tx)) in the expression). Thus we get 1 f(g)k(h)π(g)π(exp( tx)h)π(h)v dhdg t G 1 (exp( tx))f(g exp( tx))k(h)π(g)π(exp( tx))π(h)v dhdg t G 1( ) = f(g) f(g exp( tx)) k(h)π(g)π(exp( tx))π(h)v dhdg G t (exp( tx)) 1 f(g exp( tx))k(h)π(g)π(exp( tx))π(h)v dhdg. t Inside the first integral we have G f(g) f(g exp( tx)) lim = (Xf)(g), t 0 t and
6.3 Self-Adjointness 99 and in the second we have (exp( tx)) 1 lim = γ(x). t 0 t Thus, by using dominated convergence once again to bring the limit inside the integrals, we get in the first case (Xf)(g)k(h)π(g)π(h)v dhdg = π(xf) π(k)v = π(xf)y, G while the second gives γ(x) f(g)k(h)π(g)π(h)v dhdg = γ(x) π(f) π(k)v = γ(x) π(f)y. G This completes the proof. To find the adjoint of the continuous linear map π(f) put f (g) := f(g 1 ) (g): π(f)v, w = f(g)π(g)v dg, w = f(g)π(g)v, w dg G G = v, f(g)π(g 1 )w dg = v, f(g 1 ) (g)π(g)w dg G G = v, f (g)π(g)w dg = f (g)π(g)w, v dg G = G f (g)π(g)w dg, v = v, π(f )w. Hence the adjoint is π(f ). In the next lemma we calculate the adjoint in the case of a function of the form Y f for a right-invariant vector field Y. Lemma 6.12. Let f Cc (G) and X g and let Y be the right-invariant vector field corresponding to X. Then the adjoint of π(y f) is given by G π(y f) = π(xf ) γ(x) π(f ). (6.10) Proof. First of all f is smooth, as is a Lie group homomorphism, and like f it is compactly supported. Now we want to show that π(y f)v, w = v, π(xf )w γ(x) v, π(f )w for all v, w H, but since π(y f), π(xf ) and π(f ) are all continuous linear maps, it is sufficient to verify the statement for Gårding vectors v and w. For v Hπ G we have by (6.5) that π(y f)v = ϕ(x) π(f)v, and so π(y f)v, w = ϕ(x) π(f)v, w = π(f)v, ϕ(x)w = v, π(f )ϕ(x)w = v, π(xf )w + γ(x) π(f )w. In the last equation we invoked Lemma 6.11. Lemma 6.13. There exists a sequence of real, non-negative functions f n Cc (G) and a corresponding descending sequence (U n ) of neighborhoods around e, such that f n 1 = 1, supp f n U n and n U n = {e}, and such that the sequence Y f n Xf n 1 is a bounded sequence for each X g with corresponding right-invariant vector field Y.
100 Chapter 6 Infinite-dimensional Representations We will skip the proof of this lemma. It is long, technical and not very illuminating. For a proof consult [3] Lemma 7.26. However we see that the requirements on f n implies that f n converges to the delta function δ e in the sense of distributions: namely let ϕ Cc (G) be arbitrary. It is continuous in e and since the supports of f n shrink to {e} we can, given an ε > 0 find an n large enough such that ϕ(g) ϕ(e) < ε for g U n and therefore f n (g)ϕ(g) dg ϕ(e) = f n (g)(ϕ(g) ϕ(e))dg G U n f n (g) ϕ(g) ϕ(e) dg U n < ε f n (g)dg = ε. U n Lemma 6.14. Let (f n ) be a sequence of compactly supported smooth functions satisfying the demands of Lemma 6.13. Then the sequence of bounded operators ( π(y f n ) π(xf n ) γ(x) π(f n )) converges strongly to the zero-operator. Proof. Strong convergence is the same as pointwise convergence so we need to show that ( π(y f n ) π(xf n ) γ(x) π(f n ))v = (Y f n Xf n γ(x)f n )(g)π(g)v dg converges to 0 for all v H. Putting ψ(g) := π(g)v then ψ : G H is a continuous function and ψ(e) = v. We split the above expression in two (Y f n Xf n γ(x)f n )(g)(ψ(g) ψ(e))dg+ψ(e) (Y f n Xf n γ(x)f n )(g)dg G To see that the first term tends to 0 we remark that f n and Y f n Xf n 1 are bounded sequences by Lemma 6.13 and hence by the triangle inequality Y f n Xf n γ(x)f n 1 is bounded by some constant M. Therefore (Y f n Xf n γ(x)f n )(g)(ψ(g) ψ(e))dg G (Y f n Xf n γ(x)f n )(g) (ψ(g) ψ(e)) dg G sup ψ(g) ψ(e) (Y f n Xf n γ(x)f n )(g) dg g U n G G M sup g U n ψ(g) ψ(e). We only need to take the sup over U n since this is where f n is supported. As ψ is continuous we can to a given ε > 0 find a neighborhood U around e so that ψ(g) ψ(e) < ε/m for g U. As U n is a descending sequence of sets we will eventually have U n U, and thus sup g Un ψ(g) ψ(e) < ε/m, i.e. (Y f n Xf n γ(x)f n )(g)(ψ(g) ψ(e))dg < ε. G Therefore the first term tends to zero. Now we need to show that G (Y f n Xf n γ(x)f n )(g)dg tends to zero. The first term converges to zero: 1 (Y f n )(g)dg = lim t (f n(exp(tx)g) f n (g))dg G G t 0 1 = lim t 0 t ( G G f n (exp(tx)g)dg G ) f n (g)dg = 0,
6.3 Self-Adjointness 101 where the second equality follows from dominated convergence and the last one follows from left invariance of the Haar measure. The two last terms happen to cancel each other: 1 (Xf n )(g)dg = lim t (f n(g exp(tx)g) f n (g))dg G proving the lemma. G t 0 1 = lim t 0 t ( ) (exp( tx)) 1 f n (g)dg = γ(x) G Now for the final lemma, before the main results of this section. Lemma 6.15. For all X g we have ϕ(x) ϕ(x). G f n (g)dg, Proof. We will show that ϕ(x) x, y + x, ϕ(x) y = 0 for all x, y D(ϕ(X) ). The strategy is to write this as a sum of terms which are all zero (the function f Cc (G) is arbitrary): ϕ(x) x, y + x, ϕ(x) y = ϕ(x) x, y ϕ(x) x, π(f )y (6.11) + ϕ(x) x, π(f )y x, ϕ(x) π(f )y (6.12) + x, ϕ(x) π(f )y x, π(y f )y (6.13) + x, π(y f )y x, π(xf + γ(x)f )y (6.14) + x, π(xf + γ(x)f )y + π(y f)x, y (6.15) π(y f)x, y + ϕ(x) π(f)x, y (6.16) ϕ(x) π(f)x, y + π(f)x, ϕ(x) y (6.17) π(f)x, ϕ(x) y + x, ϕ(x) y. (6.18) The terms (8.2) and (6.17) are 0 simply by definition of the adjoint. The terms (6.13) and (6.16) are 0 by Eq. (6.5). The term (6.15) is 0 by Lemma 6.12. In (6.14) we put f = f n and let n and the two terms converge to zero by Lemma 6.14. For (8.1) we do something similar: we put f = f n and show that π(f n )y y for n. Namely for any v H we have lim π(f)y, v = lim π(f n)y, v = lim n n n f n (g)π(g)y dg, v G = lim f n (g)π(g)y dg, v dg n G = lim f n (g) π(g)y, v dg n G = δ e ( π(g)y, v ) = y, v. Thus for n the two terms in (8.1) go to zero. An identical argument shows that the two terms in (6.18) cancel as well. Finally, we have collected enough results to prove the following theorem Theorem 6.16. ϕ(x) is essentially skew-adjoint for all X g. Proof. We will see that ϕ(x) = ϕ(x). Since ϕ(x) is densely defined, ϕ(x) is closed, and as ϕ(x) ϕ(x) by Proposition 6.10 also we have ϕ(x) ϕ(x) (6.19) because ϕ(x) is the smallest closed extension of ϕ(x). Also ϕ(x) is densely defined, so ϕ(x) is closed and thus equal to ϕ(x). (6.19) now implies ϕ(x) ϕ(x) i.e. ϕ(x) ϕ(x). Conversely, Lemma 6.15 yields ϕ(x) = ϕ(x) ϕ(x) which immediately renders ϕ(x) ϕ(x) = ϕ(x). Thus, ϕ(x) is skew-adjoint.
102 Chapter 6 Infinite-dimensional Representations Thus if we define π (X) = ϕ(x) we get a map π from g to the set of skewadjoint operators on H. This latter space we denote O(H). Theorem 6.17. For any X g we have exp(π (X)) = π(exp(x)), i.e. the following diagram is commutative (compare with (3.1)) g π O(H) exp exp G π Aut(H) Proof. Consider the following two families of unitary operators V t := π(exp tx) and W t := exp(tπ (X)). By Proposition 1.2 V t is a strong continuous 1-parameter group of unitaries, thus by Stone s Theorem there exists a self-adjoint operator T on H such that V t = exp(itt ) and it can be constructed in the following way: Let D(A) := {v H lim t 0 1 t (V tv v) exists} and put for v D(A): 1 Av = lim t 0 t (V tv v). Then T := ia will do the job. Since we have for v H π : 1 lim t 0 t (V π(exp tx)v v tv v) = lim = π (X)v t 0 t we have that H π D(A) and that Av = π (X)v. This implies π (X) it. Since both sides are skew-adjoint we get π (X) = π (X) ( it ) = it and thus it π (X), i.e. T = iπ (X). Thus V t = W t. In particular this holds for t = 1 and this is the desired formula. 6.4 Applications to Quantum Mechanics In this final chapter we bring the Peter-Weyl Theorem, the Highest Weight Theorem and the theory from the previous section into play in a brief description of the quantum mechanical theory of momentum and angular momentum 6. First let s discuss quantum mechanical momentum. Quantum mechanics is modeled on operators on Hilbert spaces: to each quantum mechanical system there is associated a Hilbert space (usually an L 2 -space) and to any observable for that system such as momentum, position or energy, there is a unique selfadjoint operator on the associated Hilbert space. The only possible values of a given observable one can measure lie in the spectrum of the given operator, and since the operator is self-adjoint only real values occur in the spectrum. Using the theory from the previous section we will motivate why the momentum operators look as they do and find a domain of essentially self-adjointness. Let s consider a free particle in space. Then the proper Hilbert space is L 2 (R 3 ). On this we consider the operator T (x 0 )f(x) = f(x x 0 ) which translates the coordinate system along x 0. This corresponds to an actual translation of the system along x 0. It s called the translation operator. It is not hard to see (using Proposition 1.2) that T is a continuous representation of the additive group R and translation invariance of the Lebesgue measure on R 3 yields unitarity of 6 For a more physical but highly non-rigorous discussion of this, the reader is referred to Chapter 3 of [12].
6.4 Applications to Quantum Mechanics 103 this representation 7. Hence it induces a Lie algebra representation T of R 3 on L 2 (R 3 ) (or rather on the space of C -vectors L 2 (R 3 ) T ), where the Lie algebra of R 3 is just R 3 with the zero-bracket and the exponential map is the identity R 3 R 3. Choosing the standard basis {e 1, e 2, e 3 } for R 3 we can calculate T (e i ): T (e 1 )f(x) = d dt T (exp te 1 )f(x) = d t=0 dt f(x + te 1 ) t=0 = d dt f(x 1 + t, x 2, x 3 ) = f t=0 x 1 that is T (e 1 ) = x 1 and likewise T (e i ) = x i. Well, what are the C -vectors anyway? That is not an easy question and we won t answer it completely. First of all a C -vector for T is an L 2 -function satisfying that the map x (T (x)f) is C from R 3 to L 2 (R 3 ). More concretely, this means, in particular, that the expression T (te i )f f t converges, in L 2, to some L 2 -function h when x tends to 0. Let s show that this implies that h is the weak derivative of f, i.e. satisfies f, ϕ = h, ϕ for all test functions ϕ. Obviously, we have (for a fixed t) that T (tei )f f t, ϕ h, ϕ. On the other hand we see that T (tei )f f, ϕ = 1 ( T (tei )f, ϕ f, ϕ ) = 1 ( f, T ( tei )ϕ f, ϕ ) t t t = f, T ( te i)f ϕ t (the first equality being a consequence of translation invariance of the Lebesgue measure) and this converges to f, i ϕ by dominated convergence, since the fraction inside the inner product converges to i ϕ. Thus h is the weak derivative of f and this is in L 2. This we can repeat as often as we like, i.e. f has weak derivatives in L 2 of any order. By the Sobolev Embedding Theorem this implies that f is a C -function. In other words, if f is a C -vector for T, it is a C -function. Next we will show that the C -functions with compact support are actually Gårding vectors and thus in particular C -vectors. First let f, h Cc (R 3 ). We want to show that T (f)h = h f (i.e. the convolution of h with f). By definition T (f)h = f(x)t (x)hdx R 3 is the unique L 2 -function satisfying T (f)h, ϕ = f(x)t (x)f, ϕ dx R 3 for all ϕ L 2. We see (by a proper use of Fubini s Theorem) that f(x)t (x)f, ϕ = f(x)h(y x)ϕ(y) dydx = h f, ϕ R 3 R 3 7 Unitarity is a requirement on symmetry operators in quantum mechanics since they preserve probability.
104 Chapter 6 Infinite-dimensional Representations and hence that T (f)h = h f. Thus, the Gårding vectors are convolutions. Now a quite deep theorem due to Dixmier and Malliavin states that any function in Cc (R 3 ) can be written as u i v i i=1 where u i, v i Cc (R 3 ). By the remarks above this means that C -functions with compact support are Gårding vectors and thus our operators T (e i ) are at least defined on this space. But why are these operators interesting at all? For the motivation we give the following slightly non-rigorous argument based on Theorem 6.17: If ε is a small number we can consider what physicists call an infinitesimal translation εe i. Then by Theorem 6.17 T (εe i ) = exp(εt (e i )). By series expansion of exp and neglecting terms with ε to a power higher than 1 we get approximately T (εe i ) id +εt (e i ). Thus T (e i ) is a measure of how much the system changes when translated a little bit along the x i -axis. In the Hamiltonian formulation of classical mechanics, this precisely has the interpretation of momentum of the system. In generalizing to quantum mechanics, we therefore define momentum in this way. However the operator T (e i ) is only anti-symmetric. To make it self-adjoint and to make it have the right units we define p j := i T (e j ) 8. These are the momentum operators in quantum mechanics. Now following the same pattern, we analyze rotation and how it influences a quantum mechanical system. We consider the rotation group SO(3) which is a compact Lie group. Its Lie algebra is so(3) i.e. the 3 3 skew-symmetric real matrices. An obvious basis for so(3) is the following 0 0 0 0 0 1 0 1 0 A 1 = 0 0 1, A 2 = 0 0 0, A 3 = 1 0 0, 0 1 0 1 0 0 0 0 0 and it is easily checked that they possess the following commutation relations [A 1, A 2 ] = A 3, [A 2, A 3 ] = A 1, [A 3, A 1 ] = A 2. The exponential map is no longer as trivial as in the past example. To calculate exp A for a matrix A so(3) we have to calculate the Taylor series n=0 1 n! An. If we apply this formula for exp to the matrices ta i for some real number t, we get 1 0 0 cos t 0 sin t exp(ta 1 ) = 0 cos t sin t, exp(ta 2 ) = 0 1 0 (6.20a) 0 sin t cos t sin t 0 cos t cos t sin t 0 exp(ta 3 ) = sin t cos t 0. 0 0 1 (6.20b) That is, the element exp(ta i ) SO(3) is the rotation of angle t around the x i -axis. Consider a particle (without spin!) which can move inside some massive sphere Ω R 3, or for that matter inside Ω = R 3. The proper Hilbert space is then L 2 (Ω). A rotation R SO(3) of the system corresponds to a rotation R 1 of the coordinate system, hence the rotation operator D(R) (where D stands for the German word Drehung) changing the system according to the rotation is 8 = 1.0546 10 34 Js is the so-called reduced Planck constant.
6.4 Applications to Quantum Mechanics 105 D(R)f(x) = f(r 1 x). The map D : SO(3) Aut(L 2 (Ω)) is easily seen to be a unitary representation of SO(3) (unitarity is a consequence of the transformation theorem for integrals). Let s calculate the induced Lie algebra representation D for the basis elements A 1, A 2 and A 3 : D (A 3 )f(x) = d dt f(exp( ta 1 )x) t=0 = d dt f(x 1 cos t x 2 sin t, x 1 sin t + x 2 cos t, x 3 ) t=0 = d dt (x 1 cos t x 2 sin t) f + d t=0 x 1 dt (x 1 sin t + x 2 cos t) f t=0 x 2 f f = x 2 + x 1. x 1 x 2 By a similar calculation for A 1 and A 2 we thus get D (A 1 ) = x 2 f x 3 x 3 f x 2 D (A 2 ) = x 3 f x 1 x 1 f x 3 D (A 3 ) = x 1 f x 2 x 2 f x 1. (6.21a) (6.21b) (6.21c) From the commutation relations of the basis elements we immediately get similar commutation relations for the operators above. What are the C -vectors in this case? Assume f L 2 (Ω) to be a C -vector and let s write its arguments in polar coordinates f(r, θ, ϕ). Then the effect of a rotation of the coordinate system (i.e. of the action of D(R)) will be to add constants to the last variables: (D(R)f)(r, θ, ϕ) = f(r, θ θ 0, ϕ ϕ 0 ) By the same argument as for translations C -vectors have to be C in the last two variables. To motivate why these operators are interesting we once again take a small number ε and look at the infinitesimal rotation r = exp(εa i ). Exploiting Theorem 6.17 we get D(r) = exp(εd (A i )) id +εd (A i ), i.e. D (A i ) is the rate of change when rotating the system a little bit. This we interpret as orbital angular momentum. Again to obtain self-adjoint operators with the right units we define the orbital angular momentum operators to be L x := i D (A 1 ), L y := i D (A 2 ), L z := i D (A 3 ). Since SO(3) is a compact group the Peter-Weyl Theorem says that D can be decomposed into irreducible representations. So first of all let s determine the irreducible representations of SO(3). First we observe that the following matrices, known in physics as the Pauli spin matrices, σ 1 = 1 2 ( 0 1 1 0 ), σ 2 = 1 2 ( 0 i i 0 constitute a basis for su(2). One can calculate that ), σ 3 = 1 2 ( ) i 0 0 i ( e 1 ) ( 2 it 0 cos( 1 exp(tσ 1 ) = 0 e 1 2, exp(tσ 2 ) = t) sin( 1 2 t) ) 2 it sin( 1 2 t) cos( 1 2 t)
106 Chapter 6 Infinite-dimensional Representations ( cos( 1 2 exp(tσ 3 ) = t) i sin( 1 2 t) ) i sin( 1 2 t) cos( 1 2 t). Letting H, E and F denote the usual basis for the sl(2, C) (the complexification of su(2)) we have the relations E = σ 1 + iσ 2, F = σ 1 + iσ 2, H = 2iσ 3. (6.22) Furthermore we see that the Pauli matrices satisfy the commutation relations [σ 1, σ 2 ] = σ 3, [σ 2, σ 3 ] = σ 1, [σ 3, σ 1 ] = σ 2, and thus we get a Lie algebra isomorphism ϕ : su(2) so(3) by σ i A i. Since SU(2) is simply connected there is a unique Lie group homomorphism Φ : SU(2) SO(3) which induces ϕ. Since ϕ is an isomorphism Φ is a smooth covering map, namely the universal double covering of SO(3). In particular we have {±I} = ker Φ. SU(2) F Φ SO(3) F G If F : SU(2) G is any Lie group homomorphism it induces a Lie group homomorphism F : SO(3) G if and only if {±I} ker F (Φ is a surjective submersion so F is smooth if and only if F is smooth). Now the irreducible representations of su(2) are in 1-1 correspondence with the irreducible representations of sl(2, C) and these we know. For each k N there is exactly one representation ρ C k of sl(2, C) of dimension k, and thus for su(2) there is for each k N one irreducible representation ρ k of dimension k. Since SU(2) is simply connected these lift to irreducible representations of SU(2). Let s call them Π k. From above we know that Π k induces a representation of SO(3) if and only if {±I} ker Π k, and the claim is that this happens exactly when k is odd. To see this observe that exp(2πσ 3 ) = I and therefore Π k ( I) = Π k (exp(2πσ 3 )) = exp(ρ k (2πσ 3 )) = exp(iπρ C k (H)) where we used that σ 3 = i 2 H. But ρc k (H) is diagonal and the eigenvalues are k 1, k 3,..., k + 3, k + 1 and therefore exp(iπρ C k (H)) = exp(diag(iπ(k 1),..., iπ( k + 1))) k 1 2πi = diag(e 2,..., e 2πi k+1 2 ) = { I, k odd I, k even. The induced representation of SO(3) is irreducible, for the corresponding representation of so(3) = su(2) is just ρ k which is irreducible. Furthermore there can be no other irreducible representations of SO(3), for assuming the existence of a such would give an irreducible representation of so(3) which is not among the ρ k s. Wrapping up: for each positive odd integer k there is one irreducible representation of SO(3) of dimension k. We now return to the representation D. By the Peter-Weyl Theorem we can decompose this representation into irreducible representations. We don t know how many times a specific irreducible representation occurs in this decomposition, it depends on the particular space Ω. Let V k L 2 (Ω) denote an invariant irreducible subspace of dimension k. Then we know that k is odd. If the state
6.4 Applications to Quantum Mechanics 107 vector of a particle happens to be in V k the physical interpretation is that the square of the total orbital angular momentum of the particle is 2 4 (k 1)(k + 1). The values above with k odd are the only possible values of the total orbital angular momentum of a particle one can measure. But what about the angular momentum along the axes? Let D k and D k denote the restriction of D and D to the subspace V k. We see that on this subspace: L z = i D k (A 3 ) = i D k (σ 3 ) = 1 2 (D k ) C (H) and we know the eigenvalues of (D k ) C (H): they are k 1, k 3,..., k+3, k+1 and therefore the eigenvalues of L z on V k are k 1, 2 k 3 2,..., k + 3, 2 k + 1. 2 Observe that they are all integer multiples of. In physics a state in V k which is an eigenvector for L z with eigenvalue m is denoted km. By symmetry similar results hold for the other axes. But observe that L x, L y and L z do not commute, due to the commutation relations on so(3) and therefore the angular momentum along the axes cannot be simultaneously measured unless they are all 0: Indeed assume that v V k is a common eigenvector for L x, L y and L z and that the eigenvalue of, say, L z is non-zero. Then and we have a contradiction. 0 L z v = [L x, L y ]v = 0
108 Chapter 6 Infinite-dimensional Representations
Part II Geometric Analysis and Spin Geometry 109
Chapter 7 Clifford Algebras 7.1 Elementary Properties In this chapter we will introduce the Clifford algebra and discuss some of its elementary properties. The setting is the following: Let V be a finite-dimensional vector space over the field K (predominantly R or C) and ϕ : V V K a symmetric bilinear form on V. ϕ is said to be positive (resp. negative) definite if for all 0 v V we have ϕ(v, v) > 0 (resp. ϕ(v, v) < 0). ϕ is called non-degenerate if ϕ(v, w) = 0 for all v V implies w = 0. From a bilinear form we construct a quadratic form Φ : V K given by Φ(v) := ϕ(v, v). We can recover the original bilinear form by the polarization identity: Φ(u + v) = ϕ(u + v, u + v) = Φ(u) + Φ(v) + 2ϕ(u, v), hence ϕ(u, v) = 1 (Φ(u + v) Φ(u) Φ(v)). (7.1) 2 Thus we have a 1-1 correspondence between symmetric bilinear forms and quadratic forms. Thus a quadratic form Φ is called positive definite/negative definite/non-degenerate if ϕ is. Very important examples of non-degenerate bilinear forms on R n are ϕ p,q with p + q = n where ϕ p,q relative to the standard basis for R n is given by the diagonal matrix diag(1,..., 1, 1,..., 1) of p positives and q negatives. Definition 7.1 (Clifford Algebra). Let (V, Φ) be a vector space with a quadratic form Φ. The associated Clifford algebra Cl(V, Φ) (abbreviated Cl(Φ)) is an associative, unital algebra over K with a linear map i Φ : V Cl(Φ) obeying the relation i(v) 2 = Φ(v) 1 (1 is the unit element of Cl(Φ)). Furthermore (Cl(Φ), i Φ ) should have the property that for every unital algebra A, and every linear map f : V A satisfying f(v) 2 = Φ(v) 1 there exists a unique algebra homomorphism f : Cl(Φ) A, such that f = f i Φ. Two questions immediately arise: given (V, Φ), do such objects exist and if they do, are they unique? Fortunately, the answer to both questions is yes : Proposition 7.2. For any vector space V with a quadratic form Φ let I be the two-sided ideal in the tensor algebra T (V ) spanned by all elements of the form a (v v Φ(v) 1) b (for a, b T (V ) and v V ). Then T (V )/I, with the map i Φ : V T (V )/I being π ι where ι : V T (V ) is the injection of V into T (V ), and π : T (V ) T (V )/I is the quotient map, is a Clifford algebra, and any other Clifford algebra over (V, Φ) is isomorphic to this one. 111
112 Chapter 7 Clifford Algebras Proof. Uniqueness. Assume that Cl 1 (Φ) and Cl 2 (Φ) with linear maps i 1 : V Cl 1 (Φ) and i 2 : V Cl 2 (Φ) are Clifford algebras. Since Cl 1 (Φ) is a Clifford algebra, and i 2 is linear and satisfies i 2 (v) 2 = Φ(v) 1, it induces an algebra homomorphism î 2 : Cl 1 (Φ) Cl 2 (Φ); likewise i 1 induces an algebra homomorphism î 1 : Cl 2 (Φ) Cl 1 (Φ) such that the following diagram commutes: V We see that i 1 Cl 1 (Φ) î 2 î 1 i 2 Cl 2 (Φ) i 2 = î 2 i 1 = î 2 î 1 i 2 and since î 2 î 1 is the unique map satisfying this, it must be id Cl2(Φ). Likewise î 1 î 2 = id Cl1(Φ) which means that the two Clifford algebras are isomorphic. This proves uniqueness. Existence. We now show that Cl(Φ) := T (V )/I is indeed a Clifford algebra. i Φ is easily seen to satisfy i Φ (v) 2 = Φ(v) 1, where 1 Cl(Φ) is the coset containing the unit element of T (V ). Now let f : V A be linear with f(v) 2 = Φ(v) 1. By the universal property of the tensor algebra this map factorizes uniquely through T (V ) to an algebra homomorphism f : T (V ) A, such that f = f ι. f inherits the property (f (v)) 2 = Φ(v) 1, and consequently f (v v Φ(v) 1) = (f (v)) 2 Φ(v)f (1) = (f (v)) 2 Φ(v) 1 so f vanishes on I. Therefore it factorizes uniquely through Cl(Φ) to f : Cl(Φ) A, such that f = f π ι = f i Φ. Thus Cl(Φ) = T (V )/I is a Clifford algebra. We immediately see that if the quadratic form Φ on V is identically 0, the Clifford algebra Cl(Φ) is nothing but the well-known exterior algebra Λ (V ). From now on we will write i instead of i Φ where no confusion is possible. A simple calculation reveals that for all u, v V : i(u + v) 2 = i(u) 2 + i(v) 2 + i(u) i(v) + i(v) i(u). A comparison of this with the polarization identity using that Φ(u + v) 1 = i(u + v) 2, Φ(u) 1 = i(u) 2 and Φ(v) 1 = i(v) 2 yields the following useful formula: i(u) i(v) + i(v) i(u) = 2ϕ(u, v) 1. (7.2) It can be used to prove the following: Proposition 7.3. Let {e 1,..., e n } be an orthogonal basis for V, then the set consisting of 1 and all products of the form i(e j1 ) i(e jk ), where j 1 < < j k and 1 k n is a basis for Cl(Φ). In particular i : V Cl(Φ) is injective, and the dimension of Cl(Φ) is 2 n. Proof. Since e i1 e ik span the tensor algebra and since i(e j ) are subject to the relations (7.2) we see that elements of the form e i1 e ik where i 1 < < i k span Cl(Φ). In particular dim Cl(Φ) 2 n. For each v V define endomorphisms ε(v), ι(v) and c(v) on Λ V by ε(v) : v 1 v k v v 1 v k. k ι(v) : v 1 v k ( 1) j+1 ϕ(v, v j )v 1 v j v k. c(v) = ε(v) + ι(v). j=1
7.1 Elementary Properties 113 It s a matter of calculations to show that ε(v) 2 = ι(v) 2 = 0 and that we have c(v) 2 = Φ(v) 1 c(u)c(v) + c(v)c(u) = 2ϕ(u, v) 1. By the universal property of Clifford algebras there exists an algebra homomorphism ĉ : Cl(Φ) End K (Λ V ) extending c. In particular we get a linear map σ : Cl(Φ) Λ V by x ĉ(x)1, called the symbol map. We observe that σ(e i1 e ik ) = c(e i1 ) c(e ik )1 = e i1 e ik and thus that σ is surjective. Hence dim Cl(Φ) 2 n. Since i is injective we can imagine V as sitting as a subspace of Cl(Φ). Thus, henceforth we will write v instead of i(v). The inverse of the symbol map Q : Λ V Cl(Φ), mapping e i1 e ik to e i1 e ik is called the quantization map. If (V, ϕ) and (W, ψ) are two vector spaces with bilinear forms, a linear map f : V W is called orthogonal w.r.t. the bilinear forms if ψ(f(v), f(w)) = ϕ(v, w) (or equivalently, if Ψ(f(v)) = Φ(v)). We say that (V, ϕ) and (W, ψ) are isomorphic if there exists an orthogonal linear map f : V W which is also a vector space isomorphism and such a map we call an orthogonal isomorphism. If ϕ is non-degenerate, one can show that an orthogonal endomorphism f : V V is automatically an isomorphism. Thus the set of orthogonal endomorphisms of V is a group O(V, Φ) (or just O(Φ)), called the orthogonal group. This is a closed subgroup of the Lie group GL(V ), and thus itself a Lie group. Picking only those orthogonal endomorphisms having determinant 1 gives the special orthogonal group SO(V, Φ) or just SO(Φ). This is again a Lie group, being a closed subgroup of O(Φ). A curious fact is that there exists an isomorphism among the orthogonal groups. To see this, let ϕ : (R p+q, Φ p,q ) (R p+q, Φ q,p ) be an anti-orthogonal linear map, i.e. satisfying Φ q,p (ϕ(x)) = Φ p,q (x) and consider the map Ψ : O(p, q) O(q, p) given by Φ(A) = ϕ A ϕ 1. It is easily checked that this is a Lie group isomorphism. Restricting Ψ to SO(p, q), it maps into SO(q, p) since the determinant is unaltered by a conjugation. Thus we also have an isomorphism SO(p, q) SO(q, p). The next proposition shows that Cl is a covariant functor from the category of vector spaces with bilinear forms and orthogonal linear maps to the category of associated unital algebras and algebra homomorphisms. Proposition 7.4. An orthogonal linear map f : V W between (V, Φ) and (W, Ψ) induces a unique algebra homomorphism f : Cl(Φ) Cl(Ψ) satisfying f(i Φ (v)) = i Ψ (f(v)). If f is an orthogonal isomorphism, then f is an algebra isomorphism. If, furthermore, we have a vector space U with quadratic form Θ and linear maps f : V U and g : U W satisfying Θ(f(v)) = Φ(v) and Ψ(g(u)) = Θ(u), then g f = g f.
114 Chapter 7 Clifford Algebras Proof. It is a simple application of the universal property of the Clifford algebra. Consider the following commutative diagram: V f W i Φ Cl(Φ) f i Ψ Cl(Ψ) The map i Ψ f : V Cl(Ψ) satisfies the condition to factorize through Cl(Φ): ((i Ψ f)(v)) 2 = i Ψ (f(v)) 2 = Ψ(f(v)) 1 = Φ(v) 1. This gives the unique map f. If f is bijective, f 1 gives the unique homomorphism f 1 : Cl(Ψ) Cl(Φ). An argument similar to the one in the proof of Proposition 7.2 shows that f and f 1 are inverses of each other. The last claim of the lemma follows from the uniqueness, in that g f : Cl(Φ) Cl(Ψ) clearly satisfies (g f)(i Φ (v)) = i Ψ ((g f)(v)). The most interesting examples of Clifford algebras show up when the bilinear form is non-degenerate. For the time being we consider only real vector spaces and Clifford algebras. A prominent example of a vector space with nondegenerate bilinear form is (R p+q, ϕ p,q ) where ϕ p,q (e i, e j ) = 0 if i j and ϕ p,q (e i, e i ) = { 1, i p 1, i > p (e 1,..., e p+q is the standard basis of R p+q ). The associated quadratic form is denoted Φ p,q and the corresponding Clifford algebra is denoted Cl p,q. In this case O(Φ p,q ) and SO(Φ p,q ) are the well-known orthogonal groups O(p, q) and SO(p, q). Example 7.5. 1) Let us consider the vector space R with the single basis element e 1 := 1 and the quadratic form Φ 0,1 (x 1 e 1 ) = x 2 1. By Proposition 7.3 {1, e 1 } (where 1 is now the unit element of Cl 0,1 ) is a basis for Cl 0,1. The fact that e 2 1 = Φ 0,1 (e 1 ) 1 = 1 shows that the linear map Cl 0,1 1 1 C, e 1 i C defines an algebra isomorphism Cl 0,1 C and that the injection R Cl 0,1 is given by x ix, i.e. R sits inside Cl 0,1 = C as the imaginary part. Thus, Cl 0,1 is just the field of complex numbers. 2) As another example, consider R 2 with the standard basis {e 1, e 2 } and the quadratic form Φ 0,2 (x 1 e 1 +x 2 e 2 ) = x 2 1 x 2 2. The Clifford algebra has the basis {1, e 1, e 2, e 1 e 2 }. It is an easy application of formula (7.2) to show that the linear map from Cl 0,2 to the algebra of quaternions H given by 1 1, e 1 i, e 2 j and e 1 e 2 k is in fact an algebra isomorphism Cl 0,2 H. These two will be our role models throughout the text. Because of this last example, elements of Clifford algebras are sometimes called generalized quaternions. Now, let (V, ϕ) be any real vector space with non-degenerate bilinear form ϕ, and let {e 1,..., e n } be a basis for V. We will consider the matrix (ϕ ij ) where ϕ ij = ϕ(e i, e j ). Since ϕ is symmetric the matrix (ϕ ij ) is symmetric as well, i.e. it can be diagonalized. Let λ 1,..., λ n be the eigenvalues and f 1,..., f n a basis of diagonalizing eigenvectors. This means that ϕ(f i, f j ) = 0 if i j and ϕ(f i, f i ) = λ i.
7.1 Elementary Properties 115 Observe, that none of the eigenvalues are 0 (a 0 eigenvalue would violate nondegeneracy of ϕ). Arrange the eigenvalues so that λ 1,..., λ k are all strictly positive and λ k+1,..., λ n are all strictly negative. Define then we see that f i = 1 λi f i ϕ( f i, f j ) = 0 if i j and ϕ( f i, f i ) = { 1, i k 1, i > k. A basis satisfying this is called a (real) orthonormal basis w.r.t. ϕ. Thus we have proven Theorem 7.6 (Classification of Real Bilinear Forms). Let (V, ϕ) be a real vector space with non-degenerate bilinear form. Then there exists an orthonormal basis for V and the map sending this basis to the standard basis for R n is an orthogonal isomorphism (V, ϕ) (R n, ϕ k,n k ). In particular any real Clifford algebra originating from a non-degenerate quadratic form is isomorphic to Cl p,q for a certain p and q (Proposition 7.4). The complex case is a bit different. On C n we have bilinear forms ϕ n given by ϕ n (e i, e j ) = δ ij. The corresponding quadratic form is denoted Φ n. For an arbitrary complex vector space (V, ϕ), chose a basis {e 1,..., e n } and write ϕ in this basis (ϕ ij ). Again, since it is symmetric, it can be diagonalized. Let λ 1,..., λ n be the eigenvalues and {f 1,..., f n } a diagonalizing basis of eigenvectors. None of the eigenvalues are 0 and hence we can define f i = 1 λi f i. This basis satisfies ϕ( f i, f j ) = δ ij, and is thus called a (complex) orthonormal basis. Thus we have shown Theorem 7.7 (Classification of Complex Bilinear Forms). Let (V, ϕ) be a complex vector space with non-degenerate bilinear form. Then there exists an orthonormal basis for V and the map sending this basis to the standard basis for R n is an orthogonal isomorphism (V, ϕ) (C n, ϕ n ). Appealing to Proposition 7.4 we see that a complex Clifford algebra is isomorphic to Cl(Φ n ) for some n. Again we consider the general situation of Clifford algebras over K. Now we want to equip the Clifford algebra with two involutions t and α which we will need later in the construction of various subgroups of Cl(Φ). Proposition 7.8. Each Clifford algebra Cl(Φ) admits a canonical anti-automorphism, i.e. a linear map t : Cl(Φ) Cl(Φ) that for all x, y Cl(Φ) satisfies t(x y) = t(y) t(x), t t = id Cl(Φ), t V = id V. J Proof. Consider the involution J of the tensor algebra given by v 1 v k v k v 1 (and extended by linearity). Now π J : T (V ) Cl(Φ) is an anti-homomorphism that vanishes on I since J(a v v b a (Φ(v) 1) b) = J(b) v v J(a) J(b) (Φ(v) 1) J(a) = J(b) (v v Φ(v) 1) J(a) I.
116 Chapter 7 Clifford Algebras But then π J induces a unique anti-homomorphism t : Cl(Φ) Cl(Φ) determined by t[x] = π J(x) (where [x] Cl(Φ) is the coset containing x T (V )). It is easy to see that J(x y) = J(y) J(x), so we also have t(x y) = t(y) t(x). J is clearly an involution, i.e. J J = id T (V ). On one hand id T (V ) induces the map id Cl(Φ), and on the other hand J J induces the map t t. By uniqueness we conclude t t = id Cl(Φ). The last property, t V = id V follows from the fact that J(v) = v for v V. Now for the construction of the second involution: Proposition 7.9. Each Clifford algebra Cl(Φ) admits a canonical automorphism α : Cl(Φ) Cl(Φ) which satisfies α α = id Cl(Φ) and α V = id V. Furthermore we have α(1) = 1 and α(e i1 e ik ) = ( 1) k e i1 e ik. Proof. Consider the linear bijection α : V V given by v v. By the functorial property of Cl it induces an automorphism α : Cl(Φ) Cl(Φ), and we see that α α = α α = id V = id Cl(Φ). The second property of α is obtained from the identity α i = i α which is seen from the commutative diagram in the proof of Proposition 7.4. This gives α(i(v)) = i( v) = i(v), and by considering v an element of Cl(Φ) (by virtue of the injectivity of i) we have proven the claim. The the calculation of t and α on our model algebras C and H we refer to Example 8.6 in the next chapter. Due to involutivity of α, we can split the the Clifford algebra in a direct sum of two subspaces: Cl(Φ) = Cl 0 (Φ) Cl 1 (Φ) (7.3) where Cl i (Φ) = { x Cl(Φ) α(x) = ( 1) i x } for i = 0, 1. It is easily seen that Cl i (Φ) Cl j (Φ) Cl i+j (mod 2) (Φ) which says that the Clifford algebra is a Z 2 - graded algebra or a super algebra. Cl 0 (Φ) is called the bosonic subalgebra (note that it is actually a subalgebra), and Cl 1 (Φ) is called the fermionic subspace. We see that a product of the form v 1 v k for v i V is bosonic if k is even and fermionic if k is odd. Elements of Cl 0 (Φ) Cl 1 (Φ) are called homogenous elements and the degree of a homogenous element x is denoted by x. If A and B are two Z 2 -graded algebras we could form the tensor product A B with the usual product (a b)(a b ) = (aa ) (bb ). However, this is usually not of great interest since the resulting algebra is not Z 2 -graded (at least not non-trivially). Therefore we define the so-called graded tensor product or super tensor product A B in the following way: As a vector space it is just the ordinary tensor product A B but the product is given on homogenous elements a, a A and b, b B by (a b)(a b ) = ( 1) a b (aa ) (bb ). This gives A B a natural grading by defining (A B) 0 = (A 0 B 0 ) (A 1 B 1 ) (A B) 1 = (A 0 B 1 ) (A 1 B 0 ). With this at hand we can accomplish out final task of this section, showing how Cl reacts to a direct sum of vector spaces. By an orthogonal decomposition of (V, Φ), we understand a decomposition V = V 1 V 2 such that if v = v 1 + v 2 we have Φ(v) = Φ 1 (v 1 ) + Φ 2 (v 2 ) (equivalently if ϕ(v 1, V 2 ) = 0).
7.2 Classification of Clifford Algebras 117 Proposition 7.10. Assume we have an orthogonal decomposition V = V 1 V 2 of (V, Φ), then Cl(V, Φ) = Cl(V 1, Φ 1 ) Cl(V 2, Φ 2 ) as algebras. Proof. We use the universal property of Clifford algebras to cook up a map. First, define g : V Cl(V, Φ) = Cl(V 1, Φ 1 ) Cl(V 2, Φ 2 ) by g(v) = v 1 1 + 1 v 2. A quick calculation shows that g(v) 2 = Φ(v)(1 1) and thus by the universal property of Cl there exists an algebra homomorphism ĝ : Cl(V, Φ) Cl(V 1, Φ 1 ) Cl(V 2, Φ 2 ) extending g. To see that this is indeed an isomorphism choose a basis {e 1,..., e m } for V 1 and a basis {f 1,..., f n } for V 2 and put them together to a basis {e 1,..., f n } for V. A basis for Cl(V, Φ) consists of elements of the form e i1 e ik f j1 f jl where i 1 < < i k, k m and j 1 < < j l, l n. Similarly a basis for Cl(V 1, Φ 1 ) Cl(V 2, Φ 2 ) is given by e i1 e ik f j1 f jl (with the same restrictions on the indices as above). One can verify that ĝ(e i1 e ik f j1 f jl ) = e i1 e ik f j1 f jl i.e. it maps basis to basis. Thus it is an algebra isomorphism. 7.2 Classification of Clifford Algebras In this section we set out to classify Clifford algebras originating from nondegenerate bilinear forms. At first we concentrate on real Clifford algebras. We have taken the first step in the classification in that we have shown in the previous section, that it is enough to concentrate our treatment of Clifford algebras to the case where the vector space is R p+q equipped with the quadratic form Φ p,q (x 1 e 1 + + x p+q e p+q ) := x 2 1 + + x 2 p (x 2 p+1 + + x 2 p+q) (where {e 1,..., e p+q } denotes the usual standard basis for R p+q ). The associated Clifford algebra was denoted Cl p,q. As we saw in Example 7.5, we have Cl 0,1 = C and Cl0,2 = H. (7.4) Likewise, one can show that the following isomorphisms hold Cl 1,0 = R R and Cl2,0 = Cl1,1 = R(2) (7.5) (K(n) is the algebra of n n matrices over K, where K can be either R, C or H). As is apparent, all four Clifford algebras are isomorphic to either an algebra of matrices over R, C or H, or to a direct sum of two such algebras. The goal of this section is to show that this is no coincidence. Actually, it is a consequence of the Cartan-Bott Periodicity Theorem which we will prove at the end of this section that this holds for every Clifford algebra Cl p,q. Before proving it, we need two lemmas: Lemma 7.11. We have the following algebra isomorphisms: R(m) R(n) = R(mn) C R C = C C C R H = C(2) H R H = R(4). If K denotes either C or H, then R(n) R K = K(n).
118 Chapter 7 Clifford Algebras This is well-known so we won t prove it here. 1 lemma which really does most of the work: Instead we show the next Lemma 7.12. We have the following three algebra isomorphisms: Cl 0,n+2 = Cln,0 Cl 0,2, Cl n+2,0 = Cl0,n Cl 2,0 and for all p, q, n N {0}. Cl p+1,q+1 = Clp,q Cl 1,1. Proof. To prove the first isomorphism, the strategy is the following: we will construct a linear map f : R n+2 Cl n,0 Cl 0,2, show that f factorizes through Cl 0,n+2, and that the induced map f : Cl 0,n+2 Cl n,0 Cl 0,2 is an isomorphism. Letting {e 1,..., e n+2 } denote the standard basis for R n+2, {e 1,..., e n} the usual basis for R n, and {e 1, e 2} the usual basis for R 2, and thereby generators for the Clifford algebras Cl 0,n+2, Cl n,0 and Cl 0,2 respectively, we define f by { e i f(e i ) = e 1e 2 if 1 i n 1 e i n if n + 1 i n + 2 For 1 i, j n we compute, using the rules for multiplication in a Clifford algebra, that f(e i ) f(e j ) + f(e j ) f(e i ) = (e i e 1e 2) (e j e 1e 2) + (e j e 1e 2) (e i e 1e 2) = (e ie j) (e 1e 2e 1e 2) + (e je i) (e 1e 2e 1e 2) = (e ie j + e je i) (e 1e 1e 2e 2) = 2δ ij 1 1 because e i e j + e j e i = 2δ ij 1, as {e 1,..., e n} is basis for R n orthonormal w.r.t. Φ n,0. For n + 1 i, j n + 2 we have f(e i ) f(e j ) + f(e j ) f(e i ) = (1 e i n) (1 e j n) + (1 e j n) (1 e i n) = 1 e i ne j n + 1 e j ne i n = 1 (e i ne j n + e j ne i n) = 2δ ij 1 1 where the last minus is due to {e 1, e 2} being a basis for R 2 orthonormal w.r.t. Φ 0,2. A similar computation shows that f(e i )f(e j ) + f(e j )f(e i ) = 0 if 1 i n and n + 1 j n + 2. But then for x = x 1 e 1 + + x n+2 e n+2 we have by linearity of f that f(x) 2 = (x 2 1 + + x 2 n+2)1 1 = Φ 0,n+2 (x) 1 1. Therefore f factorizes uniquely through Cl 0,n+2 to an algebra homomorphism f : Cl 0,n+2 Cl n,0 Cl 0,2. f maps to a set of generators for Cl n,0 Cl 0,2 ; thus f maps to a set of generators for Cl n,0 Cl 0,2. Since f is an algebra homomorphism, f must then be surjective. Since dim Cl 0,n+2 = 2 n+2 = 2 n 2 2 = (dim Cl n,0 )(dim Cl 0,2 ) = dim(cl n,0 Cl 0,2 ), the Dimension Theorem from linear algebra tells us that f is also injective. Thus f is the desired isomorphism. The second isomorphism is proved in exactly the same way, and we avoid repeating ourselves. 1 For a proof consult for instance [Lawson and Michelson], pp 26-27, Proposition 4.2.
7.2 Classification of Clifford Algebras 119 The proof of the third isomorphism is essentially the same as the two first. Let {e 1,..., e p+1, ε 1,..., ε q+1 } be an orthogonal basis for R p+q+2 (i.e. ϕ(v, w) = 0 when v w) with the quadratic form Φ p+1,q+1, such that Φ p+1,q+1 (e i ) = 1, Φ p+1,q+1 (ε i ) = 1, and let {e 1,..., e p, ε 1,..., ε q} and {e 1, ε 1} be similar bases for R p+q and R 1+1 (and thereby generators for the Clifford algebras Cl p+1,q+1, Cl p,q and Cl 1,1 respectively). We now define a linear map f : R p+q+2 Cl p,q Cl 1,1 by { e i f(e i ) = e 1ε 1 if 1 i p 1 e 1 if i = p + 1 Just like before it can be shown that {, f(ε j ) = ε j e 1ε 1 if 1 j q 1 ε 1 if j = q + 1. f(x) 2 = Φ p+1,q+1 (x) 1 1, and thus f induces an isomorphism f : Cl p+1,q+1 Clp,q Cl 1,1. Now we are ready to state and prove the Cartan-Bott Theorem: Theorem 7.13 (Cartan-Bott I). We have the following isomorphisms: Cl 0,n+8 = Cl0,n R(16) and Cl n+8,0 = Cln,0 R(16). Proof. Using the two first isomorphisms from Lemma 7.12 a couple of times yields Cl 0,n+8 = Cln+6,0 Cl 0,2 = = Cl0,n Cl 2,0 Cl 0,2 Cl 2,0 Cl 0,2 = Cl 0,n Cl 2,0 Cl 2,0 Cl 0,2 Cl 0,2 where the last isomorphism follows from the fact that for arbitrary real algebras A and B we have A B = B A. From (7.4) we have Cl 0,2 = H, and from (7.5) that Cl 2,0 = R(2). Thus, using H R H = R(4) from Lemma 7.11, we get Cl 2,0 Cl 2,0 Cl 0,2 Cl 0,2 = R(2) R(2) (H R H) = R(4) R(4) = R(16). This completes the proof of the first isomorphism. The proof of the second isomorphism is identical to this. Now it s evident that once we know the Clifford algebras Cl 0,0, Cl 0,1,..., Cl 0,7 and Cl 0,0, Cl 1,0,..., Cl 7,0 we know all of them: Consider Cl p,q and assume that p q. Then by the third isomorphism in Lemma 7.12 we have that Cl p,q is isomorphic to Cl p q,0 (Cl 1,1 ) q = Cl p q,0 R(2 q ), and Cl p q,0 can be expressed as a tensor product of Cl k,0 (with 0 k 7) and some copies of R(16). We can do the same if q p. Using the isomorphisms from Lemma 7.11 one obtains the table of Clifford algebras in Appendix A. From this table we see that any Clifford algebra is either a matrix algebra or a sum of two such algebras, as we pointed out in the beginning of this section. Example 7.14. As an example, let us show that Cl 2,11 = C(64): Cl 2,11 = Cl0,9 Cl 1,1 Cl 1,1 = Cl0,1 R(16) R(2) R(2) = C R R(64) = C(64). As yet another example, let us show that Cl 3,2 = R(4) R(4): Cl 3,2 = Cl1,0 Cl 1,1 Cl 1,1 = (R R) R(4) = (R R(4)) (R R(4)) = R(4) R(4), where we have used that the tensor product is distributive w.r.t. and that for any real algebra A the isomorphism R R A = A holds.
120 Chapter 7 Clifford Algebras By repeating the arguments of Example 7.14 in a more general setting we obtain: Corollary 7.15. Cl p,q p q 1 (mod 4). is a direct sum of two matrix algebras exactly when We conclude this treatment by proving Proposition 7.16. We have the following algebra isomorphism Cl p,q = Cl 0 p,q+1. Proof. Let {e 1,..., e p+q+1 } denote an orthogonal basis for R p+q+1 which satisfies Φ(e i ) = 1 for i = 1,..., p and Φ(e j ) = 1 for j = p + 1,..., p + q + 1. Assume the basis has been chosen so that {e 1,..., e p+q } is a basis for R p+q. Now define a linear map f : R p+q Cl 0 p,q+1 by f(e i ) = e p+q+1 e i for i p+q. Like in the proof of Lemma 7.12 one checks that f satisfies f(x) 2 = Φ(x) 1 and thus factorizes to an algebra homomorphism f : Cl p,q Cl 0 p,q+1. By inspection, this is the desired isomorphism. For the rest of this section we will consider complex Clifford algebras. It turns out that complex Clifford algebras behave even nicer than their real counterparts. As we have already seen a complex Clifford algebra associated with a non-degenerate bilinear form is isomorphic to Cl(Φ n ). The fact that there is only one index on Φ and not two as in the real case, indicates some sort of simplification. But first we introduce a way of turning real vector spaces/algebras into complex ones: Definition 7.17 (Complexification). By the complexification of a real vector space V we mean the real tensor product V C := V R C. If If the vector space V carries a bilinear form φ, then the complexification of the bilinear form is ϕ C (v λ, v λ ) := λλ ϕ(v, v ). The complexification of the corresponding quadratic form Φ is then Φ C (v λ) = λ 2 Φ(v). In the same way we define the complexification of a real algebra A by A C := A R C carrying the product (a λ)(a λ ) = (aa ) (λλ ). Assume (V, ϕ) to be a real vector space with ϕ a non-degenerate bilinear form. Then ϕ C is a non-degenerate bilinear form on V C for assume v 0 λ 0 to satisfy ϕ C (v 0 λ 0, v λ) = λ 0 λϕ(v 0, v) 0 for all v λ. Then λ 0 0 and ϕ(v 0, v) 0 for all v and by non-degeneracy of ϕ this implies that v 0 0. Thus v 0 λ 0 0. In particular, if p + q = n we have that ϕ C p,q is equivalent to ϕ n and thus Φ C p,q is equivalent to Φ n. Now, one can pose the question: is the complexification of a real Clifford algebra a complex Clifford algebra? By the following lemma the answer is yes. Lemma 7.18. For p + q = n we have Cl(Φ C p,q) = Cl(Φ n ) = Cl C 0,n. Proof. The first isomorphism is due to the fact that the complexification of Φ p,q is equivalent to Φ n and thus the corresponding Clifford algebras are isomorphic. To verify the second isomorphism we construct a linear map ϕ : C n Cl C 0,n and show that it factorizes to an isomorphism ϕ : Cl(Φ n ) Cl C 0,n. At first,
7.3 Representation Theory 121 we remark that C n = R n R C. We then define ϕ by ϕ(v z) = i(v) z where i : R n Cl 0,n denotes the usual embedding. Since ϕ(v z) 2 = (i(v) z) 2 = i(v) 2 z 2 = Φ 0,n (v)z 2 1 1 = Φ C n(v z) 1 1, ϕ factorizes uniquely to an algebra homomorphism ϕ : Cl(Φ n ) Cl C 0,n. Both algebras have complex dimension 2 n so it s enough to show that ϕ is surjective. But ϕ is surjective since ϕ is an algebra homomorphism and ϕ maps onto a set of generators of Cl C 0,n. Namely, the set of elements of the form i(v) generate Cl 0,n, and 1 C generates C. Thus, henceforth we will stick to the notation Cl C n for the complex Clifford algebra over C n equipped with any non-degenerate quadratic form, since the preceding lemma guarantees that they are all isomorphic. This result, in combination with the classification results for real Clifford algebras, we get the complex version of the Cartan-Bott Theorem: Theorem 7.19 (Cartan-Bott II). We have the following2- periodicity : Cl C n+2 = Cl C n C Cl C 2, and furthermore that Cl C 2 = C(2). Proof. Invoking Lemma 7.12 and Lemma 7.18 we obtain the following chain of isomorphisms: Cl C n+2 = Cl 0,n+2 R C = Cl n,0 R C R Cl 0,2 = Cl n,0 R (C C C) R Cl 0,2 = (Cln,0 R C) C (C R Cl 0,2 ) = Cl C n C Cl C 2. For the second isomorphism, just recall that Cl 0,2 = H and H R C = C(2). Remembering that Cl 1,0 = R R so that Cl C 1 = C C, we obtain: Corollary 7.20. If n = 2k, then Cl C n = C(2 k ). If n = 2k + 1, then Cl C n = C(2 k ) C(2 k ). 7.3 Representation Theory In this section we will turn our attention to the representation theory of Clifford algebras. For the following definition, let us denote by K either R, C or H: Definition 7.21 (Algebra Representation). Let A be an algebra over K and V a vector space over K. A K-representation of A is an algebra homomorphism ρ : A End K (V ). A subspace U of V is called invariant under ρ if ρ(x)u U for all x A. The representation ρ is called irreducible if the only invariant subspaces are {0} and V. By an intertwiner of two representations ρ and ρ of A on V and V we understand a linear map f : V V satisfying ρ (x) f = f ρ(x) for all x A. Two representations are called equivalent if there exists an intertwiner between them which is also an isomorphism of vector spaces. Just as we had complexification of a real algebra, we can complexify complex representations: If ρ : A End(V ) is a representation of a real algebra on a complex vector space V, we define the complexification ρ C : A C End(V ) by ρ C (x λ) = λρ(v). The notions of invariant subspaces and irreducibility of a representation and its complexified are closely related as the following proposition shows
122 Chapter 7 Clifford Algebras Proposition 7.22. Let A be a real unital algebra, and ρ an algebra representations on the complex V. Let A C and ρ C denote the associated complexifications. Then the following hold: 1) A subspace W V is ρ-invariant if and only if it is ρ C -invariant. 2) ρ is irreducible if and only if ρ C is irreducible. Proof. 1) If W is ρ-invariant then ρ C (x λ)w = λρ(x)w W. Conversely, if W is ρ C -invariant then for x A we have ρ(x)w = ρ C (x 1)W W. This proves 1). 2) Follows immediately from 1). A representation ρ of A on V gives V the structure of a left A-module simply by defining a v := ρ(a)v. This is compatible with addition in V. The next proposition actually contains all the information we need to determine all the irreducible representations (up to equivalence) of the Clifford algebras: Proposition 7.23. The matrix algebra K(n) has only one irreducible K-representation, namely the defining representation i.e. the natural isomorphism π n : K(n) End K (K n ). The algebra K(n) K(n) has exactly 2 inequivalent irreducible K-representations, namely: π 0 n(x 1, x 2 ) := π n (x 1 ) and π 1 n(x 1, x 2 ) := π n (x 2 ). (7.6) We saw in the previous section that Cl p,q is of the form K(n) K(n) iff p q 1 (mod 4) and a matrix algebra otherwise. This observation along side with the preceding proposition yields the number of irreducible representations of the real Clifford algebras. But there is a slight problem here, in that the real Clifford algebra is not always a real matrix algebra or a sum of real matrix algebras. Thus the irreducible representations of Proposition 7.23 need not be real! For instance Cl 1,4 = H(2) H(2), and Proposition 7.23 gives us two irreducible H-representations over H 2! But fortunately we can always turn a complex or quaternionic representation into a real representation if we just remember to adjust the dimension. 2 Without going further into details with keeping track of the dimensions we have shown: Theorem 7.24. Consider the real Clifford algebra Cl p,q. If p q 1 (mod 4) Cl p,q has up to equivalence two real irreducible representations and up to equivalence exactly one real irreducible representation otherwise. In particular if n 1 (mod 4) Cl 0,n has two irreducible representations, ρ 0 n and ρ 1 n, and otherwise only one ρ n. If n 1 (mod 4) we define ρ n := ρ 0 n such that to each real Clifford algebra Cl 0,n we associate a real irreducible representation ρ n called the real spin representation. The elements of the corresponding vector spaces are called spinors. The similar situation for complex Clifford algebras is simpler due to the fact that each complex Clifford algebra decomposes into complex matrix algebras (cf. Corollary 7.20). Thus all irreducible representations are complex. This will make it a lot easier to keep track of the dimensions. 2 C n is naturally isomorphic to R 2n via the isomorphism ϕ : C n R 2n, (λ 1,..., λ n) (Re λ 1, Im λ 1,..., Re λ n, Im λ n). If π is a complex representation of an algebra A, then we just define a real representation on R 2n by π(x)(v) = ϕ(π(x)ϕ 1 (v)). It s easy to see that π is irreducible iff π is. Likewise with a quaternionic representation we exploit the natural isomorphism H n = R 4n.
7.3 Representation Theory 123 Theorem 7.25. Consider the complex Clifford algebra Cl C n. If n = 2k we have (up to equivalence) exactly one irreducible complex representation κ n on C 2k, namely the isomorphism Cl C 2k End(C 2k ). If n = 2k + 1 we have (up to equivalence) exactly two irreducible representations κ 0 n and κ 1 n on C 2k. For n = 2k or n = 2k + 1 like above, define n := C 2k. The elements of n are called Dirac spinors or complex n-spinors. The irreducible representations of Cl C n are representations on n. In the case where n is odd we want to single out κ 0 n and define κ n := κ 0 n which is just the composition κ n = κ 0 n : Cl C n End C ( n ) End C ( n ) π 1 End C ( n ) of the isomorphism with the projection π 1 onto the first component. Hence, for each n we have an irreducible complex representation on n called the complex spin representation. Finally, let s try to break up the action of the Clifford algebra into smaller pieces and see how they act on the spinors. This requires introduction of the so-called volume element. It is well-known that, Λ (R n ) has a unique volume element Ω given unambiguously, in any orthonormal basis {e 1,..., e n }, by Ω = e 1 e n. Applying the quantization map to this yields an element ω := Q(Ω) Cl 0,n, also called the volume element, given by ω = e 1 e n. For the complex Clifford algebra Cl C n we define the volume element by ω C := i n+1 2 ω. In the case n = 2k we note that ω 2 = ( 1) k and that ω commutes with every element of Cl 0 0,2k, while ω anti-commutes Cl 1 0,2k, for instance we see that ωe 1 = (e 1 e 2k )e 1 = ( 1) 2k 1 e 2 1e 2 e 2k = e 1 ω. If n = 2k + 1 then ω commutes with everything. We are only interested in the even case, so assume n = 2k, define ω C := i k and consider the map f := i k κ 2k (ω) : 2k 2k. From this we see that f commutes with κ 2k (ξ): f(κ 2k (ξ)ψ) = i k κ 2k (ω)(κ 2k (ξ)ψ) = i k κ 2k (ωξ)ψ = i k κ 2k (ξω)ψ = κ 2k (ξ)i k κ 2k (ω)ψ = κ 2k (ξ)f(ψ) for ξ Cl 0 0,2k and ψ 2k. Furthermore f is an involution: f f = i 2k κ 2k (ω 2 ) = ( 1) k κ 2k (( 1) k ) = ( 1) 2k id 2k = id 2k. Then it is well-known that f has the eigenvalues ±1 and corresponding eigenspaces ± 2k (of equal dimension) such that 2k = + 2k 2k. The elements of ± 2k are called positive and negative Weyl spinors or even and odd chiral spinors, respectively. We can use the map f to induce another splitting of the complexified Clifford algebra (in even dimension!), for left multiplication by ω C on Cl C 2k is an involution, hence the algebra splits into eigenspaces Cl C 2k = (Cl C 2k) + (Cl C 2k)
124 Chapter 7 Clifford Algebras where, in fact (Cl C 2k) ± = 1 2 (1 ± ω C) Cl C 2k. We can combine this splitting with the splitting given by the involution α to obtain the spaces (Cl C 2k) 0 ± := 1 2 (1 ± ω C)(Cl C 2k) 0 and (Cl C 2k) 1 ± := 1 2 (1 ± ω C)(Cl C 2k) 1. For the next proposition we will identify End( + 2k ) as the subspace of End( 2k) consisting of maps 2k 2k which map + 2k to itself and which map 2k to 0, and in a similar way we identify End( 2k ) and Hom( ± 2k, 2k ) as subspaces of End( 2k ). Proposition 7.26. The spin representation κ 2k : Cl C 2k End( 2k ) restricts to the following isomorphisms (Cl C 2k) 0 + = End( + 2k ), (ClC 2k) 0 = End( 2k ) (Cl C 2k) 1 + = Hom( 2k, + 2k ), (ClC 2k) 1 = Hom( + 2k, 2k ). Proof. First assume ψ + 2k, i.e. f(ψ) = ψ, then for ξ (ClC 2k) 0 +: f(κ 2k (ξ)ψ) = κ 2k (ω C ξ)ψ = κ 2k (ξω C )ψ = κ 2k (ξ)f(ψ) = κ 2k (ξ)ψ, i.e. κ 2k (ξ)ψ + 2k. For this result we only used that (ξ ClC 2k) 0, but to show that κ 2k (ξ) is 0 on 2k we need that ξ = 1 2 (1 + ω C)ξ. Let ψ 2k, i.e. κ 2k(1 + ω C )ψ = 0, then κ 2k (ξ)ψ = 1 2 κ 2k((1 + ω C )ξ)ψ = 1 2 κ 2k(ξ(1 + ω C )) = 0. This shows that κ 2k maps (Cl C 2k) 0 + into End( + 2k ). The reasoning is the same in the other 4 cases. But then, since κ 2k is an isomorphism, the restricted maps must be isomorphisms as well, and this proves the proposition.
Chapter 8 Spin Groups 8.1 The Clifford Group Having introduced the Clifford algebra Cl(Φ), we proceed to define its Clifford group Γ(Φ). The point of doing this is that the Clifford group has two particularly interesting subgroups, the pin and spin groups. Let V be a finite-dimensional real vector space, and let Φ be a quadratic form on V. Definition 8.1 (Clifford Group). Let Cl (Φ) denote the multiplicative group of invertible elements of Cl(Φ). The Clifford group (by some also called the Lipschitz group) of Cl(Φ) is the group Γ(Φ) := {x Cl (Φ) α(x)vx 1 V for all v V }. One mechanically verifies that Γ(Φ) is truly a group. The group Cl (Φ) is an open subgroup of Cl(Φ), just as Aut(V ) is an open subgroup of End(V ) (at least when V is finite-dimensional). In the latter case the Lie algebra of Aut(V ) is just End(V ) with the commutator bracket. In the same way the Lie algebra cl (Φ) of the group Cl (Φ) is just Cl(Φ) with the commutator bracket. It is very conspicuous from the definition that we are interested in a particular representation Λ : Γ(Φ) Aut(V ), namely Γ(Φ) x Λ x where Λ x : V V is given by Λ x (v) = α(x)vx 1 and called the twisted adjoint representation (indeed, the form of Λ x is reminiscent of the adjoint representation of a Lie group). One reason for considering Λ x (v) = α(x)vx 1 instead of Ad x (v) = xvx 1 is that the twisted adjoint representation keeps track of an otherwise annoying sign. W.r.t. the bilinear form ϕ we define, for x V with Φ(x) 0 the reflection s x through the hyperplane orthogonal to x by ϕ(v, x) s x (v) := v 2 Φ(x) x. We then have the following geometric interpretation of Λ x which in addition to being a pretty fact, is crucial in the proof of Lemma 8.10. Proposition 8.2. For any x V with Φ(x) 0 we have x Γ(Φ), and the map Λ x : V V given by Λ x (v) = α(x)vx 1 is the reflection through the hyperplane orthogonal to x. Proof. First, since x 2 = Φ(x) 1 0, x is invertible in Cl(Φ) with inverse 125
126 Chapter 8 Spin Groups x 1 = 1 Φ(x) x V. Using this and Eq. (7.2) we see that i.e. we have x Γ(Φ) and Φ(x)α(x)vx 1 = xv(φ(x)x 1 ) = xvx = 2ϕ(v, x)x + Φ(x)v V ϕ(v, x) Λ x v = v 2 Φ(x) x = s x(v) V. The following proposition is extremely useful: Proposition 8.3. Let ϕ be a non-degenerate bilinear form on V. Then for the twisted adjoint representation ker Λ = R 1 (where R = R \ {0}). Proof. Since ϕ is non-degenerate, we can choose an orthonormal basis e 1,..., e n for V such that Φ(e i ) = ±1 and ϕ(e i, e j ) = 0 when i j. Let x ker Λ; this means α(x)v = vx for all v V. Because of the Z 2 -grading of Cl(Φ) we can write x = x 0 + x 1 where x 0 and x 1 belong to Cl 0 (Φ) and Cl 1 (Φ), respectively. This gives us the equations vx 0 = x 0 v (8.1) vx 1 = x 1 v. (8.2) The terms x 0 and x 1 can be written as polynomials in e 1,..., e n. By successively applying the identity e i e j + e j e i = 2ϕ(e i, e j ) we can express x 0 in the form x 0 = a 0 + e 1 a 1 where a 0 and a 1 are both polynomial expressions in e 2,..., e n. Applying α to this equality shows that a 0 Cl 0 (Φ) and a 1 Cl 1 (Φ). Setting v = e 1 in Eq. (8.1) we get e 1 a 0 + e 2 1a 1 = a 0 e 1 + e 1 a 1 e 1 = e 1 a 0 e 2 1a 1 where the last equality follows from a 0 Cl 0 (Φ) and a 1 Cl 1 (Φ). We deduce 0 = e 2 1a 1 = Φ(e 1 )a 1 ; since Φ(e 1 ) 0, we have a 1 = 0. So, the polynomial expression for x 0 does not contain e 1. Proceeding inductively, we realize that x 0 does not contain any of the terms e 1,..., e n and so must have the form x 0 = t 1 where t R. We can apply an analogous argument to x 1 to conclude that neither does the polynomial expression for x 1 contain any of the terms e 1,..., e n. However, x 1 Cl 1 (Φ), so x 1 = 0. Thus, x = x 0 + x 1 = t 1. Since x 0, we must have t R. This shows ker Λ R 1; the reverse inclusion is obvious. The assumption that Φ is non-degenerate is not redundant. Consider a real vector space V with dim V 2. If Φ 0, then Cl(V, Φ) = Λ V, the exterior algebra of V. Consider the element x = 1 + e 1 e 2. Clearly, x 1 = 1 e 1 e 2, and we have α(1 + e 1 e 2 ) v (1 + e 1 e 2 ) 1 = (1 + e 1 e 2 ) v (1 e 1 e 2 ) = v, i.e. 1 + e 1 e 2 ker Λ, yet 1 + e 1 e 2 is not a scalar multiple of 1. Thus, since the following propositions use Proposition 8.3 in their proofs we will, from this point on, always assume Φ to be non-degenerate. This is not a severe restriction since practically all interesting Clifford algebras originate from non-degenerate bilinear forms. We now introduce the important notions of conjugation and norm.
8.1 The Clifford Group 127 Definition 8.4. For any x Cl(Φ), the conjugate of x is defined as x := t(α(x)). Moreover, the norm of x is defined as N(x) := xx. Note that t α = α t (it clearly holds on e i1 e ir, and by linearity on any element of Cl(Φ)). Also note that x = x. The term norm is justified in the following lemma, displaying some elementary properties of the norm Lemma 8.5. When Φ is non-degenerate, the norm possesses the following properties: 1) If v V then N(v) = Φ(v) 1, i.e. N is a negative extension of Φ to the algebra. 2) If x Γ(Φ), then N(x) R 1. 3) When restricted to Γ(Φ), the norm N : Γ(Φ) R 1 is a homomorphism. Moreover, N(α(x)) = N(x). Proof. 1) This is just a simple calculation N(v) = v(t α(v)) = v 2 = Φ(v) 1. 2) According to Proposition 8.3, it s enough to show that N(x) ker Λ. By definition of the Clifford group, x Γ(Φ) implies As t V = id V, we thus have α(x)vx 1 V for all v V. α(x)vx 1 = t(α(x)vx 1 ) = t(x 1 )vt(α(x)). Isolating the v on the right-hand-side we get, using t α = α t, v = t(x)α(x)v(t(α(x))x) 1 = α(xx)v(xx) 1, i.e. xx ker Λ. But then xx = x x ker Λ. 3) Two simple calculations: N(xy) = xyy x = xn(y)x = xxn(y) = N(x)N(y) and, since α(x) = α(x), (following from t α = α t) N(α(x)) = α(x)α(x) = α(xx) = α(n(x)) = N(x). Example 8.6. Let s calculate the conjugate and the norm on the two model Clifford algebras. First Cl 0,1 = C. Recall that R sits inside C as the imaginary line and that α is the involution satisfying α(1) = 1 and α(λ) = λ for λ R. Thus α is just conjugation: α(z) = z. Since t is just the identity on Cl 0,1 it follows that conjugation in the Clifford sense is just usual conjugation t(α(z)) = z. Therefore the norm becomes N(z) = zz = z 2 = Φ 0,1 (z), i.e. the square of the usual norm and minus the quadratic form. For Cl 0,2 = H the situation is similar. For Cl0,2 we have the basis {1, e 1, e 2, e 1 e 2 } where {e 1, e 2 } is the usual basis for R 2, and R 2 sits inside Cl 0,2 as span{e 1, e 2 }. We have that α(1) = 1, α(e i ) = e i and α(e 1 e 2 ) = α(e 1 )α(e 2 ) = e 1 e 2. Furthermore t(1) = 1, t(e i ) = e i and t(e 1 e 2 ) = t(e 2 )t(e 1 ) = e 2 e 1 = e 1 e 2. Thus t(α(q 0 1 + q 1 e 1 + q 2 e 2 + q 3 e 1 e 2 )) = q 0 1 q 1 e 1 q 2 e 2 q 3 e 1 e 2. As before, conjugation in the Clifford sense is just usual conjugation q. Furthermore we see that N(q) = qq = q 2 again the square of the usual norm.
128 Chapter 8 Spin Groups For the next proposition recall the definition of the orthogonal group O(Φ) (when Φ is non-degenerate!) as the endomorphisms f : V V satisfying Φ(f(v)) = Φ(v). Proposition 8.7. For any x Γ(Φ), the map Λ x is an orthogonal transformation of V. That is, Λ(Γ(Φ)) O(Φ). Proof. Let x Γ(Φ) and use the fact that N is a homomorphism: N(Λ x v) = N(α(x)vx 1 ) = N(α(x))N(v)N(x 1 ) = N(x)N(v)N(x) 1 = N(v), Since N(v) = Φ(v) 1, this shows that Λ x is Φ-preserving. Along with the linearity, this exactly shows that Λ x O(Φ). 8.2 Pin and Spin Groups Definition 8.8 (Spin Group). The pin group is defined as Pin(Φ) := {x Γ(Φ) N(x) = ±1}. The spin group consists of those elements of Pin(Φ) that are linear combinations of even-degree elements: Spin(Φ) := Pin(Φ) Cl 0 (Φ). It s not too difficult to verify that Pin(Φ) and Spin(Φ) really are groups. We will write Pin(p, q) and Spin(p, q) for the pin and spin groups associated with Cl p,q and likewise Pin(n) := Pin(0, n) and Spin(n) := Spin(0, n) (not to be confused with the complex pin and spin groups Pin(Φ n ) and Spin(Φ n ) sitting inside Cl C n!). Recall that the algebra structure in Cl 0,n is that generated by the relations v v = v 2 1. Since Φ is assumed to be non-degenerate, any real Clifford algebra is (up to isomorphism) of the form Cl p,q and thus we can in fact always assume the real pin and spin groups to be of the form Pin(p, q) and Spin(p, q). In addition to the complex pin and spin groups there are also complexifications of the real pin/spin groups, namely let (V, Φ) be a real quadratic vector space and define Pin c (Φ) Cl(Φ) C = Cl(Φ) C to be the subgroup of invertible elements in Cl(Φ) C generated by Pin(Φ) 1 and 1 U(1). Similarly, Spin c (Φ) is defined as the subgroup generated by Spin(Φ) 1 and 1 U(1). These are not to be confused with Spin(Φ C )! Proposition 8.9. There is a Lie group isomorphism Spin c (Φ) Spin(Φ) ±1 U(1) i.e. the Spin c (Φ) is quotient of Spin(Φ) U(1) where we identify (1, 1) and ( 1, 1). In particular Spin c (Φ) is connected if Spin(Φ) is connected. Proof. We consider the smooth map Spin(Φ) U(1) Spin c (Φ) given by (g, z) gz. This is a surjective Lie group homomorphism. The kernel consists of elements (g, z) such that gz = 1 i.e. g = z 1 U(1). But these elements are rare, there are only ±1. Thus the kernel equals {±(1, 1)}, and the map above descends to an isomorphism. The group Spin c (Φ) is important in the study of complex manifolds. We have the following important lemma, to be used in the proof of Theorem 8.16.
8.2 Pin and Spin Groups 129 Lemma 8.10. The map Λ : Pin(Φ) O(Φ) is a surjective homomorphism with kernel { 1, 1}. Similarly, the map Λ : Spin(Φ) SO(Φ) is a surjective homomorphism with kernel { 1, 1}. Proof. First, Proposition 8.7 guarantees Λ(Pin(Φ)) O(Φ). It is trivial to verify that Λ is a homomorphism. To prove that Λ is surjective we use the Cartan-Dieudonne Theorem 1 which states that any element T of O(Φ) can be written as the composition T = s 1 s p of reflections where s j is the reflection through the hyperplane orthogonal to some vector u j V But according to u Proposition 8.2 s j = Λ uj. Since N(u j ) = Φ(u j ) 1, replacing u j by j Φ(uj) (this doesn t change the reflection) leaves N(u j ) = 1 and thus N(u 1 u p ) = ±1, i.e. u 1 u p Pin(Φ). Hence T = Λ u1 Λ up = Λ u1 u p which proves that Λ Pin(Φ) is surjective. The kernel is easily calculated: ker Λ Pin(Φ) = ker Λ Pin(Φ) = {t R 1 N(t) = ±1} = { 1, 1}. To prove the analogous statement for Spin(Φ) we need first to show that Λ maps Spin(Φ) to SO(Φ). Assume, for contradiction, that this is not the case, i.e. that an element f O(Φ) \ SO(Φ) exists such that Λ x = f for some x Spin(Φ). By the Cartan-Dieudonne Theorem f can be written as an odd number of reflections f = s 1 s 2k+1, and to each such reflection s j corresponds a vector u j so that s j = Λ uj. In other words we have Λ x = Λ u1 u 2k+1 or id = Λ x 1 u 1 u 2k+1. By Proposition 8.3 x 1 u 1 u 2k+1 = λ 1 for some λ R, i.e. x = λ 1 u 1 u 2k+1 and this is a contradiction, since u 1 u 2k+1 Cl 1 (Φ). Thus Λ maps Spin(Φ) to SO(Φ) and since {±1} Spin(Φ) the kernel is still Z 2. As a consequence of the proof of this lemma we note, that the set {v V Φ(v) = 1} of unit vectors generate Pin(Φ) as a subgroup of Γ(Φ), i.e. any element of Pin(Φ) can be written as a product of unit vectors in V. Similarly elements of Spin(Φ) can be written as a product of an even number of unit vectors in V. Let s warm up by doing a few simple examples Example 8.11. Let s calculate Pin(1) and Spin(1). They are subgroups of Cl 0,1 = C and the vector space, from which the Clifford algebra originates, is R which sits inside C as the imaginary line (cf. Example 7.5). In R we have just two unit vectors, namely ±1. They sit in C as ±i and they generate Pin(1) which are thus seen to be isomorphic to Z 4 (the fourth roots of unity). Spin(1) is then generated by products of two such unit vectors, i.e. Spin(1) = Z 2. Now, let s calculate Pin(2) and Spin(2). Recall from Example 7.5 that Cl 0,2 = H and that the vector space R 2 sits inside H as the i, j-coordinates. Thus Pin(2) is generated by elements in H of the form ai + bj with a 2 + b 2 = 1. We see that (ai + bj)(a i + b j) = (aa + bb )1 + (ab a b)k, and one can check that (aa + bb ) 2 + (ab a b) 2 = 1. Thus Pin(2) is the set Pin(2) = {ai + bj a 2 + b 2 = 1} {c1 + dk c 2 + d 2 = 1}, and the product is the one induced from the product on the quaternions. Spin(2) is the subgroup of this consisting of a product of an even number of unit vectors, i.e. specifically Spin(2) = {c1 + dk c 2 + d 2 = 1}. One can check that the product inherited from H is the same as the one on the circle group so that Spin(2) = U(1). 1 See for instance [Gallier] ([5]), Theorem 7.2.1.
130 Chapter 8 Spin Groups At this point, it is probably not very clear why there should be anything particularly interesting about pin and spin groups. The explanation is that they are double coverings of O(Φ) and SO(Φ) (Theorem 8.16). When n 3 then Spin(n) is even the universal double covering of SO(n) (Corollary 8.20). The following more tedious example can be thought of as a special case of this fact: it shows that Spin(3) is isomorphic to SU(2) which is known to be the universal double covering of SO(3). Example 8.12. Calculation of Spin(3). Choose an orthonormal basis e 1, e 2, e 3 of R 3. By Proposition 7.3, the list of elements 1, e 1, e 2, e 3, e 1 e 2, e 1 e 3, e 2 e 3, e 1 e 2 e 3 forms a basis for the Clifford algebra Cl 0,3. The group Spin(3) can be written Spin(3) = {x Cl 0 0,3 v V : α(x)vx 1 V and N(x) = 1}, as N(x) = 1 implies x Cl 0,3 (explicitly, x 1 = x). The first thing we want to show is that x Spin(3) if and only if x Cl 0 0,3 and N(x) = 1. The only non-trivial statement to be proven is that the conditions x Cl 0 0,3 and N(x) = 1 imply that α(x)vx 1 V for any v V. Since x Cl 0 0,3 and v Cl 1 0,3, we have xvx 1 Cl 1 0,3. Thereby, xvx 1 = u + λe 1 e 2 e 3 with u V and λ R. Moreover, observe that v = v for all v V, and x = x 1 since N(x) = 1, so But e 1 e 2 e 3 = e 3 e 2 e 1 = e 1 e 2 e 3, so xvx 1 = x 1 v x = xvx 1. u + λe 1 e 2 e 3 = xvx 1 = xvx 1 = u λe 1 e 2 e 3 ; hence λ = 0, and so xvx 1 V. This is equivalent to α(x)vx 1 V, as desired. Knowing that x Spin(3) iff x Cl 0 0,3 and N(x) = 1, we can characterize the elements x of Spin(3) in a very handy way. Namely, since x Cl 0 3,0, x must have the form x = a1 + be 1 e 2 + ce 1 e 3 + de 2 e 3 where a 2 + b 2 + c 2 + d 2 = N(x) = 1. Thus, Spin(3) consists of all elements x of the form x = a1 + be 1 e 2 + ce 1 e 3 + de 2 e 3 with a 2 + b 2 + c 2 + d 2 = 1. This allows us to establish an isomorphism Spin(3) = SU(2) like follows: e 1 e 2 ( i 0 0 i ) ( 0 1, e 1 e 3 1 0 ) ( 0 i, e 2 e 3 i 0 ( ) a + ib c id i.e. x. Thus, Spin(3) is isomorphic to SU(2). One c id a ib can use similar arguments to prove Spin(4) = SU(2) SU(2) and Spin(3, 1) 0 = SL(2, C) (cf. [Lawson and Michelson], p. 56, Theorem 8.4). Example 8.13. Let s calculate some of the spin c groups also. This is rather easy, given Proposition 8.9 and the calculations above. First we show that Spin c (3) = U(2). Define the map Spin(3) U(1) = SU(2) U(1) U(2) ),
8.3 Double Coverings 131 by (A, z) za. It is well-known that this map is a surjective Lie group homomorphism, and it is easily seen that the kernel is (I 2, 1) (I 2 is just the 2 2 identity matrix). Thus a Lie group isomorphism is induced on the quotient, i.e. an isomorphism Spin c (3) U(2). We can argue almost similarly to calculate Spin c (4). For brevity, put { ( ) } A S(U(2) U(2)) := 1 0 A 0 A 1, A 2 U(2), det A 1 det A 2 = 1 2 Define a map Spin(4) U(1) = SU(2) SU(2) U(1) S(U(2) U(2)) ( ) za1 0 (A 1, A 2, z). 0 za 2 This map is again a surjective Lie group homomorphism, as above, and the kernel is ±(I 2, I 2, 1), and thus it induces an isomorphism Spin c (4) S(U(2) U(2)). Note that Pin(Φ) and Spin(Φ) are Lie groups. This is because the multiplicative group Cl (Φ) of invertible elements is an open subset of Cl(Φ) (this is a general result for algebras) which is a finite-dimensional linear space, hence a manifold. Thus, Cl (Φ) is a manifold, and since multiplication and inversion are smooth maps, it is a Lie group. As Pin(Φ) is a closed subgroup of Cl (Φ) (since N is continuous) and Spin(Φ) is a closed subgroup of Pin(Φ) (since Cl 0 (Φ) is a closed subspace of Cl(Φ)), they are Lie groups. 8.3 Double Coverings In this section we will prove that Pin(Φ) and Spin(Φ) are double coverings of O(Φ) and SO(Φ), respectively. This will allow us to prove furthermore that Spin(n) is the universal double covering of SO(n) which is our main result. We first recall the notion of a covering space in the general setting of two topological spaces: Definition 8.14 (Covering Map). Let Y and X be topological spaces. A covering map is a continuous and surjective map p : Y X with the property that for any x X there is an open neighborhood U of x so that p 1 (U) can be written as the disjoint union of open subsets V α (called the sheets) with the restriction p Vα : V α U a homeomorphism. We say that U is evenly covered by p and call Y a covering space of X. When all the fibers p 1 (x) have the same finite cardinality n, we call p an n-covering. If Y is simply connected, the covering is called a universal covering. When X is pathwise connected, all fibers p 1 (x) will have the same cardinality. If the covering is the universal covering this is precisely the cardinality of the fundamental group π 1 (X). Quite often, covering maps between groups arise from group actions G Y Y. We now introduce the notion of an even action or covering action as it allows an elegant proof of Theorem 8.16. A group G is said to act evenly on the topological space Y if each point y Y has a neighborhood U such that g U h U = if g h. As usual, Y/G denotes the orbit space under the action of G, equipped with the quotient topology. Lemma 8.15. Let G be a finite group acting evenly on a topological space Y. Then the canonical map p : Y Y/G is a G -covering.
132 Chapter 8 Spin Groups Proof. p is obviously continuous and surjective. Let [y] = p(y) Y/G for some y Y. We shall produce a neighborhood of [y] which is evenly covered by p. As G is acting evenly there exists a neighborhood V Y of y such that g V h V = if g h; define U := p(v ). U is open, for we have p 1 (U) = g V, where g V are all open (as the map Y x g x is a homeomorphism for all g G). Consequently, p 1 (U) is open and by definition of the quotient topology, U is open. Thus, we have that U is a neighborhood of [y] and that p 1 (U) is a disjoint union of sets homeomorphic to V, and that the number of sheets is equal to the order of the group G. The only thing we need to show is that p V : V U is a homeomorphism. The map is obviously surjective. To show injectivity assume that p(x) = p(y) (for x, y V ) that is there exists a g G such that y = g x. But then y g V. From the fact that V g V = if g e, we deduce g = e and consequently x = y. Continuity of p V is obvious, since it s a restriction of a continuous map. We now only need to show that p V is a open map. But p is itself an open map, for let O Y be any open subset. By definition of the quotient topology, p(o) is open if and only if p 1 (p(o)) is open. But g G p 1 (p(o)) = g O is open, being the union of open sets g O. Being a restriction of an open map p to an open set V, p V is an open map, and p V is therefore a homeomorphism. g G Theorem 8.16. The map Λ : Pin(Φ) O(Φ) is a double covering. Moreover, Λ : Spin(Φ) SO(Φ) is a double covering. Proof. By Lemma 8.10, the homomorphism Λ : Pin(Φ) O(Φ) is surjective and has the kernel {1, 1} = Z 2. By standard results we have that O(Φ) and Pin(Φ)/Z 2 are isomorphic as Lie groups. We let Z 2 act on Pin(Φ) by multiplication which is obviously an even action. By the preceding Lemma 8.15, the quotient map Pin(Φ) Pin(Φ)/Z 2 is a double covering. Since Λ : Pin(Φ) O(Φ) can be identified with this map via the above isomorphism of Lie groups, Λ is a double covering. The proof for Λ : Spin(Φ) SO(Φ) is completely analogous. Corollary 8.17. The groups Pin(n) and Spin(n) are compact groups. Proof. Since O(n) and SO(n) are compact and Pin(n) and Spin(n) are finite coverings, the result follows from standard covering space theory. This does not hold in general, for instance Spin(3, 1) 0 = SL(2, C) which is definitely not compact. If the identity component in non-compact, the entire group must be non-compact. In Chapter 7 we constructed explicit isomorphisms Ψ : O(p, q) O(q, p) and Ψ : SO(p, q) SO(q, p). A natural question would be if these isomorphisms lift to isomorphisms on the level of pin and spin groups? For the pin groups the answer is no, in general. But for the spin groups the answer is affirmative:
8.3 Double Coverings 133 Proposition 8.18. There exists a Lie group isomorphism Ψ : Spin(p, q) Spin(q, p) such that the following diagram commutes Spin(p, q) Ψ Spin(q, p) Λ SO(p, q) Ψ Λ SO(q, p) Proof. Let ϕ be an anti-orthogonal linear map as mentioned in Chapter 7. By the same sort of argument as in the proof of Proposition 7.4, it is seen that ϕ extends to a map on the corresponding Clifford algebras. This is also denoted ϕ. Define Ψ : Spin(p, q) Spin(q, p) by (recall that Spin(p, q) is generated by an even number of unit vectors) Ψ(v 1 v 2k ) = ( 1) k ϕ(v 1 ) ϕ(v 2k ) where v 1,..., v 2k R p+q. This is easily seen to be a Lie group isomorphism. Thus we only need to check commutativity of the diagram. First we observe Then we immediately get Ψ(Λ vi )v = ϕ Λ vi ϕ 1 (v) = ϕ( v i ϕ 1 (v)v 1 i ) = ϕ(v i )vϕ(v 1 i ) = ϕ(v i )vϕ(v i ) 1 = Λ ϕ(vi)v. Λ(Ψ(v 1 v 2k )) = Λ(( 1) k ϕ(v 1 ) ϕ(v 2k )) = Λ ϕ(v1) Λ ϕ(v2k ) = Ψ(Λ v1 ) Ψ(Λ v2k ) = Ψ(Λ v1 Λ v2k ) = Ψ(Λ(v 1 v 2k )) Restricting attention to Spin(n) we can prove another delicious fact concerning its topology: Theorem 8.19. Spin(n) is path-connected when n 2 and simply connected when n 3. Proof. We first remark that Spin(n) is pathwise connected. Consider the element v 1 v 2k where v j R n and Φ 0,n (v j ) = 1 and note that each v j can be connected to v 1 by a continuous path running on the unit sphere S n 1 = {v R n Φ 0,n (v) = 1} in R n. Thus there is a continuous path from v 1 v 2k to v 1 v 1 = 1 and thereby Spin(n) is path-connected. Next we need to show, that the fundamental group at each point of Spin(n) is the trivial group. Since Spin(n) is path-connected it suffices to show this for just a single point. So let x 0 Spin(n). By standard covering space theory ρ : π 1 (Spin(n), x 0 ) π 1 (SO(n), ρ(x 0 )) is injective, and the index of the subgroup ρ (π 1 (Spin(n), x 0 )) in π 1 (SO(n), ρ(x 0 )) is equal to the number of sheets of the covering ρ, which we have just showed was 2. For n 3 the fundamental group π 1 (SO(n), ρ(x 0 )) is Z 2, so as ρ : π 1 (Spin(n), x 0 ) has index 2, it must be the trivial subgroup. Since ρ was injective also π 1 (Spin(n), x 0 ) is trivial, proving that Spin(n) is simply connected. Putting p = 0 in Theorem 8.16 and combining with Theorem 8.19 we get the main result: Corollary 8.20. For n 3, the group Spin(n) is the universal (double) covering of SO(n). It s a classical fact from differential geometry that for connected Lie groups G and H, if F : G H is a covering map, then the induced map F : g h is an isomorphism. 2 Spin(n) is connected and simply connected (Theorem 8.19), 2 See for instance [Warner], Proposition 3.26.
134 Chapter 8 Spin Groups and SO(n) is connected. By Theorem 8.16, the homomorphism Λ : Spin(n) SO(n) is a covering map, and thus we have an isomorphism Λ : spin(n) so(n), in particular, dim spin(n) = dim so(n) = n(n 1) 2. Let s investigate this map a little further. Recall that the Lie algebra of Cl 0,n is just the Clifford algebra Cl 0,n itself with the commutator bracket. Spin(n) is a Lie subgroup of Cl 0,n and hence the Lie algebra spin(n) is a Lie subalgebra of Cl 0,n. Proposition 8.21. Let {e 1,..., e n } be an orthonormal basis for R n, then spin(n) Cl 0,n is spanned by elements of the form e i e j where 1 i < j n. Furthermore Λ maps e i e j to the matrix 2B ij so(n) where B ij is the n n-matrix which is 1 in its ij th entry and 1 in its ji th entry. Proof. Consider the curve t (e i cos t + e j sin t)( e i cos t + e j sin t) = cos(2t) + sin(2t)e i e j. It is a curve in Spin(n) since it is the product of two unit vectors, and its value at t = 0 is the neutral element 1. Upon differentiating at t = 0 we get 2e i e j, which is then an element of T 1 Spin(n) = spin(n). They are all linearly independent i Cl 0,n hence also in spin(n), and there are exactly n(n 1) 2 of them, i.e. they span spin(n). Now, Λ is the restriction of the twisted adjoint representation to Spin(n), and since Spin(n) Cl 0 0,n we get Λ(g)v = gvg 1 for g Spin(n) and v R n. As for the usual adjoint representation one can calculate (Λ X)v = Xv vx (8.3) in particular we get 0, k i, j Λ (e i e j )e k = e i e j e k e k e i e j = 2e j, k = i 2e i, k = j We see that Λ (e i e j ) acts in the same way on R n as the matrix 2B ij, thus we may identify Λ (e i e j ) = 2B ij. We can rephrase the first part of this proposition by saying that under the symbol map σ : Cl 0,n Λ R n, the Lie algebra spin(n) gets mapped to Λ 2 R n. We end the section by a short description of some covering properties of Spin c (Φ). Proposition 8.22. The map Λ c : Spin c (Φ) SO(Φ) U(1) given by [g, z] (Λ(g), z 2 ) is a double covering. Proof. It is easy to see that it is well-defined (since Λ( g) = Λ(g) and ( z) 2 = z 2 ). It is a covering map because it is the quotient map of the even Z 2 -action on Spin c (Φ) given by ( 1, [g, z]) [ g, z] = [g, z]. Thus it follows from Lemma 8.15. Since SO(n) U(1) is compact, Spin c (n) is also compact. Furthermore, for n 2 Spin c (n) is connected (cf. Proposition 8.9 and connectivity of Spin(n)). Thus π 1 (Spin c (n)) can be identified with a subgroup of π 1 (SO(n) U(1)) = Z 2 Z of index 2, i.e. π 1 (Spin c (n)) = Z.
8.4 Spin Group Representations 135 8.4 Spin Group Representations In this section we will treat the basics of the representation theory of spin groups. In this section we will restrict our attention to a particular complex representation of Spin(n), the spinor representation, defined as follows: Definition 8.23 (Complex Spinor Representation). By the complex spinor representation κ n of Spin(n) we understand the restriction κ n Spin(n) Aut C ( n ) of the complex spin representation κ n of Cl C n to Spin(n). Similarly, we define the spin c -representation κ c n of Spin c (n) by restricting the spin representation to Spin c (n) Cl C n. We stress that we use the term spin representation for the irreducible Clifford algebra representations and spinor representation for the associated spin group representations. Of course we can in a similar way define the real spinor representation of Spin(n) by restricting ρ n to Spin(n), but we will not consider them here. Theorem 8.24. For each n the complex spinor representation κ n is a faithful representation of Spin(n). Proof. If n = 2k is even, then κ n is a restriction of the isomorphism κ n : Cl C n End C ( 2k ) and therefore injective. So let s assume that n = 2k + 1. By definition we have 2k = 2k+1 and consequently Aut( 2k ) = Aut( 2k+1 ). We can think of Spin(2k) as sitting inside Spin(2k + 1). 3 Denoting the injection ι the following diagram commutes: Spin(2k) κ 2k Aut( 2k ) ι Spin(2k + 1) κ2k+1 Aut( 2k+1 ) id Now, put H := ker κ 2k+1 Spin(2k + 1). The goal is to verify H = {1}, but first we show that H Spin(2k) = {1}. The inclusion follows since 1 clearly sits in ker κ 2k+1. Now, assume that h H Spin(2k). In particular, h H and h = ι( h) for some h Spin(2k). Since h sits in H, κ 2k+1 (h) = id 2k+1. From the commutativity of the diagram it follows that κ 2k ( h) = id 2k+1. But since κ 2k is injective, h must be 1, and so must h. This shows H Spin(2k) {1}. Identifying elements A SO(n) with elements in SO(2k + 1) of the form diag(a, 1), we obtain SO(2k) SO(2k + 1) like the spin groups. Recall that Λ : Spin(2k + 1) SO(2k + 1) is a surjective homomorphism. Thus Λ(H) is a normal subgroup of SO(2k + 1), since H as a kernel is normal in Spin(2k + 1). Now we claim Λ(H Spin(2k)) = Λ(H) SO(2k). The inclusion is obvious, and follows from the surjectivity of Λ. Hence we have Λ(H) SO(2k) = {I} (here, I denotes the identity matrix). We want to show that Λ(H) = {I}, so let A Λ(H) SO(2k + 1). Its characteristic polynomial is of odd degree, and it thus has a real root. As A SO(2k + 1), all eigenvalues have modulus 1. Moreover, A preserves orientation, so this root must be 1. Denote the corresponding eigenvector by v 0 and choose an ordered, positively oriented orthonormal basis for R 2k+1 containing v 0 as the last vector. 3 If Cl 0,2k is generated by {e 1,..., e 2k } and Cl 0,2k+1 is generated by {e 1,..., e 2k+1 } then we have a linear injection ι : Cl 0,2k Cl 0,2k+1 by defining ι(e j ) = e j. This restricts to an injection ι : Spin(2k) Spin(2k + 1).
136 Chapter 8 Spin Groups If B denotes the change-of-basis matrix, then we have the block diagonal matrix BAB 1 = diag(ã, 1), where à SO(2k) and 1 is the unit of R. We can now identify à with BAB 1. Hence, BAB 1 SO(2k), and since Λ(H) was normal, we also have BAB 1 Λ(H). All together we have BAB 1 Λ(H) SO(2k) = {I} and so A = I. Now we have Λ(H) = {I}. We have two possibilities: H = {1} or H = {±1}. But 1 cannot be in the kernel of the spinor representation (because it s not in the kernel of the spin representation, from which it came). Therefore, H = {1}, and κ 2k+1 is injective. This theorem is not as innocent as it might look. It actually tells us that the spinor representations do not arise as lifts of SO(n)-representations, since a lift of an SO(n)-representation necessarily contains {±1} in its kernel. We now want to decompose the spinor representations into irreducible representations. To this end we need: Lemma 8.25. For any complex vector space V the endomorphism algebra End(V ) is a simple algebra, i.e. the only ideals are the trivial ones. In particular if dim W < dim V, then any homomorphism ϕ : End(V ) End(W ) is trivial: ϕ 0. Proof. Let n be the dimension of V and fix a basis for V. Then we can think of End(V ) as the algebra of complex n n-matrices. Now let I End(V ) be any non-zero ideal, and let 0 a I. Then a has an eigenvalue λ 0 (because C is algebraically closed). By a suitable basis transformation, given by a change-of-basis-matrix b, and subsequently multiplying diag(1/λ, 0,..., 0) on b 1 ab from the left we obtain the matrix diag(1, 0..., 0). It has been constructed from a just by multiplication, so it s in I. By a similar argument we obtain diag(0,..., 0, 1, 0..., 0) I. The sum of all these is just the identity matrix, which therefore is also in I. Thus, I = End(V ). If ϕ : End(V ) End(W ) is a homomorphism, ker ϕ is an ideal in End(V ), thus ker ϕ = {0} or ker ϕ = End(V ). But since dim W < dim V injectivity of ϕ is impossible. Therefore ker ϕ = End(V ) and ϕ 0. Decomposing κ 2k+1 into irreducibles is easy: Theorem 8.26. The spinor representation κ 2k+1 of Spin(2k +1) is irreducible. Proof. Let s assume that {0} W 2k+1 is a Spin(2k + 1)-invariant subspace, i.e. for each g Spin(2k + 1) κ 2k+1 (g)w W. Consider an element of the form e i e j, i < j ({e 1,..., e 2k+1 } is an orthonormal basis for the vector space R 2k+1 underlying Cl 0,2k+1 ). It is an element of Spin(2k + 1) and therefore κ 2k+1 (e i e j )W W. I.e. κ 2k+1 is actually defined on all elements {e i1 e im } where i 1 < < i m and m is even. On the other hand these elements constitute a complex basis for (Cl C 2k+1) 0. Hence we get an algebra representation ϕ : (Cl C 2k+1) 0 End(W ) by extending κ 2k+1 linearly. But recall that (Cl C 2k+1) 0 = Cl C 2k = End( 2k ) (Proposition 7.16) so that we get an algebra homomorphism ϕ : End( 2k ) End(W ). W was a proper subspace of 2k+1 = 2k, so dim W < dim 2k. Lemma 8.25 now guarantees that ϕ 0. Since ϕ is an extension of κ 2k+1, this should also be zero. As κ 2k+1 is injective by Theorem 8.24 this is a contradiction, so W cannot be invariant.
8.4 Spin Group Representations 137 Example 8.27. Again we consider our favorite spin group Spin(3). What is the spinor representation of Spin(3)? Recall that Spin(3) = SU(2) and that for each n SU(2) has exactly one irreducible representation π n of dimension n + 1 on the space of homogenous n-degree polynomials in two variables. The spinor representation κ 3 is a 2-dimensional irreducible representation, thus it must be equivalent to π 1. κ 2k is not an irreducible representation, but it can quite easily be decomposed into such. To do this recall the volume element, the unique element ω in Cl 0,2k given by e 1 e 2k. It commutes with everything in the even part of Cl 0,2k and anti-commutes with the odd part. The map f = i k κ 2k (ω), which was an involution, gave rise to a splitting 2k = + 2k 2k. Lemma 8.28. + 2k and 2k are κ 2k-invariant subspaces. Thus, κ 2k induces representations κ ± 2k on ± 2k such that κ 2k = κ + 2k κ 2k. Proof. We want to show κ 2k (g) ± 2k ± 2k for any g Spin(2k), so let ψ be a positive Weyl spinor. Then f(κ 2k (g)ψ) = κ 2k (g)f(ψ) = κ 2k (g)ψ so κ 2k (g)ψ + 2k. Likewise if ψ is a negative Weyl spinor. Theorem 8.29. κ ± 2k are irreducible representations of Spin(2k). Proof. Like in the proof of Theorem 8.26 a κ + 2k-invariant subspace {0} W gives rise to a representation + 2k ϕ : (Cl C 2k) 0 End(W ). Again, by Proposition 7.16 and Corollary 7.20: (Cl C 2k) 0 = Cl C 2k 1 = End( 2k 1 ) End( 2k 1 ). We get homomorphisms ϕ 1, ϕ 2 : End( 2k 1 ) End(W ) simply by ϕ 1 (x) = ϕ(x, 1) and ϕ 2 (y) = ϕ(1, y). By assumption dim W < dim + 2k = dim 2k 1 and so by Lemma 8.25 ϕ 1, ϕ 2 0. This means ϕ(x, y) = ϕ((x, 1)(1, y)) = ϕ 1 (x)ϕ 2 (y) = 0, hence ϕ 0 and thus κ + 2k 0 which is a contradiction. The covering space results in the previous section yields the following result Corollary 8.30. Let κ : Spin(n) Aut(V ) be a finite-dimensional representation of Spin(n) which is the restriction of an algebra representation ρ : Cl 0,n End(V ) (e.g. the spinor representation κ n ). Then for the induced representation κ : spin(n) End(V ) it holds that for X spin(n) Cl 0,n and v V. κ (X)v = ρ(x)v Proof. Note that ρ is a linear map, hence the induced representation of the restriction ρ Cl 0,n is just ρ itself (where we have identified cl 0,n = Cl 0,n ). The induced representation of κ = ρ Spin(n) is just the restriction of ρ Cl 0,n, hence the formula follows. A close examination of the proofs above will reveal that nothing is used which does not hold for Spin c (n) as well. We may therefore summarize the results above in the following statement about the spin c -representations:
138 Chapter 8 Spin Groups Theorem 8.31. For each n the spin c -representation κ c n is faithful. If n is odd, the representation is irreducible and if n is even it splits in a direct sum of two irreducible representations κ c n = (κ c n) + (κ c n) where (κ c n) ± are representations on the space ± n.
Chapter 9 Topological K-Theory 9.1 The K-Functors In this chapter we will describe the topological K-theory of topological spaces, a theory developed in the 1950s, first by Alexander Grothendieck in his attempt to generalize the Riemann-Roch Theorem, later by Michael Atiyah, Friedrich Hirzebruch and Raoul Bott, culminating with the proof of the Bott periodicity in the late 1950s. The K stems from the German word Klassen, i.e. class. Later it has, with great success, been vastly generalized to a non-commutative version, a K-theory for C -algebras. In this chapter we shall deal with the K-theory for topological spaces. A short outline of the present chapter is as follows: we begin by defining K-theory for compact spaces and describing some elementary properties. In the section to follow we discuss some cohomological aspects of K-theory such as relative K-theory and the long-exact sequence, and then we state, without proof, Bott periodicity. After this, we more or less start all over again by generalizing to locally compact spaces carrying a G-action and finally we state the Thom Isomorphism Theorem and derive from this the Bott periodicity. Since K-theory is defined in terms of vector bundles, we begin with a short review, without proof, of some of the most important results in the theory of vector bundles. Proofs of these results can be found in [Ha] Chapter 1. Definition 9.1 (Vector Bundle). Let X be a topological space. A complex vector bundle over X is a topological space E and a continuous projection map π : E X, such that for each point x X there exists a neighborhood U X around x, a natural number k and a homeomorphism Φ : π 1 (U) U C k such that π 1 Φ = π. The pair (U, Φ) is called a trivialization of E and a cover (U i, Φ i ) of trivializations is called a trivialization cover. The trivial bundles X C k will be denoted by the short-hand notation I k. Note, that we do not require all the fibers to have the same dimension. On components the dimension of the fibers are the same, since the fiber dimension, by local triviality, is a continuous map X N 0, but on different components the dimensions need not be equal. The first proposition, which is really one of the founding pillars of K-theory, explains why we here at the beginning restrict our attention to compact topological spaces. 139
140 Chapter 9 Topological K-Theory Proposition 9.2. Let X be compact Hausdorff and E a vector bundle over X. Then there exists a vector bundle E over X such that E E is a trivial vector bundle. In the next chapter it will be of great importance that this result holds if X is a smooth manifold, compact or not 1. If π : E X is a vector bundle and f : Y X is a continuous map, we can form a vector bundle f E, called the pullback bundle, defined by f E = {(x, v) Y E f(x) = π(v)} with projection map π : (x, v) x. This bundle is the unique bundle over Y making the diagram f E F E π Y f commutative, where F is the map (x, v) v. The pullback construction satisfies the following properties X π (f g) E = g (f E) f (E F ) = f E f F f (E F ) = f E f F. Proposition 9.3. Let E be a vector bundle over X [0, 1] where X is a paracompact space, then the restrictions to X {0} and X {1} are isomorphic. From this one can deduce the following homotopy-invariance of pullback bundles: Corollary 9.4. Let X be a paracompact space, and E a vector bundle over X. If two maps f 0, f 1 : Y X are homotopic, then f 0 E and f 1 E are isomorphic. Let Vect C (X) denote the set of isomorphism classes of complex vector bundles over X. The isomorphism class containing E will be denoted [E]. With the operation + defined as [E] + [F ] = [E F ] (one readily checks that this is well-defined) Vect C (X) becomes an abelian semigroup (recall that E F = F E). Furthermore this semigroup has the cancellation property, i.e. if [E] + [H] = [F ] + [H] then [E] = [F ]. This can be seen in the following way: Let H be a bundle complementary to H, i.e. such that H H = I k. Then by adding [H ] on both sides we get that E I k = F Ik, and this of course implies that E = F. Definition 9.5. Let X be compact, then we define the (complex) K-group of X to be the Grothendieck group 2 K(X) of the semigroup Vect C (X). To be a little more specific, consider Vect C (X) Vect C (X) and define an equivalence relation on this product by ([E], [F ]) ([G], [H]) if and only if E H = F G. Then define K(X) = (Vect C (X) Vect C (X))/. 1 Cf. [MT] Exercise 15.10. 2 A good account of the Grothendieck construction can be found in N. J. Laustsen, F. Larsen and M. Rørdam: An Introduction to K-Theory for C*-Algebras.
9.1 The K-Functors 141 It is a standard result that any element in the Grothendieck group of a semigroup with the cancellation property, that the semigroup sits inside the group, and that any group element is just a formal difference of two elements in the semigroup. Thus any element ξ of the K-group is of the form ξ = [E] [F ], although this representation of the element need not be unique. In fact we have Lemma 9.6. Let [E] [F ] be an element of K(X), then there exist H and I k such that [E] [F ] = [H] [I k ]. Proof. To see this use Proposition 9.2 to get a vector bundle F such that F F = I k. Put H = E F, then [E] [F ] = ([E] + [F ]) ([F ] + [F ]) = [H] [I k ]. Addition in this group is, of course given by ([E 1 ] [F 1 ]) + ([E 2 ] [F 2 ]) = [E 1 E 2 ] [F 1 F 2 ], the neutral element is [I 0 ], the isomorphism class of the trivial zero-dimensional vector bundle over X, and this equals the formal difference [E] [E] for any vector bundle E. The inverse of [E] [F ] is then seen to be [F ] [E]. As a matter of fact, K(X) is not just a mere group, it is a commutative ring. It should come as no surprise that the product in K(X) originates from tensor products of vector bundles: we define ([E] [F ])([G] [H]) := ([E G] + [F H]) ([E H] + [F G]). This is, as one can check, well-defined, and since E F = F E this product turns K(X) into a commutative ring with unit element [I 1 ]. Although K(X) is a ring, we will refer to it as the K-group. If f : X Y is a continuous map, then we get a map f : K(Y ) K(X) simply by pulling back vector bundles f ([E] [F ]) = [f E] [f F ]. This is well-defined and satisfies (f g) = g f. Since f (E F ) = f (E) f (F ) and f (E F ) = f (E) f (F ), it is a ring homomorphism, hence K is a contravariant functor from the category of compact Hausdorff spaces to the category of commutative unital rings. Observe, that if f : X Y and g : X Y are homotopic, then by Corollary 9.4 f = g. Thus we have shown Proposition 9.7. Homotopic maps induce identical maps in K-theory. In particular, homotopy equivalences induce isomorphisms in K-theory. Observe that all we have done so far could equally well have been done for real vector bundles or, for that matter, quaternionic vector bundles. The corresponding real K-group is denoted KO(X) and the quaternionic K-group KSp(X) (in fact, complex K-theory is sometimes denoted KU(X) but as we will mostly be interested in complex K-theory, this notation is unnecessarily cumbersome). The difference will become visible when discussing Bott-periodicity, where in fact complex K-theory is substantially simpler than real K-theory. Example 9.8. Let s calculate the K-group of a one-point space {pt}. As the only vector bundles over {pt} are product bundles, and since these can only be distinguished by their dimension we have a natural identification Vect C (pt) = N 0, the identification being I k k. This is an isomorphism of semirings. If we
142 Chapter 9 Topological K-Theory Grothendieck this semiring we get Z, hence K(pt) = Z where the identification is [I m ] [I n ] m n. If X is a contractible space, then all vector bundles over X are trivial, thus Vect C (X) = N 0. Hence K(X) = Z and the identification is again [I m ] [I n ] m n. By exactly the same arguments we get KO(pt) = KSp(pt) = Z. With this example in mind we may proceed to define the so-called reduced K-theory. To this end let (X, x 0 ) be a based compact Hausdorff space, i.e. a topological space with a distinct point x 0, called the base point. The class of such spaces forms a category with morphisms being continuous maps X Y mapping base point to base point. Consider the inclusion i : {x 0 } X. This gives rise to a ring homomorphism i : K(X) K({x 0 }). If we compose i with the canonical isomorphism K({x 0 }) Z then it maps [E] [I k ] to the integer dim E x0 k, or, since fiber dimensions are constant on components, it maps [E] [I k ] to the dimension of E of the component containing x 0 minus k. Definition 9.9. Let X be a compact based space and define the reduced K- group of X by K(X) := ker i. Since K(X) is an ideal in K(X), it is itself a commutative ring, albeit not necessarily a unital one. Similarly, the map i induces homomorphisms on real and quaternionic K- theory, i O : KO(X) Z and i Sp : KSp(X) Z and we define KO(X) = ker i O and KSp(X) = ker i Sp. Obviously, these definitions are independent of the choice of one-point set. Later, we will discuss how to get rid of the base point dependence. For now, let s verify that K is a functor on the category of based spaces: assume f : X Y to be a continuous map between based spaces preserving base points. It induces a map in unreduced K-theory: f : K(Y ) K(X) and if ξ K(Y ), then i X(f (ξ)) = (f i X ) (ξ) = i Y (ξ) = 0 i.e. f (ξ) K(X). Thus by restriction we obtain a homomorphism f : K(Y ) K(X). This obviously satisfies the composition rule (f g) = g f and thus K is a functor. Proposition 9.10. Let X be a compact based space. Then we have a natural split exact sequence 0 K(X) i K(X) K(x 0 ) 0 c where i : {x 0 } X is the inclusion and c : X {x 0 } is the trivial map. Thus, we get a natural group isomorphism K(X) K(X) K(x 0 ) = K(X) Z. (9.1) Proof. Split exactness of the sequence follows since c i = id {x0}. To see that the split is natural, let Y be another compact based space and let f : X Y be a based map. Then we get homomorphisms f : K(Y ) K(X), f : K(Y ) K(X), f : K(y 0 ) K(x 0 ).
9.1 The K-Functors 143 We need to se that the diagram 0 K(Y ) K(Y ) i Y c Y K(y 0 ) 0 f 0 K(X) f i X f K(X) K(x 0 ) 0 c X is commutative. The first square commutes since f : K(Y ) K(X) is just restriction of f : K(Y ) K(X) to K(Y ). The second square commutes since f c Y = (c Y f) = (f i X ) = i X f and likewise f i Y = i X f. Thus the split is exact. It is well-known that a split exact sequence of groups induces an isomorphism K(X) K(X) K(x 0 ) and this isomorphism is natural since the split is natural. Example 9.11. Again we consider a one-point set, or, generally, a compact, contractible, based space X. Then K(X) = Z and by the isomorphism in the Proposition above we get K(X) = 0. Similarly KO(X) = KSp(X) = 0. One can put a quite different view on the reduced K-group, one that will, to a certain extend, allow us to disregard the base points. We consider the set of complex vector bundles over X and define an equivalence relation s, called stable equivalence by E s F iff there exist positive integers m and n such that E I m = F In. The stable equivalence class containing E is denoted [E] s. Observe, that a bundle is stably equivalent to 0, if and only of it is trivial. The following lemma is an immediate consequence of Proposition 9.2: Lemma 9.12. Direct sum in Vect C (X) gives Vect C (X)/ s the structure of an abelian group. Proposition 9.13. If X is a compact based space, then there is a group isomorphism K(X) Vect C (X)/ s. Proof. Consider the group homomorphism K(X) Vect C (X)/ s mapping the element [E] [I k ] to [E] s. It is easily checked that the map is well-defined and surjective. Since [E] s = 0 if and only if E is trivial, we see that the kernel of the map is {[I m ] [I k ] m, k N 0 }. But this is isomorphic to K(x 0 ). From the standard Isomorphism Theorem from group theory the homomorphism above induces an isomorphism K(X)/K(x 0 ) Vect C (X)/ s. But since K(X)/K(x 0 ) = K(X) the result follows. Thus we see that K(X) is (as a group) independent of the choice of base point: it is canonically isomorphic to Vect C (X)/ s which is defined without reference to base points. However if we want to view K(X) as a subgroup of K(X) the base point is needed. The way K(X) sits inside K(X) does in fact depend on the base point. Also the ring structure on K(X) depends on the base point! Armed with this alternative description of K(X) we are able to prove the following nice result:
144 Chapter 9 Topological K-Theory Proposition 9.14. Let X and Y be connected based spaces and assume furthermore that X and Y are connected. Then there is a group isomorphism K(X Y ) K(X) K(Y ). In fact if we define π X : X Y X to be the map satisfying π X X = id X and π X (Y ) = x 0 and similarly π Y : X Y Y, then the inverse K(X) K(Y ) K(X Y ) is given by (ξ 1, ξ 2 ) π X(ξ 1 ) + π Y (ξ 2 ). (9.2) Proof. The idea is to show that the groups fit into a split short exact sequence. Let i X : X X Y and i Y : Y X Y be the inclusions. The claim is that the following sequence is split exact: 0 π K(X) X K(X Y ) i Y K(Y ) 0. π Y First we see that i Y π Y = id K(Y ) since π Y i Y = id Y, thus the sequence is split, and i Y is surjective. Moreover we see that i Y π X = (π X i Y ) and that π X i Y factorizes as Y {x 0 } X. Since K(x 0 ) = 0, we get i Y π X = 0, thus im πx ker i Y. πx is injective: if p : E X is a bundle over X then π X (E) is the bundle over X Y which is E over X and the product Y p 1 (x 0 ) over Y. Assume this to be 0 over X Y, this means that the bundle is trivial, in particular E is trivial over X, i.e. [E] s = 0. Thus πx is injective. Finally let E be a bundle over X Y and assume [E] s to be in ker i Y. Thus when restricted to Y the bundle E is trivial. But then it is easy to see that E = πx (i X E), i.e. [E] s im πx. Thus split exactness of the sequence has been shown and the isomorphism follows. 9.2 The Long Exact Sequence In this section we will investigate K-theory of a pair. Let X be a compact Hausdorff space and A a closed subset. We call such a pair (X, A) a compact pair. For such a pair, the quotient space X/A is a compact space with base point A/A. Note that we do not require A to be non-empty, if A = then we interpret X/ as X +, i.e. X with a disjoint base point added. Given two compact pairs (X, A) and (Y, B), we consider continuous maps f : X Y mapping A into B. As one can check, such a map gives rise to a map f : X/A Y/B between the quotient spaces. Definition 9.15. Given a compact pair (X, A), we define the relative K-group by K(X, A) := K(X/A). Lemma 9.16. There are isomorphisms: K(X, ) = K(X + ) = K(X). Proof. The first follows simply by definition. To verify the other one, let ι : X X + = X {x 0 } denote the inclusion. This induces a group homomorphism K(X + ) K(X + ) K(X), which we denote by ι. We will show that this is the desired isomorphism. To see that it is injective, assume that we have elements [E] [I k ], [E ] [I l ] K(X + ) K(X + ) (which just means that they are in the kernel of the map K(X) K(x 0 ) i.e. that dim E x0 = k and dim E x 0 = l) such that: ι ([E] [I k ]) = ι ([E ] [I l ]).
9.2 The Long Exact Sequence 145 Since ι is just restriction to X, this simply means that we have the identity E X I l = E X I k. But this identity extends to all of X + by virtue of the dimension relations dim E x0 = k and dim E x 0 = l above, since x 0 is an isolated point and this just means that we have the identity [E] [I k ] = [E ] [I l ], i.e. ι is injective. To check that ι is surjective, let [E] [I k ] K(X). Define Ẽ over X+ by Ẽ X = E and Ẽx 0 = {x 0 } C k for some k, and let Ĩk be the extension of I k to X +. Then [Ẽ] [Ĩk] K(X + ) since their restrictions to x 0 have the same dimension namely k. Thus ι is surjective, hence an isomorphism. A map f : (X, A) (Y, B) induces a map f : X/A Y/B and hence a ring homomorphism f : K(Y, B) K(X, A). The composition rule is easily verified and thus we have extended the K-functor to a functor on the category of compact pairs. In homology and cohomology theories we have higher order groups, like H n (X) or H n (X). We want to construct something similar in K-theory. For this recall the notion of suspension: For a space X the (unreduced) suspension SX is the quotient of X I where X {0} and X {1} are collapsed to two points. We take the image of X {0} under the quotient map as the base point. If X is a based space, we can define the reduced suspension ΣX as the quotient of SX collapsing {x 0 } I to a point. It becomes a pointed space by taking this collapsed point as the base point. It is easy to see that S is a functor for compact spaces to compact based spaces and that Σ is a functor from the category of compact based spaces to itself. We can apply the suspension to the suspension and thus obtaining iterated suspensions S n X and Σ n X (where, of course, we define S 0 X = Σ 0 X := X). For the use in K-theory it is really of no concern which suspension we use, for as the quotient map SX ΣX collapses a contractible subspace, Proposition 9.20 below will guarantee that it induces ring isomorphisms K(ΣX) K(SX) and K(ΣX) K(SX). Recall also the cone on X: it is the functor mapping a compact space X to the quotient CX obtained from X I by collapsing X {0} to a point. We take this collapsed set to be the base point. By identifying X with the subset X {1} of CX we get a natural inclusion X CX. Collapsing X CX to a point we obtain SX i.e. SX = CX/X. Definition 9.17. Let X be compact, A X a closed subset and Y a compact based space. For n 0 define K n (X) := K(Σ n (X + )) K n (Y ) := K(Σ n Y ) K n (X, A) = K n (X, A) := K(Σ n (X/A)). Furthermore we define the total K-groups K (X) := n 0 K n (X), K (Y ) := n 0 K n (Y ) K (X, A) := n 0 K n (X, A). Since suspension and K are functors, the above defined maps are functors as well. Thus, a map f : X Y will induce maps f : K n (Y ) K n (X) and similarly in the reduced and relative cases.
146 Chapter 9 Topological K-Theory As we shall see in the next section most of these groups coincide. In fact they repeat themselves with a period of 2. This is the celebrated Bott periodicity. Example 9.18. Let s play around with these new definitions. Assuming (X, A) to be a compact pair we identify A as a subset of CA and thus we may form the union X CA. It is intuitively clear that we have a homeomorphism (X CA)/X CA/A = SA. Thus we get a chain of isomorphisms K(X CA, X) = K((X CA)/X) = K(SA) = K(ΣA) = K 1 (A). Our next goal is to obtain a long exact sequence for pairs. A great step is taken by the following lemma: Lemma 9.19. Let (X, A) be a compact pair with A a based space (the base point of A will then also be the base point of X), let i : A X be the inclusion and q : X X/A be the quotient map. Then the following sequence is exact: K(X/A) q K(X) i K(A). Proof. We see that i q = (q i) and that q i factorizes as the composition A A/A X/A. Since K(A/A) = 0, i q factors through the trivial group, i.e. is the zero map. Consequently, im q ker i. Conversely, assume that [E] s ker i i.e. that [i E] s = 0. This means that E is trivial when restricted to A. We need to show that E is the pullback along q of some vector bundle over X/A. Let π : E X denote the projection. As E is trivial over A we have a trivialization map ϕ : π 1 (A) A C n, i.e. a homeomorphism which is linear on fibers mapping fibers isomorphically to C n. Consider the space Ẽ, the quotient of E under the equivalence relation ϕ 1 (x, v) ϕ 1 (y, v) for x, y A and v C n, i.e. we collapse all the fibers over A to one fiber. Let q : E Ẽ denote the quotient map. We get an obvious projection map π : Ẽ X/A rendering the following diagram commutative: E q Ẽ π π X q X/A To show that Ẽ is a vector bundle we need a trivialization in a neighborhood around the point A/A (around all other points in X/A we have trivializations coming directly from the bundle E). From the trivialization ϕ we get a local frame i.e. continuous maps s 1,..., s n : A C n such that ϕ 1 (x, s i (x)) are sections of E over A and such that at each point x A the set {s 1 (x),..., s n (x)} is a basis for C n. Now, let U j be a covering of A of open sets on which E is trivial and let ϕ j : p 1 (U j ) U j C n be the trivializations. Since A is compact, we can assume this covering to be finite. As A U j is closed in U j and as X is normal we can (by Tietze s Extension Theorem) extend s i from U j A to U j. This extended map we call s ij. This gives rise to a section ϕ 1 j (x, s ij (x)) of E over U j. Let {ψ j } be a partition of unity subordinate the the cover (U j ), then we can patch these sections together to n sections over j U j. These need not constitute a basis for π 1 (x) for each point in j U j but at least in a neighborhood around A they will. Thus we have a local frame in a neighborhood of A and this is the same as having a local trivialization i a neighborhood of A. This gives the desired trivialization of Ẽ around A/A and hence π : Ẽ X/A is a vector bundle.
9.2 The Long Exact Sequence 147 Finally, consider the map Φ : E q (Ẽ) given by v (π(v), q(v)). This is easily seen to be an isomorphism of vector bundles. Thus [E] s = q [Ẽ] s, i.e. [E] s im q. From the proof of this lemma we can extract the following result which reflects some kind of excision property for the K-groups: Proposition 9.20. Let A X be a closed contractible subspace, and let q : X X/A denote the quotient map. Then the induced maps q : K(X/A) are ring isomorphisms. K(X) and q : K(X/A) K(X) Proof. We will show that the pullback map q : Vect C (X/A) Vect C (X) is an isomorphism of semirings, and this we do by constructing an explicit inverse. Let E Vect C (X). Since A is contractible, E A is trivial and as in the proof above we can find a trivialization ϕ on an open set U A and from this construct a vector bundle Ẽ over X/A. We will show that E Ẽ is the inverse to q, but first we have to check that it is well-defined, i.e. that it is independent of the choice of trivialization. So assume that ϕ is another trivialization over U A and let Ẽ be the corresponding bundle. We see that ϕ = (ϕ ϕ 1 ) ϕ, where g := ϕ ϕ 1 : U U GL(n, C) is the transition map. Since A is contractible, g A is null-homotopic, and since GL(n, C) is path-connected, g A is homotopic to the constant map A x I n. This implies that we get a homotopy H from ϕ A to ϕ A, where for each t [0, 1] the map H t is a trivialization of A. Thus for each t we get a bundle Ẽt over X/A and these we clutch together to a bundle Ẽ I over (X/A) I. Since Ẽ0 = Ẽ and Ẽ1 = Ẽ it follows from Proposition 9.3 that E and E are isomorphic, i.e. the map is well-defined. We saw in the proof of the lemma that q (Ẽ) = E, and as q E = E, the map is the desired inverse. From the lemma we can furthermore derive the main result of this section, namely the existence of long exact sequences in K-theory: Theorem 9.21 (Long Exact Sequences). Let (X, A) be a compact pair then there exist connecting homomorphisms δ : K n (A) K n+1 (X, A) such that the following sequence is exact K 2 (A) δ K 1 (X, A) δ K 0 (X, A) q K 1 (X) q K 0 (X) i K 1 (A) i K 0 (A). Likewise, if (X, Y ) is a compact pair and A is based, then the following sequence is exact K 2 (A) δ K 1 (X, A) δ K 0 (X, A) q K 1 (X) q K 0 (X) i K 1 (A) i K 0 (A). Proof. The core of the proof is to establish exactness of the following sequence: K 1 (X) i K 1 (A) δ K 0 (X, A) j K 0 (X) i K 0 (A). (9.3) From Lemma 9.19 we get exactness at K 0 (X). To show exactness at the other two spots we use Lemma 9.19 for the pairs (X CA, X) and ((X CA) CX, X CA). Namely, consider the sequence of inclusion and quotient X X CA π (X CA)/X.
148 Chapter 9 Topological K-Theory From Lemma 9.19 we get an exact sequence K((X CA)/X) K(X CA) K(X). We have (X CA)/X = SA and by Proposition 9.20 a natural isomorphism ϕ 1 : K(ΣA) K(SA). The map X CA X/A collapsing the cone induces an isomorphism ϕ 2 : K(X/A) K(X CA) (since the cone is contractible). Define δ := ϕ 1 2 π ϕ 1 : K 1 (A) K(X, A). Since the quotient q : X X/A factorizes as a composition X X CA X/A we see that δ and q fit into the following commutative diagram K((X CA)/X) ϕ 1 π K(X CA) K(X) ϕ 2 K(ΣA) δ K(X/A) q K(X) Since the top row is exact the bottom row will be exact as well, yielding exactness at K 0 (X, A) in (9.3). Exactness at K 1 (A) is proved in exactly the same way. We can now extend the sequence to the left by replacing X and A by ΣX and ΣA thus obtaining the following exact sequence K 2 (X) K 2 (A) K 1 (X, A) K 1 (X) K 1 (A) which coincides with (9.3) at the two last spots. Continuing this process will give us an infinite long exact sequence. To get the corresponding sequence for unreduced K-theory, just replace X and A by X + and A + and recall that K(X) = K(X + ) and that K(X, A) = K(X +, A + ). The reader may wonder about the missing 0 at the end of the long exact sequence, after all the long exact sequences for homology and cohomology theories do end at 0. In general it is not true that the map K(X) K(A) is surjective. However, it is true, if A is a retract of X i.e. if there is a map r : X A such that r A = id A. Corollary 9.22. For compact based spaces X and Y there exists a group isomorphism K(X Y ) = K(X Y ) K(X) K(Y ). (9.4) To be more specific, if π 1 : X Y X and π 2 : X Y Y are the projection maps and q : X Y X Y the quotient map, then the inverse K(X Y ) K(X) K(Y ) K(X Y ) is given by (ξ 1, ξ 2, ξ 3 ) π 1(ξ 1 ) + π 2(ξ 2 ) + q (ξ 3 ). (9.5) Proof. Since we have Σ(X Y ) = ΣX ΣY we get K 1 (X Y ) = K 1 (X) K 1 (Y ) by Proposition 9.14 the last part of the reduced long exact sequence for the pair (X Y, X Y ) takes the following form: K(Σ(X Y )) K(ΣX) K(ΣY ) K(X Y ) K(X Y ) K(X) K(Y ). The sequence splits for we have a map K(X) K(Y ) K(X Y ) by (ξ 1, ξ 2 ) π 1(ξ 1 ) + π 2(ξ 2 ) where π 1 : X Y X and π 2 : X Y Y are
9.3 Exterior Products and Bott Periodicity 149 the projections. This means in particular that the map is surjective, and hence we can extend the sequence above with a zero. Similarly the map K(Σ(X Y )) K(ΣX) K(ΣY ) splits, in particular it is surjective. Consequently, the map K(ΣX) K(ΣY ) K(X Y ) is the zero map. Thus the exact sequence above is equivalent to the following split short exact sequence 0 K(X Y ) K(X Y ) K(X) K(Y ) 0. The statement of the proposition now follows from elementary results on split short exact sequences. The results of Lemma 9.19 as well as Theorem 9.36 and Corollary 9.22 hold also in the real and quaternionic case. 9.3 Exterior Products and Bott Periodicity In this section we want to introduce some extra algebraic structure on K (X) and K (X). The way this is accomplished is to introduce exterior products, one for reduced K-theory and later an unreduced analogue. Let X and Y be compact based spaces. The exterior product in reduced K- theory is a map µ : K(X) Z K(Y ) K(X Y ) defined in the following way: Let ξ 1 K(X) and ξ 2 K(Y ). Then retaining the notation of π 1 and π 2 for the projections from X Y onto X and Y respectively, we can form the product ξ := π 1(ξ 1 )π 2(ξ 2 ) K(X Y ). By (9.5) we know that ξ in a unique way can be written on the form π 1(η 1 ) + π 2(η 2 ) + q (η 3 ) (9.6) and we want to show that the first two terms are 0. Let ι X : X X Y denote the map x (x, y 0 ) and similarly let ι Y denote the map y (x 0, y). We see that π 2 ι X : X Y maps x y 0, i.e. ι X π 2 : K(Y ) K(X) factorizes through K(y 0 ) = 0, i.e. ι X π 2 = 0. Similarly ι X q = 0. Furthermore, since π 1 ι X = id X, we get by applying the map ι X to (9.6) that ι X(π 1(ξ 1 ))ι X(π 2ξ 2 ) = η 1. But the left-hand side is 0, since ι X π 2 = 0, and thus η 1 = 0. In a similar fashion we get η 2 = 0, and therefore ξ = q (η 3 ). Now we simply define µ(ξ 1 ξ 2 ) := η, this is the unique element in K(X Y ) to which π 1(ξ 1 )π 2(ξ 2 ) pulls back. Since Σ i X = S i X this exterior product is easy to extend to the higher K-groups, namely, define µ : K i (X) Z K j (Y ) K i j (X Y ) to be the exterior product K(S i X) Z K(S j Y ) K(S i+j X Y ). In particular if X = Y = {pt} the exterior product turns K (pt) into a graded ring and, since X {pt} = X, the exterior product K i (pt) Z K j (X) K i j (X) turns K (X) into a graded module 3 over K (pt). 3 A graded module M over a graded ring R = i R i is a module over R with a grading M = j M j such that R i M j M i+j.
150 Chapter 9 Topological K-Theory This exterior product can easily be extended to the unreduced case: If we replace X and Y with X + and Y + and recall Lemma 9.16, we get a map µ : K i (X) Z K j (Y ) K i j (X + Y + ) = K i j (X Y ). In the same way as above this exterior product turns K (pt) into a graded ring and K (X) into a graded module over K (pt). All this works equally well for real K-theory: we can define exterior products KO i (X) Z KO j (Y ) KO i j (X Y ) and KO i (X) Z KO j (Y ) KO i j (X Y ) in exactly the same way as above. These exterior products give KO (pt) and KO (pt) the structure of graded rings and turn KO (X) and KO (X) into graded modules over the rings KO (pt) and KO (pt) respectively. One may have observed, that we have not yet been able to calculate the higher K-groups, even for a 1-point space, not even mentioning more complicated spaces. In fact, to do so requires a genius like Bott or Atiyah: Theorem 9.23 (Bott Periodicity I). Let H be the canonical line bundle over CP 1 = S 2, and let ξ K 2 (pt) = K(S 2 ) be the element [H] [I 1 ]. Then there is a ring isomorphism K (pt) = Z[ξ]. In particular K i (pt) = 0 if i is odd, K i (pt) = Z if i is even and for any i multiplication by ξ gives an isomorphism K i (pt) K i 2 (pt). We will not prove it here. In a later section, we shall deduce it, partly, from the so-called Thom isomorphism. Out next aim is to calculate the K-groups of spheres. Since pt + = S 0, examining the definitions gives K i (pt) = K(Σ i S 0 ) = K(S i ). (9.7) Thus the K-groups of a sphere is closely related to those of a point. From (9.7) we read off the reduced K-groups of the spheres immediately: { K(S n Z if n is even ) = 0 if n is odd. Thanks to Proposition 9.10 we get the unreduced K-groups as well: { K(S n Z Z if n is even ) = Z if n is odd. We may even calculate the zeroth K-groups of a torus T 2 = S 1 S 1 by virtue of (9.4) K(T 2 ) = K(S 1 S 1 ) K(S 1 ) K(S 1 ) = Z and by Proposition 9.10 we have also K(T 2 ) = Z Z. For the higher K-groups we get K i (S n ) = K(Σ i S n ) = K(S i S n ) = K(S n+i ). In particular, these K-groups repeat themselves with a period of 2, just as in the case of a point. This is not a coincidence, in fact this remarkable property holds for any space:
9.4 Equivariant K-theory 151 Theorem 9.24 (Bott Periodicity II). Let X be a compact space. Then the module multiplication by ξ as defined above, gives isomorphisms K i (X) K i 2 (X) and K i (X) K i 2 (X). It is an immediate consequence of the definition, that also relative K-groups repeat themselves with a period of 2: K i (X, A) = K(Σ i (X/A) + ) = K i ((X/A) + ) = K i 2 ((X/A) + ) = K(Σ i+2 (X/A) + ) = K n 2 (X, A). At this point it is very important to stress that this is where complex, real and quaternionic K-theories go their separate ways: There are periodicity relations in these theories as well, but in the real case the period is 8 and in the quaternionic case the period is 6. The lower period in complex K-theory makes the complex K-groups substantially simpler to compute. Corollary 9.25. Composing the isomorphism K 0 (A) K 2 (A) with the connecting homomorphism δ : K 2 (A) K 1 (X, A) from the long exact sequence we can reduce the long exact sequence of the pair (X, A) to the following six-term exact sequence K 0 (X, A) K 0 (X) K 0 (A) K 1 (A) K 1 (X) K 1 (X, A) A similar statement holds in the reduced case. 9.4 Equivariant K-theory In this section we generalize the K-theory described thus far to topological spaces X carrying a continuous action of a topological group, where by continuous action we mean a continuous map θ : G X X, written (g, x) θ g (x) = g x satisfying g 1 (g 2 x) = (g 1 g 2 ) x. For short we say that X is a G-space. A G-map or equivariant map between to G-spaces X and Y is a map f : X Y such that f(g x) = g f(x). Definition 9.26. Let X be a G-space. A (complex) G-vector bundle or simply a G-bundle (not to be confused with a principal G-bundle which is something completely different) over X is a complex vector bundle π : E X over X, the total space of which is a G-space in such a way that π is a G-map (i.e. if v E x then g v E g x ) and such that the action θ g : E x E g x is linear. Note that all this reduces to the usual definition of a vector bundle if G is the trivial group. Note, however, that a the total space of a G-bundle over a space X with a trivial G-action need not be a trivial G-space Example 9.27. 1) The product bundle: Let X be a G-space and V be a G- module (i.e. a complex vector space with a representation ρ : G Aut(V )). Then V := X V is a G-bundle with the action g (x, v) = (g x, ρ(g)v). This obviously satisfies the requirements in the definition. 2) Complexified tangent bundle: Let X be a smooth manifold with a smooth G-action θ and consider the complexified tangent bundle: T X C := T X R C. Then the diffeomorphism θ g : X X induces a push-forward bundle map (θ g ) : T X T X and hence also a self-map on the complexified bundle.
152 Chapter 9 Topological K-Theory This map is easily seen to be a smooth action on the space T X C and since it is linear on fibers and maps (θ g ) : T x X R C T g x X R C the projection map is equivariant with respect to these G-action, hence turning T X C into a G-bundle. Definition 9.28. Let E be a G-bundle. A section s of E which is also a G-map is called equivariant. The vector space of equivariant sections is denoted Γ G (E). If E and F are two G-bundles over X then a bundle map ϕ : E F is called equivariant or a G-bundle map if it is a G-map. The set of G-bundle maps we denote by Hom G (E, F ). In particular a G-bundle isomorphism is a G-bundle map which is also a bundle isomorphism. The set of G-isomorphism classes of G-bundles over X is denoted Vect G C (X). By [E] we denote the G-isomorphism class containing the G-bundle E. All the usual constructions of vector bundles carry over to G-bundles: We can form the direct sum E F by endowing it with the action g (v, w) = (g v, g w). We can form the tensor product E F, by giving it the action g (v w) = g v g w (extended linearly). Finally, given a G-map f : Y X and a G-bundle E over X we give the pullback bundle f E = {(x, v) Y E f(x) = π(v)} the action g (x, v) = (g x, g v). This is well-defined since f(g x) = g f(x) = g π(v) = π(g v), i.e. g (x, v) f E. It turns f E into a G-bundle. As for ordinary vector bundles (cf. Corollary 9.4) one can show Proposition 9.29. Let f 0, f 1 : Y X be two G-maps which are G-homotopic, i.e. homotopic through G-maps, and let E be a G-bundle over X. Then the G- bundles f 0 E and f 1 E are G-isomorphic. We won t prove it here 4. An important corollary is of course that any G- bundle over a G-contractible base space is trivial. Another important result that will become useful to us is the following generalization of Proposition 9.2 5 : Proposition 9.30. Let X be a compact G-space and E a G-bundle over X. Then there exists another G-bundle E over X and a G-module V such that E E = V. Thus, over compact spaces we always have complementary G-bundles. The set of G-isomorphism classes of G-bundles over X is an abelian semigroup w.r.t. direct sum, thus we may define: Definition 9.31. Let X be a compact G-space. The equivariant K-group K G (X) of X is the Grothendieck group of the semigroup of G-bundles over X. By Proposition 9.30 the semigroup Vect G C (X) has the cancellation property and thus, as in ordinary K-theory the elements of K G (X) are formal differences [E 0 ] [E 1 ] of isomorphism classes, two such differences [E 0 ] [E 1 ] and [F 0 ] [F 1 ] being equal if E 0 F 1 = F0 E 1. Direct sum of G-bundles gives K G (X) the structure of an abelian group, and tensor product gives K G (X) the structure of a commutative ring. 4 For a proof see [Seg] Proposition 1.3. 5 A proof is given in [Seg] Proposition 2.4.
9.4 Equivariant K-theory 153 If f : Y X is a G-map the pullback map on G-bundles induces a map in K-theory f : K G (X) K G (Y ) and as in ordinary K-theory this is a ringhomomorphism. Thus K G becomes a contravariant functor from the category of G-spaces and G-maps to the category of commutative rings. From Proposition 9.29 we see that two G-homotopic maps induce the same maps in equivariant K-theory. Example 9.32. Let us calculate the equivariant K-group of a one-point G- space. By Example 9.27 we see that Vect G C (pt) equals the semigroup of finitedimensional G-modules. The Grothendieck group of this is the so-called representation ring R(G) which is just the free abelian group generated by the set of finite-dimensional G-modules. Thus K G (pt) = R(G). This is in accordance with the result from usual K-theory, K(pt) = Z, since the trivial group {e} has only one irreducible representation so that its representation ring is the integer ring. Let X be a space. We say that X is a based or pointed G-space if it has a base point x 0 satisfying g x 0 = x 0 for all g G. The inclusion of a base point into the based space is readily seen to be a G-map. Definition 9.33. Let X be a compact based G-space the base point of which is denoted x 0. Letting i : {x 0 } X denote the inclusion we denote by K G (X) the sub-ring ker i of K G (X). This is called the reduced equivariant K-group of X. Thanks to Proposition 9.30 we have another description of this reduced group. We say that two G-bundles E and F are G-stably equivalent if there exist G- modules V and W such that E V = F W. This equivalence relation on Vect G C (X) we denote s. Now the proof of Proposition 9.13 carries over more or less verbatim to yield: Proposition 9.34. The quotient Vect G C (X)/ s is a group which is canonically isomorphic to the group K G (X). Thus again, the reduced group itself does not depend on the base-point but the ring structure and the way it sits inside K G (X) do. Before we can define the relative and higher order equivariant K-groups we need to investigate how group actions behave with respect to quotients. Let X be a compact space with G-action θ and A X a closed subspace which is invariant under the group action, i.e. g A A for all g G. Such a pair is called a compact G-pair. We want to give the quotient space X/A a G-action. Let q : X X/A denote the quotient map. Then there is a unique continuous map θ : G X/A X/A making the diagram commute G X q θ G X/A X/A id G q and this is easily seen to be a G-action on X/A. If X is a compact based G-space, then we can equip X I with the G-action g (x, t) = (g x, t). With this G-action both X {0}, X {1} and {x 0 } I are G-invariant closed subspaces of X I and thus we can collapse them and obtain a G-action on the quotient space which is of course just ΣX. In complete analogy with ordinary K-theory we can now define define: θ
154 Chapter 9 Topological K-Theory Definition 9.35. Let (X, A) be a compact G-pair and Y a compact based G-space. For n 0 define K n G Furthermore we define K n G (X) := K G (Σ n (X + )) K n G (Y ) := K G (Σ n Y ) n (X, A) = K (X, A) := K G (Σ n (X/A)). G K G (X) := n 0 K n G (X), K G (Y ) := n 0 K n G (Y ) K G (X, A) := n 0 K n G (X, A). We even have long-exact sequences: Theorem 9.36. Let (X, A) be a compact G-pair then there exist connecting homomorphisms δ : K n G (A) K n+1 G (X, A) such that the following sequence is exact K 2 G (A) δ K 1 q (X, A) G δ K 0 G(X, A) K 1 G (X) q K 0 G(X) i K 1 G (A) i K 0 G(A). Likewise, if (X, A) is a compact G-pair and A is furthermore a based G-space then the following sequence in reduced K-theory is exact K 2 G (A) δ K 1 q (X, A) G δ K 0 G(X, A) 1 K G (X) i q K 0 G(X) K 1 G (A) i K 0 G(A). The proof doesn t really involve the group G and so the earlier proof for usual K-theory will hold in this case as well. If X is a locally compact space, we can form the 1-point compactification X { } which we denote by X +. This is in accordance with the previous use of the same notation, for if X happens to be compact, then the 1-point compactification is indeed just X with a disjoint point added. If X is a G-space, we can extend the G-action to X + by defining g =. Since θ g : X X is a homeomorphism, hence a proper map, the extension is continuous X + X +. We thus extend equivariant K-theory to the category of locally compact spaces in the following way: Definition 9.37. Let X be a locally compact G-space. Then we define K n G (X) := K G (Σ n (X + )) and K n G (X, A) := K n G (X+, A + ). This is called equivariant K-theory with compact support. If X is already compact, then one can show (as in Lemma 9.16) that K G (X + ) = K G (X) and hence that this new definition is a genuine extension of equivariant K-theory to the category of locally compact spaces. However, the functorial properties are a bit more complicated than before: In fact, K-theory with compact support is a functor in two different ways: First, it is a contravariant functor with respect to proper maps (i.e. maps whose pre-image of compact sets are compact sets). For if f : X Y is a proper map, then this can be extended in an obvious way to a continuous map f + : X + Y + and this induces a map (f + ) : K G (Y + ) K G (X + ) i.e. a map K G (Y ) K G (X).
9.5 The Thom Isomorphism 155 Second, it is a covariant functor with respect to inclusions: if X is a locally compact space and U X is an open subset of X, then we get a homomorphism K G (U) K G (X) induced by the map X + X + /(X + \ U) The following identities will become useful: U +. Lemma 9.38. Let X be a locally compact G-space and A closed a G-subset. Then we have K n G (X) = K G(X R n ) and K n G (X, A) = K G(X R n, A R n ). Proof. This is an exercise in the definitions: Since Σ n X = S n X + = (R n X) + we see that K n G (X) = K G (Σ n X + ) = K G ((X R n ) + ) = K G (X R n ). To see that the second identity is true, we note that Σ n (X + A C(A + )) = Σ n X + Σn A C(Σ n (A + )) and that K n G (X, A) = K G (Σ n (X + A CA + )) and therefore K n G (X, A) = K G (Σ n (X + A CA + )) = K G (Σ n X + Σ n A C(Σ n A + )) = K G (X + S n C(A + S n )) = K G (X + S n, A + S n ) = K G (X R n, A R n ). 9.5 The Thom Isomorphism Now we will take a quite different approach to equivariant K-theory, a description which goes via complexes of G-bundles. This will help us define the so-called Thom isomorphism and ultimately lead us to Bott periodicity. Out starting point will be the following: Let X be a G-space and consider a finite complex E of G-bundles over X E i 1 di 1 E i d i E i+1 di+1 where the d i s are G-bundle maps such that d i d i 1 = 0 and finite means that E i = 0 when i N for some N. Such an object we call a G-complex and the maps d i we call differentials. We define the support supp E to be the set of x X for which the sequence of vector spaces and linear maps E i 1 x d i 1 E i x d i E i+1 x d i+1 fails to be exact. Lemma 9.39. The support of a complex is a closed set. Proof. First note that the set {x dim im d i x > k} is open, for if d i x 0 has rank greater that k it has a matrix representation in which there is a k k-submatrix which has non-zero determinant. Since this matrix depends continuously on x there is at least a neighborhood around x 0 where the
156 Chapter 9 Topological K-Theory same k k-submatrix has non-zero determinant. Thus the set above is open. In the same way we have that is open. Thus the sets {x dim ker d i x < l} {x dim im d i x k} and {x dim ker d i x k} are closed. We see that supp E = i {x dim ker d i x > dim im d i 1 x } and that {x dim ker d i x > dim im d i 1 x } = l is closed. ( {x dim ker d i x l}\{x dim im d i 1 x > l 1} ) If the complex is exact at every point (i.e. if the support is empty) we say that the complex is acyclic. An acyclic complex consisting of only two non-zero bundles is called a simple acyclic complex. It is an elementary fact that any acyclic complex can be written as a direct sum of simple acyclic complexes. A morphism of complexes f : E F is a collection of G-bundle maps f i : E i F i commuting with the differentials, i.e. making the diagram commutative: E i 1 di 1 E i d i f i 1 f i di+1 i+1 E f i+1 F i 1 di 1 F i d i F i+1 di+1 A morphism f of complexes for which each of the G-bundle maps f i is a bundle isomorphism is called an isomorphism of complexes. Example 9.40. An example which will become important shortly when we construct the Thom isomorphism is the so-called Koszul complex. Let E be a G-bundle over X and s and equivariant section of E. From this we construct the following complex 0 C d0 E d1 Λ 2 E d2 d n 1 Λ n E 0 consisting of exterior powers of E, which are given the natural G-action: g (ξ 1 ξ k ) = g ξ 1 g ξ and with morphisms d k (ξ) = ξ s(x) if ξ Λ k E x. The maps d k are easily seen to be G-bundle maps: they obviously map fibers to fibers and if (U, Φ) is a trivialization for E, then it also trivializes Λ k E, i.e. Λ k E U = U Λ k C n and locally the map d k takes the form U Λ k C n (x, ξ) (x, ξ Φ(s(x))) and this is continuous. Thus d k is a bundle map, and clearly d k d k 1 = 0. To see that they respect the G-action, let ξ be in the fiber over x: d k (g ξ) = (g ξ) s(g x) = (g ξ) g s(x) = g (ξ s(x)) = g d k (ξ). Since wedging with a nonzero vector is known to yield an exact sequence, we see that the support of the Koszul complex equals the set of zeros of s. If s is everywhere non-zero, the complex is acyclic.
9.5 The Thom Isomorphism 157 The usual operations on G-vector bundles can be generalized to G-complexes: If E and F are two G-complexes we can form their direct sum E F E i F i d i d i E i+1 F i+1 where we simply form the direct sum of the corresponding vector bundles and differentials. We see that supp(e F ) = supp E supp F, for if x is some point where E or F fails to be exact, then the direct sum complex fails to be exact at that point and conversely if both E and F are exact at x, then also E F is exact at x. In almost the same way we define the tensor product of two G-complexes E F. The k th bundle in this complex should be (E F ) k = E i F j i+j=k and the differential D k : (E F ) k (E F ) k+1 is given by D k = ( (di id) + (id d j) ) i+j=k meaning that on the component E i F j the differential maps it to E i+1 F j by d i id and to E i F j+1 by id d j. This time we have supp(e F ) = supp E supp F i.e. only at points x where both complexes fail to be exact, the tensor product will fail to be exact. The third and final construction is the pullback. Let E be a G-complex over X and let f : Y X be a G-map. We define a G-complex f (E ) over Y by f (E ) k = f E k with the differential d k : f E k f E k+1 given by d k(y, ξ) = (y, d k (ξ)). We see that supp(f (E )) = f supp(e ). Let (X, A) be a compact G-pair and let L G (X, A) denote the set of isomorphism classes of G-complexes having compact support inside X \ A, i.e. on elements of A the complex is exact. In particular, all acyclic complexes live in L G (X, A) for all A. If A happens to be the empty set, we will write L G (X) instead of L G (X, ) for the set of isomorphism classes of G-complexes with compact support. If the complexes E and F both have support inside X \ A then as supp(e F ) supp E supp F X \ A the space L G (X, A) is closed under direct sum, in fact it is a semigroup w.r.t. direct sum. Furthermore, as tensor product of complexes decreases support, L G (X, A) is closed under tensor product as well, thus turning it into a semi-ring. Inside L G (X, A) we define the notion of homotopy. Two complexes E and F in L G (X, A) are said to be homotopic if there exists a G-complex H L G (X [0, 1], A [0, 1]) (the product space X [0, 1] is equipped with the usual G-action g (x, t) = (g x, t)) such that H X {0} = E and H X {1} = F. This equivalence relation is denoted by = h. It is easy to see that if E = h Ẽ then E F =h Ẽ F. (9.8) Lemma 9.41. Let X be compact, then any complex E i 1 di 1 E i d i E i+1 di+1 is homotopic to the corresponding complex where the differentials are all 0.
158 Chapter 9 Topological K-Theory Proof. Let p denote the projection X [0, 1] X. Put Ẽi = p E i then the fiber Ẽi (x,t) over (x, t) is isomorphic to Ei x. Define a differential d i : Ẽi Ẽi+1 by d i (v) = td i (v) if v Ẽ(x,t) = E x. Then Ẽ becomes a complex over X [0, 1], and we see that Ẽ X {1} is isomorphic to E and that Ẽ X {0} is a complex with 0- differentials. Introduce also in L G (X, A) the equivalence relation: E F iff there exist acyclic complexes H 0 and H 1 on X such that E H 0 = h F H 1. The equivalence class containing the complex E is denoted E. Observe, that all acyclic complexes are equivalent. Theorem 9.42. The quotient space L G (X, A)/ is naturally an abelian group which is naturally isomorphic to K G (X, A). Proof. We show it only in the very special case where X is compact and A = 6. The group G doesn t really matter, so without loss of generality we may assume that G is the trivial group. First we show that L(X)/ is a group. We define addition by E + F = E F. From (9.8) it is easy to deduce that this operation is well-defined. If F is an acyclic complex, then by definition of, one can see that E + F = E. Thus, we have identified our neutral element. The most complicated part is to construct inverses: Let a complex E : 0 E 1 d 1 E 2 d 2 d n 1 E n 0 be given. To E 1 we pick a complementary bundle F 1, i.e. E 1 F 1 = X C k1. To E 2 we pick a bundle F 2 such that E 2 F 2 = X C k2 where k 2 k 1 i.e. so that an injective linear map ϕ 1 : C k1 C k2 exists. Pick a bundle F 3 such that E 3 F 3 = X C k3, where k 3 is so big that an injective linear map (im ϕ 1 ) C k3 exists. Let ϕ 2 : C k2 C k3 be the linear extension which is 0 on im ϕ 1. Continue in this way to find bundles F i such that E i F i = X C ki and such that the sequence 0 C k1 ϕ 1 C k2 ϕ 2 ϕ n 1 C kn 0 is exact. Let F denote the complex consisting of the bundles F i and zero differentials. Let H denote the complex with H i = X C ki and whose differentials are d i (x, v) = (x, ϕ i (v)). By construction H is acyclic. Since E F and H are homotopic (they are by the lemma above both homotopic to the corresponding complex with zero differentials), we have E + F = H = 0, i.e. F is an inverse to E. Thus L(X)/ is a group. Consider the map Φ : L(X) K(X) given by E k ( 1) k [E k ]. 6 For a full proof, see [Seg] Proposition 3.1 and Appendix A.
9.5 The Thom Isomorphism 159 This is easily seen to be additive. If E and F are homotopic complexes, then by Proposition 9.3 Φ(E ) = Φ(F ). Let H be an acyclic complex. This we write as a sum of simple acyclic complexes H = H1 Hn, but since Hi consists solely of two (isomorphic) non-zero bundles Hi is mapped to zero by Φ. Thus if E F we have Φ(E ) = Φ(F ). Thus we may define Φ : L(X)/ K(X) by Φ( E ) = k ( 1) k E k and this is a group homomorphism. It is trivially surjective, for the element [E] [F ] K(X) is hit by the complex 0 E 0 F 0. To see that it is injective, assume that the complex E 0 E 1 d 1 E 2 d 2 d n 1 E n 0 is mapped to 0 by Φ. Let F i be a bundle, such that E k F k is trivial. Then this implies that the bundle H := F 1 E 2 F 3 E 4 is trivial. By Lemma 9.41 the complex E is homotopic to the complex 0 E 1 0 E 2 0 0 E n 0 To this we add the complex 0 F 1 id F 1 0 yielding the new complex 0 I i1 E 2 F 1 0 E n 0 To this we add the complex 0 E 1 F 2 id E 1 F 2 0 where the first E 1 F 2 is added to E 2 E 1 (yielding a trivial complex I i2 ) and the second is added to E 3. Continuing in this way will ultimately give us the following complex 0 I i1 I i2 I in 1 H 0 Thus, by adding acyclic complexes to E we have obtained a complex consisting solely of trivial complexes. In much the same way as we constructed inverses in the beginning of this proof, one can to this trivial complex add acyclic complexes such that it becomes homotopic to an acyclic trivial complex. Thus [E ] represents 0 in L(X)/ and Φ is injective. Thus we have obtained a new picture of equivariant K-theory involving G- complexes instead of G-bundles. Being able to switch back and forth between
160 Chapter 9 Topological K-Theory the two pictures can often be useful, especially when we come to discuss index theory. The first asset of this new picture is seen in the following when we construct the Thom isomorphism. The problem is as follows: Given a G-vector bundle π : E X we would like somehow to obtain a homomorphism between K G (E) and K G (X) and preferably (since E and X are homotopy equivalent topological spaces) an isomorphism. However, homotopy equivalence alone can t save us, for if X is compact, E is non-compact, so they belong to two different categories of topological spaces, and thus the projection map π does not induce a homomorphism in K-theory and if X is only locally compact, the projection map π is neither an inclusion nor a proper map, so neither in this case do we get an induced homomorphism in K-theory. So in order to obtain such a map, we have to proceed differently, and this is where the complexes enter the stage: Let π : E X be some fixed G-bundle over X (which is still assumed to be at least locally compact). Pull E back along π to obtain the bundle π E over E. This bundle has a natural section s, namely E ξ (ξ, ξ) π E, and this is easily seen to be equivariant. From the bundle π E and the section s we get a Koszul complex over E (cf. Example 9.40): 0 C s(x) π E s(x) Λ 2 (π E) s(x) s(x) Λ n (π E) 0 which in the sequel will be denoted Λ E. From a given G-complex F L G (X) we pull back to E and tensor it with the Koszul complex to obtain π (F ) Λ E which is a complex over E. The support of this is π 1 (supp F ) E 0 where E 0 denotes the zero-section of E which equals the set of zeros for s. This set is homeomorphic to supp F which is compact, i.e. π (F ) Λ E L G(E). Thus we get a map L G (X) L G (E) and this map is additive. One can check that it induces a map on the quotient, thus giving a map ϕ : K G (X) K G (E). This is the Thom homomorphism. We can extend this to the higher K-groups in the following way: Given a G-bundle π : E X we define a G-bundle π : E R n X R n by π(v, w) = (π(v), w), i.e. the fiber over the point (x, w) X R n is E x {w}. If we give X R n the G-action g (x, w) = (g x, w) and E R n the same G-action g (v, w) = (g v, w), π becomes an equivariant map, and E R n is a G- bundle. Thus we get a Thom homomorphism K G (X R n ) K G (E R n ) and composing with the isomorphisms K n G (X) KG (X R n ) and K n G (E) (E Rn ), we obtain a map K n G ϕ : K n (X) K n (E). G Putting all these together gives us a Thom homomorphism G ϕ : K (X) K (E). G Actually, the Thom homomorphism can be shown to be more than just a homomorphism: Theorem 9.43 (Thom Isomorphism). For a G-bundle E over a locally compact space X, the Thom homomorphism K G (X) K G (E) is a natural group isomorphism. G
9.5 The Thom Isomorphism 161 Of course since ϕ, by construction, respects the grading, ϕ : K n G (X) (E) is also an isomorphism for any i. K n G Example 9.44. Let s see what information we can extract from the Thom isomorphism in the case of a trivial group G and a trivial bundle E = X C n. Then we get an isomorphism K(X) = K(X C n ) = K((X C n ) + ) = K(X + S 2n ) = K(Σ 2n X + ) = K 2n (X). In fact if we consider the trivial bundle (X R i ) C n over X R i and recall Lemma 9.38 we get K i (X) = K(X R i ) = K(X R i C n ) = K(X R 2n+i ) = K i 2n (X), that is we have retrieved the Bott periodicity from the Thom isomorphism. That this is possible is a mark of how deep a result the existence of the Thom isomorphism really is. Assume we have two bundles π : E X and π : F X. Then π π : E F X is a bundle over X, but it can also be viewed as a bundle over E, by the obvious map p : E F E. How do the different Thom isomorphisms interact? Corollary 9.45 (Transitivity of the Thom isomorphism). Let the situation be as above, then the Thom isomorphisms for the bundles E X, E F X and E F E make the following diagram commute K G (X) K G (E F ) K G (E) Proof. First of all, note that E F E is isomorphic to the bundle π F, thus we will reserve the notation E F to mean the bundle over X. Secondly, observe that Λ E F = p Λ E q Λ F. This is a simple consequence of the fact that (π π) (E F ) = p π E q π F and that k Λ k (E F ) = Λ j E Λ k j F j=0 (note how this fits with the definition of tensor products of complexes). Now for the Thom isomorphisms: The isomorphism K G (X) K G (E F ) is given by H (π π) (H) Λ E F = (π π) (H) p Λ E q Λ F. The composition K G (X) K G (E) K G (E F ) is given by H π H Λ E p π H p Λ E Λ π F. We see that p π H equals (π π) H simply because π π = π p, and that Λ π F = q Λ F since π p = π q. Finally, how does the Thom isomorphism for a pullback bundle relate to the Thom isomorphism of the original bundle? Fortunately they are related in the nicest possible way:
162 Chapter 9 Topological K-Theory Proposition 9.46. Let π : E X be a vector bundle and f : Y X a continuous map. Let ϕ : K(X) K(E) denote the Thom isomorphism for E and ψ : K(Y ) K(f E) denote the Thom isomorphism for f E. The natural map F : f E E is proper and hence induces a map F : K(E) K(f E) and the following diagram commutes K(X) ϕ K(E) f K(Y ) F ψ K(f E) Unfortunately, I ve not been able to find a proof of this claim.
Chapter 10 Characteristic Classes 10.1 Connections on Vector Bundles In a smooth vector bundle there is a priori no way of taking the derivative of sections. Well, for the tangent bundle and tensor products of such there is of course the Lie derivative, but this has the serious drawback of not providing us with a well-defined way of taking derivatives along curves. Therefore we need the notion of a connection which, intuitively speaking, is a device which tells us how the fibers are packed together. Before we get started, let us fix the notation. Throughout this chapter, M and N will denote smooth manifolds without boundary. By C (M) = C (M, R) is meant the algebra of smooth real-valued functions on M, and by C (M, C) is meant the algebra of smooth complex-valued functions on M. It is easy to see that a smooth complex-valued function f can be written on the form f = f 1 +if 2 where f 1, f 2 C (M), and thus that C (M, C) = C (M) R C. If it is clear from the context that the scalar field is K, we will use the short-hand notation C (M) to mean C (M, K). Thus C (M) need not just refer to the algebra of real-valued functions. T M and T M denote the tangent bundle and cotangent bundle over M respectively, and TC M denotes the complexified cotangent bundle. A section of the complexified cotangent bundle is just a complex 1-form ω, i.e. if X is a smooth vector field, then ω(x) is a smooth complex-valued function on M. The set of sections of TC M is denoted Ω1 (M, C), and it is easy to see that Ω 1 (M, C) = Ω 1 (M, R) R C, i.e. we can split ω = ω 1 + iω 2 where ω 1 and ω 2 are real-valued 1-forms. In general we can form the bundle Λ k (TK M) and the sections of this, are K-valued k-forms. The set of such is denoted Ω k (M, K). It is well-known that this is a C (M)-module. Now we can introduce connections on vector bundles. The setup is as follows: Let M be a smooth manifold and E a smooth vector bundle over K (i.e. over R or C). By TK M we simply mean the cotangent bundle if K = R and Hom R (T M, C) = T M R C (i.e. the complexified cotangent bundle) if K = C. By Ω k (M, E) we shall denote the module over C (M) (recall that this means C (M, C) if K = C) of E-valued k-forms, i.e. sections of the bundle Λ k (TK M) K E. Any such section is a linear combination of elements of the form θ s where θ is an ordinary k-form and s is a section of E. It is easy to see that if E = M K n then elements of Ω k (M, E) are just n-tuples of k-forms. 163
164 Chapter 10 Characteristic Classes Definition 10.1 (Connection). By a connection on E we understand a K- linear map : Γ(E) Ω 1 (M, E) satisfying the Leibniz rule (fs) = df s + f s (10.1) for f C (M) and s Γ(E). s is called the covariant derivative of s. The set of connections on E is denoted A E. For each s Γ(E), s is a smooth section of the bundle T K M E = Hom K (T M, E), i.e. s p is a K-linear map T p M K E p. But smooth sections of the bundle Hom K (T M, E) are in 1-1 correspondence with smooth bundle maps T M E which again are in 1-1 correspondence with C (M)-linear maps X(M) Γ(E). Thus if X is a vector field on M we get a smooth section of E by defining X s := s(x) i.e. ( X s)(p) := s p (X p ). As mentioned this is C (M)-linear in X, i.e. gx s = g X s (but this can also be seen from a direct computation). In this description the Leibniz rule takes the following form: X (fs) = (fs)(x) = (df s + f s)(x) = df(x)s + f s(x) = (Xf)s + f X s. Phrased in this way, a connection is a map : X(M) Γ(E) Γ(E) which is C (M)-linear in the first variable, K-linear in the second and satisfies X (fs) = (Xf)s + f X s. Sometimes this is the definition of a connection. Of course, one might wonder: does such an object exist at all? The answer is yes: Assume first that E is a product bundle. This means that we have a canonical global frame {s 1,..., s n }. If v 1,..., v n Ω 1 (M, E) are chosen arbitrarily then define a connection by (f 1 s 1 + + f n s n ) = (df i s i + f i v i ). (10.2) Clearly, this is a K-linear map Γ(E) Ω 1 (M, E) and we see i=1 (fs) = (ff 1 s 1 + + ff n s n ) = = (d(ff i ) s i + ff i s i ) i=1 ) (f(df i s i ) + ff i s i + f i df s i i=1 = f (f 1 s 1 + + f n s n ) + = f( s) + df s. df f i s i Thus, at least product bundles have connections and lots of them. If v 1 = = v n = 0 then the corresponding connection is called the trivial connection. If E is an arbitrary vector bundle and (U α ) is a trivialization cover of M then the results above apply to the trivial vector bundles E Uα giving a connection α on E Uα. Let (ϕ α ) be a partition of unity subordinate to the cover, then a calculation as the one above will show that α ϕ α α is a connection on E. Therefore A E is non-empty. In fact, A E is quite plentiful: Let End(E) denote i=1
10.1 Connections on Vector Bundles 165 the bundle E E, then and let A Ω 1 (M, End(E)) act on a section s Γ(E) by A(s)(X) := A(X)s. This is obviously K-linear in s, and we have A(fs)(X) = A(X)(fs) = fa(x)s = fa(s)(x). Thus we are in position to show: Proposition 10.2. If is a connection on E and A Ω 1 (M, End(E)) then + A is a connection. Conversely, any connection is of the form + A where A Ω 1 (M, End(E)). Proof. At first we check that + A is a connection. The the identity above we have ( + A)(fs) = (fs) + A(fs) = df s + f( s) + fa(s) = df s + f( + A)s so + A is a connection. Now, let be any connection. For f C (M) and s Γ(E) we see ( )(fs) = (df s + f s) (df s + f s) = f( )s i.e. is a C (M)-linear map Γ(E) Ω 1 (M, E). Thus it corresponds to a bundle map E TK M E which again corresponds to a smooth section of the bundle Hom(E, TK M E) = E TK M E = TK M End(E). Thus Ω 1 (M, End(E)). Observe that connections are local operators i.e. if the section s vanishes on an open set U, then s also vanishes on U. In other words decreases support. To see this let s be a section which is 0 on U. Let p U be arbitrary and let f be a smooth bump function with supp f U and which is identically 1 in a small neighborhood of p. Then fs is the zero section and hence (fs) = 0 and therefore ( s) p = f(p)( s) p = (fs) p (df s) p = 0. Another way of formulating locality is that ( s) p depends on s only in an arbitrarily small neighborhood around p. Therefore if U is any open set in M and E U is the restriction of E to U it makes sense to restrict from E to E U. Definition 10.3 (Connection 1-form). Given a connection on the trivial bundle, define the connection 1-forms ω ij to be the unique differential 1-forms satisfying s j = ω ij s i. i=1 They are smooth for if X is a smooth vector field then X s j = ω ij (X)s i i=1 is a smooth section of E, i.e. ω ij (X) are smooth functions and hence ω ij are smooth 1-forms. Conversely, if ω ij is a matrix of 1-forms, then v j = n i=1 ω ij s i are sections of TK M E and hence determine a connection by (10.2). Thus, given a local frame, there is a 1-1 correspondence between connections on E U and matrices (ω ij ) of 1-forms over U.
166 Chapter 10 Characteristic Classes For the trivial connection on a product bundle, the connection forms relative to the canonical global frame are indeed trivial: By definition s j = d(1) s j = 0, so ω ij = 0. The next result describes how the connection 1-forms react to a change of trivialization. Lemma 10.4. Let (U α, ϕ α ) and (U β, ϕ β ) be two trivializations of E with nontrivial overlap, and let g αβ : U α U β GL(n, K) be the transition function. If ω α and ω β denote the matrices of connection 1-forms on U α and U β respectively, then for any p U α U β : ω β (p) = g 1 αβ (p)ω α(p)g αβ (p) + g 1 αβ (p)dg αβ(p). where dg αβ is the entrywise exterior derivative. Proof. First, we know that the trivializations ϕ α and ϕ β yield smooth local frames (s i ) and (t i ) on U α and U β respectively by s i (p) = ϕ 1 α (p, e i ), t i (p) = ϕ 1 β (p, e i) where (e i ) denotes the standard basis for K n. If p U α U β then t j (p) = ϕ 1 β = i=1 (p, e j) = ϕ 1 α (g αβ ) ij (p)ϕ 1 α (p, e i ), ϕ α ϕ 1 β (p, e j) = ϕ 1 α (p, g αβ (p)e j ) in other words t j = n i=1 (g αβ) ij s i. We act on this by X and get (ω β ) kj (X)t k = k=1 = X ((g αβ ) ij s i ) i=1 d(g αβ ) ij (X)s i + i=1 (g αβ ) kj (ω α ) ik (X)s i. i,k=1 Now we plug in the expression for t k in the left hand side and get (ω β ) kj (X)(g αβ ) ik s i = i,k=1 d(g αβ ) ij (X)s i + i=1 Comparing the coefficient to s i we get the equation (g αβ ) ik (ω β ) kj (X) = d(g αβ ) ij (X) + k=1 (g αβ ) kj (ω α ) ik (X)s i. i,k=1 (ω α ) ik (X)(g αβ ) kj which exactly states that g αβ ω β = dg αβ + ω α g αβ. From this the desired formula is an immediate consequence. 10.2 Connections on Associated Vector Bundles* From principal G-bundles we can construct certain associated vector bundles of great importance, particularly in spin geometry (spinor bundles). In this section, which may be skipped at a first reading, we will investigate how connections on the principal bundle give rise to connections on the associated vector bundles. First, however, we give a detailed outline of the construction of the associated bundles: k=1
10.2 Connections on Associated Vector Bundles* 167 Lemma 10.5. Consider a smooth G-principal bundle (P, π, σ) over a manifold M, and assume that G is acting smoothly from the left on a manifold F. Consider the following smooth right action of G on P F by (x, ξ) g = (x g, g 1 ξ), put P G F := (P F )/G the orbit space under this action, and define a map π G : P G F M by π G ([p, ξ]) = π(p). Then (P G F, π G ) has the structure of a smooth fiber bundle over M with fiber F. Proof. It is easy to see, that the action mentioned in the theorem really is a right action, and it is smooth since the coordinate maps ((p, ξ), g) p g and ((p, ξ), g) g ξ are smooth, by assumption. We then consider the orbit space P G F = (P F )/G where accordingly two points (p 1, ξ 1 ) and (p 2, ξ 2 ) are identified if there is a g with (p 2, ξ 2 ) = (p 1 g, g 1 ξ 2 ). The equivalence class containing (p, ξ) we denote by [p, ξ]. Define a map q : P F P G F simply by (p, ξ) q [p, ξ], and equip P G F with the quotient topology w.r.t. q. Now, P G F is a topological space and we show that it has the structure of a topological fiber bundle over M. To do this we need a projection π G from P G F to M. We define this map on an equivalence class by π G ([p, ξ]) = π(p). This is well-defined, since any other element of [p, ξ] has the form (p g, g 1 ξ) and π G ([p g, g 1 ξ]) = π(p g) = π(p). To ensure continuity of π G we remark that i renders the following diagram commutative (π here, is the obvious extension of π from the fiber bundle, by π(p, ξ) = π(p), it should not cause confusion to use the same notation for both maps) P F π q P G F π G M q is a quotient map, π is continuous, and thus by the characteristic property of quotient maps, π G is continuous. The next thing we need to do is to construct trivializations. In order to do so, we need to remark the following intermediate result π 1 G (x) = { [p, ξ] ξ F }. for some fixed p π 1 (x). The inclusion is clear, since π G ([p, ξ]) = π(p) = x. Conversely, any element in P G F is of the form [p, ξ]. Assume, that [p, ξ] is in π 1 G (x), then x = π G([p, ξ]) = π(p), so follows. From this it is easily seen that π 1 G (V ) = {[p, ξ] π(p) V, ξ F }. (10.3) Back on track, we shall now construct trivializations. Fix a point x 0 M, pick a trivializing neighborhood V for the principal bundle around x 0, let Φ : π 1 (V ) V G be the trivialization of the principal bundle, and let s : V π 1 (V ), s(x) = Φ 1 (x, e) be the canonical local section associated with the trivialization. Define a map Ψ : V F π 1 G (V ) by Ψ(x, ξ) = [s(x), ξ].
168 Chapter 10 Characteristic Classes We will eventually see that this is the inverse of the trivialization we want. To see that Ψ is injective, assume that Ψ(x, ξ) = Ψ(x, ξ ), i.e. [s(x), ξ] = [s(x ), ξ ], which means that there exists a g such that s(x) = s(x ) g and ξ = g 1 ξ. But then x = π(s(x)) = π(s(x ) g) = π(s(x )) = x, and since the G-action on P is free, g = e, so (x, ξ) = (x, ξ ). To prove surjectivity, let [p, ξ] π 1 G (V ) be arbitrary (every element of π 1 G (V ) is of this form by (10.3)) and put x = π(p). As both p and s(x) are in the same fiber (namely π 1 (x)), we have p = s(x) g for some g. Accordingly [s(x) g, g 1 ξ ] = [s(x), ξ ] for all ξ F. For the choice ξ = g ξ, we get Ψ(x, g ξ) = [s(x), g ξ ] = [s(x) g, g 1 g ξ] = [p, ξ], thus proving surjectivity and hence bijectivity. Ψ is continuous, since it s just the composition (x, ξ) (s(x), ξ) [s(x), ξ]. We denote the inverse by Φ : π 1 G (V ) V F. To see that this is continuous, consider this commutative diagram P F q 1 (π 1 G (V )) Φ q q P G F π 1 G (V ) V F Then Φ q on q 1 (π 1 q Φ G (V )) is given by (p, ξ) [p, ξ] (π(p), g ξ) which is continuous. Again by the property of quotient maps, Φ is continuous, and hence the homeomorphism we want. In short, we have shown that a trivialization (V, Φ) for the principal bundle renders a trivialization (V, Φ) for P G F. Up till now we have verified that (P G F, π G ) is a topological fiber bundle over M. Our final goal is to show that it is actually a smooth bundle, i.e. to equip P G F with a smooth structure such that π G is smooth and the trivializations are diffeomorphisms. To this end consider a trivialization cover {V α, Φ α } for the principal bundle. This gives us a trivialization cover {V α, Φ α } for P G F. Now, we simply declare P G F to have the unique smooth structure which makes all Φ α : π 1 G (V α) V α F diffeomorphisms (V i F has a natural smooth structure as an open subset op the manifold P F ). In order for this to be a well-defined smooth structure on P G F we only need to show that Φ α Φ 1 β : (V α V β ) F (V α V β ) F is smooth (it then follows automatically that it is a diffeomorphism). Φ α Φ 1 β (x, ξ) = Φ α([s β (x), ξ]) = Φ α ([s α (x) g αβ (x), ξ]) Φ = Φ α ([s α (x) g αβ (x), g αβ (x) 1 g αβ (x) ξ]) = Φ α ([s α (x), g αβ (x)ξ]) = (x, g αβ (x) ξ). This map is smooth. Hence P G F has a well-defined smooth structure. The only thing left is to show, that π G is smooth. We do that by showing that π G Φ 1 : V F π 1 G (V ) is smooth for any trivialization Φ. We observe π G Φ 1 (x, ξ) = π G ([s(x), ξ]) = π(s(x)) = x. This it smooth, and since Φ is a diffeomorphism, also π G is smooth.
10.2 Connections on Associated Vector Bundles* 169 The fiber bundle whose existence is guaranteed by the preceding lemma is called the associated fiber bundle to the principal bundle (P, π, σ). This associated bundle depends, of course, on the choice of manifold F and on the G-action on F. If we instead of a mere group action on some manifold F have a representation of a vector space, then we get a vector bundle Lemma 10.6. Let (P, π, σ) be a smooth principal G-bundle, V a finite-dimensional vector space and ρ : G Aut(V ) a Lie group representation. Then the associated fiber bundle (P ρ V, π ρ ) has the structure of a smooth vector bundle over M. Proof. As mentioned in the proof above, the fibers are of the form π 1 ρ (x) = { [p, v] v V } for some fixed p π 1 (x). These have natural vector space structures a[p, v 1 ] + b[p, v 2 ] = [p, aξ 1 + bv 2 ] which turn the fibers into dim V -dimensional vector spaces. Now, let s consider Φ restricted to the fiber πρ 1 (x) mapping injectively into {x} V, which has the vector space structure a(x, v 1 ) + b(x, v 2 ) = (x, av 1 + bv 2 ) turning it into a dim V -dimensional vector space as well. There is a g such that s(x) = p g 1 and so Φ π 1 ρ (x) (a[p, v 1] + b[p, v 2 ]) = Φ([p, av 1 + bv 2 ]) = Φ([p g 1, ρ(g)(av 1 + bv 2 )]) = Φ([s(x), aρ(g)v 1 + bρ(g)v 2 ]) = (x, aρ(g)v 1 + bρ(g)v 2 ) = a(x, ρ(g)v 1 ) + b(x, ρ(g)v 2 ) = aφ([s(x), ρ(g)v 1 ]) + bφ([s(x), ρ(g)v 2 ]) = aφ([p, v 1 ]) + bφ([p, v 2 ]). Φ restricted to the fibers is accordingly a linear map and hence a vector space isomorphism πρ 1 (x) {x} V. This proves the assertion. As is apparent, the fibers in the associated vector bundle can be a bit tricky to handle. Thus it may cause some problems to work with sections of associated vector bundles. Therefore it is a relieve, that we have alternative ways to deal with such. If we still want to work globally, we can think of them as certain functions on the total space of the principal bundle. Recall that a function f : P V is called equivariant w.r.t. the representation ρ if f(p g) = ρ(g 1 )f(p). Proposition 10.7. To each section ψ Γ(P ρ V ) corresponds an equivariant function ψ : P V, given by ψ(x) = [p, ψ(p)] when π ρ (p) = x, and this correspondence between smooth sections and equivariant functions is bijective. Proof. To be a bit more precise on the definition of ψ: let p P, then we define ψ(p) to be the unique element in V such that [p, ψ(p)] = ψ(π ρ (p)). Equivariance of this function is seen as follows: On the other hand [p g, ψ(p g)] = [p, ρ(g) ψ(p g)]. [p g, ψ(p g)] = ψ(π(p g)) = ψ(π(p)) = [p, ψ(p)]. Thus, by uniqueness, we get ρ(g) ψ(p g) = ψ(p) and hence equivariance.
170 Chapter 10 Characteristic Classes To see that it is smooth, let Φ be a trivialization of P over U, let s be the associated local section, i.e. s(x) = Φ 1 (x, e) and let Φ be the corresponding trivialization of the associated vector bundle. We note that ψ(x) = [s(x), ψ(s(x))] and hence that Φ(ψ(x)) = (x, ψ(s(x))). Thus the map x ψ(s(x)) is smooth. But then also ψ Φ 1 : U G V given by (x, g) ψ(s(x) g) = ρ(g 1 ) ψ(s(x)) is smooth, and hence ψ is smooth. Given a smooth equivariant function f we define a corresponding section of P ρ V by ψ f (x) = [s(x), f(s(x))]. It is easy to see that these two operations are inverses of each other. Another description of the section is the local one. Observe the elementary fact that for any vector bundle E with trivializations (U α, Φ α ) α I and transition functions g αβ : U α U β GL(n, K) there is a 1-1 correspondence between smooth sections of E and collections (ψ α ) α I of smooth functions ψ α : U α R n satisfying ψ α (x) = g αβ (x)ψ β (x) for x U α U β : Given a section ψ Γ(E) the obvious definition would be Φ α ψ(x) = (x, ψ α (x)), i.e. which is obviously smooth. Since ψ α (x) = pr 2 Φ α (ψ(x)) ψ α (x) = pr 2 Φ α (ψ(x)) = pr 2 Φ α Φ 1 β Φ β(ψ(x)) = pr 2 (Φ α Φ 1 β (x, ψ β(x))) = pr 2 (x, g αβ (x)ψ β (x)) = g αβ (x)ψ β (x). Conversely, given such a collection (ψ α ) we define a section ψ by Φ α ψ(x) = (x, ψ α (x)). A calculation as the one above will reveal that this is well-defined. Now, let E = P ρ V be an associated vector bundle and ψ Γ(E) a smooth section. Furthermore, let (U α, Φ α ) be a set of trivializations of P and (U α, Φ α ) the corresponding trivializations of E. Relative to these we have the local functions ψ α (x) = (pr 2 Φ α )(ψ(x)) Since Φ α ([s α (x), v]) = (x, v) we see that locally ψ(x) = [s α (x), ψ α (x)]. Moreover we can express ψ α in terms of the associated equivariant function Proposition 10.8. For any ψ Γ(E) we have ψ α (x) = ψ(s α (x)) Proof. Since ψ(x) = [s α (x), ψ(s α (x))] on U α, the result follows immediately by the calculations above. Assume now that our principal G-bundle π : P M has a connection, i.e. a choice of horizontal subspace H p P T p P for each p P, or equivalently a connection 1-form ω. Given a connection, the push-forward π : T p P T π(p) M restricts to an isomorphism H p P T π(p) M. Any vector field X on M can be lifted to a unique horizontal vector field X on P. More specifically, X p = π 1 (X π(p) ). The first step in constructing a connection on the associated bundle E = P ρ V is the following lemma
10.2 Connections on Associated Vector Bundles* 171 Lemma 10.9. Let ψ Γ(E) be a section of the associated bundle, then for each X X(M) the map P V given by p X p ( ψ) is ρ-equivariant. Proof. This is an elementary calculation: X p g ( ψ) = (σ g ) X p ( ψ) = X p ( ψ σ g ) the third identity being equivariance of ψ. = X p (ρ(g 1 ) ψ) = ρ(g 1 )X p ( ψ) Theorem 10.10 (Connection on Associated Bundle). The map : X(M) Γ(E) Γ(E) given by (X, ψ) X ψ where X ψ is the section corresponding to the equivariant function p X p ( ψ) defines a connection on E. In short X ψ(p) = X p ( ψ). (10.4) Since we spent some time discussing local expressions of sections, it is a natural question to pose: how does X ψ look locally? The answer is given in the following proposition Proposition 10.11. Given a local section s α : U α P of the principal bundle, then the local function of X ψ relative to the corresponding trivialization of E is given by ( X ψ) α (x) = X x ψ α ρ (A α (X x ))ψ α (x) (10.5) where ρ : g End(V ) is the induced Lie algebra representation of ρ and A α = s αω is the local gauge potential. We can phrase this in a bit different way, when we view a connection as a map Γ(E) Γ(T M E). If ψ Γ(E) is a section of E and ψ α are the local functions, then ψ is a section of Γ(T M E) whose local functions are given by ( ψ) α (x) = dψ α (x) ρ (A α x)ψ α (x). (10.6) This formula has to be interpreted correctly. To x U α corresponds an element in Tp M V = R k R n (if dim M = k and dim V = m), dψ should be understood componentwise, i.e. pick a basis for V and apply d to the components relative to this basis, and ρ (A α x)ψ α should act on a vector X x T x M by ρ (A α x(x x ))ψ α (x). Letting 10.6 act on a vector field X we thus get 10.5. Let s end with a prominent example of an associated connection, namely the Levi-Civita connection. Let M be an n-dimensional pseudo-riemannian manifold, i.e. a manifold with a metric g of signature (r, s). Then we can consider the orthonormal frame bundle π : F O (M) M. This is a principal O(r, s)- bundle on M. It carries what we will call the canonical 1-form ϕ, an R n -valued 1-form on P O (M), defined in the following way: let X p T p P O (M), where p P O (M) is an isometric isomorphism p : R n T π (p)m, then ϕ(x p ) = p 1 (π X p ). One can check that relative to the defining representation ρ : O(r, s) GL(n, R) the 1-form satisfies ϕ((σ A ) X p ) = ρ(a 1 )ϕ(x p ) and that it is zero on vertical vectors (this is obvious, since vertical vectors, by definition, satisfy π X p = 0). Thus ϕ Ω 1 (P O (M), R n ). Given a connection ω on P O (M) we have a covariant exterior derivative D ω : Ω 1 (P O (M), R n ) Ω 2 (P O (M), R n ) and the 2-form Θ ω := D ω θ is called the torsion of the connection.
172 Chapter 10 Characteristic Classes Theorem 10.12 (Fundamental Theorem of Riemannian Geometry I). There exists a unique connection ω, called the Levi-Civita connection or the Riemannian connection, on P O (M) with vanishing torsion 2-form. We will not prove this theorem. For a proof consult [Bleecker] Theorem 6.2.5. Exactly the same holds true in the case of an oriented pseudo-riemannian manifold where we replace the orthonormal frame bundle P O (M) with the oriented orthonormal frame bundle P SO (M). We can retrieve the tangent bundle T M from the frame bundle, it is simply the vector bundle associated to the defining representation ρ : O(r, s) GL(n, R), i.e. T M = P O (M) ρ R n and the Levi-Civita connection induces on T M a connection which is also called the Levi-Civita or Riemannian connection. If s α : U α P O (M) is some local section and A α = s αω is the gauge potential of the Levi-Civita connection, formula (10.5) above gives us the following local formula for the connection on T M: ( ( Y X) α (x) = X x Y α A α i (X x )B i )Y α (x) (10.7) where (B i ) i I is some basis for the Lie algebra o(r, s). 10.3 Pullback Bundles and Pullback Connections Assume that f : N M is a smooth map and π : E M is a vector bundle. Recall from Chapter 9 the pullback bundle: i I f E = {(p, v) N E f(p) = π(v)} over N. f induces a pullback map on the sections f : Γ(E) Γ(f E) by (f s)(p) = (p, s(f(p))). If we turn C (N) into a C (M)-module by defining scalar multiplication ϕ ψ := f (ϕ)ψ (where f ϕ just means ϕ f) we can prove the following Lemma 10.13. The map Φ : C (N) C (M) Γ(E) Γ(f E) given by the C (N)-linear extension of is an isomorphism of C (N)-modules. ϕ s ϕf (s) Proof. It is easy to see that it is well-defined: if θ C (M), then Φ(ϕ θs) = ϕf (θs) = (ϕf θ)f s = (θ ϕ)f s = Φ(θ ϕ s). We prove lemma first in the case where E is the trivial bundle, E = M K n. This being the case, f E is also trivial: f E = N K n. Thus, sections of these bundles are just smooth maps M K n or N K n. Let s i : M K n denote the section p e i of E, mapping p to the i th standard basis vector, and t i : N K n denote the corresponding section of f E. It is easy to see that f s i = t i. Consider the section s = ϕ 1 t 1 + + ϕ n t n Γ(f E), then s = ϕ 1 f s 1 + + ϕ n f s n = Φ(ϕ 1 s 1 + + ϕ n s n ),
10.3 Pullback Bundles and Pullback Connections 173 i.e. Φ is surjective. To see that it is injective, assume Φ(ϕ s) = 0. As before, s is of the form ϕ 1 s 1 + + ϕ n s n, so that Φ(ϕ (ϕ 1 s 1 + + ϕ n s n )) = Φ(ϕ (ϕ 1 s 1 ) + ϕ (ϕ n s n )) = Φ((f ϕ 1 )ϕ s 1 ) + + Φ((f ϕ n )ϕ s n ) = ((f ϕ 1 )ϕ,, (f ϕ n )ϕ). Thus we must have (f ϕ 1 )ϕ = = (f ϕ n )ϕ = 0. Therefore also ϕ s = (f ϕ 1 )ϕ s 1 + + (f ϕ n )ϕ s n = 0, i.e. Φ is injective. In the general case: Let F be a bundle such that E F is trivial 1. Then the result holds here, and by restriction, also on E E F. From the pullback map of ordinary k-forms we get a pullback map f : Ω k (M, E) Ω k (N, f E) by a linear extension of One can check that this is well-defined. f (ω s) = f ω f s. Proposition 10.14. Given a connection on E there is a unique connection on f E rendering the following diagram commutative Γ(E) Ω 1 (M, E) f Γ(f E) f Ω 1 (N, f E) This connection, denoted f, is called the pullback connection on f E. Proof. Uniqueness first. As shown above any section of f E is a linear combination of elements of the form ϕf (s) where s Γ(E) and ϕ C (N). Assume we have two connections and making the diagram commutative, i.e. (f s) = f ( s) = (f s). On elements of the form ϕf (s) we have (ϕf s) = dϕ f s + ϕ (f s) = dϕ f s + ϕ (f s) = (ϕf s). Thus, the two connections are equal. To show existence we simply construct a connection fitting into the diagram. Lemma 10.13 gave us an isomorphism Φ : C (N) C (M) Γ(E) Γ(f E). Thus we have Ω k (N, f E) = Ω k (N) C (N) Γ(f E) = Ω k (N) C (N) C (N) C (M) Γ(E) = Ω k (N) C (M) Γ(E). This isomorphism we call Ψ k. It is not hard to see that the inverse is given by Ψ 1 k (ω s) = ω f s. 1 For smooth vector bundles such a complementary bundle exists over any manifold, compact or not, cf. [MT] Exercise 15.10.
174 Chapter 10 Characteristic Classes As in Lemma 10.13 the pullback map on forms f : Ω k (M) Ω k (N) gives a C (M)-linear map C (N) C (M) Ω k (M) Ω k (N) upon defining θ ω θ f ω. This is no longer an isomorphism. Tensoring (over the ring C (M)) with Γ(E) gives a map C (N) C (M) Ω k (M) C (M) Γ(E) Ω k (N) C (M) Γ(E) (the map on the second factor is, of course, just the identity). Composing this with the isomorphism C (N) C (M) Ω k (M, E) yields the map C (N) C (M) Ω k (M) C (M) Γ(E) ρ : C (N) C (M) Ω k (M, E) Ω k (N) C (M) Γ(E) which is given explicitly by θ (ω s) θ(f ω) s. Now we define f : Γ(f E) Ω 1 (N, f E) by f = Ψ 1 1 (d id +ρ(id ))Φ 1. (10.8) To check that it fits into the diagram, note that Φ 1 (f s) = 1 s, and assuming s to be of the form ω s we get f (f s) = Ψ 1 1 (d id +ρ(id ))(1 s) = Ψ 1 1 (d(1) s) + Ψ 1 1 ρ(1 ω s ) = Ψ 1 1 (f ω s ) = f ω f s = f (ω s ) = f ( s). Finally we need to check that it is indeed a connection. Obviously it is K- linear. To see that it satisfies (10.1), let θ C (N) and t Γ(f E). By Lemma 10.13 we may assume that t = θ f s, where θ C (N). Combining θθ into a single function θ we only need to check (10.1) on sections of the form θf s corresponding, under the isomorphism Φ to θ s. Still assuming s to be of the form ω s we calculate f ( )(θf s) = Ψ 1 1 (d id +ϕ(id ))(θ s) = Ψ 1 1 (dθ s + ρ(θ ω s )) = dθ f s + Ψ 1 1 (θ(f ω) s ) = dθ f s + θf ω f s, and since f ω f s = f (ω s ) = f ( s) = f (f s) (the last equality following from the commuting diagram) we see that f satisfies (10.1). Proposition 10.15. Let E be a vector bundle over M and a connection. 1) We have id M =. 2) For maps f : N M and g : K N we have, under the isomorphism g f E = (fg) E, that g f = (fg). 3) If (ω ij ) are the connection forms relative to some frame (s i ) over a trivialization U M, then (f ω ij ) are the connection forms for f relative to the frame (f s i ) over f 1 (U).
10.4 Curvature 175 Proof. Point 1) follows trivially from the uniqueness part of Proposition 10.14. For part 2) consider the following diagram Γ(E) g f Γ(g f E) Φ Γ((fg) E) g f (fg) Ω 1 (M, E) g f Ω 1 (K, g f E) Ψ Ω 1 (K, (fg) E) where Φ and Ψ are the natural isomorphisms, induced by the bundle isomorphism g f E = (fg) E. The upper square is commutative and so is the outer square. By uniqueness, this forces g f = Ψ 1 ((fg) )Φ, and this is exactly what it means for g f and (fg) to be equal under the isomorphism. 3) For this part we can assume E to be trivial and the frame (s i ) to be global. If E = M K n then also f E = N K n. Let s abbreviate s j = f s j. This is easily seen to be a global frame for f E. Then we see that (f )s j = (f )(f s j ) = f ( s j ) n = f ( ) ω ij s i = (f ω ij ) (f s i ) = i=1 (f ω ij ) s i i=1 which is precisely what we wanted. i=1 10.4 Curvature We define a wedge product : Ω k (M) Ω l (M, E) Ω k+l (M, E) by θ (ω s) := (θ ω) s. In general we would need some kind of bilinear map on E in order to define a wedge product Ω k (M, E) Ω l (M, E) Ω k+l (M, E). This is for instance the case for the bundle End(E), where we have composition, or for the trivial bundle M M(n, K) where we have matrix multiplication. Specifically, if ω, η are differential forms with values in M(n, K), they are matrices of usual differential forms, and the wedge product is defined to be the matrix of forms whose ij th entry is given by (ω η) ij := ω ik ω kj. k=1 The most obvious example at hand of a matrix-valued form is of course the matrix of connection 1-forms. For the spaces Ω k (M) of K-valued forms on M we have the usual exterior derivative d : Ω k (M) Ω k+1 (M). We want to construct something similar for Ω k (M, E):
176 Chapter 10 Characteristic Classes Definition 10.16 (Covariant Derivative). Let be a connection on the vector bundle E. For k 1 we define a map k : Ω k (M, E) Ω k+1 (M, E), called the covariant exterior derivative, given on generators by This is well-defined for on one hand and on the other k (θ s) = dθ s + ( 1) k θ s. k (fθ s) = d(fθ) s + ( 1) k fθ s = (df θ + fdθ) s + ( 1) k θ f s (10.9) k (θ fs) = dθ fs + ( 1) k θ (fs) = dθ fs + ( 1) k θ (df s + ( 1) k f s) = fdθ s + ( 1) k (θ df) s + ( 1) k θ f s = (fdθ + df θ) s + ( 1) k θ f s. Thus the two sides are equal and from (10.9) we read off: Lemma 10.17. The covariant exterior derivative k satisfies the Leibniz rule, i.e. if f C (M) and ω Ω k (M, E) then: k (fω) = df ω + f k (ω). (10.10) The important map is 1 which we abbreviate to. Definition 10.18 (Curvature). Let E be a K-vector bundle and a connection. The K-linear map R := : Γ(E) Ω 2 (M, E) is called the curvature of the connection. A connection with vanishing curvature is called a flat connection. First we observe, that R (or just R for brevity) is C (M)-linear: R(fs) = (fs) = (df s + f s) = d(df) s df s + df s + f s = fr(s). Thus R induces a bundle map E Λ 2 (TK M) E which again corresponds to a section of the bundle E Λ 2 (TK M) E = Λ 2 (TK M) End(E). Therefore we can think of R either as a map Γ(E) Ω 2 (M, E) or we can think of it as an End(E)-valued 2-form on M. If we adopt this last view of the curvature, we write R X,Y (s) for the smooth section of E which we get by acting on s with the vector fields X and Y. For the trivial connection on a product bundle, the curvature is identically 0, indeed, since R is C (M)-linear and Rs i = ( s i ) = (d(1) s i ) = 0: R(f 1 s 1 + + f n s n ) = f 1 R(s 1 ) + + f n R(s n ) = 0. Thus the trivial connection is a flat connection. Next we investigate how the curvature looks locally
10.4 Curvature 177 Proposition 10.19. Let U be a trivialization neighborhood for E and ω ij the connection 1-forms for relative to some local frame s 1,..., s n. Put Ω ij := dω ij + n k=1 ω ik ω kj (or in compact form Ω = dω + ω ω). Then Rs j = Ω ij s i. i=1 The 2-forms Ω ij are called the curvature 2-forms. Proof. This hangs on the fact that s j = i ω ij s i so that ( n ) Rs j = ( s j ) = ω ij s i = (ω ij s i ) = = = i=1 (dω ij s i ω ij s i ) = i=1 dω ij s i i=1 dω ij s i + i=1 And then globally: i=1 i=1 ω ij (ω ki s k ) i,k=1 (ω ω) ij s i = i=1 ( dω ij s i ω ij ( Ω ij s i. j=1 k=1 ) ω ki s k ) Proposition 10.20. For vector fields X and Y and s Γ(E) we have R X,Y (s) = ( X Y Y X [X,Y ] )s. Proof. We will show it locally so let U be a trivialization for E and ω and Ω the matrices of connection 1-forms and curvature 2-forms relative to the corresponding local frame s 1,..., s n. Then ( n ) ( ) X Y s j = X ω ij (Y )s i = X(ωij (Y ))s i + ω ij (Y ) X s i And therefore = i=1 i=1 ω ij (Y )ω ki (X)s k + i,k=1 ( X Y Y X [X,Y ] )s j = = + X(ω ij (Y ))s i. i=1 i=1 ( ωkj (Y )ω ik (X) ω kj (X)ω ik (Y ) ) s i i,k=1 dω ij (X, Y )s i + i=1 = R X,Y (s j ), and by C (M)-linearity of R X,Y ( X(ωij (Y )) Y (ω ij (X)) ω ij ([X, Y ]) ) s i (ω ik ω kj )(X, Y )s i = i,k=1 the result follows. Ω ij (X, Y )s i As for the connection forms, we also have a transition rule for the curvature forms Lemma 10.21. Let (U α, ϕ α ) and (U β, ϕ β ) be two trivializations of E with nontrivial overlap, and let g αβ : U α U β GL(n, K) be the transition function. If Ω α and Ω β denote the curvature 2-forms on U α and U β respectively then for p U α U β : Ω β (p) = g 1 αβ (p)ω α(p)g αβ (p). i=1
178 Chapter 10 Characteristic Classes Proof. From Lemma 10.4 we know that the connection 1-forms are related by ω β (p) = g 1 αβ (p)ω α(p)g αβ (p) + g 1 αβ (p)dg αβ(p). For brevity we write g instead of g αβ. By dg we understand the entry-wise exterior differentiation of the matrix of functions. Since g 1 g = I we get d(g 1 )g + g 1 dg = 0 and hence d(g 1 ) = g 1 (dg)g 1. Inserting in the definition of Ω β we get Ω β = dω β + ω β ω β = d(g 1 ω α g + g 1 dg) + (g 1 ω α g + g 1 dg) (g 1 ω α g + g 1 dg) = (g 1 (dg)g 1 ) ω α g + g 1 (dω α )g g 1 ω α dg (g 1 (dg)g 1 ) dg + (g 1 ω α g + g 1 dg) (g 1 ω α g + g 1 dg) = g 1 dω α g + g 1 (ω α ω α )g = g 1 Ω α g. 10.5 Metric Connections Before we embark on the construction of characteristic classes we will in this section briefly discuss some of the interplay between metrics and connections on vector bundles. Recall the definition of a metric or fiber metric: It is a choice of inner product (i.e. a conjugate-symmetric, positive definit sesquilinear form) g p on each of the fibers varying smoothly in sense that if s, s are smooth sections of E, then p g p (s p, s p) is a smooth map. We will use the notation g or, for such a metric. A vector bundle endowed with a metric is called a Riemannian vector bundle. Definition 10.22 (Metric Connection). Let E be a vector bundle equipped with a smooth fiber metric. A connection is said to be compatible with the metric or a metric connection if for all X X(M) and s, s Γ(E) the following equation is satisfied: X s, s = s(x), s + s, s (X). (10.11) Proposition 10.23. A metric connection always exists on a Riemannian vector bundle. Proof. We only need to show that a metric connection exists on a trivial bundle for then a partition of unity subordinate to a trivialization cover of M will provide the rest of the argument. So let E = M K n be the trivial bundle with the metric coming from the usual inner product on K n and let {e 1,..., e n } be the standard basis for K n and consider the associated global orthonormal frame s i (p) = (p, e i ). We show that the trivial connection is metric: For sections s = a i s i and s = b i s i i=1 where a i, b i C (M) we see that s, s = n i=1 a ib i so that X s, s = X(a i b i ) = (Xa i )b i + a i (Xb i ) whereas i=1 i=1 i=1 i=1 n s(x), s = (Xa i )s i, b j s j = j=1 i=1 (Xa i )b i and similarly s, s (X) = n i=1 a i(xb i ). Thus the trivial connection is a metric connection. i=1
10.5 Metric Connections 179 If E is a Riemannian vector bundle and U is a trivialization neighborhood, we can always assume the local sections over U to be orthonormal. If they are not, we simply apply the Gram-Schmidt procedure to obtain the desired orthonormal frame. The impact on connection forms of a connection being metric is seen in the following (where by ω ij we understand the complex 1-form, acting on vector fields by ω ij (X) := ω ij (X)). Lemma 10.24. Let be a connection on the Riemannian bundle E. Then is metric if and only if for any local orthonormal frame (s i ) the associated matrix of connection 1-forms (ω ij ) is skew-adjoint, i.e. ω ij + ω ji = 0. Furthermore the matrix of curvature forms of a metric connection associated with an orthonormal frame is skew-adjoint. In the case of a real bundle, the matrices will, of course, turn out to be antisymmetric. Proof. Assume first, that the connection is metric. We have s i, s j = δ ij and thus 0 = X s i, s j = s i (X), s j + s i, s j (X) n = ω ki (X)s k, s j + s i, ω kj (X)s k k=1 = ω ji (X) + ω ij (X). Thus they are skew-adjoint. Conversely, assume that all matrices of connection 1-forms are skew-adjoint. We want to show that (10.11) is satisfied. We do it locally, so let U be a trivialization neighborhood and let {s 1,..., s n } be a local orthonormal frame and (ω ij ) the connection 1-forms, which are by assumption skew-symmetric. It suffices to show it for the local sections s i and s j. Since they are orthonormal we have X s i, s j = 0. On the right-hand side k=1 k=1 n s i (X), s j + s i, s j (X) = ω ki (X)s k, s j + s i, = ω ki (X) s k, s j + k=1 = (ω ji + ω ij )(X) = 0. ω lj (X)s l l=1 ω lj (X) s i, s l Thus the two sides are equal. The last assertion follows from the simple calculation Ω ij = dω ij + ω ik ω kj = dω ji + ω ki ω jk = dω ji k=1 ω jk ω ki = Ω ji. k=1 One last bit of information on metric connections that will be needed: how do metric connections react to pullbacks? First of all, if E is a vector bundle over M with metric g, and f : N M is smooth, we can pull g back to a metric f g on f E in the following way: Bearing in mind that sections t, t of f E are of the form t(p) = (p, ϕ(p)) and t (p) = (p, ϕ (p)), where ϕ, ϕ : M E are some functions satisfying f(p) = π(ϕ(p)), we can define the pullback by k=1 (f g)(t, t ) p = g f(p) (ϕ(p), ϕ (p)). l=1
180 Chapter 10 Characteristic Classes This is well-defined since ϕ(p) and ϕ (p) are in the fiber over f(p), and it is easily seen to be sesquilinear, conjugate symmetric and positive definit, and since the right-hand side depends smoothly on p, f g is a fiber metric on f E. Since (f s)(p) = (p, s(f(p))) we have in particular that (f g) p (f s, f s ) = g f(p) (s(f(p)), s (f(p))). (10.12) Thus, for instance, if {s 1,..., s n } is a local orthonormal frame w.r.t. g, then {f s 1,..., f s n } is orthonormal w.r.t. f g. Lemma 10.25. Let (E, g) be a Riemannian vector bundle and a metric connection on E. Then f is metric relative to the pullback metric f g. Proof. Let {s 1,..., s n } be a local orthonormal frame for E, then by the only-if-part of Lemma 10.24 the matrix of connection 1-forms is skew-adjoint. {f s 1,..., f s n } is a local orthonormal frame for f E relative to f g and by Proposition 10.15 3) f ω ij are the connection 1-forms for f relative to this frame. But we easily see that f ω ij = f ω ji and thus by the if-part of Lemma 10.24 the connection f is metric. On the tangent bundle of a Riemannian manifold the situation is quite delicate. Not only do metric connections exists but we can in fact single out one particular connection if we impose one further condition, namely symmetry. Let be a connection on T M and define the torsion tensor τ to be the covariant 2-tensor given by τ(x, Y ) = X Y Y X [X, Y ]. The connection is said to be symmetric or torsion-free if the torsion tensor vanishes identically, i.e. if X Y = Y X + [X, Y ]. Theorem 10.26 (Fundamental Theorem of Riemannian Geometry II). On the tangent bundle of a Riemannian manifold there exists a unique connection, called the Levi-Civita connection or Riemannian connection, which is symmetric and compatible with the metric. The proof of this important fact can be found in any textbook on Riemannian geometry, for instance [Lee]. 10.6 Characteristic Classes Definition 10.27 (Invariant Polynomial). Let G be a matrix Lie group and g its Lie algebra. A G-invariant polynomial is a map P : g K which is polynomial in the entries of the elements of g and which satisfies P (AXA 1 ) for all X g and A G. The set of G-invariant polynomials is denoted I(G) and the set of G-invariant polynomials of homogenous degree k is denoted I k (G). Clearly we must have I(G) = n I k (G). k=1 In this section our interest we primarily be at GL(n, K)-invariant polynomials. We have some obvious examples of such polynomials, the simplest being the trace and the determinant. They are homogenous of degree 1 and n respectively. Another example is t i (X) := Tr X i as well as the following: define for k = 1,..., n the map σ k : M n (K) K by det(i + λx) = 1 + λσ 1 (X) + λ 2 σ 2 (X) + + λ n σ n (X).
10.6 Characteristic Classes 181 It s not hard to check that σ k is in fact a GL(n, K)-invariant polynomial of homogenous degree k. Note that σ n (X) = det X. Surprisingly enough, the t k s and σ k s generate the set of GL(n, K)-invariant polynomials: Theorem 10.28. The set {t 1,..., t n } of invariant polynomials is independent, in the sense that none of them can be written as a polynomial of the others. The same is true for the set {σ 1,..., σ n }. Furthermore I(GL(n, R)) = R[t 1,..., t n ] = R[σ 1,..., σ n ] (10.13) i.e. any GL(n, R)-invariant polynomial can be written as a polynomial of the t i s or the σ i s. Upon defining c k := ( i 2π )k σ k we get I(GL(n, C)) = R[c 1,, c n ]. (10.14) An immediate consequence of (10.13) is that the σ k s can be written as polynomials in the t k s. These relations are called Newton relations, and the first of them read σ 1 = t 1, σ 2 = 1 2 (t2 1 t 2 ), σ 3 = 1 3 ( 1 2 t3 1 3t 1 t 2 + t 3 ). If R is the curvature of some connection on a vector bundle E and (U α ) is a trivialization cover, then we have the matrices of curvature 2-forms. If P is a GL(n, K)-invariant polynomial we can form the K-valued form P (Ω α ) over U α. If U α and U β are two trivialization neighborhoods that intersect nontrivially then, by Lemma 10.21, we have Ω β = g 1 αβ Ω αg αβ on U α U β, where g αβ (p) GL(n, K) and hence by invariance of P we have P (Ω α ) = P (Ω β ) on U α U β. Thus, we can piece the forms P (Ω α ) together to a global differential form P (R) satisfying P (R) Uα = P (Ω α ). Lemma 10.29. P (R) is a closed differential form. Proof. We will show that this is true locally, i.e. that d(p (Ω)) = 0 whenever Ω is the curvature 2-form on some neighborhood U. Recall the definition Ω = dω + ω ω. We then have dω = dω ω ω dω = Ω ω ω ω ω ω Ω + ω ω ω (10.15) = Ω ω ω Ω (this is also known as the Bianchi identity). Now we consider d(t i (Ω)) d(t i (Ω)) = d(tr(ω i )) = Tr(dΩ i ) = Tr(dΩ Ω i 1 + Ω dω Ω i 2 + + Ω i 1 dω) = Tr((Ω ω ω Ω) Ω i 1 + Ω (Ω ω ω Ω) Ω i 2 + + Ω i 1 (Ω ω ω Ω)) = Tr(Ω i ω ω Ω i ). During the calculation we exploited the Bianchi identity along with the fact that the second last sum is telescoping. Now since Tr is an invariant polynomial we have Tr(ω Ω i ) = Tr(Ω i ω) and hence that d(t i (Ω)) = 0. As P is a polynomial in the t i s we get d(p (Ω)) = 0.
182 Chapter 10 Characteristic Classes Thus, if P is a homogenous invariant polynomial of degree k P (R) is a closed 2k-form on M. Therefore it represents a cohomology class [P (R)] in H 2k (M, K) 2. Lemma 10.30. The cohomology class [P (R )] HdR 2k (M, K) is independent of the choice of connection. Proof. Let 0 and 1 be two connections on E and R 0 and R 1 the corresponding curvature tensors. We will see that [P (R 0 )] = [P (R 1 )]. Consider the projection π 1 : M R M and let E be the bundle over M R induced from E by this projection and let 0 and 1 be the pullback connections on E. We define a new connection on E by (s)(p, t) = (1 t) 0(s)(p, t) + t 1(s)(p, t), i.e. (s) is of the form f 0 0(s) + f 1 1(s) where in this case f 0 (p, t) = 1 t and f 1 (p, t) = t. Let R denote the corresponding curvature, then P (R ) is a closed 2k-form over M R. Now let, for k = 0, 1, ι k : M M R denote the map p (p, k). As π 1 ι k = id M, we have ι k E = E. By 2) of Proposition 10.15 we see that ι 0( (s)) = ι 0(f 0 0(s) + f 1 1(s)) = (f 0 ι 0 )ι 0(s) + (f 1 ι 0 )ι 0( 1(s)) = ι 0 0(s) = ι 0π 1 0 (s) = 0 (s) and similarly that ι 1 = 1. Therefore (since the pullback map is multiplicative with respect to the wedge product) P (R 0 ) = P (R ι 0 ) = ι 0P (R ) P (R 1 ) = P (R ι 1 ) = ι 1P (R ). But as ι 0 and ι 1 are obviously homotopic, they induce the same maps on cohomology and thus we have [P (R 0 )] = [ι 0P (R )] = ι 0[P (R )] = ι 1[P (R )] = [P (R 1 )]. By this last result [P (R)] does not depend on the curvature but depends only on the (isomorphism class of the) vector bundle E and therefore we may write it P (E). A cohomology class obtained in this way is called a characteristic class for the vector bundle E. Proposition 10.31 (Naturality of Characteristic Classes). Let E be a vector bundle over M and f : N M a smooth map. Then for any invariant polynomial P we have P (f E) = f P (E). Proof. Let be a connection on E, then f is a connection on f E and by point 3 of Proposition 10.15 we have P (R f ) = f P (R ) and therefore P (f E) = [P (R f )] = [f P (R )] = f ([P (R )]) = f P (E). Proposition 10.32 (Isomorphism Invariance). Two isomorphic bundles have the same characteristic classes. 2 HdR n (M; R) is just the usual de Rham cohomology i.e. the homology of the cocomplex (Ω k (M, R), d). In the complex case we define HdR 2k (M, C) to be the homology of the cocomplex (Ω k (M, C), d). Since Ω k (M, C) = Ω k (M, R) C, it is not hard to see that HdR n (M, C) = HdR n (M) C. dr
10.6 Characteristic Classes 183 Proof. Let Φ : E F be a bundle isomorphism. It induces an isomorphism Φ : Γ(E) Γ(F ) by Φ (s) = Φ s and similarly an isomorphism Φ : Ω k (M, E) Ω k (M, F ) by Φ (ω s) = ω (Φ s). Given a connection on E there exists a connection on F such that Γ(E) Ω 1 (M, E) Φ Γ(F ) Φ Ω 1 (M, F ) commutes, simply define by (Φ s) := Φ ( s). Since is K-linear and since for any f C (M) and s Γ(E) we have (f(φ s)) = (Φ (fs)) = Φ (fs) = Φ (df s + f s) = df (Φ s) + fφ ( s) = df (Φ s) + f (Φ s), is a connection on F. Now let {s 1,..., s n } be a local frame for E and let ω ij be the corresponding connection 1-forms for. Put t i = Φ s i, then ( n ) t j = (Φ t j ) = Φ ( s j ) = Φ ω ij s i = i=1 ω ij t i i.e. has the same connection 1-forms relative to the frame {t 1,..., t n }. Thus they also have the same curvature 2-forms, and hence the same characteristic classes. Example 10.33. 1) For a bundle over a contractible manifold M all characteristic classes in positive degree vanish since H k (M; K) = 0 for k 1. 2) Product bundles have vanishing characteristic classes as well: For on the product bundle we have the trivial connection which is flat, i.e. its curvature is zero. Thus all curvature 2-forms are zero and so are the characteristic classes. Another way to see this is the following: Consider the map f : M {p 0 } mapping all of M to some point p 0. Then the product bundle M K n equals the pullback f ({p 0 } K n ) and by naturality, the characteristic classes of M K n are pullbacks of the characteristic classes of {p 0 } K n which are zero by 1). For the next proposition it is crucial for the vector bundles in question to be real. The statement is not true for complex vector bundles! Proposition 10.34. If E is a real vector bundle and P is an invariant polynomial of odd degree, then P (E) = 0. Proof. As mentioned above we have I(GL(n, R)) = R[t 1,..., t n ], so if P is an invariant polynomial of odd degree then it must be a polynomial in t 1, t 3, t 5,... Thus it suffices to show that t k (E) = 0 for k odd. The characteristic class is independent of the connection, so equip E with a fiber metric and choose a metric connection. On each of the trivialization neighborhoods we pick orthonormal frames, and thus, by Lemma 10.24 the matrix of curvature forms is antisymmetric. If A is some antisymmetric matrix, then obviously A k is also antisymmetric (k is still odd) and thus t k (A) = Tr A k = 0. Thus also t k (Ω ij ) = 0 and the result follows. i=1
184 Chapter 10 Characteristic Classes In view of Proposition 10.34 we define: Definition 10.35 (Pontrjagin Class). Let E be a real vector bundle of dimension n, and define for k = 1,..., [n/2] p k := 1 (2π) 2k σ 2k. The characteristic class p k (E) H 4k dr (M) = H 4k (M; R) is called the k th Pontrjagin class of E. The map M n (R) X det(i + 1 2π X) is an invariant polynomial as well, and by definition of σ k we have ( det I + 1 ) 2π X 1 = 1 + (2π) k σ k(x). Since σ k (E) = 0 for k odd (Proposition 10.34) we have ( p(e) := det I + 1 ) 2π E = 1 + p 1 (E) + p 2 (E) + + p [n/2] (E) H (M; R), and this characteristic class is called the total Pontrjagin class. Recalling the definition c k = ( i 2π )k σ k, in the complex case we define: Definition 10.36 (Chern Class). Let E be a complex vector bundle of dimension n. The characteristic class c k (E) HdR 2k (M, C) = H 2k (M; C) is called the k th Chern class of E. The class ( c(e) := det I + i ) 2π E = 1 + c 1 (E) + c 2 (E) + + c n (E) H (M; C) is called the total Chern class. Even though it is defined in terms of complex polynomials and complex connection forms, it turns out that the Chern-classes are real cohomology classes: Lemma 10.37. Let E be a complex vector bundle, then c k (E) H 2k (M; R) and hence c(e) H (M; R). Proof. The Chern class is independent of the choice of connection, so we can pick a metric connection associated to some hermitian fiber metric on E. Then, relative to some orthonormal frame, the curvature matrix Ω will be skew-adjoint, and hence I + i 2π Ω will be self-adjoint. The determinant of a self-adjoint matrix will be a real number, and thus the differential form I + i 2π Ω does indeed take values in R: To be more specific: when action on a vector field it produces a real number, and thus it is a real cohomology class. In Section 10.8 we go one step further to show that both Chern and Pontrjagin classes are in fact integer cohomology classes. The total Pontrjagin and Chern classes behave nicely with respect to direct sums: Theorem 10.38 (Whitney Sum Formulas). Let E and F be vector bundles over K. k=1 1) If K = R: p(e F ) = p(e) p(f ) or explicitly p k (E F ) = k p i (E) p k i (F ). i=1
10.6 Characteristic Classes 185 2) If K = C: c(e F ) = c(e) c(f ) or explicitly c k (E F ) = k c i (E) c k i (F ). i=1 Proof. We show it for Chern classes. The proof for Pontrjagin classes is completely analogous. Let be a connection on E and a connection on F. Then under the isomorphisms Γ(E F ) = Γ(E) Γ(F ) and Ω 1 (M, E F ) = Ω 1 (M, E) Ω 1 (M, F ) it is a routine calculation to show that defined by ( )(s, s ) = ( s, s ) mapping Γ(E) Γ(F ) Γ(T M E) Γ(T F ) = Γ(T M (E F )) is a connection on E F. This is called the direct sum connection. Consider an open set U which is a trivialization for both E and F. Let (s 1,..., s m ) be a local frame for E and (t 1,..., t n ) be a local frame for F, both over U, and let ω E and ω F be the corresponding connection matrices. Then the corresponding connection form for E F relative to the frame (s 1,..., s m, t 1,..., t n ) is ω = ( ω E 0 0 ω F ). (10.16) Similarly the curvature matrix for E F is the block diagonal matrix of the curvature forms of E and F respectively: ( ) Ω E 0 Ω = 0 Ω F. Thus over U we get ( det I + i 2π Ω ) ( = det I + i ( 2π ΩE) det I + i ) 2π ΩF. For any point p U there is a neighborhood where this identity holds, thus it holds pointwise over all of M. Passing to the level of cohomology classes, the wedge product is transformed to a cup product, and the formula is proved. Example 10.39. Let s calculate the Pontrjagin classes of the tangent bundle of a sphere S n. Upon adding to T S n the normal bundle NS n (which is a trivial line bundle) we get the product bundle S n R n. But by Example 10.33 the total Pontrjagin class of a product bundle is just 1, so by the Whitney sum formula: 1 = p(t S n NS n ) = p(t S n ) p(ns n ) = p(t S n ) 1 = p(t S n ), i.e. all the Pontrjagin classes for T S n are zero. This, however, does not imply that the tangent bundles of the spheres are all trivial. Indeed, it is a highly non-trivial fact, that only T S 1, T S 3 and T S 7 are trivial. From a vector space V, we can form the dual space V. Similarly, for a vector bundle E we can form the dual bundle E, this is simply the bundle whose fibers are the dual spaces E p. This can be given a natural vector bundle structure. A closely related object is the conjugate bundle E, whose fibers E x are just the fibers E p equipped with the scalar multiplication (z, v) zv. Of course, if E is a real bundle, E = E. In any case, the conjugate bundle is isomorphic to the dual bundle: Pick a fiber metric on E, then the map E E, v ϕ v where ϕ v (w) = w, v is easily seen to be a bundle isomorphism. In particular, if E is a real bundle, it is isomorphic to its dual, although the explicit isomorphism depends on the choice of metric. In the complex case, E and E are not isomorphic, as is seen by the following proposition
186 Chapter 10 Characteristic Classes Proposition 10.40. The Chern classes of the conjugate and dual bundle of E are given by c k (E ) = c k (E) = ( 1) k c k (E). (10.17) Proof. The first identity is a consequence of the isomorphism E = E. The second one can be seen as follows: Let be a metric connection on E and Ω the matrix of curvature forms relative to some trivialization. Then is again a connection on E and Ω is the curvature form of this curvature. But as was a metric connection, we have Ω = Ω T. As σ k is a polynomial of homogenous degree k, the result follows. We end this section with a few words on the relation between the Pontrjagin and Chern classes. Let E be a real vector bundle. From this we can form the a complex vector bundle E C := E R C, the complexification of E. If is a connection on E we can define a connection on E C in the following way: Observe that Γ(E R C) = Γ(E) C (M,R) C (M, C) and thus that a section of E C is of the form s z, where z : M C is a smooth function. A connection C is then given by C (s z) = ( s) z (and extended linearly). Checking that this is well-defined and a connection is trivial. Assume now, that (s 1,..., s n ) is a local frame for E and ω is the corresponding connection form. Then (s 1 1,..., s n 1) is a local frame for E C (now we allow the coefficient functions to be complex-valued). We see that ( n ) C (s j 1) = ( s j ) 1 = ω ij s i 1 = ω ij (s i 1) i=1 i.e. the connection form for C is just ω, now viewed as a complex form. In the same manner: the curvature form of C is just the curvature form of, viewed as a complex form. Therefore [ p k (E) = 1 (2π) 2k σ 2k(R ) = ( 1) k c 2k (E C ). ] = Thus we have showed the promised relation: i=1 [ ( 1) k( i ) 2kσ2k (R 2π C)] Proposition 10.41. Let E be a real vector bundle and E C its complexification. Then p k (E) = ( 1) k c 2k (E C ). (10.18) 10.7 Orientation and the Euler Class In the previous section we were mostly interested in GL(n, K)-invariant polynomials. In the following our group will be SO(2n) and the Lie algebra will be so(2n), the set of skew-symmetric 2n 2n-matrices. We define a SO(2n)- invariant polynomial, called the Pfaffian Pf : so(2n) R in the following way: Let A so(2n) and define ω(a) := A ij e i e j Λ 2 (R 2n ). 1 i<j n Now the n times wedge product 1 n! ω(a) ω(a) is a 2n-form, i.e. it is proportional to the volume form µ := e 1 e 2n on R 2n. We define the Pfaffian of A to be exactly this proportionality factor, i.e. 1 ω(a) ω(a) = Pf(A)µ. n!
10.7 Orientation and the Euler Class 187 One can (but we will not) show the following properties of the Pfaffian 3 : Proposition 10.42. If A so(2n) and B M 2n (R) then: 1) Pf(A) = 1 2 n n! σ S 2n sign σa σ(1)σ(2) A σ(2n 1)σ(2n). 2) Pf(BAB T ) = Pf(A) det B. 3) Pf(A) 2 = det A. 4) If A and B are anti-symmetric, then the Pfaffian on the block-diagonal matrix C = diag(a, B) satisfies Pf(C) = Pf(A) Pf(B). (10.19) Note, that the content of the first two is, that the Pfaffian is an SO(2n)- invariant polynomial. From this invariant polynomial we will construct a characteristic class. However, this cannot be done for all vector bundles, only for a privileged class: the orientable bundles. Definition 10.43 (Orientable Vector Bundle). Let E be a real vector bundle over M. E is said to be orientable if there is a trivialization cover (U i, Φ i ) such that the corresponding transition functions g ij map into GL + (n, R), i.e. have positive determinant. A choice of a specific set of trivializations is called an orientation on E. If E is endowed with an orientation it is said to be oriented. It is easy to see, that if M is connected, there are only two possible orientations. If M has n components there are two possibilities for each component, i.e. a total of 2 n possible orientations. A fancy way of phrasing this is by saying that the set of orientations is in bijective correspondence with H 0 (M; Z 2 ), the zeroth singular cohomology group of M with coefficients in Z 2. Let E be a complex vector bundle of rank n. Let E R denote the realification of E, i.e. the same bundle but we forget the complex structure. This is a real vector bundle of rank 2n and more interesting: it is orientable. This is for the following reason: Let V be a complex vector space and V R its realification. If {v 1,..., v n } is a complex basis for V, then {v 1, iv 1,, v n, iv n } is a real basis for V R. Let {w 1,..., w n } be another complex basis for V and let A = (b kl + ic kl ) be the transition matrix. It takes but a trivial calculation to see, that the corresponding transition matrix between the real bases is b 11 c 11 b n1 c n1 c 11 b 11 c n1 b n1 A R =........ (10.20) b 1n c 1n b nn c nn c 1n b 1n c nn b nn The transition matrix A is an element of GL(n, C) which is path-connected, in particular there exists a continuous curve joining A to the identity I GL(n, C). Realifying this curve, will give a curve in GL(2n, R) joining A R with I 2n. Thus A R will have positive determinant. The upshot of all this is therefore that two different complex bases on a complex vector space induce the same orientation on the realification, i.e. the realification has a canonical orientation. If E is a complex vector bundle, we give each fiber in the realification the canonical orientation. This is easily seen to determine an orientation on E R, thus we have proved: 3 For proofs consult [MT] Appendix B.
188 Chapter 10 Characteristic Classes Lemma 10.44. The realification of a complex vector bundle is orientable and has a canonical orientation. Assume now that E is an oriented real vector bundle of rank 2n, and let (U α, Φ α ) be an oriented trivialization set. We may assume that the associated local sections constitute local orthonormal frames. Pick a fiber metric on E and let be a metric connection. With respect to the orthonormal frame in question, the matrices of curvature 2-forms Ω α are skew-symmetric. Thus, we can apply the Pfaffian to Ω α to obtain a 2n-form Pf( 1 2π Ω α) over U α. On the overlap U α U β we have Ω β = g 1 αβ Ω αg αβ and since the trivializations are oriented and the associated local frames are orthonormal, g αβ takes values in SO(2n) we get Pf( 1 2π Ω β) = det(g αβ ) Pf( 1 2π Ω α) = Pf( 1 2π Ω α). Thus we can piece the individual forms together to a global 2n-form, which we will denote by Pf(R ). Lemma 10.45. Pf(R ) is a closed 2n-form. Proof. By using the Bianchi identity on the explicit formula for the Pfaffian, Proposition 10.42 1), a direct computation as the one in the proof of Lemma 10.29 will show that Pf(R ) is closed. Thus, Pf(R ) determines a class in H 2n (M; R). Lemma 10.46. The cohomology class [Pf(R )] is independent of the choice of metric and metric connection. Proof. Assume first that we have two metrics g 0 and g 1 on E and 0 and 1 two metric connections (relative of course to g 0 and g 1 respectively). Let π : M R M be the projection and i k : M M R be the inclusions p (p, k) for k = 0, 1. We want to construct a metric on π E which pulls back to g 0 and g 1 via i 0 and i 1 (note that i k (π E) = E). Consider the open cover M ], 3 4 [ and M ] 1 4, [ of M R and let (ϕ 0, ϕ 1 ) be a partition of unity subordinate to the cover. Put g = ϕ 0 π g 0 + ϕ 1 π g 1, this is a metric on π E. Over M {0} it is just g 0 and over M {1} it is just g 1. Since the pullback i k is just restriction, we get i k g = g k. Now, let be a connection which is compatible with g, in particular it is compatible with g on (π E) M ] 1 8, 7 8 [ (this is Lemma 10.25 applied to the inclusion map M ] 1 8, 7 8 [ M R). Furthermore π 0 and π 1 are compatible with π g 0 and π g 1 respectively and thus π 0 will be compatible with g over M ], 1 4 [ and π 1 will be compatible with g over M ] 3 4, [. Thus by using a partition of unity subordinate to this new cover of M R we get a connection on π E and this is compatible with g since (10.11) by construction of holds in a neighborhood of every point. The pullback i k is the unique connection on i k (π E) = E which satisfies i (i k k s) = i k ( s). But this equation is, by construction of, satisfied by k, thus by uniqueness, i k = k. Now, by Proposition 10.15 3) we have i k(pf(r )) = Pf(R i k ) = Pf(R k ). Since i 0 and i 1 are homotopic, they induce the same map in cohomology, thus Pf(R 0 ) = Pf(R 1 ).
10.7 Orientation and the Euler Class 189 Definition 10.47 (Euler Class). The cohomology class e(e) := [Pf(R )] H 2n (M; R) which depends only on E is called the Euler class of the oriented bundle E. The Euler class possesses some properties which are analogous to the properties of the Chern and Pontrjagin classes: Proposition 10.48. The Euler class satisfies 1) e(e F ) = e(e) e(f ). 2) If E is a bundle over M and f : N M is smooth, then e(f E) = f (e(e)). where f E is given the induced orientation. Proof. 1) This a simple consequence of (10.16) in combination with (10.19). 2) Let be any metric connection on E. Pick f on f E, then by the same sort of argument as in the proof of Proposition 10.31 we get that e(f E) = [Pf(R f )] = [f Pf(R )] = f [Pf(R )] = f e(e). If E is an complex bundle of rank n we saw above, that its realification was an orientable bundle of rank 2n. Thus it makes sense to define the Euler class of an arbitrary complex vector bundle by e(e) := e(e R ). Proposition 10.49. The relations to other characteristic classes are as follows: 1) If E is an oriented rank 2n real vector bundle, then e(e) 2 = p n (E). 2) If E is a complex vector bundle of rank n, then e(e) = c n (E). Proof. 1) Let g be a metric on E and a metric connection. Let Ω be the matrix of curvature 2-forms relative to some local orthonormal frame. Now we complexify E to obtain E C, a rank 2n complex bundle. Let g C be the complexified metric on E C, it is defined fiber-wise by g C (v z, v z ) = g(v, v )zz. The connection is, if extended by C-linearity to a map defined on Γ(E C ) = Γ(E) R C, becomes a connection on E C, and it is compatible with g C. The local frame mentioned above is still an orthonormal frame for E C, and Ω will be the curvature form for relative to this frame. Now we have locally that Similarly we have e(e) 2 = c 2n (E C ) = ( ) 2n 1 det(ω). 2πi ( ) 2n ( ) 2n 1 1 Pf(Ω) 2 = det(ω). 2π 2π Since c 2n (E C ) = ( 1) k p n (E) by Proposition 10.41 we reach the conclusion. 2) Choose a complex metric g on E and let be a corresponding metric connection. It maps into Ω 1 (M, E) viewed as Ω 1 (M, C) C (M,C) Γ(E). On E R we have a metric g R given by g R (s, s ) := Re g(s, s ).
190 Chapter 10 Characteristic Classes Under the isomorphism Ω 1 (M, E) = Ω 1 (M, C) C (M,C) Γ(E) = Ω 1 (M, R) C (M,R) C (M, C) C (M,C) Γ(E) = Ω 1 (M, R) C (M,R) Γ(E R ) we can perceive as a connection on E R. Under this isomorphism all connection and curvature forms will be real differential forms. Obviously, is compatible with g R simply take the real part of (10.11). Let {s 1,..., s n } be a local g-orthonormal frame for E. Then {s 1, is 1,..., s n, is n } is a local g R -orthonormal frame for E R. Let Ω ij be the curvature 2-forms for relative to the frame (s i ). They are complex differential forms, so we write them as Ω ij = a ij +ib ij. Just as in the beginning of this section one can show that the matrix of curvature 2-forms for E R relative to the frame {s 1, is 1,..., s n, is n } is a 11 b 11 a n1 b n1 b 11 a 11 b n1 a n1 A R =........ (10.21) a 1n b 1n a nn b nn b 1n a 1n b nn a nn This is exactly the way we turn a complex matrix into a real matrix. So in order to show the desired identity we only need to verify the purely algebraic identity Pf(A R ) = ( i) n det A for some skew-adjoint n n-matrix A. Since A is skew-adjoint it can be diagonalized, i.e. there exists a unitary matrix U such that UAU 1 = diag(iλ 1,..., iλ n ). Realifying we get U R SO(2n) and [ ( ) U R A R U 1 R = diag 0 λ 1 λ 1 0 Thus by 2) and 4) of Proposition 10.42 we get,..., ( 0 λn λ n 0 Pf(A R ) = Pf(U R A R U 1 R ) = λ 1 λ n. ) ]. On the other hand det(a) = i n λ 1 λ n and the identity is proved. The importance of the Euler class (except for its use in the Atiyah-Singer index formula) lies in the fact that it provides an obstruction to the existence of an everywhere nonzero section of the bundle: Theorem 10.50. Let E M be a real oriented vector bundle. If an everywhere nonzero section s : M E exists, then e(e) = 0. We will not use this theorem and hence not prove it. 10.8 Splitting Principle, Multiplicative Sequences In this section we define the Chern character, a map which gives an important link between cohomology of a space and its K-theory. There are several ways to accomplish this, but the most elegant way, I think, involves cohomology with integer coefficients. Thus the first task before us is to show that the Chern classes as defined above are not just R-cohomology classes but in fact integer cohomology classes. An important ingredient in this is the following, which I do not intend to prove 4 : 4 For a proof consult [MT] Chapter 20 or [Ha] Proposition 3.3.
10.8 Splitting Principle, Multiplicative Sequences 191 Theorem 10.51 (The Splitting Principle I). Let E be a real or complex vector bundle of rank n over the manifold M. Then there exists a smooth manifold T (depending on E) and a smooth map f : T M such that 1) f E = L 1 L n where L 1,..., L n are real or complex line bundles over T. 2) For each k and each coefficient ring R the map f : H k (M; R) H k (T ; R) is injective. For real vector bundles there is an alternative version of the splitting principle Theorem 10.52 (The Splitting Principle II). Let E be an oriented real vector bundle of rank n over the manifold M. Then there exists a smooth manifold T (depending on E) and a smooth map f : T M such that f : H (M; R) H (T ; R) is injective and such that 1) f (E C ) = L 1 L 1 L k L k if n = 2k 2) f (E C ) = L 1 L 1 L k L k I 1 if n = 2k + 1 where L i are complex line bundles, L i is the conjugate bundle and I 1 is the trivial complex line bundle. Thanks to this splitting principle we can prove a uniqueness result for the Chern-classes. In what follows, H n will denote the canonical complex line bundle over CP n. Theorem 10.53. There exists a unique set of maps c k : Vect C (M) H 2k (M; R), k N satisfying 1) M (c 1(H 1 )) = 1, c k (H 1 ) = 0 for k > 1. 2) f (c k (E)) = c k (f E) for any smooth map f : N M. 3) c k (E F ) = k j=1 c j(e) c k j (F ). Proof. We do know that they exist, after all that was, what all the effort in the previous sections was for. The only thing missing is the first point, but this can be seen in [MT] Theorem 18.4. The proof of uniqueness is divided into 3 steps: first we show it for a line bundle, then for a sum of line bundles and finally for a general bundle. First, let π : L M be a smooth line bundle over M and assume we have maps c 1, c 2,... satisfying the 3 requirements above. There exists an integer n and a rank n bundle L over M such that L L = M C n+1. Thus we may view L as sitting inside M C n+1. We define a map p : M CP n in the following (slightly complicated) way: For a point x M the fiber L x in L is a complex line in C n+1 and thus represents an element in CP n. To see that this map is smooth we see how it looks locally: Around any point in M there is a neighborhood U and a nowhere vanishing section s : U L of L. On U the map p is given by x [π 2 (s(x))] where π 2 : M C n+1 C n+1 is the projection map and [ ] : C n+1 CP n is the quotient map. These are all smooth maps, and thus p is smooth over U. In the same spirit we define a map p : L H n in the following way: Let U CP n be a trivialization neighborhood for H n such that p 1 (U) M is a trivialization neighborhood for L. Then we define p over π 1 (U) L as the composition L π 1 (U) p 1 (U) C p id U C H n U.
192 Chapter 10 Characteristic Classes This gives a well-defined smooth bundle map which makes the following diagram commutative: L p M p CP n Since the pullback p H n is the unique bundle over M making the diagram above commutative, we have L = p H n. But then and for k > 1 H n c 1 (L) = c 1 (p H n ) = p (c 1 (H n )) c k (L) = p (c k (H n )) = 0. Thus the c k s are uniquely determined on a line bundle. Consider now a sum L 1 L n of line bundles. Then inductive use of requirement 3) gives c k in terms of c 1 (L 1 ),..., c 1 (L n ) which were uniquely determined. Thus the c k s are uniquely determined on a sum of line bundles. Finally, assume E to be a generic vector bundle over M. By the splitting principle there exists a manifold T and a map f : T M such that f E = L 1 L n is a sum of line bundles. Assume we have cohomology classes c k (E), c k (E) H2k (M) satisfying the 3 conditions, then by point 2) we have f (c k (E)) = c k (f E) = c k(f E) = f (c k(e)) and since f : H 2k (M; R) H 2k (T ; R) is injective, c k (E) = c k (E). This uniqueness statement can be exploited to show that the Chern classes are in fact integer cohomology classes, but we will not need that. For our purpose it suffices that they are real cohomology classes. In the previous sections we have constructed characteristic classes from invariant polynomials. However in order to construct the characteristic classes needed for the Atiyah-Singer Index Theorem, we cannot confine ourselves to just polynomials, we need to be able to construct characteristic classes out of infinite power series. The remainder of this section will create the proper framework and in the next we will consider some very concrete and important examples. We begin by introducing symmetric polynomials: Definition 10.54 (Symmetric Polynomial). A symmetric polynomial p of degree k is a polynomial of homogenous degree k satisfying for all σ S n. p(x 1,..., x n ) = p(x σ(1),, x σ(n) ) An obvious example of such a polynomial is t k (x 1,, x n ) := x k 1 + + x k n. Furthermore we have the so-called elementary symmetric polynomials given by σ 1 (x 1,..., x n ) = x 1 + + x n σ 2 (x 1,..., x n ) = x i1 x i2 i 1<i 2 σ 3 (x 1,..., x n ) = x i1 x i2 x i3 i 1<i 2<i 3 σ n (x 1,..., x n ) = x 1 x n. The main result of symmetric polynomials is the following.
10.8 Splitting Principle, Multiplicative Sequences 193 Theorem 10.55. The set of elementary symmetric polynomials is algebraically independent (i.e. no σ i can be written as a polynomial of the rest) and furthermore any symmetric polynomial can be expressed as a polynomial in the σ i s. In particular the symmetric polynomial t k can be written as a polynomial s k (σ 1,..., σ k ). These particular polynomials s k are called the Newton polynomials. The first Newton polynomials are s 1 = σ 1 s 2 = σ 2 1 2σ 2 s 3 = σ 3 1 3σ 1 σ 2 + 3σ 3 s 4 = σ 4 1 4σ 2 1σ 2 + 4σ 1 σ 3 + 2σ 2 2 4σ 4. In general one has the following recursion formula s k = σ 1 s k 1 σ 2 s k 2 + + ( 1) k 2 σ k 1 s 1 + ( 1) k 1 kσ k. (10.22) We will return to these in the following section. For now, consider R[x] := {1 + a 1 x + a 2 x 2 + a i R}, the set of formal power series in the indeterminate x with real coefficients and constant term 1 (the bar is to resemble the closure). Fixing f R[x] we construct a formal power series in n indeterminates by f(x 1 ) f(x n ). Group the terms according to their degree. The sum of the terms of degree k will be a homogenous symmetric polynomial and can therefore be written as a polynomial F k in the elementary symmetric polynomials, i.e. f(x 1 ) f(x n ) = 1 + F 1 (σ 1 ) + F 2 (σ 1, σ 2 ) + F 3 (σ 1, σ 2, σ 3 ) + The polynomials F k are independent of the number n of indeterminates because σ k (x 1,..., x n, 0,..., 0) = σ k (x 1,..., x n ) if k n and σ k (x 1,..., x n, 0,..., 0) = 0 if k > 0. In this way we obtain an infinite sequence (F k ) k=1 of polynomials, called the multiplicative sequence which is characteristic of the formal power series f. A great source of power series is of course Taylor series of holomorphic functions. The Taylor series of a product of holomorphic functions is, of course, just the product of the Taylor series, as above. Therefore, if f = gh is a product of holomorphic functions, and F, G and H are the corresponding multiplicative sequences, we get F (x 1,..., x n ) = G(x 1,..., x n )H(x 1,..., x n ). (10.23) Let B = k B k be a commutative N 0 -graded algebra over R, examples include R[x], the polynomial algebra, and H 2 (M; R) the even cohomology with real coefficients. Let B be the subalgebra consisting of elements of the form 1 + b 1 + b 2 + + b n for b i B i and varying n. To a fixed f R[x] we associate a map F : B B by defining F(1 + b 1 + + b n ) = 1 + F 1 (b 1 ) + F 2 (b 1, b 2 ) +
194 Chapter 10 Characteristic Classes For the purpose of constructing characteristic classes of complex vector bundles we will take as our algebra B the even cohomology H 2 (M; R) of the manifold. To a given f R[x] we get a homomorphism F : H 2 (M; R) H 2 (M; R). If E is a complex vector bundle, we define its total F -class F C (E) := F(c(E)) = 1 + F 1 (c 1 (E)) + F 2 (c 1 (E), c 2 (E)) + If E is instead a real vector bundle, we take as our algebra H 4 (M; R), identify the total Pontrjagin class p(e) as an element of H 4 (M; R) and define its total F -class by F R (E) := F(p(E)). A prime example of a real vector bundle is of course the tangent bundle T M. In this case we write F(M) := F R (T M) H 4 (M; R). If we assume furthermore that M is compact, connected and oriented, then H n (M; R) = R and the homology class corresponding to 1 is called the fundamental class of M and is denoted [M] (the isomorphism H n (M; R) depends on the orientation, thus changing the orientation changes the fundamental class). We can evaluate the F -class F(M) on [M] to obtain a real number F (M) called the F-genus of M. It s easy to see that F (M) = F(M)[M] = { F k (p 1 (T M),..., p k (T M))[M], n = 4k 0, n 0 mod 4 Proposition 10.56. Assume f R[x] is fixed. The total F -class has the following properties 1) It is isomorphism invariant: If E = E, then F K (E) = F K (E ). 2) It it natural: If g : N M is a map, then F K (g E) = g F K (E). 3) It is multiplicative: F K (E E ) = F K (E) F K (E ). Proof. The first assertion follows trivially from the fact, that the Chern and Pontrjagin classes are isomorphism invariant. The second claim follows from a small calculation (for which we assume E to be a complex bundle) F C (g E) = F(c(g E)) = F(g c(e)) = 1 + F 1 (g c 1 (E)) + F 2 (g c 1 (E), g c 2 (E)) + = g (1 + F 1 (c 1 (E)) + F 2 (c 1 (E), c 2 (E)) + ). To prove the third claim (as before, we assume the bundles to be complex, the real case is analogous), assume first that E = L 1 L n and E = L n+1 L n+m are sums of line bundles. Then by the Whitney sum formula c(e E ) = c(l 1 ) c(l n+m ) = (1 + x 1 ) (1 + x n+m ) = 1 + σ 1 + + σ n+m where x i = c(l i ) and the σ k s are the elementary symmetric polynomials in n + m variables. But then F C (E E ) = F(c(E E )) = F(1 + σ 1 + + σ n+m ) = f(x 1 ) f(x n+m ) = (f(x 1 ) f(x n ))(f(x n+1 ) f(x n+m )) = F(1 + σ 1 + + σ n)f(1 + σ 1 + + σ m) = F(c(E))F(c(E )) = F C (E)F C (E ) where (σ k ) and (σ k ) are the elementary symmetric polynomials in the variables x 1,..., x n resp. x n+1,... x n+m.
10.8 Splitting Principle, Multiplicative Sequences 195 If E and E are arbitrary bundles we apply the splitting principle. By the splitting principle there exists a manifold T 1 and a smooth map g 1 : T 1 M such that g 1E is a direct sum of line bundles. The pullback g 1E need not be a sum of line bundles, but applying the splitting principle once again we get a manifold T 2 and a smooth map g 2 : T 2 T 1 such that g 2(g 1E ) is a sum of line bundles. Obviously g 2(g 1E) is still a sum of line bundles, and thus putting T := T 2 and g := g 1 g 2 : T M we have g E and g E are sums of line bundles over the same manifold. Hence by naturality g F C (E E ) = F C (g E g E ) = F C (g E)F C (g E ) = g (F C (E)F C (E )). Since g : H (M; R) H (T ; R) is injective, the assertion is proven. By the Splitting principle we can in many cases reduce to the case of line bundles. Therefore these deserve a special treatment. Let E be a real line bundle, such that E C = L 1 L 1 L n L n, a sum of complex line bundles. By the Whitney sum formula we get c(e C ) = n (1 x 2 k) = 1 σ 1 (x 2 1,..., x 2 n) + σ 2 (x 2 1,..., x 2 n) σ 3 (x 2 1,..., x 2 n) + k=1 and from this we get that c 2k (E C ) = ( 1) k σ k (x 2 1,..., x 2 n). Thus p(e) = 1 + p 1 (E) + p 2 (E) + = 1 c 2 (E C ) + c 4 (E C ) c 6 (E C ) + = 1 + σ 1 (x 1,..., x n ) + σ 2 (x 1,..., x n ) + n = (1 + x 2 n). k=1 In particular, multiplicativity of the F-class above gives F R (E) = f(x 2 1) f(x 2 n) (10.24) still under the assumption that E C splits as a direct sum of line bundles as above. Example 10.57. 1) The Todd Class. Consider the formal power series td(x) = x 1 e x = 1 + 1 2 x + 1 12 x2 1 720 x4 + The coefficients here in this Taylor series are closely related to the so-called Bernoulli numbers, in fact these are the coefficients in the Taylor expansion of z the holomorphic function e z 1. The corresponding multiplicative sequence is called the Todd sequence (Td m ). From td(x 1 ) td(x 2 ) = 1 + 1 ( 1 2 (x 1 + x 2 ) + 12 x2 1 + 1 4 x 1x 2 + 1 ) 12 x2 2 + we see by comparing to the expressions of the elementary symmetric polynomials that Td 1 (σ 1 ) = 1 2 σ 1 Td 2 (σ 1, σ 2 ) = 1 12 (σ 2 + σ 2 1) Td 3 (σ 1, σ 2, σ 3 ) = 1 24 c 2c 1
196 Chapter 10 Characteristic Classes and so on. If E is a complex vector bundle, its total Todd class is then defined by Td C (E) = 1 + 1 2 c 1(E) + 1 12 (c 2(E) + c 1 (E) 2 ) + (10.25) As a matter of fact, the Todd class has an inverse, namely consider the inverse of td(x): td 1 (x) = 1 e x x = 1 x 2 + x2 6 x3 24 + Given a complex vector bundle E, we denote the corresponding characteristic class by Td 1 (E). Since td 1 (x) td(x) = 1, we have by (10.23) that Td 1 C (E)Td C(E) = 1 in H 2 (M; R). Sometimes we will use the slightly abusive 1 notation Td C (E) for the inverse Todd class. 2) The Total Â-Class. In this example we consider the formal power series given by x/2 â(x) = sinh( x/2) = 1 1 24 x + 7 5760 x2 + The corresponding sequence, called the Â-sequence, has the first terms  1 (σ 1 ) = 1 24 σ 1  2 (σ 1, σ 2 ) = 1 5760 ( 4σ 2 + 7σ1) 2  3 (σ 1, σ 2, σ 3 ) = 1 9676800 (16σ 3 44σ 2 σ 1 + 31σ1). 3 For a real bundle, the associated characteristic class ÂR(E) is called the total Â-class. 3) Hirzebruch L-class. Finally consider the formal power series x l(x) = tanh x = 1 + 1 3 x 1 45 x2 + The first terms of the corresponding Hirzebruch L-sequence are L 1 (σ 1 ) = 1 3 σ 1 L 2 (σ 1, σ 2 ) = 1 45 (7σ 2 σ 2 1) L 3 (σ 1, σ 2, σ 3 ) = 1 945 (62σ 3 13σ 1 σ 2 + 2σ 3 1). For a real bundle E the characteristic class L R (E) = 1 + L 1 (p 1 (E)) + L 2 (p 1 (E), p 2 (E)) + is called the Hirzebruch L-class or just the total L-class. Proposition 10.58. Let E be an oriented real vector bundle. Then Td C (E C) = (ÂR(E)) 2. (10.26) Proof. By the splitting principle we only need to show it in the case where E C is a sum of line bundles. Assuming that E has dimension 2n then E C = L 1 L 1 L n L n.
10.9 The Chern Character 197 In this case we get (thanks to (10.17)) Td C (E C) = j=1 n j=1 x j 1 e xj x j 1 e xj where x j = c 1 (L j ). Multiply the denominator by e xj/2 e xj/2 to obtain n ( ) 2 x j n ( ) 2 xj /2 Td C (E C) = = e xj/2 e xj/2 sinh(x j /2) = (ÂR(E)) 2 where the last equality follows from (10.24). The case where E is of odd dimension is analogous. 10.9 The Chern Character We end this chapter by introducing the Chern character. This is again a characteristic class defined from an formal power series. However its construction and its properties are slightly different from the F -classes. Let L be a complex line bundle over a manifold M and define the Chern character ch(l) on L by ch(l) = exp(c 1 (L)) = 1 + c 1 (L) + 1 2 c 1(L) 2 + + 1 n! c 1(L) n + Since 1 k! (L)k H 2k (M; R) and since H n (M; R) = for n great enough, this series terminates after finitely many terms. Thus the Chern character is a well-defined real cohomology class. For a direct sum L 1 L n of complex line bundles, we have n c(e) = (1 + x i ) = 1 + σ 1 (x 1,..., x n ) + + σ n (x 1,..., x n ) i=1 (where x i := c 1 (L i )) in other words that j=1 c k (L 1 L n ) = σ k (x 1,..., x n ). (10.27) Upon defining the Chern character to be the sum of the Chern character of the line bundles (this is where we depart from the F -classes) ch(l 1 L n ) = ch(l i ) = i=1 exp(c 1 (L i )) i=1 = n + t 1 (x 1,..., x n ) + + t k (x 1,..., x n )/k! + (10.28) and substituting the Newton relations (10.22) into this and using (10.27) we get ch(e) = n + k=1 s k (c 1 (E),..., c n (E)) k!. (10.29) This formula is derived under the assumption that E is a direct sum of line bundles. But the right hand side is expressed in terms of E only. Therefore we may take (10.29) to be the definition of the Chern character in the general case: Definition 10.59 (Chern Character). For a complex vector bundle E of rank n, define the Chern character of E by ch(e) := n + k=1 s k (c 1 (E),..., c n (E)) k!. (10.30)
198 Chapter 10 Characteristic Classes Compared to the Chern classes and F -classes, the Chern character has even nicer properties Proposition 10.60. Let E and F be complex vector bundles over a manifold M and f : N M a smooth map, then the Chern character satisfies the following 1) Naturality: ch(f E) = f ch E. 2) Additivity: ch(e F ) = ch(e) + ch(f ). 3) Multiplicativity:ch(E F ) = ch(e) ch(f ). Proof. 1) This follows immediately from the definition (10.30) as well as from naturality of the Chern classes and multiplicativity of f. 2) By definition of the Chern character, this is true if E and F are direct sums of line bundles. If E and F are generic vector bundles we will use the splitting principle to reduce to the case of sums of line bundles: By the splitting principle there exists a manifold T and a smooth map f : T M such that f E and f F are direct sums of line bundles. By naturality of the Chern character we have ch(f E) = f ch(e) and therefore f ch(e F ) = ch(f E f F ) = ch(f E) + ch(f F ) = f (ch(e) + ch(f )). Since f : H (M; R) H (T ; R) is injective, we have ch(e F ) = ch(e) + ch(f ). 3) This can be proved in exactly the same way. All we need to know is that if L 1 and L 2 are line bundles, then we have c 1 (L 1 L 2 ) = c 1 (L 1 ) + c 1 (L 2 ) 5, and therefore ch(l 1 L 2 ) = e c1(l1 L2) = e c1(l1)+c1(l2) = e c1(l1) e c1(l2) = ch(l 1 ) ch(l 2 ). Iterating this formula to arbitrary sums of line bundles and using the splitting principle as in the proof of 1), we get 2). Example 10.61. Let E be a complex vector bundle. Let s try to derive a formula for the alternating sum of the Chern character of the k th exterior power of E, i.e. ch( k ( 1)k ch(λ k E) (artificial as this may seem, it will however turn out useful later on). First, assume that E = L 1 L n is a direct sum of complex line bundles. By an iteration of the formula Λ k (V W ) = k i=0 Λi V Λ k i W we get Λ k E = Thus (writing x i := c 1 (L i )): ch Λ k E = = 1 i 1<...<i k n 1 i 1< <i k n 1 i 1< <i k n L i1 L ik. ch(l i1 ) ch(l ik ) e xi 1 e x ik = 1 i 1< <i k n e xi 1 + +xi k 5 A proof of this statement may be found in [Ha], the proof of Proposition 3.10.
10.9 The Chern Character 199 and thereby ( 1) n ch(λ k E) = k=0 = ( 1) k k=0 n (1 e xi ). i=1 1 i 1< <i k n e xi 1 + +xi k By the Whitney formula c(e) = (1 + x i ) we easily see that c n (E) = x 1 x n (still, of course, under the assumption, that E is a sum of line bundles). By Proposition 10.40 we have c n (E ) = ( 1) n x 1 x n and hence ( 1) k ch(λ k E) = k=0 n (1 e xi ) i=1 = ( ( 1) n x 1 x n ) ( 1) n n 1 e xi i=1 = c n (E )Td 1 C (E ) (10.31) (note that (10.23) guarantees that these manipulations are legal). By virtue of the Splitting Principle, this formula holds for any complex vector bundle, being a direct sum of line bundles or not. We can immediately extend the definition of the Chern character to K- theory. Let M be a compact manifold. Since the Chern character Vect C (M) H (M; R) is additive it follows from standard properties of the Grothendieck group, that it descends to a group homomorphism K(M) H (M; R) (or, as the Chern classes are even dimensional, ch : K(M) H 2 (M; R)), explicitly it is given by ch([e] [F ]) = ch(e) ch(f ). The properties 2) and 3) of Proposition 10.60 ensure that this is a ring homomorphism and property 1) says that if f : N M is a smooth map, the following diagram is commutative: x i K(M) f K(N) ch ch H (M; R) f H (N; R) We will need also a Chern character over non-compact manifolds. The construction is as follows: Let M be a manifold and M + its 1-point compactification. Let j : {pt} M denote the inclusion of infinity in M +, then by definition K(M) = K(M + ) = ker j. Furthermore we have the isomorphism H c (M; R) = ker j where j : H (M; R) H (pt; R) is now the induced map in cohomology. We have a Chern character ch : K(M + ) H (M + ; R) and by naturality it commutes with j. Thus it is easily seen that the Chern character maps kernel to kernel, i.e. restricting it to K(M) = K(M + ) produces a ring homomorphism ch : K(M) H c (M; R). (10.32) This is our Chern character in the non-compact case.
200 Chapter 10 Characteristic Classes
Chapter 11 Differential Operators 11.1 Differential Operators on Manifolds Let U be an open set in R n. By a differential operator of order k on U we will understand an n 1 n 2 -matrix (A ij ) A ij = α k a ij α (x) α (11.1) where a ij α are smooth complex-valued functions on U. Let M be a manifold, and let (U, x 1,..., x n ) be a chart on M. For a smooth function f C (U) it makes sense to write α f, it simply means consecutive actions of the coordinate vector fields i = / x i on f. Thus we can talk about a differential operator over U namely a matrix of operators of the form (11.1). A differential operator on M is an operator which locally looks like the operators described above. More precisely, let π E : E M and π F : F M be complex vector bundles over M (they could have been real as well). We say that an open set U M is a proper neighborhood if it is the domain of a smooth chart (U, x 1,..., x n ) for M and if there exist trivializations Φ E : π 1 E (U) U Cn1 and Φ F : π 1 F (U) U for E and F over U. We formalize the notion of Cn2 a differential operator in the following Definition 11.1 (Differential Operator). A C-linear map A : Γ(E) Γ(F ) is called a differential operator provided: 1) The operator is local, i.e. if u Γ(E) is zero on some open neighborhood U then Au U = 0 as well, i.e. A decreases support. 2) For each proper neighborhood U with trivializations Φ E and Φ F there exists a matrix of differential operators A Φ U (called the local representative for A) as in (11.1) such that for any u Γ(E): Au U = (Φ 1 F (id U A Φ U ) Φ E )u U. Note how point 1) is necessary for point 2) to make sense. For later use we will investigate how the local representative A Φ U the trivializations. Suppose we have two alternative trivializations depends on Ψ E : π 1 E (U) U Cn1 and Ψ F : π 1 F (U) U Cn2 and let A Ψ U be the local representative for A relative to these trivializations. We have transition functions g E : U GL(n 1, C) and g F : U GL(n 2, C) 201
202 Chapter 11 Differential Operators satisfying From the equation Ψ 1 F Ψ E Φ 1 E (p, ξ) = (p, g E(p)ξ), Ψ F Φ 1 F (p, ξ) = (p, g F (p)ξ). (id U A Ψ U ) Ψ E = Φ 1 F (they are both equal to Au U ) we get by isolating (id U A Φ U ) Φ E id U A Ψ U = (Ψ F Φ 1 F ) (id U A Φ U ) (Φ E Ψ 1 E ) = id U (g F A Φ U g 1 E ), that is A Ψ U = g F A Φ U g 1 E. In the following example we consider a number of differential operators on manifolds. Example 11.2. 1) Vector Fields. Let X be a vector field, and let L X : C (M) C (M) be the Lie derivative on functions, i.e. L X f = Xf. It is an operator on the trivial bundle E = M R. In a coordinate chart (U, x 1,..., x n ) we have X = n i=1 Xi i and L X f = f x 1 + + f x n thus in this chart L X has local representative / x 1 + + / x n (a 1 1- matrix) and thus it is a first order differential operator on E. 2) The Exterior Derivative. Consider the vector bundles M R and T M (these are now real vector bundles!), as well as the exterior derivative d : C (M) Ω 1 (M) between the sets of sections of these bundles. It is a local R-linear operator, and in a coordinate patch (U, x 1,..., x n ) it is given by df = f x 1 dx1 + + f x n dxn. Consequently, relative to the trivialization for T M given by the chart, this operator is represented by the differential operator ( ) x 1,..., x n. Thus, the exterior differential is a first order differential operator. 3) Connections. Let E be a vector bundle, a connection on E and X a vector field on M, then X : Γ(E) Γ(E) is a differential operator. Trivially, this is K-linear and local. To check that it is locally a differential operator, we might just a well assume E to be trivial. Furthermore in order to keep the calculations as simple as possible, we we impose the additional assumption, that E is 1-dimensional i.e. is the trivial line bundle M K. In this case Γ(E) = C (M) (K-valued smooth functions) and the connection is given on a section f C (M) by (cf. (10.2)) where v is a 1-form. Thus f = df + fv X f = Xf + v(x)f.
11.1 Differential Operators on Manifolds 203 In a local coordinate patch (U, x 1,..., x n ) we write X = n i=1 Xi i, then X f = X i ( i f) + v(x)f i=1 thus locally, X is represented by the differential operator X 1 x 1 + + Xn x n + v(x). In fact, the connection itself : Γ(E) Γ(T M E) is a differential operator. Again we assume that E = M K m is trivial, then the sections are of course just m-tuples of smooth functions (the sections s i (p) = (p, e i ) constitute a global frame) and Γ(T M E) = Ω 1 (M, E) consists of m-tuples of 1-forms (the identification being ω s 1 (ω, 0,..., 0), ω s 2 (0, ω, 0,..., 0) and so on). The connection acts on fs 1 = (f, 0,..., 0) in the following way where v = η 1. η m = (fs 1 ) = df s 1 + fv η 11 dx 1 + + η 1n dx n. η m1 dx 1 + + η mn dx n Ω 1 (M, E) is an m-tuple of 1-forms expressed in terms of the chart (U, x 1,..., x n ). Identify df s 1 with the tuple (df, 0,..., 0) and we get (fs 1 ) = df. 0 + f η 11 dx 1 + + η 1n dx n. η m1 dx 1 + + η mn dx n (fη 11 + f x )dx 1 + + (fη 1 1n + f x )dx n n =. fη m1 dx 1 + + fη mn dx n The first row of this vector of 1-forms corresponds to ( fη 11 + f ) ( x 1 (dx 1 s 1 ) + + fη 1n + f ) x n (dx n s 1 ) the m th row corresponds to fη m1 (dx 1 s m ) + + fη mn (dx n s m ) and similarly with the other rows. Thus by choosing the trivialization of T M E corresponding to the local frame {dx i s j } we see that acts on the components as a differential operator. For the remainder of this chapter suppose (unless otherwise specified) that (M, g) is an oriented Riemannian manifold with volume form dv g. For a vector bundle E over M we let Γ c (E) denote the set of smooth sections of E with compact support. If E is a Riemannian vector bundle we can define on Γ c (E) an inner product ( ) E by (u v) E := u(p), v(p) p dv g. M
204 Chapter 11 Differential Operators Definition 11.3 (Formal Adjoint). Let A : Γ(E) Γ(F ) be a linear operator. If there exists an operator A : Γ(F ) Γ(E) such that (Au v) F = (u A v) for all u Γ c (E) and v Γ c (F ) we say that A is the formal adjoint of A. It is called a formal adjoint since it is not an adjoint in the Hilbert space sense. We record some elementary properties: Proposition 11.4. Let A : Γ(E) Γ(F ) be an R-linear operator, then the following holds 1) A has at most one formal adjoint. 2) If A and B : Γ(E) Γ(F ) have formal adjoints A and B, then A + B has a formal adjoint and (A + B) = A + B. 3) If A and B : Γ(F ) Γ(H) have formal adjoints, then BA has a formal adjoint and (BA) = A B. 4) If A is a differential operator a formal adjoint exists and A is a differential operator. Proof. The only non-trivial statement in this proposition is 4). We can prove this by using a partition of unity to reduce to the trivial case. First, if A is a usual differential operator A = α k a α(x) α over R n or an open subset hereof, integration by parts yields the existence of a formal adjoint as well as the formula A u = ( 1) α α (a α u). α k If A = (A ij ) is a differential operator C (U) n1 C (U) n2 it is easy to see that the formal adjoint exists and equals A = (A ji ) i.e. the transpose of the matrix of formal adjoints. Now, suppose A is a differential operator on a manifold M, and suppose (U i ) i I is a cover of M of proper neighborhoods. Let A Ui denote the local representatives of A (w.r.t. some trivializations which we don t bother to include in the notation), then the formal adjoints A U i exists as we have just seen. Furthermore, if two neighborhoods U i and U j intersect non-trivially, then both A U i and A U j, when restricted to the intersection, are formal adjoints of A restricted to U i U j. Thus, by uniqueness of formal adjoints, they agree on the overlap. Consequently, there exists a differential operator B having A U i as local representatives. To see that this is the formal adjoint of A, let (ψ i ) be a partition of unity subordinate to the cover (U i ), then we have (by definition of the integral over M, in fact) that (Au v) = Au, v dv g = ψ i A Ui u, v dv g M i I U i = ψ i u, A U i v dv g = u, Bv dv g i I U i M = (u Bv) from which we conclude that B is the formal adjoint of A.
11.2 The Principal Symbol 205 Example 11.5. 1) Exterior Derivative. On a compact, oriented Riemannian manifold it is well-known that we have a map d : Ω k (M) Ω k 1 (M) given by d = ( 1) n(k+1)+1 d where is the Hodge star operator and n is the dimension of M. This map satisfies (d ω η) = (ω dη) for all ω Ω k (M) and η Ω k 1 (M) where ( ) is the inner product on Ω k (M) induced from the metric on M. Thus d is the formal adjoint of d. 2) Vector fields. First we record the following formula for u, v C (M) (we use L X to denote the Lie derivative on both functions and differential forms): L X (uv dv g ) = (L X u)v dv g + u(l X v)dv g + uv(l X dv g ) = (L X u)v dv g + u(l X + div X)v dv g. The Cartan formula states that L X = i X d + d i X where i X : Ω k (M) Ω k 1 (M) is contraction with X. Since uv dv g is already a top-dimensional form, L X (uv dv g ) = di X (uv dv g ) and substituting this into the equation above yields di X (uv dv g ) = (L X u)v dv g + u(l X + div X)v dv g. (11.2) Integration on the left-hand side gives 0 by Stokes Theorem (since our manifold by assumption has no boundary). Hence (L X u)v dv g = u( L X div X)v dv g, M M i.e. L X = L X div(x), and this is again a differential operator on the trivial bundle M R. 3) Connections. Let be a metric connection on a bundle E, i.e. for any u, v Γ(E) and X X (M) it satisfies L X u, v = X u, v + u, X v. Upon integrating we get X u, v + u, X v dv g = L X ( u, v )dv g M M = (div X) u, v dv g M = u, (div X)v dv g, M where the second identity follows from (11.2) (simply replace u by u, v and v by 1). Consequently X = X div X. 11.2 The Principal Symbol Let U R n be some open set and let A = α k a α(x) α be a usual differential operator of order k, acting on functions f C (U) (which can be either real- or complex-valued but in this setting are assumed complex-valued). The principal
206 Chapter 11 Differential Operators symbol, or just symbol 1 σ(a) of A is the map σ(a) : U R n C obtained by formally replacing α by i α ξ α in the top degree terms of A: σ(a)(x, ξ) := i k a α (x)ξ α. α =k More generally, if we have a differential operator A : C (U) n1 C (U) n2 i.e. an n 2 n 1 matrix of differential operators (A) ij = a ij α (x) α α k as in (11.1), we have again a principal symbol σ(a) though this time a matrixvalued one: (σ(a)) ij (x, ξ) := i k a ij α (x)ξ α. α =k We see that for a given x this is a homogenous polynomial in ξ with coefficients in Mat(n 2, n 1 ), the set of n 2 n 1 -matrices. In order to motivate the definition of the symbol for a differential operator on an arbitrary manifold below, let s give the symbol map a more formal guise: let E = U C n1 and F = U C n2 denote trivial bundles, then A is an operator Γ(E) Γ(F ). Let (x, ξ 0 ) U R n be fixed, then the complex matrix σ(x, ξ 0 ) may be viewed as a linear map E x E x. Identifying U R n with the cotangent bundle π : T U U (shortly, it will become apparent why we have identified it with the cotangent bundle and not the tangent bundle) the remarks above translate into the following: For each point (x, ξ) T U we have a linear map E x F x or (since, by construction of the pullback bundle, the fiber (π E) (x,ξ) equals the fiber E x ) a linear map (π E) (x,ξ) (π F ) (x,ξ). Thus we see that we may view the symbol as a smooth section of the bundle Hom(π E, π F ), or equivalently as a bundle map π E π F. With this in mind, we generalize to a differential operator A : Γ(E) Γ(F ) over an arbitrary manifold M. Let π : T M M denote the cotangent bundle, and let π E and π F be the pullbacks over T M. Let U be a proper neighborhood and let Φ E and Φ F be trivializations of E and F over U. Definition 11.6 (Principal Symbol). Define the principal symbol (or just symbol) σ(a) of the differential operator A to be the smooth bundle map π E π F corresponding to the smooth section (also denoted σ(a)) of the bundle Hom(π E, π F ) given locally by σ(a)(ξ p ) = (Φ F p ) 1 σ(a Φ U )(p, ξ) Φ E p : E p F p (11.3) where ξ p is an element in Tp M, ξ = (ξ 1,..., ξ n ) are the coordinates relative to the basis (dx i ) and A Φ U is the local representative of A. There are several things we need to check in order to verify that this is welldefined. First of all we note that (π E) ξp = E p and (π F ) ξp = F p, so σ(a) is indeed a section of Hom(π E, π F ) as stated. Secondly we need to check that it is independent of the choice of trivializations. To this end let Ψ E and Ψ F be different trivializations over U. Then σ(a)(ξ p ) = (Φ F p ) 1 σ(a Φ U )(p, ξ)φ E p = (Ψ E p ) 1 Ψ E p (Φ F p ) 1 σ(a Φ U )(p, ξ)φ E p (Ψ E p ) 1 Ψ E p = (Ψ E p ) 1 g F (p)σ(a Φ U )(p, ξ)g E (p) 1 Ψ E p = (Ψ E p ) 1 σ(a Ψ U )(p, ξ)ψ E p. 1 In analysis on R n one distinguishes between the symbol and the principal symbol. In this manifold setting, however, only the principal part of the symbol makes coordinate independent sense, and thus we will use the terms symbol and principal symbol interchangeably.
11.2 The Principal Symbol 207 Finally we need to check that it is independent of the choice of smooth coordinates (x 1,..., x n ) on U. Assume we have two charts ϕ = (x 1,..., x n ) and ϕ = ( x 1,..., x n ). We let D( ϕ ϕ 1 ) denote the GL(n, R)-valued function p ( x j / x i )(p) and A Φ U and ÃΦ U denote the local representatives of A with respect to the charts ϕ and ϕ. It is well-known that the corresponding principal symbols are related by 2 σ(ãφ U )(p, ξ) = σ(a Φ U ) ( p, D( ϕ ϕ 1 )(p) T ξ ) (11.4) for ξ R n. If ξ p = ξ 1 dx 1 p + + ξ n dx n p = ξ 1 d x 1 p + + ξ n d x n p, we know that the coefficients are related by (ξ 1,..., ξ n ) = D( ϕ ϕ 1 )(p) T ( ξ 1,..., ξ n ), i.e. by exactly the same transformation rule as in (11.4). This shows that the expression (11.3) is independent of the chart, and thus that it is well-defined. Proposition 11.7. The symbol possesses the following properties: 1) If a, b C and A and B are differential operators of the same order, then σ(aa + bb) = aσ(a) + bσ(b). 2) If A : Γ(E) Γ(F ) and B : Γ(F ) Γ(H) are differential operators then σ(ba)(ξ p ) = σ(b)(ξ p ) σ(a)(ξ p ). 3) If A is of order k then σ(a )(ξ p ) = ( 1) k σ(a)(ξ p ) (11.5) where σ(a)(ξ p ) : F p E p is the adjoint of σ(a)(ξ p ). Proof. 1) is obvious. To prove 2) consider first two local representatives A Φ U = α k a α α and B Φ U = β l b β β. Then, of course A Φ U B Φ U v = α k, β l a α α (b β β.v). By the Leibniz rule, the term of highest order is just a α b β α+β. α =k, β =l From this we deduce that σ(a U B U ) = σ(a U )σ(b U ). Globally we have σ(ab)(ξ p ) = (Φ H p ) 1 σ(a Φ U B Φ U )(p, ξ)φ E p = (Φ H p ) 1 σ(a Φ U )(p, ξ)σ(b Φ U )(p, ξ)φ E p = (Φ H p ) 1 σ(a Φ U )(p, ξ)φ F p (Φ F p ) 1 σ(b Φ U )(p, ξ)φ E p = σ(a)(ξ p )σ(b)(ξ p ). Let s calculate the symbol of some of the differential operators discussed above Example 11.8. 1) Vector Fields. We claim that σ(l X )(ξ p ) = iξ p (X). Let (U, x 1,..., x n ) be a chart on M and write X = j Xj j, thus locally: L X f = 2 See for instance [GG] (8.5). X j f n x j = i X j D j f. j=1 j=1
208 Chapter 11 Differential Operators Formally replacing j with iξ j gives the local symbol σ(l X ) U (ξ 1,..., ξ n ) = i j=1 X j ξ j. If ξ p is a covector over a point i p U with ξ p = n j=1 ξ jdx j p, then ξ p (X) = j Xj ξ j, i.e. σ(l X )(ξ p ) = iξ p (X). 2) Exterior Derivative. Let s calculate the symbol of the exterior derivative d : Ω 1 (M) Ω 2 (M). To keep the calculations to a manageable level, let s assume M to have dimension 3. Let (U, x 1, x 2, x 3 ) be a chart on M and write ω = 3 i=1 ω idx i. It is well-known that dω = ( ωj x i ω ) i x j dx i dx j i<j and from this it is easily seen that the exterior derivative, relative to the local frames (dx 1, dx 2, dx 3 ) and (dx 1 dx 2, dx 1 dx 3, dx 2 dx 3 ), is represented by the matrix / x 2 / x 1 0 / x 0 / x 1. (11.6) 0 / x 3 / x 2 The claim is that the symbol of d is given by σ(d)(ξ p ) = ie(ξ p ), where e(ξ p ) is the map Λ Tp M Λ Tp M given by ω p ξ p ω p. To verify this, note that if ξ p = 3 i=1 ξ idx i p and Tp M η p = 3 j=1 η jdx j p then ξ p η p = i<j(ξ i η j ξ j η i )dx i p dx j p i.e. the linear map e(ξ p ) is represented, in the basis (dx i p), by the matrix ξ 2 ξ 1 0 ξ 3 0 ξ 1 0 ξ 3 ξ 2 and comparing to (11.6) we see that the symbol of d is indeed equal to e(ξ p ) as stated above. This can easily be generalized to manifolds of arbitrary dimension and to d as an operator d : Ω (M) Ω (M). 3) Hodge-de Rham Operator. Consider the operator d + d : Ω (M) Ω (M). By Proposition 11.7 we need only calculate the adjoint of the linear map e(ξ p ) in order to obtain σ(d )(ξ p ). One can check that the adjoint of e(ξ p ) is c(ξ p) where c(ξ p) : Λ T p M Λ 1 T p M is contraction with the metric dual of ξ p. Thus by Proposition 11.7 σ(d + d )(ξ p ) = σ(d)(ξ p ) σ(d)(ξ p ) = ie(ξ p ) ic(ξ p). 4) Hodge Laplacian. Since d d = d d = 0 we see that (d + d ) 2 = dd + d d = : Ω (M) Ω (M), i.e. the Hodge-de Rham operator is a square-root of the Hodge Laplacian. By Proposition 11.7 σ( )(ξ p ) = σ(d)(ξ p )σ(d)(ξ p ) σ(d)(ξ p ) σ(d)(ξ p ) = ( ie(ξ p )(ic(ξ p)) + (ic(ξ p))ie(ξ p ) ) = e(ξ p )c(ξ p) + c(ξ p)e(ξ p ).
11.2 The Principal Symbol 209 Thus, to compute the symbol, we need only compute e(ξ p )c(ξ p) + c(ξ p)e(ξ p ). To do so, let {ξ 1,..., ξ n } be an orthonormal basis for T p M and a R arbitrary, then ( e(aξ1 )c(aξ 1 ) + c(aξ 1 )e(aξ 1) ) (ξ j1 ξ jk ) = a 2 ξ 1 c(ξ 1 )(ξ j 1 ξ jk ) + a 2 c(ξ 1 )(ξ 1 ξ j1 ξ jk ). If j 1 = 1 then the second term vanishes and the first term equals a 2 ξ j1 ξ jk. If j 1 1, then the first term vanishes, while the second term equals a 2 ξ j1 ξ jk. From this and linearity we get in general the following Cartan formula In conclusion we obtain e(ξ p )c(ξ p) + c(ξ p)e(ξ p ) = ξ p 2. (11.7) σ( )(ξ p ) = e(ξ p )c(ξ p) + c(ξ p)e(ξ p ) = ξ p 2. 5) Connections and the Connection Laplacian. Let {s 1,..., s m } be a local frame for a bundle E and let be a connection. A local expression for the order 1 terms of (f 1 s 1 + + f m s m ) is then m i=1 j=1 f i x j dxj s i. As in 3) we see that the symbol of is then σ( )(ξ p ) = iξ p, i.e. tensoring with iξ p from the left. The adjoint of ξ p is contraction with ξ p, therefore σ( )(ξ p ) = ic(ξ p). Since ξ p (ξ p) = ξ p 2 we get by Proposition 11.7 that σ( )(ξ p ) = ξ p 2. The operator is known as the connection Laplacian. Inspired by 4) a second order differential operator A with symbol σ(a)(ξ p ) = ξ p 2 is called a generalized Laplacian. The Hodge Laplacian and the connection Laplacian are examples of generalized Laplacians. A first order differential operator A for which A A and AA are generalized Laplacians is called a Dirac type operator (the reason for this will become apparent in Section 11.3). Examples of Dirac type operators include the Hodge-de Rham operator and the connection. Definition 11.9 (Elliptic Operator). A differential operator A is called elliptic if the symbol map σ(a) : π E π F is a pointwise isomorphism off the zero section of T M. By the results of Example 11.8 we see that the Hodge Laplacian is an elliptic operator. In fact any generalized Laplacian as well as formally self-adjoint Dirac type operators are elliptic. Thus the Hodge-de Rham operator is elliptic whereas a connection need not. The composition of two elliptic operators is again elliptic. If A is elliptic of order k and K is a differential operator of order strictly less than k, then as σ(a + K) = σ(a), the operator A + K is elliptic as well. Elliptic operators have a lot of very beautiful analytic properties. In order to shed some light on these we have to introduce Sobolev spaces. This is the topic of Section 11.4. Before delving into that we want to devote a section to the description of a type of operator which plays an immensely important part in modern geometry: the Dirac operator.
210 Chapter 11 Differential Operators 11.3 Dirac Bundles and the Dirac Operator In this section we let M denote an oriented Riemannian manifold. No compactness condition is imposed unless specified. Let E be a real oriented Riemannian vector bundle over M of rank n (often we will take it to be the tangent bundle). Thus we can construct its oriented frame bundle P SO (E), the principal SO(n)-bundle over M whose fiber at p M consists of oriented orthonormal bases for the vector space E p. We want to construct the so-called Clifford bundle over E, i.e. an algebra bundle over M whose fiber at x is isomorphic to the Clifford algebra Cl(E x ). The construction is accomplished as an associated bundle in the following way: Consider the Clifford algebra Cl 0,n, the Clifford algebra over R n with the usual negative definit inner product. We have a representation of SO(n) on Cl 0,n : any A SO(n) viewed as a linear map R n R n preserves the inner product, and hence induces an algebra homomorphism à : Cl 0,n Cl 0,n, so the representation is given as ρ(a) = Ã. Definition 11.10 (Clifford Bundle). The Clifford bundle of E is the associated bundle Cl(E) := P SO (E) ρ Cl 0,n. Elements in Cl(E) are equivalence classes [p, ξ], where p P SO (E) and ξ Cl 0,n and the equivalence relation on P SO Cl 0,n is (p, ξ) (p A, ρ(a)ξ). The projection map is π : Cl(E) M, [p, ξ] π(p) where π : P SO (E) M is the projection in the frame bundle. The vector space structure on the fibers is given by a[p, ξ] + b[p, ξ ] = [p, aξ + bξ ] (note, by transitivity of the right SO(n)-action on each fiber in P SO (E) we can always assume the p s to be equal). Similarly, the algebra structure is given by [p, ξ] [p, ξ ] = [p, ξξ ], and the identity element is [p, 1]. It is easy to check that these operations are well-defined. Since ξ 2 = ξ 2 1 (by definition of the Clifford algebra), we get [p, ξ][p, ξ] = [p, ξ ξ] = [p, ξ 2 1] = ξ 2 [p, 1], thus each fiber is indeed a Clifford algebra of type (0, n). Thus Cl(E) x (the fiber in the Clifford bundle) is isomorphic to Cl(E x ) (the Clifford algebra of the vector space E x ). Observe that we have R n Cl 0,n and that R n is an ρ-invariant subspace and that ρ(a) R n = A, i.e. ρ restricted to this invariant subspace is just the defining representation of SO(n) on R n. We write it as id. But this means that we have the subbundle P SO (E) id R n = E sitting inside Cl(E) (elements in E Cl(E) are characterized by being of the form [p, v] where v R n ). In particular we may view Γ(E) as sitting inside Γ(Cl(E)). For the purpose of studying spinor bundles as we will do later in this section, we need another description of the Clifford bundle. Assume that the oriented Riemannian vector bundle E has a spin structure π : P Spin (E) M with double covering bundle map Φ : P Spin (E) P SO (E). Consider the representation Ad : Spin(n) Aut(Cl 0,n ) given by Ad(g)ξ = gξg 1 (recall that Spin(n) sits inside Cl 0,n, so the multiplication makes sense). Let Λ : Spin(n) SO(n) denote the double covering, then the following diagram
11.3 Dirac Bundles and the Dirac Operator 211 commutes (simply because Ad(g) is the unique extension of Λ(g) to Cl 0,n ): Ad Spin(n) Aut(Cl 0,n ) Λ ρ SO(n) From the principal Spin(n)-bundle P Spin (E) and the representation Ad we can form the associated bundle P Spin (E) Ad Cl 0,n. Lemma 11.11. The map Ψ : P Spin (E) Ad Cl 0,n Cl(E) = P SO (E) ρ Cl 0,n given by [p, ξ] [Φ(p), ξ] is a well-defined smooth algebra bundle isomorphism. Proof. It is well-defined, since Ψ([p g 1, Ad(g)ξ]) = [Φ(p g 1 ), Ad(g)ξ] = [Φ(p) Λ(g) 1, ρ(λ(g))ξ] = [Φ(p), ξ] = Ψ([p, ξ]), the third identity is a consequence of equivariance of Ψ and of the commuting diagram just above. Restricted to the fiber over x, the map is an algebra isomorphism (we skip checking linearity): Ψ x ([p, ξ][p, ξ ]) = Ψ x ([p, ξ ξ ]) = [Φ(p), ξ ξ ] = [Φ(p), ξ] [Φ(p), ξ ] = Ψ x ([p, ξ])ψ x ([p, ξ ]). It is injective, for if 0 = Ψ x ([p, ξ]) = [Φ(p), ξ], then ξ must be 0, but then [p, ξ] = 0. Also Ψ x is surjective, since Φ is. Thus Ψ is an algebra bundle isomorphism. Definition 11.12 (Dirac Bundle). Let E be a real Riemannian vector bundle over M and a metric connection. A complex vector bundle S over M, which is a left Cl(E)-module, i.e. for each x M there is a representation of the algebra Cl(E) x on S x is called a Dirac bundle, provided it is equipped with a fiber metric, and a compatible connection satisfying the following two additional conditions 1) Clifford multiplication is skew-adjoint, i.e. for each x M and each V x E x and ψ 1, ψ 2 S x : V x ψ 1, ψ 2 + ψ 1, V x ψ 2 = 0. 2) The connection on S is compatible with the connection on E in the following sense: X (V ψ) = ( X V ) ψ + V ( X ψ) for X X(M), V Γ(E) and ψ Γ(S). Example 11.13. Exterior Bundle. Let s prove that the exterior bundle Λ T M is a Dirac bundle. We extend the metric g on M to a fiber metric on Λ T M in the usual way. On T M we have the Levi-Civita connection. This can be extended to T M upon defining X ω = ( X ω ), for ω Ω 1 (M). This connection on T M is metric: X ω 1, ω 2 = X ω 1, ω 2 = Xω 1, ω 2 + ω 1, Xω 2 = ( X ω 1 ), ω 2 + ω 1, ( Xω 2 ) = X ω 1, ω 2 + ω 1, X ω 2.
212 Chapter 11 Differential Operators We can extend the connection to the exterior bundle by the requirement X (ω η) = ( X ω) η + ω ( X ). One can check that this is indeed a metric connection on this bundle, and that satisfies X (c(y )ω) = c( X Y )ω + c(y )( X ω) (11.8) where c is contraction with the vector field Y. Next, we need the exterior bundle to be a bundle of Clifford modules, that is for each point x M we need an action of the Clifford algebra Cl(T x M) on Λ Tx M. To this end it is fortunate that we have quantization map vector space isomorphism Q : Λ Tx M Cl(T x M) given by ϕ 1 ϕ k sign σ ϕ σ(1) ϕ σ(2). σ S k We simply define the Clifford action by Cl(T x M) Λ T x M (v, ω) v ω := Q 1 (vq(ω)). If v happens to be an element of T x M Cl(T x M) we have an explicit formula for this product 3 v ω = v ω + c(v)ω. (11.9) As has been mentioned previously, the adjoint of v is c(v) and so by (11.9) condition 1 in Definition 11.12 is satisfied: v ω, η = v ω, η + c(v)ω, η = ω, c(v)η + ω, v η = ω, v η + c(v)η = ω, v η. Finally we need to check compatibility of the connection with the Clifford action. Let X and Y be vector fields on M, and view Y as a section of Cl(M) (whose values happen to be in the subbundle T M Cl(M)). Using (11.9) and (11.8) and the defining property of the connection on Λ T M we get X (Y ω) = X (Y ω + c(y )ω) = ( X Y ) ω + Y X ω + X (c(y )ω) = ( X Y ) ω + Y X ω + c( X Y )ω + c(y )( X ω) = ( X Y ) ω + Y ( X ω). Thus we have proved that the exterior bundle is a Dirac bundle. Spinor Bundle. The next and single most example of a Dirac bundle is the spinor bundle. The setup is as follows: let E be a real oriented Riemannian vector bundle over a Riemannian manifold M and suppose it has a spin structure, i.e. its oriented frame bundle lifts to a principal Spin(n)-bundle P Spin (E) M with a double covering map Φ : P Spin (E) P SO (E). Let κ n : Spin(n) Aut( n ) be the spinor representation of Spin(n), then the spinor bundle is the associated bundle S(E) := P Spin (E) κn n. Smooth sections of this bundle are called spinor fields or Dirac spinor fields. 3 See [Lawson, Michelson] Proposition 3.9. Note the minus sign in their formula. This is due to their convention v 2 = v 2 which is different from ours.
11.3 Dirac Bundles and the Dirac Operator 213 Our present task is to show that it is a Dirac bundle. First we equip it with an action of the Clifford bundle Cl(E) = P Spin (E) Ad Cl 0,n. Consider on the threefold product P Spin (E) Cl 0,n n the equivalence relation (p, ξ, v) (p g 1, Ad(g)ξ, κ n (g)v) for any g Spin(n). Elements in the quotient space P Spin (E) Cl 0,n n / are denoted [p, ξ, v]. As in the proof of Lemma 11.11 one can show that the map Cl(E) S(E) P Spin (E) Cl 0,n n / given by ([p, ξ], [p, v]) [p, ξ, v] is a well-defined bundle isomorphism. This allows us to define the Clifford action in the following way: Define µ : P Spin (E) Cl 0,n n P Spin (E) n by µ(q, ξ, v) = (q, ρ n (ξ)v) (where ρ n : Cl 0,n Aut( n ) is the spin representation of the Clifford algebra) and note that for any g Spin(n) it makes the following diagram commutative P Spin (E) Cl 0,n n µ g P Spin (E) Cl 0,n n µ P Spin (E) n g P Spin (E) n where the first vertical map is (p, ξ, v) (p g 1, Ad(g)ξ, κ n (g)v) and the second is (p, v) (p g 1, κ n (g)v). Thus it induces a map µ : Cl(E) S(E) S(E), the desired Clifford action. It is given explicitly by the formula [p, ξ] [p, v] := µ([p, ξ], [p, v]) = [p, ρ n (ξ)v]. Next we want to give S(E) a metric. Inside Cl 0,n we have the finite group G n := {e i1 e ik 1 k n, 1 i 1 < < i k n} (where {e 1,..., e n } is some orthonormal basis for R n ). Restricting the spin representation ρ n of Cl 0,n to G n gives a representation of G n on n, also denoted ρ n. By a well-known result from representation theory, there exists an inner product, n on n relative to which ρ n is a unitary representation, i.e. ρ n (e i1 e ik )v, ρ n (e i1 e ik )w n = v, w n. (To make the notation in the following less cumbersome, we will simply write the action of ρ n (ξ) on v as ξ v.) If ξ = n i=1 a ie i is a unit vector in R n Cl 0,n then ρ n (e) is a unitary operator as well: First we observe e i v, e j w n = e j (e i v), e 2 j w n = (e j e i ) v, w n i=1 = (e i e j ) v, w n = (e 2 i e j ) v, e i w n = e j v, e i w n, and from this we get n ξ v, ξ w n = a i e i v, a j e j w n = j=1 a i a j e i v, e j w n a 2 i e i v, e j w n + i=1 i j ( ) = a i a j ei v, e j w n + e j v, e i w n i=1 a 2 i v, w n + i<j = v, w n.
214 Chapter 11 Differential Operators Since Spin(n) is generated by unit vectors, we see immediately that κ n is a unitary representation w.r.t. this inner product. Furthermore, for any ξ R n, unit vector or not, ρ n (ξ) is a skew-adjoint map: ξ v, w n = ξ ξ (ξ v), ξ ξ w n = 1 ξ 2 ξ2 v, ξ w n = v, ξ w n. Note that this implies that ρ n (ξ) is skew-adjoint, when ξ Cl 1 0,n (the odd part of Cl 0,n ) and that ρ n (ξ) is self-adjoint when ξ Cl 0 0,n (the even part of Cl 0,n ). In particular, κ n = ρ n Spin(n) is self-adjoint. We can readily extend this inner product to a fiber metric on S(E), simply by defining [p, v], [p, w] := v, w n. This is well-defined due to unitarity of κ n (g): [p g 1, κ n (g)v], [p g 1, κ n (g)w] = κ n (g)v, κ n (g)w n = v, w n = [p, v], [p, w]. Checking condition 1 in the definition of a Dirac bundle is not hard: Let V x E x Cl(E) x and ψ 1, ψ 2 S x (E). We have presentations V x = [p, v] and ψ i = [p, w i ] where v R n Cl 0,n, p P Spin (E) and w i n, and hence: V x ψ 1, ψ 2 = [p, v] [p, w 1 ], [p, w 2 ] = [p, v w 1 ], [p, w 2 ] = v w 1, w 2 n = w 1, v w 2 n = [p, w], [p, v] [p, w 2 ] = ψ 1, V x ψ 2. Next, we want to equip S(E) with a connection. Consider a connection on P SO (E) given either as a horizontal tangent distribution on the total space or as a so(n)-valued 1-form ω on P SO (E). This connection can be lifted to a connection on the spin bundle P Spin (E) in the following way. We have the double covering map Φ : P Spin (E) P SO (E) and the induced map Φ : T q P Spin (E) T Φ(q) P SO (E) is an isomorphism. Therefore we can define a horizontal subspace H q P Spin (E) by Φ 1 (H Φ(q) P SO (E)). This is easily seen to be a connection. Alternatively we can define a connection 1-form ω on P Spin (E) as follows: ω := Λ 1 Φ ω. This is a connection 1-form: we only have to check the usual two requirements. The first one: σ g ω = Λ 1 σgφ ω = Λ 1 Φ σλ(g) ω = Λ 1 Φ Ad(Λ(g 1 )) ω = Λ 1 Ad(Λ(g 1 ))(Φ ω) = Ad(g 1 ) Λ 1 Φ ω = Ad(g 1 ) ω. For the second one let p P Spin (E) and denote by σ p : Spin(n) P Spin (E) the map g p g and note that Φ σ p = σ Φ(p) Λ, where σ Φ(p) is the similar map on the bundle P SO (E). Then: ω p ( σ p ) = Λ 1 (Φ ω) p ( σ p ) = Λ 1 ω Φ(p) Φ ( σ p ) = Λ 1 ω Φ(p) (σ Φ(p) ) Λ = Λ 1 Λ = id spin(n). This defines the same connection as before, for one can easily check that ker( ω) p = H p P Spin (E). Now we can transfer this connection to the spin bundle as done in Section 10.2, obtaining a connection, on S(E). In the special case where E = T M is the tangent bundle and ω on P SO (M) is the Levi-Civita connection, the lifted connection, both ω on P Spin (E) and on S(E), is called the spin connection.
11.3 Dirac Bundles and the Dirac Operator 215 Proposition 10.11 gives us local expressions for an induced connection on an associated bundle. Let s see what it looks like in the case of the spin connection. Let t α : U α P Spin (E) be a local section of the spin bundle and put s α := Φ t α. This is a local section of the frame bundle P SO (E). Let à α := t α ω and A α := s αω denote the local gauge potentials of the connection ω on P SO (E), resp. ω on P Spin (E). Then we have à α = t α ω = t α(λ 1 = Λ 1 Φ ω) = Λ 1 s αω = Λ 1 A α, t αφ ω so the gauge potentials are related in the nicest possible way. Putting this into (10.5) we obtain ( X ψ) α (x) = X x ψ α (κ n ) (Ãα (X x ))ψ α (x) = X x ψ α ρ n (Λ 1 (A α (X x )))ψ α (x) for x U α (recall that the action of (κ n ) is just ρ n itself, cf. Corollary 8.30). Now we pick the usual basis (B ij ) i<j for so(n) and write the so(n)-valued 1-form A α in terms of this basis: A α = i<j Aα ij B ij. Then (remembering Proposition 8.21) we get ( X ψ) α (x) = X x ψ α ρ n ( Λ 1 ( ) ) A α ij(x x )B ij ψ α (x) i<j ( 1 = X x ψ α ρ n A α 2 ij(x x )e i e j )ψ α (x) i<j = X x ψ α 1 A α 2 ij(x x )e i e j ψ α (x). i<j Now, let us show compatibility of the spin connection with the fiber metric. We will use the local expression above, for note that since locally ψ(x) = [s α (x), ψ α (x)] (for some section s α : U α P Spin (E)) we get by definition of the fiber metric ψ(x), ψ (x) = ψ α (x), ψ α(x) n for x U α. Thus we get X ψ(x), ψ (x) = Note that X x ψ α 1 2 i<j A α ij(x x )e i e j ψ α (x), ψ α(x) n = X x ψ, ψ (x) n 1 A α 2 ij(x x ) e i e j ψ α (x), ψ α(x) n. i<j e i e j ψ α (x), ψ α(x) n = e j ψ α (x), e i ψ α(x) n = ψ α (x), e j e i ψ α(x) n = ψ α (x), e i e j ψ α(x) n, and therefore X ψ(x), ψ (x) + ψ(x), X ψ (x) = X x ψ, ψ (x) n + ψ(x), X x ψ n.
216 Chapter 11 Differential Operators As mentioned X x ψ α should be interpreted componentwise, i.e. pick a complex basis {v 1,..., v N } for n (of course N = 2 n 2 ) and write ψ α = N ψ α,i v i i=1 where ψ α,i, the i th component of ψ α, are complex valued functions on U α, then X x ψ α = N (X x ψ α,i )v i n. i=1 Since we have a metric in play, it would be wise of us to assume the basis {v 1,..., v N } to be orthonormal. Then N N X x ψ α, ψ α(x) n + ψ α (x), X x ψ α n = (X x ψ α,i )ψ α,i(x) + ψ α,i (x)x x ψ α,i i=1 ( N ) = X x ψ α,i ψ α,i = X x ψ α, ψ α n. i=1 Thus we have proved that the spin connection is compatible with the metric. Finally, we need to check condition 2 in the definition of a Dirac bundle, that is X (Y ψ)(x) = ( X Y )(x) + Y x ( X ψ(x)) (11.10) for each x M. Again we use the local expressions, i.e. we consider a cover (U α ) which are domains of trivializations of both T M and E. Let Φ α denote the trivializations of the tangent bundle (we may assume it to preserve the metric on M, i.e. Φ x : T x M R m is an isometry). Since X is C-linear and satisfies the Leibniz rule, it is sufficient to verify the above condition for Y = E k where E k (x) = Φα 1 (x, e k ) are local orthonormal vector fields. But first, recall formula (8.3) and replace in that X by i<j Aα ij (X x)e i e j spin(n) to get ( ) ( ) ( A α ij(x x )e i e j e k = e k A α ij(x x )e i e j + Λ A α ij(x x )e i e j )e k i<j i<j = e k ( i<j i<j i=1 A α ij(x x )e i e j ) + 2 i<j A α ij(x x )B ij e k. (11.11) Note that concatenation here means multiplication inside the Clifford algebra, and not the Clifford action. Recall also formula (10.7) for the local form of the Levi-Civita connection. It will be used in the following calculations: ( ( 1 X (E k ψ)) α (x) = X x (e k ψ α ) A α 2 ij(x x )e i e j )e k ψ α (x) = e k (X x ψ α ) e k ( 1 2 i<j i<j i<j(a α ij(x x )B ij e k ) ψ α (x) A α ij(x x )e i e j ) ψ α (x) = e k ( X ψ) α (x) + ( X E k ) α (x) ψ α (x) and this is precisely the local form of the right-hand side of (11.10). For the first identity we used that (E k ψ) α = e k ψ α and in the second we used (11.11) as well as the fact, that e k is a linear map, and thus commutes with X x. This verifies condition 2 and hence we have shown that the spinor bundle S(E) is a Dirac bundle.
11.3 Dirac Bundles and the Dirac Operator 217 Definition 11.14 (Dirac Operator). Let S be a Dirac bundle. The Dirac operator / is then defined as the composition Γ(S) Γ(T M S) Γ(T M S) Γ(S) (11.12) where the last map is Clifford multiplication X(M) Γ(S) Γ(S). Lemma 11.15. The Dirac operator is a first order differential operator, and given a local orthonormal frame {E 1,..., E n } for T M over U M, the Dirac operator takes the local form / ψ U = E j ( Ej ψ). (11.13) j=1 Proof. Let ϕ i = E i Ω1 (U) be the metric dual of E i. Suppose ψ = ω ψ for some local section ψ of S, then /ψ = ω ψ. Locally we have ω = i ω iϕ i, and ω = i ω ie i. Thus Ei ψ = ω i ψ and therefore / ψ = ω ψ = (ω i E i ) ψ = i=1 E i (ω i ψ ) = i=1 E i ( Ei ψ). From this formula, and the fact that Ei is a first order differential operator, it is evident that / is also a first order differential operator. As a differential operator, we can compute the symbol of the Dirac operator: Proposition 11.16. Let ξ x T x M, and let ξ x : S x S x be Clifford multiplication with the metric dual ξ x of ξ x. Then we have i=1 σ(/ )(ξ x ) = iξ x and σ(/ 2 )(ξ x ) = ξ x 2. Thus / 2 (called the Dirac Laplacian) is a generalized Laplacian and hence both / and / 2 are elliptic. Proof. Locally / = j E j Ej. Choose normal coordinates (x 1,..., x n ) centered at p, such that j x = E j (x) for all i. Pick a trivialization of S around x, then by Example 11.2 Ej acts componentwise as d j plus lower order terms. Thus, at x, the first order terms of / act as j E j(x) j x. Let {ϕ 1,..., ϕ n } be the metric duals of the E i s and decompose ξ x = j ξ jϕ j. By formally replacing j by iξ j we get σ(/ ) = i E j ξ j = iξx. For / 2, Proposition 11.7 says j=1 σ(/ 2 )(ξ x ) = σ(/ )(ξ x )σ(/ )(ξ x ) = ξ xξ x = ξ x 2 = ξ x 2. Clifford multiplication by ξ x has as inverse multiplication by ξ x 0), thus the Dirac operator is elliptic. ξ x ξ x 2 (as long as Proposition 11.17. The Dirac operator is formally self-adjoint, i.e. for ψ 1, ψ 2 Γ c (S) (sections of S with compact support) (/ ψ 1 ψ 2 ) = (ψ 1 / ψ 2 ). (11.14)
218 Chapter 11 Differential Operators Proof. First we consider the fiberwise inner product /ψ 1, ψ 2 x at some point x M. Let {E 1,..., E n } be an orthonormal frame such that Ej E j (x) = 0 at this particular x. Then / ψ 1, ψ 2 x = E j Ej ψ 1, ψ 2 x = Ej ψ 1 (x), E j (x) ψ 2 (x) j=1 = + = j=1 E j (x) ψ 1 (x), E j (x) ψ 2 (x) + j=1 ψ 1 (x), E j (x) Ej ψ 2 (x) j=1 E j (x) ψ 1, E j ψ 2 + j=1 ψ 1 (x), ( Ej E j )(x) ψ 2 (x) j=1 ψ 1 (x), E j (x) Ej ψ 2 (x). Secondly, we need a formula involving the divergence. Let X be the global vector field given uniquely by j=1 X, Y = ψ 1, Y ψ 2 for all Y X(M). Then at x (the first identity following from [Lee] Problem 5-6) (div X) x = = Ej X, E j x = j=1 j=1 ( Ej (x) X, E j X(x), Ej E j (x) ) j=1 E j (x) X, E j = E j (x) ψ 1, E j ψ 2. (11.15) j=1 Plugging this into the expression above we get / ψ 1, ψ 2 x = (div X) x + ψ 1, / ψ 2 x, and as x was chosen arbitrarily this holds for any x. Since M has no boundary, it is a consequence of the divergence formula that integration of this identity yields (11.14) Now it is clear, that / 2 = / / = / /, and this explains the term Dirac type operator introduced in a previous section. Example 11.18. Dirac Operator of the Exterior Bundle. In this example we calculate the Dirac operator of the Dirac bundle Λ T M and show that this operator is none other than the Hodge-de Rham operator. Let {E 1,..., E n } denote a local orthonormal frame for T M. We want to show the following local identities d = d = Ej Ej j=1 c(e j ) Ej. j=1 (11.16a) (11.16b) First we note that the expressions are defined independently of the choice of orthonormal frame. To be more precise, suppose Ẽj = i Ai j E j, is another orthonormal frame over the same neighborhood (i.e. A i j is an O(n)-valued smooth
11.3 Dirac Bundles and the Dirac Operator 219 function), then Ẽj Ẽj = j=1 = i,j,k=1 A i je i A k j Ek = (AA T ) ik Ei Ek = i,k=1 i,j,k=1 A i ja k j E i Ek Ei Ei. Thus, we can choose whatever orthonormal frame that suit our needs, it makes no difference. To show the actual identity (11.16a) we have to show that the right-hand side satisfies the axioms defining d, i.e. that df(x) = Xf for any vector field X, that d(ω η) = dω η + ( 1) ω ω dη and that d d = 0. The first one is easy, one readily checks that df = j (E jf)ej and hence Ej Ej f = j=1 Ej (E j f) = j=1 i=1 (E j f)ej = df, j=1 thus the first axiom is satisfied. Axiom 2 is also satisfied Ej Ej (ω η) = j=1 Ej ( ( Ej ω) η + ω ( Ej η) ) j=1 ( n ) = Ej Ej ω η + ( 1) ω ω j=1 ( n j=1 ) Ej Ej η (induction is tacit in these calculations). Checking axiom 3 is the hardest part. Let p U and pick an orthonormal frame (E j ) such that Ej E j (x) = 0 and hence also Ej Ej (x) = 0. To simplify things we check this axiom on the form fe1 Ek for some smooth function f. Ei ( n Ei Ej Ej (fe1 Ek) ) i=1 = j=1 Ei Ei Ej Ej (fe1 Ek) i,j=1 + Ei Ej ( Ei Ej (fe1 Ek) ). i,j=1 Since Ej E j (x) = 0 most of these terms are zero when evaluated at p. The surviving terms are ([E j, E j ]f)ei Ej E1 Ek. k<i<j By symmetry of the Levi-Civita connection we have [E i, E j ] = Ei E j Ej E i and this is 0 at x. Thus the entire expression is zero when evaluated at x and this was chosen arbitrarily so the third axiom is satisfied and the expression (11.16a) is verified. Also the second expression (11.16b) is defined independently of the choice of orthonormal frame. Thus we can apply our standard trick: let p M be arbitrary and pick an orthonormal frame {E 1,..., E n } such that Ei E j (x) = 0.
220 Chapter 11 Differential Operators We can then derive (11.16b) from (11.16a) if we recall that (E j ) = c(e j ) and E j = Ej div E j and that (div E j ) x = j E j E j (x), E j (x) = 0: dω, η x = = Ej(x) ( Ej ω)(x), η(x) = Ej ω(x), c(e j )η(x) j=1 ω(x), Ej (c(e j )η)(x) = j=1 j=1 ω(x), c(e j )( Ej η(x)) where the last identity follows from (11.8). Finally, by (11.9) we get the Dirac operator Dω = E j ( Ej ω) = j=1 = dω + d ω j=1 Ej Ej ω + c(e j )( Ej ω) j=1 and realize that it equals the Hodge-de Rham operator. Spin-Dirac operator. We consider the spinor bundle S(E) associated to some oriented Riemannian vector bundle E carrying a spin structure. We saw earlier that this vector bundle is a Dirac bundle. Thus it carries a Dirac operator /D called the spin-dirac operator or just the Dirac operator (in [Lawson] it is called the Atiyah-Singer operator). Thanks to Lemma 11.15 and the calculations done in the previous example, we arrive at the following local description ( m ( /D ψ) α (x) = E k ) Ek ψ (x) = α = = k=1 m e k k=1 ( (E k ) x ψ α 1 2 m e k ( Ek ψ) α (x) k=1 ) A α ij((e k ) x ψ α )e i e j ψ α (x) i<j m e k (E k ) x ψ α 1 A α 2 ij((e k ) x ψ α )e k e i e j ψ α (x) (11.17) k=1 i<j where E k (x) = Φ 1 α (x, e k ) are local orthonormal vector fields (we pick the trivialization Φ α such that it preserves the metric) and m = dim M 4. 11.4 Sobolev Spaces First of all, we will introduce L p -spaces. The Sobolev spaces will then be defined as certain subspaces of these. Most of the proofs in this section will be skipped. The reader is referred to [Nicolaescu] or [Lawson-Michelson]. Consider a smooth Riemannian vector bundle E M over an oriented Riemannian manifold (M, g). Both spaces E and M are, in particular, topological spaces, so it makes sense to talk about Borel subsets of E and M. A (rough) section u : M E is called measurable if u 1 (B) is a Borel set in M for each Borel set B E. The set of measurable sections is denoted M(M, E). The metric g gives rise to a Radon measure µ g on the Borel algebra of M 5 4 Note that often (E k ) xψ α is written as ψα (x). 5 e k This is for the following reason: on a manifold one can integrate compactly supported continuous n-forms. Thus on a Riemannian manifold a linear form I : C c(m) R is defined by I(f) = M f dvg. By the Riesz Representation Theorem there exists a unique radon measure µ g such that I(f) = M fµg.
11.4 Sobolev Spaces 221 Definition 11.19. Let 1 p <. We say that a section u M(M, E) is p-integrable if u(x) p E dµ g <. M The space of equivalence classes (equivalence meaning equality µ g -a.e.) of p- integrable sections is denoted L p (M, E) A measurable section u is called locally p-integrable if for each smooth compactly supported function ϕ Cc (M) the section ϕu is p-integrable. The space of equivalence classes of locally p-integrable sections is denoted L p loc (M, E). Proposition 11.20. Under the norm ( u L p = M u p E dµ g ) 1/p the space L p (M, E) is a Banach space. For p = 2 the space is a Hilbert space with the inner product u, v = u(x), v(x) E dµ g. Moreover Γ c (E) is dense in L p (M, E). M Proof. Let (v k ) be a Cauchy sequence in L p (M, E) and U M a proper neighborhood. Then (v k U ) is again a Cauchy sequence and v k U = v 1 ḳ. v m k L p (U) L p (U). As U is a chart neighborhood it is isometrically diffeomorphic to an open set V R n and thus L p (U) = L p (V ) which is a Banach space. Hence v k U will converge. By a partition of unity argument we obtain convergence of v k, thus L p (M, E) is a Banach space. A partition of unity argument also applies to the second assertion. Just recall the fact that Cc (Ω) is dense in L p (Ω) for any open Ω R n. If the volume of M is finite, for instance if M is compact, p q implies L q (M, E) L p (M, E). Thus for any M (finite volume or not) L q loc (M, E) (M, E). In particular, for any p: L p loc L p loc (M, E) L1 loc(m, E). (11.18) Assume now that is a metric connection on E. The tangent bundle T M has the Levi-Civita connection, and via the bundle isomorphism T M T M we can turn it into a connection on T M simply by defining X ω = ( X ω ). The tensor product of these two connections is a connection on T M E, and we denote it by. It maps into T M T M E. Again, taking the tensor product with the Levi-Civita connection yields a connection on T M T M E. Hence we have maps Γ(E) Γ(T M E) Γ(T M T M E) We let k = denote the composition of the first k of these maps. Furthermore we have a fiber metric on T M E, simply the tensor product of the fiber metrics on T M = T M and E. Specifically, if ω p s p, ω p s p T p M E p, then ω p s p, ω p s p := ω p, ω p s p, s p,
222 Chapter 11 Differential Operators which gives the norm ω p s p = ω p s p on T p M E p. Similarly, we get metrics on the higher bundles (T M) k E. We use this to construct two new types of Banach spaces. First, recall that a (rough) section u of E is said to be k times differentiable if, relative to any smooth local frame {s 1,..., s m }, u can be written as u = u i s i for some k times differentiable functions u i. The space of k times differentiable sections of E is denoted Γ k (E). When M is compact this is also a Banach space when equipped with the norm u Γ k := sup x M k j u(x) E. j=0 Secondly define for each 1 p < and each k N 0 the Sobolev norm p,k on Γ c (E) by ( k 1/p ( k u p,k := j u p L g) pdµ = j=0 j=0 M j u p E dµ g) 1/p. The notation can be a bit misleading, for as a matter of fact, the Sobolev spaces depend not only on M and E but also on the metric on M as well as the metric and connection on E, thus á priori there is no canonical choice of norm! Example 11.21. Consider the manifold Ω R n and the trivial real line bundle E := Ω R. Equip E with the trivial connection d and the metric inherited from R. Let s calculate the Sobolev norm 2,1, on u Γ c (E) = Cc (Ω) Since u 2 2,1 = u 2 L 2 + du 2 L 2. du = u x 1 dx1 + + u x n dxn and since (dx i ) is an orthonormal frame for T Ω we get Hence du E = u u. + + x 1 x n u 2,1 = Ω u + u u dx + + x 1 x n and this is just equal to the usual Sobolev norm on H 1 (Ω). Likewise, one can show that 2,k equals the usual Sobolev norms on H k (Ω). Had we chosen a different connection on the trivial line bundle in the previous example, we would probably end up with a different norm than the usual one. In the case of a compact base manifold, however, the situation is much brighter. Lemma 11.22. Let M be a compact oriented Riemannian manifold and E a vector bundle over M. Then the Sobolev norm is (up to equivalence) independent of the choices of metrics and connections. Proof. First we show that the Sobolev-norm p,k is independent of the choices of connections. Assume we have connections and on T M (one of them need not be the Levi-Civita connection) and connections E and E on E. Let p,k be the Sobolev norm constructed from and E and let p,k
11.4 Sobolev Spaces 223 be the norm constructed from and E. We want to show that a constant C p,k exists such that for each ψ Γ(E) ψ p,k C p,k ψ p,k. We do it by induction. Note, first, that for k = 0 we have ψ p,0 = ψ p,0 since this norm does not involve the connections. Thus C p,0 = 1. Assume now the conclusion holds for k 1 and note that we have ψ p p,k = ψ p p,0 + E ψ p p,k 1 and ψ p p,k = ψ p p,0 + E ψ p p,k 1. Let A Ω 1 (M, End(E)) denote the difference E E. We want to estimate the term ψ p p,k : ψ p p,k = ψ p p,0 + E ψ p p,k 1 p p p,0 + Cp p,k 1 E ψ p p,k 1 = ψ p p,0 + Cp p,k 1 E ψ + Aψ p p,k 1 ψ p p,0 + Cp p,k 1 2p ( E ψ p p,k 1 + Aψ p p,k 1 ) (1 + 2 p C p p,k 1 )( ψ p p,0 + E ψ p p,k 1 ) + Cp p,k 1 Aψ p p,k 1 = (1 + 2 p C p p,k 1 ) ψ p p,k + Cp p,k 1 Aψ p p,k 1. We may assume that Aψ = α ψ for some 1-form α. By the Leibniz rule (which can easily be extended to hold for k ) we get k 1 ( ) k 1 k 1 (α ψ) = ( k 1 i α) i ψ, i i=0 and by this we can estimate α ψ p p,k 1 : k 1 α ψ p p,k 1 = j=0 k 1 M j=0 i=0 j ( j i) i α j 1 ψ p g Edµ j=0 j ( j i) M k 1 2 k α p Γ k 1 j=0 i=0 i α p T M j 1 ψ p E dµ g j M j 1 ψ p E dµ g 2 k k α p Γ k 1 ψ p p,k 1 2k k α Γ k 1 ψ p p,k. In the third estimate we used that ( j i) 2 k. Combining this we the inequality above we can find the desired constant C p,k such that p,k is dominated by p,k. Thus the Sobolev norm is independent of the choice of connections. We need to check that the norms are independent of the choices of metrics. There are two impacts of a change of metric on M, first i changes the Levi-Civita connection, but this change yields an equivalent metric, secondly it changes the measure by a positive function ϕ C (M), i.e. M f dµ g = M ϕf dµ g. But as M is compact, ϕ is bounded and so the new Sobolev norm is equivalent to the original one. The impact of a change of metric on E is also just a positive function. For this reason, on compact manifolds the following definition is well-posed. Definition 11.23 (Sobolev Space). Let 1 p < and k N 0 and assume E is a Riemannian vector bundle over a compact, orientable Riemannian manifold M. Then, define the Sobolev space W p,k (M, E) as the completion of Γ(E) in the Sobolev norm p,k.
224 Chapter 11 Differential Operators With the Sobolev norm these spaces are Banach spaces. The case p = 2 is particularly interesting, for in this case H k (M, E) := W 2,k (M, E) (there should be no risk of confusing this with the cohomology groups) are Hilbert spaces when equipped with the inner product u, v 2 = k j u, j v L 2 = j=0 k j=0 M j u, j v E dµ g. The Sobolev norm in this case will be written k instead of 2,k. Note that the Sobolev norm p,0 is just the L p -norm on Γ(E). Thus the zeroth Sobolev space W p,0 (M, E) equals L p (M, E) for all p. In particular H 0 (M, E) = L 2 (M, E). There is another approach to defining Sobolev spaces, one that relates to the classical Sobolev spaces and norms over R n, and which makes it possible for us to define Sobolev spaces not just for k N 0 but for s R. Sobolev spaces of non-integer order is important in the study of pseudo-differential operators. Continue to let M denote a compact oriented Riemannian manifold and E a smooth rank N vector bundle over M. Pick a finite atlas (U i, ϕ i ) J i=1 for M for which each open set U i is a trivialization neighborhood of E and let (ρ i ) be a partition of unity with compact support subordinate to this cover. For any ψ Γ(E) we have that (ρ i ψ) ϕ 1 i is a smooth map R n E with compact support in ρ i (U i ). Since E is trivial over the support of ρ i ψ, we may view (ρ i ψ) ϕ 1 i as a smooth compactly supported map R n C N. Thus for any s R it is an element of H s (R n ) H s (R n ), and we define the Sobolev norm by J ψ s := (ρ i ψ) ϕ 1 i H s (R n ) N i=1 where of course the norm on the right-hand side is the product norm. We may therefore for any s R define the Sobolev space H s (M, E) as the closure of Γ(E) in the norm s. For s = k N 0 this new Sobolev norm is equivalent to the one defined in terms of connections. Note first, that since ϕ i is a diffeomorphism we have by Example?? that Hence we get (ρ i ψ) ϕ 1 i Hk (R n ) N C ρ iψ 2,k C (ρ i ψ) ϕ 1 i Hk (R n ) N. ψ 2,k = J ρ i ψ i=1 = C ψ k. The other way round: ψ k = J ρ i ψ 2,k C i=1 J i=1 (ρ i ψ) ϕ 1 i H k (R n ) N J J (ρ i ψ) ϕ 1 i H s (R n ) N C ρ i ψ 2,k JC ψ 2,k. i=1 Thus k and 2,k are equivalent. Note how this implies that k is (at least up to equivalence) independent of the choices of atlas and partition of unity. Given this link to the classical Sobolev spaces, it should come as no surprise that many of the results there carry over verbatim to the manifold case. One of the most important such results is of course the Sobolev Theorem: i=1
11.4 Sobolev Spaces 225 Theorem 11.24 (Sobolev Embedding Theorem). Let M be a compact orientable Riemannian manifold of dimension n. For non-negative integers k and m such that 1 2 n k m we have Hk (M, E) Γ m (E), more precisely each class u H k (M, E) has a representative in Γ m (E). Moreover the inclusion is a continuous operator. The relations among the Sobolev spaces are given by the Rellich Lemma: Theorem 11.25 (Rellich Lemma). If s t then H t (M, E) H s (M, E) and if the inequality is strict, the inclusion is a compact operator. When having a differential operator A : Γ(E) Γ(F ) of order k (let N 1 and N 2 denote the rank of E and F respectively) we can extend it to Sobolev spaces: The way to do it, is to show that it is continuous w.r.t. the Sobolev norms. Picking a cover of proper charts (U i, ϕ i ) and partitions of unity (ρ i ) we get for any u Γ(E): Au s k = (ρ i Au)ϕ 1 i H s k (R n ) N 2. But over U i, the operator A acts as a matrix of usual differential operators on R n and we know that a differential operator a α α on R n with coefficient functions in S(R n ) is continuous H s (R n ) H s k (R n ) for any s. Thus for each i we can find a constant C i (depending of course on s) such that (ρ i Au)ϕ 1 i H s k (R n ) N 2 C i (ρ i u)ϕ 1 i Hs (R n ) N 1. Letting C denote the maximum of the C i s we see that Consequently: Au s k C u s. Proposition 11.26. A differential operator A : Γ(E) Γ(F ) over a compact, oriented Riemannian manifold extends to a continuous operator A : H s (M, E) H s k (M, F ) for each s R. One can easily show that the Hilbert space adjoint A : H s k (M, F ) H s (M, E) is exactly the extension of the formal adjoint A to H s k (M, E), thus the use of the star is unambiguous. When working with an order k differential equation like Au = v where v is some fixed smooth section or an element of a Sobolev space, one is interested in how nice u behaves when compared to v. Unfortunately, one cannot conclude that if v Γ m (F ) then u Γ k+m (E), but if A is elliptic, we have something similar: Theorem 11.27 (Elliptic Regularity). Let A be an elliptic differential operator of order k. If Au = v for some v H s (M, F ), then u H s+k (M, E). Thus for elliptic operators Sobolev spaces are the natural spaces to work with. Combining the elliptic regularity with the Sobolev Embedding Theorem, one can relate differentiability of u to differentiability of v but not as schematically as one could have hoped. However, we do have the following strong consequence Corollary 11.28 (Weyl Lemma). If A is an elliptic order k differential operator A and Au = v for v Γ(F ), then u Γ(E). In particular the kernel of A : H s (M, E) H s k (M, F ) is a subspace of Γ(E) and is thus independent of s.
226 Chapter 11 Differential Operators Change of focus. In the rest of this section we will be concerned with cocalled Fredholm operators and their theory. We just present some basic stuff on Fredholm operators, for proofs we refer the reader to [Conway] for an excellent presentation of the theory. Definition 11.29 (Fredholm Operator). A linear operator A : V 1 V 2 between vector spaces is called a Fredholm operator if ker A and coker A := V/ im A are finite-dimensional. The Fredholm index of a Fredholm operator is then defined as ind(a) := dim ker A dim coker A. The main result which we present later in this section (without proof, though) is that elliptic differential operators are Fredholm operators. However, before that we want to describe some key features of Fredholm operators and their indices. First, the so-called Atkinson Theorem which gives some equivalent conditions for a bounded operator to be Fredholm Theorem 11.30 (Atkinson). Let A : H 1 H 2 be a bounded operator between Hilbert spaces. Then the following conditions are equivalent 1) A is Fredholm. 2) im A is closed and ker A and ker A are finite-dimensional. 3) There exists a bounded operator B : H 2 H 1 (called a parametrix) such that I AB and I BA are compact operators. The set of bounded Fredholm operators H 1 H 2 is denoted F (H 1, H 2 ). This is an open set inside B(H 1, H 2 ), when the latter is given the norm-topology. The Fredholm index turns out to be an impressively robust and well-behaved quantity. Proposition 11.31. Let A F (H 1, H 2 ). Then 1) The adjoint A is again a Fredholm operator and ind A = ind A. 2) If B F (H 1, H 2 ) is a Fredholm operator, then BA is a Fredholm operator and ind(ba) = ind(a) + ind(b). 3) If K : H 1 H 2 is a compact operator, then A+K is a bounded Fredholm operator and ind(a + K) = ind(a). 4) The Fredholm index is a continuous map ind : F (H 1, H 2 ) Z. In particular it is constant on path components of F (H 1, H 2 ). With the basic properties of Fredholm operators in place we can return to the study of differential operators: Theorem 11.32. An elliptic differential operator A : H s (M, E) H s k (M, F ) is Fredholm. As noted in the Weyl Lemma, ker A Γ(E) is independent of the Sobolev space on which we choose to represent A. Likewise with ker A. Thus ind A = dim ker A dim ker A is independent of the choice of Sobolev space, and therefore we can talk unambiguously about the index of an elliptic differential operator. Let A t : Γ(E) Γ(F ), t [0, 1] be a family of differential operators. We call this a continuous family of differential operators if for any proper neighborhood
11.5 Elliptic Complexes 227 U, the coefficients of α in the local representative (A t ) U depend continuously on t. Two elliptic operators A 0 and A 1 for which there exists a continuous family A t of elliptic operators joining them are said to be homotopic. Let J m [0, 1] be the set of t [0, 1] for which A t is of order at least m. By continuity of the family J m is an open set. Since [0, 1] is compact, there is a maximum k = max{m J k [0, 1]} and thus we can assume all the operators A t to be of order k. Given such a continuous family of differential operators and a real number s, one can represent each operator as a continuous map A t : H s (M, E) H s k (M, F ) and one can show that t A t is continuous in the norm topology on B(H s (M, E), H s k (M, F )). Combining with property 4 of Proposition 11.31 we get Corollary 11.33. The index of two homotopic elliptic differential operators are equal. Corollary 11.34. For two elliptic differential operators A 0 and A 1 having identical principal symbols, ind A 0 = ind A 1. Proof. Let A t = (1 t)a 0 +ta 1, then this is a continuous family of differential operators, and obviously σ(a t ) = σ(a 0 ), so A t is elliptic. Thus A 0 and A 1 are homotopic, and they have the same index by the preceding corollary. 11.5 Elliptic Complexes In the next chapter we will uncover a connection between differential operators and K-theory, a connection which is vital for the index theory. As a preparation for this we introduce in this section the notion of an elliptic complex. As an extra bonus we obtain a generalization of the classical Hodge Theorem. Let s first discuss the following simple case. Consider the finite complex V 0 V 0 d 0 V 1 d 1 d n 1 V n 0 of finite-dimensional inner product spaces and linear maps. Define and H k (V ) := ker d k ker d k 1 H (V ) := n H k (V ), k=0 these are the the harmonic elements of the complex. Furthermore, let H k (V ) denote the cohomology of the complex, i.e. H k (V ) := ker d k / im d k 1. We put V := V k, and define the map d := d i : V V as well as the map := (d + d ) 2. Lemma 11.35. The quotient map is a linear isomorphism. Furthermore q k : H k (V ) H k (V ) H (V ) = ker = ker(d + d ). In particular the complex will be exact if and only if d + d : V V is an isomorphism.
228 Chapter 11 Differential Operators Proof. First, let s show that q k is an isomorphism. To show injectivity suppose v ker d k ker d k 1 satisfies q k(v) = 0, i.e. v = d k 1 u for some u V k 1. Since v ker d k 1 we have 0 = d k 1 v = d k 1 d k 1u and therefore 0 = d k 1d k 1 u, u = d k 1 u 2, i.e. v = d k 1 u = 0, thus q k is injective. Showing that it is surjective is a bit harder: Let [v] H k (V ) be a cohomology class represented by v. We can identify this equivalence class with the affine subspace C v := {v + d k 1 u u V k 1 } V k. Let v 0 be the unique point in C v with minimal norm 6. We claim that d k 1 v 0 = 0. To see this, let u V k 1 be arbitrary and consider the smooth function f u : R [0, [ given by f u (t) = v 0 + td k 1 u 2. Since v 0 + td k 1 u C v and v 0 was the element in C v with minimal norm, f u must have a minimum in 0. Thus f u(0) = 0, i.e. 0 = d dt v 0 + td k 1 u, v 0 + td k 1 u t=0 = d k 1 u, v 0 + v 0, d k 1 u = v 0, d k 1 u + v 0, d k 1 u = 2 Re d k 1v 0, u. Replacing u by d k 1 v 0 we see that d k 1 v 0 = 0. Moreover d k (v 0 ) = d k (v + d k 1 u) = d k v = 0. Thus v 0 represents a class in H k (V ) and since v 0 C v we see that [v] = [v 0 ] = q k (v 0 ). Since d k 1 v 0 = 0, v 0 is a homogenous element and thus q k : H k (V ) H k (V ) is surjective. Next we show the identity ker(d + d ) = ker. The inclusion is obvious. Conversely, assume that v ker, then since d + d is self-adjoint, we get 0 = (d + d ) 2 v, v = (d + d )v, (d + d )v = (d + d )v 2, i.e. v ker(d+d ). The identity ker(d+d ) = H (V ) follows from the following decomposition ker(d + d ) = (ker d 0 ) (ker d 1 ker d 0) (ker d n 1). Definition 11.36 (Elliptic Complex). Let E 0,..., E n be complex Riemannian vector bundles over M, and let D k : Γ(E k ) Γ(E k+1 ) be differential operators such that D k D k 1 = 0. The complex 0 Γ(E 0 ) D 0 Γ(E 1 ) D 1 D n 1 Γ(E n ) 0 (11.19) is called an elliptic complex if the complex 0 π E 0 σ(d0) π E 1 σ(d1) σ(dn 1) π E n 0 over the cotangent bundle π : T M M is exact off the zero section of T M. 6 It is a standard result in Hilbert spaces theory that if A is a convex subset of a Hilbert space V, and q is a point in V, there exists a unique point p A such that p q = dist(p, A). This applies to the situation here, the Hilbert space being of course V k and the convex set being C v.
11.5 Elliptic Complexes 229 Define the bundles n E := E k, E + := k=0 n E 2k, E := k=0 n E 2k+1, let D denote the differential operator D k : Γ(E) Γ(E) and put D := D + D as well as := D 2. Furthermore we let D : Γ(E + ) Γ(E ) denote the restriction of D to Γ(E + ). Proposition 11.37. The complex (11.19) is elliptic if and only if the operator D : Γ(E) Γ(E) is an elliptic differential operator. Proof. Let 0 ξ p T p M. It is obvious from the properties of the symbol (cf. Proposition 11.7) that k=0 k=0 ( n ) ( n. σ(d)(ξ p ) = σ(d k )(ξ p ) + σ(d k )(ξ p )) k=0 From Lemma 11.35 we see that the complex of vector spaces 0 E 0 p σ(d 0)(ξ p) E 1 p σ(d 1)(ξ p) σ(d n 1)(ξ p) E n p 0 is exact if and only if σ(d)(ξ p ) is an isomorphism, i.e. the complex (11.19) is elliptic if and only if the operator D is elliptic. We are now ready to state and prove the promised generalized Hodge Theorem. As in the setup for Lemma 11.35 we let H k (E ) := ker D k ker D k 1 and H (E ) := n H k (E ) k=0 denote the harmonic sections of E k, resp. E. It is easy to see that ker = ker D = H (E ). If the complex is elliptic, then so is the operator D, in particular it is Fredholm, implying that ker D (and thus also H (E )) is finite-dimensional. Furthermore, we let H k (E ) := ker D k / im D k 1 denote the k th cohomology group of the complex. If the vector spaces H k (E ) are finite-dimensional, the integer dim H k (E ) (the dimension taken over R if the bundles E k are real, and over C if the bundles are complex) is called the k th Betti number of the complex and we define the Euler characteristic χ(e ) of the complex to be the alternating sum of the Betti numbers χ(e ) := ( 1) k dim H k (E ). k=0 One of the statements of the following theorem is that the cohomology groups of an elliptic complex are always finite-dimensional. Theorem 11.38 (Generalized Hodge). For an elliptic complex 0 Γ(E 0 ) D 0 Γ(E 1 ) D 1 D n 1 Γ(E n ) 0. the natural quotient map q k : H k (E ) H k (E ) is a linear isomorphism, i.e. any cohomology class is represented by exactly one harmonic section. Consequently, the cohomology groups of the complex are finite-dimensional.
230 Chapter 11 Differential Operators Proof. Almost every argument from the proof of Lemma 11.35 holds in this general case, except the one proving surjectivity of q k. Since the spaces Γ(E k ) are no longer finite-dimensional or complete, we have to proceed differently. The problem is still the same: Let [v] H k (E ) where v ker D k. We need to find a representative for this class in H k (E ). Since D is an elliptic operator, the kernel ker D = H (E ) is finite-dimensional, in particular H k (E ) is finitedimensional. Thus, there exists a continuous projection p : Γ(E k ) H k (E ). Letting v 0 = p(v), we assert that v 0 is the desired representative. Obviously, v v 0 is orthogonal to ker D. Extend D to a continuous map H 1 (M, E k 1 ) L 2 (M, E k ), then we know that L 2 (M, E k 1 ) = im D ker D = im D ker D (since it is self-adjoint and Fredholm, so it has closed image) and so we must have v v 0 = Du for some u H 1 (M, E k 1 ). From the Weyl Lemma we obtain smoothness of u, i.e. u Γ(E k 1 ). Recall that v v 0 ker D k ker D and so by acting with D on the equation v v 0 = (D + D )u we get DD u = 0. Thus 0 = DD u, u = D u 2, i.e. D u = 0. Therefore v v 0 = Du = Du = D k 1 u, i.e. v and v 0 are cohomologous, and surjectivity is proven. From the Hodge Theorem we derive our first index result Corollary 11.39. Let E be an elliptic complex. Then the Euler characteristic χ(e ) equals the index of the Fredholm operator D : Γ(E + ) Γ(E ). Proof. First of all, the operator D is Fredholm since it is the restriction of the elliptic operator D. Furthermore, it is easy to see that ind D = dim ker(d) dim ker(d ) = dim H 2k (E ) dim H 2k+1 (E ) = k=0 dim H 2k (E ) k=0 = χ(e ). k=0 dim H 2k+1 (E ) Theorem 11.38 is indeed a generalization of the classical Hodge Theorem as seen in the following example. Example 11.40. The de Rham Complex. Let M be a compact, oriented Riemannian manifold and consider the de Rham complex k=0 0 C (M) d 0 Ω 1 (M) d 1 d n 1 Ω n (M) 0 where d i are the exterior derivatives. Let ξ p T p M be nonzero, then the corresponding symbol complex takes the form 0 C iξp Λ 1 (T p M C ) iξp iξ p Λ n (T p M C ) 0, and it is well-known that this is exact, i.e. the de Rham complex is elliptic.
11.5 Elliptic Complexes 231 The cohomology of this complex is the de Rham cohomology, and from the Hodge Theorem we derive that the de Rham cohomology groups are finitedimensional and that there is an isomorphism from the space of harmonic k- forms to HdR k (M). This is indeed the content of the classical Hodge Theorem. The operator D is just the Hodge-de Rham operator, and Proposition 11.37 states that it is elliptic. Finally, the statement of Corollary 11.39 ind(d) = χ(m) (11.20) is that the Fredholm index of the restricted Hodge-de Rham operator D : Ω + (M) Ω (M) equals the Euler characteristic of the manifold. This is of course just a rephrase of the de Rham Theorem.
232 Chapter 11 Differential Operators
Chapter 12 The Atiyah-Singer Index Theorem 12.1 K-Theoretic Version Let A be an elliptic differential operator between vector bundles E and F over a compact Riemannian manifold M. The assumption of compactness is important and will be retained throughout the rest of this chapter unless otherwise specified. The symbol map σ(a) is part of the complex 0 π E σ(a) π F 0 over T M and since A was elliptic, this complex is exact off the zero section of T M. The zero section is diffeomorphic to M, in particular it is compact, and thus this complex represents a class in K(T M). Composing with the isomorphism K(T M) K(T M) induced by the metric bundle isomorphism T M T M, we obtain a class [σ(a)] K(T M) called the symbol class of A. The statement of the Index Theorem (in its K-theoretic form) is that there exists a homomorphism K(T M) Z, called the topological index which evaluated on the class [σ(a)] equals the Fredholm index of A. Our present job is to define this topological index. To this end, assume i : Y M to be an inclusion of a proper submanifold Y into M. Let N Y denote the normal bundle of Y in M and U Y the corresponding tubular neighborhood 1. To the inclusion Y M there corresponds an inclusion T Y T M, a tubular neighborhood U T Y T M and a normal bundle N T Y (which is now a vector bundle over T Y ) which happens to be (isomorphic to) π (N Y N Y ) = π N Y π N Y, where π : T Y Y now denotes the projection in the tangent bundle. N T Y is a real bundle, but we can equip it with a complex structure, namely the bundle map ( ) 0 id. id 0 Thus, we may view N T Y as a complex bundle over T Y. We can now define a group homomorphism i! : K(T Y ) K(T M) as the composition K(T Y ) K(N T Y ) K(U T Y ) K(T M) (12.1) 1 Given an inclusion Y M of a proper submanifold into a Riemannian manifold (compact or not) one can define the normal bundle N Y over M to be the smooth vector bundle over Y whose fiber over x Y is N x := (T xy ) T xm. A tubular neighborhood is an open subset U Y of M containing Y together with a diffeomorphism N Y U Y which is an extension of the diffeomorphism N 0 Y of the zero section of N with Y. One can show that a tubular neighborhood always exists, see for instance [Lang] Thm. IV.5.1. 233
234 Chapter 12 The Atiyah-Singer Index Theorem where the first map is the Thom isomorphism for the complex bundle N T Y T Y, the second is induced by the diffeomorphism N T Y UT Y and the last one is induced by the inclusion U T Y T M. Lemma 12.1. The map i! is independent of the choice of tubular neighborhood, and for a smooth family i t : Y M of inclusions (meaning that the map I Y M, (t, x) i t (x) is smooth) then (i 0 )! = (i 1 )!. Finally, if i : X Y and j : Y M are inclusions, then (j i)! = j! i!. Proof. Given two tubular neighborhoods U T Y and ŨT Y, one can show that they are homotopic, i.e. there exists a smooth map ϕ : I N T Y T M such that ϕ 0 maps N T Y diffeomorphically onto U T Y and ϕ 1 maps N T Y diffeomorphically onto ŨT Y. But then, by homotopy invariance, the two maps K(N T Y ) K(T M) induced by ϕ 0 and ϕ 1 are identical. The smooth family i t gives rise to a smooth family of inclusions ĩ t : T Y T M. Let T Y t denote the image ĩ t (T Y ) in T M. This new family again gives rise to a smooth family of bundles isomorphisms N T Yt U T Yt between the corresponding normal bundles and tubular neighborhoods and thus also to a smooth family of inclusions U T Yt T M. These smooth families also produces a diffeomorphism T Y 0 T Y1, simply take a point ĩ 0 (x) and move it to ĩ 1 (x). Likewise, from the family of bundle isomorphisms, we obtain a bundle isomorphism N T Y0 NT Y1 which is compatible with the underlying diffeomorphism T Y 0 T Y1. Upon composing with the diffeomorphisms N T Yk UT Yk, k = 1, 2 we get a diffeomorphism U T Y0 UT Y1. By construction this map makes the following diagram commute N T Y0 N T Y1 U T Y0 U T Y1 Consider now the following commuting diagram K(T Y ) K(T Y 0 ) K(N T Y0 ) K(U T Y0 ) K(T Y 1 ) K(N T Y1 ) K(U T Y1 ) K(T M) The vertical maps are induced by the diffeomorphisms T Y 1 T Y0, N T Y1 N T Y0 and U T Y1 UT Y0. The first triangle commutes since the diffeomorphisms between T Y, T Y 0 and T Y 1 commute. The first square commutes thanks to naturality of the Thom isomorphism and the second square commutes since the preceding diagram does. The last triangle commutes due to homotopy invariance. Composing the upper maps gives (i 0 )! while composing the lower maps yields (i 1 )!. From this commutativity of the diagram, we see that (i 0 )! = (i 1 )!. As an example: Let i : M E be the zero-section of a vector bundle (real or complex). But then the push-forward map i : T M T E is just the zerosection of the complex bundle T E T M. Then T E will be a normal bundle for this inclusion as well as a tubular neighborhood. Thus, the two last maps in the definition of i! are just identity maps, i.e. i! is nothing but the Thom isomorphism for the bundle T E T M. By the famous Nash Embedding Theorem we can embed any Riemannian manifold M isometrically into R n. Denoting the inclusion map by i we have by
12.1 K-Theoretic Version 235 the construction above a map i! : K(T M) K(T R n ). Similarly, if j denotes the inclusion of the 1-point set (zero dimensional manifold) P into 0 R n we get a homomorphism j! : Z = K(T P ) K(T R n ). But j is just the zero-section of the trivial bundle C n over a point, and therefore, as noted above, j! : K(T P ) K(T R n ) is just the Thom isomorphism. In particular it has an inverse (j! ) 1 : K(T R n ) K(T P ) = Z. Definition 12.2 (Topological Index). Let M be a compact Riemannian manifold and i : M R n an isometric embedding. Define the topological index ind T : K(T M) Z as the homomorphism (j! ) 1 i!. Lemma 12.3. The topological index is independent of the choice of embedding. Proof. Assume we have two embeddings i : M R n and i : M R m and consider the diagonal embedding k(x) = (i(x), i (x)) of M into R n+m as well as the smooth family of embeddings k t (x) = (i(x), ti (x)). Let s compare i and k 0 (these are not identical as i maps into R n whereas k 0 maps into R n+m ). Let N T R n denote the normal bundle for i(m), then obviously the normal bundle for k 0 (M) will be N R n. Viewing T R n and T R n+m as C n and C n+m respectively, and viewing C n+m as a (trivial) complex rank m vector bundle over C n we get a Thom isomorphism ϕ : K(T R n ) K(T R n+m ). Let P denote the one-point space, and j : T P T R n and l : T P T R n+m denote the inclusions, then, by transitivity of the Thom isomorphism each of the triangles in the following diagram commutes K(T M) i! (k 0)! K(T R n ϕ ) K(T R n+m ) j! l! K(T P ) Thus we see that (l! ) 1 (k 0 )! = (j! ) 1 i!. This shows that i and k 0 give the same index maps. By Lemma 12.1 k 0 and k 1 = k give the same index map. Thus i and k produce the same index map. By symmetry, the same is true for i and k, and hence i and i produce the same index. Without further Ado here is the theorem we have all been waiting for Theorem 12.4 (Atiyah-Singer Index Theorem I). For an elliptic differential operator A : Γ(E) Γ(F ) on a compact Riemannian manifold between complex vector bundles, the Fredholm index of A equals the topological index of its symbol class: ind(a) = ind T ([σ(a)]). (12.2) Sketch of the proof. We give here a very rough sketch of the proof as it is presented in the classical article by Atiyah and Singer(although they have generalized slightly to equivariant K-theory). First of all, in Section 4 they introduce the notion of an index function as well as two axioms, namely (1): if M is a point, then the index is just the isomorphism K(T M) = K(pt) Z and (2) given an inclusion i : N M of N into M, then the index maps ind M and ind M are compatible with the induced map i : K(T N) K(T M),
236 Chapter 12 The Atiyah-Singer Index Theorem i.e. ind N = ind M i. The topological index is an index function and satisfies these two axioms. Then they show that any index function satisfying these axioms coincide with the topological index. Thus theses axioms characterize the topological index uniquely. This is rather straightforward. The remainder of the section is spend on breaking the second axiom down to a set of axioms which are easier to handle. In Section 6, after having reviewed some theory of pseudo-differential operators they proceed to define the so-called analytic index map. This is defined in the following way: In section 2 they have showed that any element of K(T M) can be obtained as the symbol class of some elliptic pseudo-differential operator P over M. The index of P only depends on this symbol class, and hence we obtain the analytic index ind A : K(T M) Z by sending an element of K(T M), which is of the form [σ(p )], to ind(p ). This is an index function and it is relatively easy to realize that the first axiom is satisfied. The real core of the proof is to show that it satisfies the second axiom, or better, that it satisfies the other axioms which implied the second one. This is carried out in Section 8 and 9. But then the analytic index and the topological index coincide! Since ind A ([σ(a)]) = ind A, by construction, (12.2) is true. 12.2 Cohomological Version Our present task is to give a different (and some would say more applicable) formula for the topological index, one involving cohomology classes instead of K-classes. In this game the Chern character will play a prominent role. First, however, we need to discuss a cohomological version of the Thom isomorphism. Let M be an oriented manifold of dimension m and N an oriented manifold of dimension n. Then we have the Poincaré duality isomorphisms D M : H p c (M; R) H m p (M; R) and D N : Hc p (N; R) H n p (N; R) where H c denotes cohomology with compact support. If f : M N is a smooth map we obtain a homomorphism f : Hc p (M; R) Hc p (m n) (N; R), called the Gysin homomorphism or integration along the fiber as the composition H p c (M; R) D M H m p (M; R) f H m p (N; R) D 1 N H p (m n) c (N; R) where f is the induced map in homology. In particular we can consider an oriented vector bundle π : E M of rank k (if the bundle is oriented, then E as a manifold is also oriented). We also assume M to be compact, such that Hc p (M; R) = H p (M; R). If i : M E denotes the inclusion as the zero-section, then π and i are homotopy inverses of each other, i.e. π and i isomorphisms in homology, thus π : Hc p+k (E; R) Hc p (M; R) and i : Hc p (M; R) Hc p+k (E; R) are isomorphisms. In fact they are inverses of each other, as is easily seen. Definition 12.5 (Thom Isomorphism). Let E be an oriented rank k-bundle over an oriented manifold M. The map Φ := i : H p (M; R) Hc p+k (E; R) is called the Thom isomorphism in cohomology. Note, that the bundle E need not be complex, contrary to the situation for the K-theoretic Thom isomorphism. Some basic and useful properties of the Thom isomorphism are listed in the following Proposition Proposition 12.6. Let π : E M be an oriented vector bundle of rank k over an oriented manifold M.
12.2 Cohomological Version 237 1) The class Φ(1), called the Thom class, is the unique class in H p+k c (E; R) with integral 1 over each fiber such that Φ(ξ) = π (ξ) Φ(1) (12.3) and the Thom class is related to the Euler class of E by e(e) = i (Φ(1)). (12.4) 2) If H p c (E; R) is given the structure of a right H p (M; R)-module by ω η = ω π (η) (ω H p c (E; R) and η H p (M; R)) then Φ 1 = π is a modulehomomorphism, i.e. Φ 1 (ξπ (η)) = Φ 1 (ξ)η. 3) If f : N M is a smooth map, then the natural map F : f E E is proper and if Ψ : H p (N; R) H p+n (f E; R) denotes the the Thom isomorphism for the pullback bundle, the following diagram commutes H p (M; R) Φ H p+n c (E; R) f F H p (N; R) Ψ Hc p+n (f E; R) The first claim is proved in [Madsen, Tornehave] and the second one in [Bott, Tu]. Applying i to the formula (12.3) and using (12.4) we obtain i Φ(ξ) = ξ e(e) = e(e) ξ. (12.5) The last identity holds, since either e(e) H 2k (M; R) or e(e) = 0. For the remainder of this section E M will denote a complex vector bundle of rank k over a compact oriented Riemannian manifold (and thus considered as a real vector bundle, it has rank 2k and is orientable). We have two Thom isomorphisms ϕ : K(M) K(E) and Φ : H p (M; R) Hc p+2k (E; R) and we have Chern characters ch : K(M) H (M; R) and ch : K(E) Hc (E; R), how are they related? Unfortunately, the Chern character does not intertwine the two Thom isomorphisms, but almost: Introduce the Thom defect I(E) := Φ 1 ch(ϕ (1)). One of the properties of the Thom defect below states that, up to the Thom defect, the Chern character intertwines the Thom isomorphisms: Lemma 12.7. Under the above conditions the following hold: 1) For all ξ K(M) the Thom Defect Formula holds Φ 1 ch(ϕ ξ) = I(E) ch ξ. (12.6) 2) The Thom defect is natural, i.e. if f : N M is a smooth map then I(f E) = f I(E). 3) The Thom defect is multiplicative, i.e. I(E F ) = I(E)I(F ). 4) If E is a trivial bundle, then I(E) = 1.
238 Chapter 12 The Atiyah-Singer Index Theorem Proof. 1) First note, that by construction of the K-theoretic Thom isomorphism we have ϕ ξ = ϕ (1) π ξ (since M is compact, K(M) is unital ring). Using that the Chern character is natural and multiplicative and that Φ 1 is a module homomorphism we get Φ 1 ch(ϕ ξ) = Φ 1 ch(ϕ (1) π ξ) = Φ 1( ch ϕ (1) ch(π ξ) ) = Φ 1( ch ϕ (1)π (ch ξ) ) = ( Φ 1 ch ϕ (1) ) ch ξ = I(E) ch ξ. 2) This follows from part 3) of Proposition 12.6 and the corresponding result for the K-theory Thom isomorphism, Proposition 9.46. Concretely, let ϕ : K(M) K(E) be the Thom isomorphism for E and ψ : K(N) K(f E) the Thom isomorphism for f E. If 1 M K(M) and 1 N K(N) denote the identities, note that 1 N = f (1 M ) and so we have Therefore ψ (1 N ) = ψ (f 1 M ) = F ϕ (1 M ). I(f E) = Ψ 1 ch(f ϕ (1 M )) = Ψ 1 F ch(ϕ (1 M )) = f Φ 1 E (ch ϕ (1 M )) = f I(E). 3) and 4) are immediate consequences of (12.11) proved below. Next, recall that if E and F are two vector bundles, then Λ k (E F ) = (Λ i E) (Λ j F ). (12.7) i+j=k Define a map λ t : Vect C (M) K(M)[ t ] by λ t (E) := [Λ k E]t k. k=0 Since Λ k E = 0 if k is larger than the rank of E, λ t (E) is a polynomial in t with coefficients in K(M). If t = n is an integer, then λ n is an element of K(M). Note that λ t (E F ) = [Λ k (E F )]t k = [Λ i E][Λ j F ]t i t j k 0 k 0 i+j=k = [Λ i E][Λ j F ]t i t j = λ t (E)λ t (F ). i 0 j 0 Example 12.8. Let L be a complex line bundle over M. Then Λ 0 L = M C, Λ 1 L = L and Λ k L = 0 if k > 1. Since the trivial line bundle M C represents the identity in K(M) we have λ t (L) = 1 + [L]t. As λ is multiplicative, we see that n λ t (L 1 L n ) = (1 + [L k ]t). (12.8) k=1 Thus we have determined λ t for any sum of line bundles. In the following calculations our goal is to obtain a formula for the Thom defect in terms of λ 1 (E) and the Euler class (since E is complex, the Euler
12.2 Cohomological Version 239 class is always defined). At first, suppose that E = L 1 L k is a sum of line bundles. Apply the Chern character to (12.8): ( k ) ch(λ t (E)) = ch (1 + [L j ]t) = = j=1 k (1 + t ch L j ) = j=1 k ch(1 + [L j ]t) j=1 k (1 + te xj ) j=1 where, as usual, x j = c 1 (L j ). For t = 1 we get ch(λ 1 (E)) = k (1 e xj ). (12.9) The importance of the map λ 1 is to produce a formula for the K-theoretic Thom isomorphism which resembles (12.5): Proposition 12.9. For a complex vector bundle E M over a compact oriented manifold with i : M E denoting the zero section, the following holds for any ξ K(M): j=1 i ϕ (ξ) = λ 1 (E) ξ. (12.10) This is proved in [Lawson, Michelson]. Combining this with (12.5) we get e(e)i(e) = i Φ(I(E)) = i ch ϕ (1) = ch(i ϕ (1)) = ch(λ 1 (E)). Now split the Euler class e(e) = c k (E) = x 1 x k and recall formula (12.9) to obtain k x 1 x k I(E) = (1 e xj ). Since x 1 x (1 ex ) is a perfectly holomorphic function (the singularity in x = 0 is removable) we see that j=1 k 1 e xi I(E) = j=1 x i = ( 1) k Td 1 C (E). (12.11) But since both I and Td 1 C are natural w.r.t. smooth maps, the Splitting Principle guarantees that this formula holds for arbitrary vector bundles. A final notion, we need to introduce is that of the fundamental class or orientation class. In the case of an oriented manifold M, one can show that there exists a homology class [M] in the top degree homology group, which in a certain sense determines the orientation (a change of orientation will give produce a different orientation class). Recall, that H k (M; R) = H k (M; R) (an identity which holds for any field, and not just R), thus if M is compact (without boundary) we can let a cohomology class ω in the top degree cohomology act on [M]. As a matter of fact, this is nothing but integration: ω([m]) = which is well-defined (i.e. independent of choice of representative for the cohomology class ω) due to Stokes theorem, and the fact, that M has no boundary. M ω
240 Chapter 12 The Atiyah-Singer Index Theorem In the case of M being non-compact we can let elements of the top cohomology with compact support act on the fundamental class, and the above formula will still hold. In fact, we can let a compact cohomology class of arbitrary degree act on the fundamental class, simply define this action to be 0, unless the cohomology class is of top degree. Consider a tangent bundle π : T M M and assume M is orientable. T M is always orientable (whether M is orientable or not) and in local coordinates (x 1,..., x n, v 1,..., v n ) the orientation is given by dx 1 dv 1 dx n dv n. Thus the fundamental class [T M] always exists. Given an element ω Hc n (T M; R) and assume M to have a global chart, so that ω appears as ωdv 1 dx 1 dv n dx n (we allow ourselves to identify the cohomology class with its representative) then we see ω[t M] = ωdv 1 dx 1 dv n dx n T M = ( 1) n(n 1)/2 ωdx 1 dx n dv 1 dv n. T M Integrating first along the v k s is just integration along the fiber, i.e. it equals Φ 1 (ω) where Φ is the Thom isomorphism for the tangent bundle. Integrating over the x k s is then just evaluation on the fundamental class [M]. Using a partition of unity argument in the case where M is not covered by a single chart, we get ω[t M] = ( 1) n(n 1)/2 (Φ 1 ω)[m]. (12.12) Now we are in position to deduce the cohomological index formula: Theorem 12.10 (Atiyah-Singer Index Theorem II). For an elliptic differential operator A : Γ(E) Γ(F ) on a compact, oriented Riemannian manifold of dimension n between complex vector bundles we have ind(a) = ( 1) n( ch[σ(a)] π Â(M) 2) [T M] (12.13) where π : T M M is the projection in the tangent bundle. Note, that π : T M M is not a proper map. Thus π Â(M) 2 is an element of H (T M; R), and not necessarily of H c (T M; R). However, ch[σ(a)] does have compact support, and hence the product has compact support. Proof. We begin by collecting some preliminary results which will become useful later in the proof. Let T R N P be a scrunch map (P is just a onepoint set). Under the isomorphism T R N = C N we may view this as a complex vector bundle over a point. Thus we have the Thom isomorphisms (ϕ ) 1 : K(C N ) K(P ) = Z and Φ : H (P ; R) H c (C N ; R). By the Thom defect formula with ξ = (ϕ ) 1 u K(P ) we get Φ 1 ch u = I(C N ) ch(ϕ 1 u) = ch(ϕ 1 u) (as I is 1 on trivial bundles). Note that ch : K(P ) H (P ; R) is just the identity Z Z, for the following reason: View C as the trivial line bundle over the point, then c k (C) = 0 if for k 1 and therefore ch(c) = 1. Since C represents the identity in K(P ), the Chern character maps 1 to 1, but then it has to be the identity. Thus we have Φ 1 ch u = ϕ 1 u. Φ 1 is integration along the fiber, but here the fiber is all of C N = T R N, thus the left-hand side equals (ch u)[t R N ] where [T R N ] is the fundamental class of
12.2 Cohomological Version 241 T R N, (recall that T M is always an orientable manifold for any manifold M, in fact, if (x 1,..., x n, v 1,..., v n ) are local coordinates for T M, then an orientation form is given locally by dx 1 dv 1 dx n dv n, thus the fundamental class exists). Thus ϕ 1 u = (ch u)[t R N ] (12.14) where ϕ = q. This is our first preliminary result. Consider now a real vector bundle p : E M. Then also the push-forward p : T E T M is a vector bundle, in fact it can be shown to be isomorphic to π E π E = π E R C, where π : T M M is the projection in the tangent bundle. Thus it is a complex vector bundle over T M. The Thom isomorphism in K-theory is denoted ψ whereas the Thom isomorphism in cohomology is called Ψ. Since the inclusion i : T M T E of T M as the zero section is a proper map, the Thom defect formula still holds and in this case yields Ψ 1 ch(ψ ζ) = I(π E C) ch ζ for any ζ K(T M). Evaluate this on the fundamental class [T M]: Ψ 1 ch(ψ ζ)[t M] = (I(π E C) ch ζ)[t M]. Since Ψ 1 is just integration along the fibers we get ch(ψ ζ)[t E] = (I(π E C))[T M]. (12.15) This is our second preliminary result. Recall now, how we defined the topological index in the preceding section: We took an embedding i : M R N as well as an embedding j : P R N where P is a one-point space. Then we formed the maps i! : K(T M) K(T R N ) and j! : K(T P ) K(T R N ), and we noted that j! was nothing but the Thom isomorphism ϕ mentioned above. Let E M denote the normal bundle to i(m) and let, as above, ψ denote the Thom isomorphism for the bundle T E T M. For a given ζ K(T M) the class ψ ζ K(T E). This has compact support. The inclusion E R N extends to an inclusion T E T R N and under the map in K-theory induced by this inclusion, ψ ζ is mapped to a class in K(T R N ) with support inside T E. Since i! ζ is exactly this extended class, we obtain (ch ϕ ζ)[t E] = ch(i! ζ)[t R N ]. (12.16) But then we can combine the formulas obtained so far (replacing ζ or u by [σ(a)]) to get ind A = (j! ) 1 i! [σ(a)] = ϕ 1 i! [σ(a)] = ch(i! [σ(a)])[t R N ] = (ch ϕ [σ(a)])[t E] = ( I(π E C) ch[σ(a)] ) [T M] the third identity being (12.14), the fourth being (12.16) and the final being (12.15). Finally, we only need calculate I(π E C). First, note that it equals π I(E C) since I is natural. Secondly we note that T M E = T R N, and thus, since I(T R N ) = 1 (as T R N is a trivial bundle) and since I is multiplicative, we obtain I(E C) = I(T M C) 1 = ( 1) n Td C (T M C) the last identity being a consequence of the fact that T M C is its own conjugate bundle. Applying Proposition 10.58 we get the desired formula.
242 Chapter 12 The Atiyah-Singer Index Theorem Using (12.12) and the fact that Φ 1 is a module homomorphism we obtain the following alternative index-formula Theorem 12.11 (Atiyah-Singer Index Theorem III). For an elliptic differential operator A : Γ(E) Γ(F ) between complex vector bundles on a compact oriented Riemannian manifold of dimension n we have ind(a) = ( 1) n(n+1)/2( Φ 1 (ch[σ(a)]) Â(M)2) [M]. (12.17) This powerful theorem has an impressively long list of corollaries. Here we just mention two: Corollary 12.12. An elliptic differential operator on an odd-dimensional compact manifold has index 0. Proof. Consider the involution c : T M T M given by v v. This is an orientation reversing bundle automorphism, for a basis {E 1,..., E n } for one of the fibers T p M, when subjected to c is changed to { E 1,..., E n }. The transition matrix between these bases is just the diagonal diag( 1,..., 1) and this has determinant ( 1) n = 1. Thus c [T M] = [T M]. How does c influence the symbol? Well, first of all, c induces a natural automorphism c : T M T M and this is again given by ξ ξ. Let A be of order m. Since the symbol σ(a) is homogenous of degree m in ξ, we see that c σ(a) = σ(a) c = ( 1) m σ(a) so σ(a) is either unchanged or changed to σ(a). But σ(a) and σ(a) can be deformed continuously to each other through a path of elliptic symbols by t e iπt σ(a). By construction of the K-classes of complexes, such a homotopy does not change the corresponding K-theory class, thus c [σ(a)] = [σ(a)]. Therefore we get ind(a) = ( ch[σ(a)] π Â(M) 2) [T M] = c ( ch[σ(a)] π Â(M) 2) c [T M] = ( ch(c [σ(a)]) c π Â(M) 2) ( [T M]) = ( ch[σ(a)] π Â(M) 2) [T M] = ind(a) (the second identity follows from involutivity of c and the fourth from the fact that π c = π). Thus ind(a) = 0. The second corollary, we want to derive is the generalized Gauss-Bonnet Theorem. Corollary 12.13 (Gauss-Bonnet). Let M be a compact, oriented manifold of even dimension n and let e(m) be the Euler class of the tangent bundle, then χ(m) = e(m). (12.18) In the case of an odd-dimensional manifold χ(m) = 0. Proof. Let i : M T M be the zero-section. By (12.5) we have i Φ(ξ) = ξ e(m). Replace ξ by Φ 1 ξ to obtain M i ξ = Φ 1 (ξ)e(m), for all ξ H c (T M; R).
12.2 Cohomological Version 243 Thus we also have Φ 1 (ch u)e(m) = i ch u = ch(i u) (12.19) for all u K(T M). Now consider the complex Hodge-de Rham operator D = d+d : Ω + C (M) Ω C (M), i.e. the operator originating from the complexified de Rham complex (in order for the Index Theorem to be applicable we need the bundles to be complex). We have discussed this operator in the real case in Example 11.40 where we found that the index equals χ(m). This is still true in the complex case, for then the Euler characteristic is simply the alternating sum of dim C H k (M; C) = dim C (H k (M; R) R C) = dim R H k (M; R). For now we work on the right-hand side of 12.17. First we note that the index class of D equals [σ(d)] = ( 1) i [π (Λ i T M C )] i=0 where π is the projection in the tangent bundle (in secret we have identified the tangent and the cotangent bundle) and [Λ i T M C ] is the K-class represented by the i th exterior bundle. Since π i = id M we get i [σ(d)] = ( 1) i i [π (Λ i T M C )] = i=0 ( 1) i [Λ i T M C ]. The Chern character of this K-class has already in Example 10.61 been calculated to be c n (T M C )Td 1 C (T M C). Plugging this K-class into (12.19) therefore gives Φ 1 (ch[σ(d)])e(m) = c n (T M C )Td 1 C (T M C). Noting that i=0 c n (T M C ) = ( 1) n/2 p n/2 (M) = ( 1) n/2 e(m) 2 (cf. Proposition 10.41 and 10.49) we get From this we conclude, that Φ 1 (ch[σ(d)])e(m) = ( 1) n/2 e(m) 2 Td 1 C (T M C). Φ 1 (ch[σ(d)]) = ( 1) n/2 e(m)td 1 C (T M C). Inserting this into (12.17) and recalling the identity ÂR(T M)) 2 = Td C (T M C ) (cf. Proposition 10.58) we get the desired expression e(m)[m].
244 Chapter 12 The Atiyah-Singer Index Theorem
Appendix A Table of Clifford Algebras The following table displays the first of the real Clifford algebras Cl p,q. q \ p 0 1 2 3 4 0 R R R R(2) C(2) H(2) 1 C R(2) R(2) R(2) R(4) C(4) 2 H C(2) R(4) R(4) R(4) R(8) 3 H H H(2) C(4) R(8) R(8) R(8) 4 H(2) H(2) H(2) H(4) C(8) R(16) 5 C(4) H(4) H(4) H(4) H(8) C(16) 6 R(8) C(8) H(8) H(8) H(8) H(16) 7 R(8) R(8) R(16) C(16) H(16) H(16) H(16) 8 R(16) R(16) R(16) R(32) C(32) H(32) q \ p 5 6 7 8 0 H(2) H(2) H(4) C(8) R(16) 1 H(4) H(4) H(4) H(8) C(16) 2 C(8) H(8) H(8) H(8) H(16) 3 R(16) C(16) H(16) H(16) H(16) 4 R(16) R(16) R(32) C(32) H(32) 5 R(32) R(32) R(32) R(64) C(64) 6 C(32) R(64) R(64) R(64) R(128) 7 H(32) C(64) R(128) R(128) R(128) 8 H(32) H(32) H(64) C(128) R(256) 245
246 Chapter A Table of Clifford Algebras
Appendix B Calculation of Fundamental Groups Just for the sake of completion I ve added this appendix on calculation of fundamental groups of some of the classical Lie groups. The calculation will involve homotopy theory, so let us just recall the definition of the higher homotopy groups: Let (X, x 0 ) be a pointed topological space. As a set, the n th homotopy group π n (X, x 0 ) is the set of homotopy classes of continuous maps (I n, I n ) (X, x 0 ) relative to the boundary. One can equip this set with a composition turning it into a group which is abelian if n 2. Furthermore one can show that the construction is a functor, i.e. given a continuous map f : X Y there are induced homomorphisms f : π n (X, x 0 ) π n (Y, f(x 0 )). The exact details will not concern us here. Next, recall the notion of a fiber bundle: A fiber bundle (or a locally trivial bundle) over some Hausdorff topological space X with fiber F (also a Hausdorff topological space) is a pair (P, π) of yet another topological space P and a continuous surjective map π : P X such that for each point x X we can find a neighborhood U around x and a homeomorphism (called a local trivialization) Φ : π 1 (U) U F of the form Φ(p) = (π(p), ϕ(p)) where ϕ : π 1 (U) F is some continuous map. For such a bundle we will use the handy notation F P X. The first result we will need in order to calculate fundamental groups is the following standard result from homotopy theory: Theorem B.1. Let F P X be a fiber bundle. Choose a point x 0 X, let F 0 = π 1 (x 0 ) P be the fiber over x 0 and select p 0 F 0. Denote by ι the inclusion F 0 P. Then for each n 2 there exists a homomorphism n : π n (X, x 0 ) π n 1 (F 0, p 0 ) such that the following long sequence is exact π n (F 0, p 0 ) ι πn (P, p 0 ) π πn (X, x 0 ) n πn 1 (F 0, p 0 ) (B.1) The long exact sequence (B.1) is known as the homotopy long exact sequence. The connection with Lie groups is via homogenous manifolds. A homogenous G-manifold is a smooth manifold M equipped with a smooth, transitive left action of a Lie group G. The prototype of a homogenous manifold is the following: let G be a Lie group and H a closed subgroup (which is then automatically a Lie subgroup) where H acts on G from the right by translation: G H (g, h) gh. This action is smooth, free and proper and thus the orbit space, denoted G/H, has a smooth structure. Now define a left G-action on this 247
248 Chapter B Calculation of Fundamental Groups space by (g, g 0 H) (gg 0 )H. This is a smooth, transitive action of G on G/H making G/H a homogenous manifold. In fact, any homogenous G-manifold M is of this form: for the closed subgroup H simply take the isotropy group G p of any point p M, and M will be diffeomorphic to G/G p. These are all classical facts from smooth manifold theory. Let s give some examples which we will need later. First the (n 1)-sphere is a homogenous manifold: We let O(n) act on the sphere S n 1 R n in the obvious way. The action is clearly smooth, and it is transitive, since we can move the north pole N = (0,..., 0, 1) to any point on the sphere by an appropriate rotation (this also shows that the action of SO(n) on S n 1 is smooth and transitive, we return to that in an little while). Now we seek the isotropy group at N that is the group ( of ) orthogonal matrices fixing N. Such a matrix A 0 must have the form R = for some n 1 n 1 matrix A. The orthogonality condition R T R = I forces A to obey A T A, thus A O(n 1). We can 0 1 therefore identify the isotropy group at N with O(n 1) and hence we get a diffeomorphism O(n)/ O(n 1) = S n 1. As we noted above, also SO(n) acts smoothly and transitively ( ) on S n 1 and as A 0 before the SO(n)-matrices that fix N have the form for an orthogonal 0 1 matrix A O(n 1). But 1 = det ( ) A 0 = det A det 1 = det A, 0 1 hence A SO(n 1) and so we get a diffeomorphism SO(n)/ SO(n 1) = S n 1. The natural actions of U(n) and SU(n) on S 2n 1 C n provide us, in the same manner, with diffeomorphisms U(n)/ U(n 1) = S 2n 1 and SU(n)/ SU(n 1) = S 2n 1. These diffeomorphisms will come in quite handy later on. We now provide the link between homogenous manifolds and Theorem B.1: Proposition B.2. Let G/H be a homogenous manifold. Then G has the structure of a fiber bundle over G/H with fiber H. If we use the notation from above, G would be a H G G/H fiber bundle. Proof. As projection map we simply use the natural map π : G G/H sending g to gh. It is clearly continuous and surjective. Now we prove the trivialization part. Let x 0 G/H be arbitrary. According to [14] Theorem 3.58 there exists a neighborhood U G/H around x 0 and a smooth local section of G, that is a smooth map σ : U G such that π σ = id U. σ can be used to define the trivialization Φ : π 1 (U) U H Φ(p) = (π(p), σ(π(p)) 1 p). This map is obviously continuous and has as inverse map Φ 1 (x, h) = σ(x)h which is also continuous. Thus, Φ is the desired trivialization. At this point we are really able to do some calculations. But before doing so, we would like to get rid of the base point dependence by breaking the Lie groups up into components (recall that for Lie groups, or generally for manifolds, the components and path components are the same). Many of the classical Lie groups are actually connected by virtue of the following result: 1 9.34). 1 The proof of this result can be found either in [14] (Proposition 3.66) or in [9] (Proposition
249 Lemma B.3. Suppose that G is a Lie group acting smoothly, freely and properly on a manifold M. If G and M/G are connected then so is M. We can now show: Proposition B.4. For all n 1, the Lie groups SO(n), U(n) and SU(n) are connected. Proof. Lets verify the result for SO(n) by induction. Firstly, SO(1) is just a point, hence connected. Assume then that SO(n 1) is connected. By the diffeomorphism SO(n)/ SO(n 1) = S n 1 and Lemma B.3 (S n 1 is connected) we get that SO(n) is connected. The proof for U(n) and SU(n) is exactly the same, just use that U(1) = S 1 is connected, and that SU(1), like SO(1), is just a point. The orthogonal group O(n) is not connected, albeit almost: Proposition B.5. For every n 1 the group O(n) has two diffeomorphic components O(n) + = det 1 (1) and O(n) = det 1 ( 1). Proof. It is clear that O(n) is not connected since det(o(n)) = { 1, 1}, and { 1, 1} is not connected. Obviously, O(n) = O(n) + O(n) and O(n) + O(n) =. By definition O(n) + is just SO(n) which is connected by Proposition B.4. Thus, O(n) + is a component. Let A O(n) be an arbitrary matrix with determinant 1, then the map O(n) + O(n) given by X AX is a diffeomorphism. Hence, also O(n) is connected. Finally let s do what we set out to do: calculate some fundamental groups. Theorem B.6. For all n 1 we have π 1 (U(n)) = Z and π 1 (SU(n)) = 0, i.e. SU(n) is simply connected. Proof. We only do U(n). Thanks to the diffeomorphism U(n)/ U(n 1) = S 2n 1 and Proposition B.2 which yields a fiber bundle U(n 1) U(n) S 2n 1, the long exact sequence of homotopy theory give us an exact sequence π 2 (S 2n 1 ) π 1 (U(n 1)) π 1 (U(n)) π 1 (S 2n 1 ). For all n 2 π 2 (S 2n 1 ) = π 1 (S 2n 1 ) = 0 and hence exactness of the sequence above gives an isomorphism π 1 (U(n)) = π 1 (U(n 1)). Hence π 1 (U(n)) = π 1 (U(1)) = Z. Exactly the same for SU(n) except that π 1 (SU(1)) = 0. In the calculation of the fundamental group of SO(n) we will need the following very famous relationship between SO(3) and SU(2) which is important enough to be stated as a result in its own right. It plays an interesting role in the quantum mechanical theory of spin 2. Lemma B.7. There exists a Lie group isomorphism SO(3) SU(2)/{ I, I}. Theorem B.8. For SO(n) we have the following fundamental groups: π 1 (SO(1)) = 0, π 1 (SO(2)) = Z and π 1 (SO(n)) = Z 2 for n 3. 2 For a proof see [1] Proposition 9.2.
250 Chapter B Calculation of Fundamental Groups Proof. The first two fundamental groups are obvious since SO(1) is just a point, and SO(2) is the circle. For a general n 3 we proceed as before. We have an exact sequence π 2 (S n 1 ) π 1 (SO(n 1)) π 1 (SO(n)) π 1 (S n 1 ). which for n 4, says π 1 (SO(n)) = π 1 (SO(n 1)) since π 2 (S n 1 ) = π 1 (S n 1 ) = 0. Inductively π 1 (SO(n)) = π 1 (SO(3)), so we only need to find this fundamental group. This is where we need Lemma B.7. Let { I, I} SU(2) act in the natural way and consider the canonical map π : SU(2) SU(2)/{ I, I} onto the orbit space which, according to Lemma B.7, is just SO(3). Since { I, I} is a discrete subgroup, π is a double covering map 3 and since SU(2) is simply connected by Theorem B.6 π is the universal covering map of SO(3). But then the fundamental group of SO(3) has the same order as the number of sheets in the covering, i.e. π 1 (SO(3)) must be isomorphic to Z 2. Proposition B.5 says that the two components of O(n) are both diffeomorphic to SO(n) and so renders the following result: Corollary B.9. For all n 1 we have π 1 (O(n) + ) = π 1 (O(n) ) = Z 2. 3 See [9] Proposition 9.26.
Bibliography [AA] M. F. Atiyah, D. W. Anderson: K-Theory. W. A. Benjamin Inc., 1967. [BT] Raoul Bott, Loring Tu: Differential Forms in Algebraic Topology. Graduate Texts in Mathematics nr. 82, Springer, 1982. [LLR] N. J. Laustsen, F. Larsen and M. Rørdam: An Introduction to K-Theory for C*-Algebras. London Mathematical Society, Student Texts, 2000. [Lee] John M. Lee: Riemannian Manifolds. An Introduction to Curvature. Graduate Texts in Mathematics, 1997. [MT] Ib Madsen, Jørgen Tornehave: From Calculus to Cohomology. Cambridge University Press, 1999. [Mi] J. W. Milnor, J. Stasheff: Characteristic Classes. Annals of Mathematics Studies, 1974. [Mo] Shigeyuki Morita: Geometry of Differential Forms. Translation of Mathematical Monographs, 2000. 251
Index abstract root system, 68 reduced, 68 Â-class, 196 action, 151 acyclic complex, 156 ad-nilpotent, 33 adjoint representation, 30 algebra complexification of, 120 analytic index map, 236 angular momentum, 105, 107 angular momentum operator, 105 antisymmetric operator, 97 arc measure, 16 associated fiber bundle, 169 associated measure, 15 associated vector bundle, 169 connection, 171 Atiyah-Singer Index Theorem, 235, 240, 242 Atiyah-Singer operator, 220 Atkinson s Theorem, 226 base point, 142 based G-space, 153 based space, 142 Bernoulli numbers, 195 Betti number, 229 Bianchi identity, 181 bilinear form complexification of, 120 negative definite, 111 non-degenerate, 111 positive definite, 111 bosonic subalgebra, 116 Bott periodicity, 150, 151, 161 from Thom isomorphism, 161 bundle map equivariant, 152 C -vectors, 92 are dense in H, 95 cancellation property, 140, 152 canonical 1-form, 171 canonical anti-automorphism, 115 canonical automorphism, 116 252 canonical generator, 81 Cartan formula, 205, 209 Cartan subalgebra, 59, 87 existence, 59 real form, 67 Cartan s Criterion, 36 Cartan-Bott Theorem, 119, 121 Cartan-Dieudonne Theorem, 129 Casimir element, 55 Cauchy-Schwartz inequality, 26 center, 31 centralizer, 30 character, 22, 24 irreducible, 22 characteristic class, 182 Chern character, 197 in K-theory, 199 on non-compact spaces, 199 Chern class, 184 sum formula, 184 total, 184 uniqueness of, 191 chiral spinors even, 123 odd, 123 circle group, 14 irreduciblerepresentations, 14 class function, 20 Clebsch-Gordan theory, 12 Cl(Φ), 125 Clifford algebra, 111 Cl 0,1, 114 Cl 0,2, 114 Cl p,q, 114 existence, 111 functoriality of, 113 uniqueness, 111 Z 2 -grading, 116 Clifford bundle, 210 Clifford group, 125 Clifford module, 211 compact G-pair, 153 compact pair, 144 completely reducible, 12, 26 complex
INDEX 253 elliptic, 228, 229 Euler characteristic, 229 harmonic section, 229 complex n-spinor, 123 complexification, 186 of a Lie algebra, 39 of a representation, 52 of a vector space, 39 cone, 145 conjugate bundle, 185 conjugation, 126 connection, 164 1-form, 165 complexification, 186 direct sum, 185 flat, 176 formal adjoint, 205 is a differential operator, 202 Levi-Civita, 180 metric, 178, 180 pullback, 173 Riemannian, 180 symmetric, 180 trivial, 164 connection Laplacian, 209 symbol, 209 co-root, 63 covariant derivative, 164 covariant exterior derivative, 176 covering action, 131 covering map, 131 covering space, 131 curvature, 176 2-form, 177 cyclic highest weight vector, 76 de Rham cohomology, 182 de Rham complex, 230 is elliptic, 230 defining representation, 10, 88 derived algebra, 31 derived series, 31 differentiability, 91 differential operator, 201 continuous family, 226 elliptic, 209 homotopy, 227 symbol, 206 Dirac bundle, 211 Dirac Laplacian, 217 Dirac operator, 220 formally self-adjoint, 217 is elliptic, 217 of the exterior bundle, 218 symbol, 217 Dirac spinor, 123 field, 212 Dirac spinor field, 212 Dirac type operator, 209 direct sum, 10, 51, 52 of Hilbert spaces, 10 directional derivative, 91 Dixmier-Malliavin Theorem, 104 dominant element, 73, 87 dominated convergence, 93 dual bundle, 185 elementary symmetric polynomials, 192 elliptic complex, 228 elliptic operator, 209 is Fredholm, 226 elliptic regularity, 225 endomorphism algebra, 30 Engel s Theorem, 34 equivalence of representations, 11, 50 equivalent representations, 11, 50 equivariant K-group, 152 equivariant bundle map, 152 equivariant function, 169 equivariant map, 151 equivariant section, 152 essentially skew-adjoint, 97 Euler characteristic, 229 Euler class, 189 and nonzero sections, 190 and the Thom isomorphism, 237 even action, 131 evenly covered, 131 exterior bundle Dirac operator, 218 is a Dirac bundle, 211 exterior derivative, 218 formal adjoint, 205, 218 is a differential operator, 202 symbol, 208 exterior product, 149, 150 fermionic subspace, 116 F -genus, 194 fiber bundle, 247 fiber metric, 178 filtration of the tensor algebra, 44 flat connection, 176 formal adjoint, 204 formal power series, 193 Fourier coefficient, 27 Fourier series, 27 Fourier theory, 27 Fredholm index, 226 Fredholm operator, 226
254 INDEX fundamental class, 194, 239, 240 fundamental group of O(n), 250 of SO(n), 249 of SU(n), 249 of U(n), 249 fundamental representation, 13 fundamental representations, 88 fundamental system, 70, 87 Fundamental Theorem of Riemannian Geometry, 180 G-complex, 155 acyclic, 156 direct sum, 157 homotopy, 157 pullback, 157 support, 155 tensor product, 157 G-homotopic maps, 152 G-map, 151 G-module, 9 G-space, 151 G-vector bundle, 151 g-module, 49 Gauss-Bonnet Theorem, 242 Gelfand s Theorem, 13 generalized Hodge Theorem, 229 generalized Laplacian, 209 graded module, 149 graded tensor product, 116 Gysin homomorphism, 236 Gårding subspace, 94, 95 is dense in H, 95 Gårding vector, 94 Gårding s Theorem, 95 Haar integral, 15 left, 15 right, 15 Haar measure, 15 for T, 16 for R n, 15 left, 15 on compact group, 16 right, 15 half space, 70 harmonic analysis, 27 harmonic element, 227 harmonic section, 229 highest weight, 76, 80, 88 is a dominant integral element, 79 highest weight module, 80 Highest Weight Theorem, 86 highest weight vector, 76, 80 cyclic, 76 Hirzebruch L-class, 196 Hirzebruch L-sequence, 196 Hodge Laplacian, 208 is elliptic, 209 symbol, 208 Hodge star operator, 205 Hodge-de Rham operator, 208, 220, 231 and the Gauss-Bonnet Theorem, 243 Fredholm index, 231 is elliptic, 209, 231 symbol, 208 homogenous manifold, 247 is a fiber bundle, 248 homotopy, 157 homotopy equivalence induces isomorphism, 141 homotopy groups, 247 homotopy long exact sequence, 247 ideal, 29 index function, 235 index notation, 12 infinitesimal representation, 50 integral element, 79 integration along fiber, 236 intertwiner, 10, 50, 121 intertwining map, 10 intertwining number, 11 invariant polynomial, 180 invariant subspace, 11, 50, 121 irreducible character, 22 irreducible representation, 12, 24, 26, 50 isomorphism of complexes, 156 Jacobi identity, 29 K-group, 140 of a sphere, 150 of a torus, 150 quaternionic, 141 real, 141 relative, 144 ring structure, 141 total, 145 Killing form, 36, 37, 62, 64 radical of, 37 Koszul complex, 156, 160 K-theory induced map, 141 is a contravariant functor, 141 with compact support, 154
INDEX 255 L-class, 196 left Haar integral, 15 left Haar measure, 15 left regular representation, 27 Leibniz rule, 176 level, 70 Levi-Civita connection, 172, 180 lexicographic ordering, 71 Lie algebra, 29 abelian, 31 indecomposable, 31 nilpotent, 32, 34 radical of, 32 reductive, 40 semisimple, 32, 37 simple, 31 solvable, 31 Lie algebra homomorphism, 30 Lie algebra isomorphism, 30 Lie algebra representation, 49, 137 dimension, 49 equivalence, 50 faithful, 49 induced, 50 irreducible, 50 Lie derivative, 202 Lie group representation, 49 Lie subalgebra, 29 Lipschitz group, 125 local operator, 165, 201 local trivialization, 247 locally trivial bundle, 247 long exact sequence in K-theory, 147, 154 matrix coefficient, 18, 24 matrix Lie group, 10 maximal torus, 59 measurable section, 220 metric, 178 modular function, 17, 98 momentum, 102 momentum operator, 104 morphism of complexes, 156 multiplicative sequence, 193 Nash Embedding Theorem, 234 Newton polynomial, 193 recursion formula, 193 Newton relations, 181 norm, 126 normal bundle, 233 normalizer, 31 O(Φ), 113 operator antisymmetric, 97 essentially skew-adjoint, 97 orbital angular momentum, 105 orientation, 187 orientation class, 239, 240 orthogonal decomposition, 116 orthogonal group, 113 orthogonal isomorphism, 113 orthogonal linear map, 113 orthonormal basis, 115 parametrix, 226 Pauli spin matrices, 105 PBW-basis, 48 PBW-Theorem, 44 Peter-Weyl Theorem, 24, 26 Pfaffian, 186 pin group, 128 is a Lie group, 131 Pin c (Φ), 128 Pin(Φ), 128 is a double covering of O(Φ), 132 Pin(n), 128 is compact, 132 Poincaré duality, 236 Poincaré-Birkhoff-Witt Theorem, 44 polarization identity, 111 Pontrjagin class, 184 of tangent bundle of S n, 185 sum formula, 184 total, 184 positive integral, 15 positive root, 70, 75 level, 70 positive roots, 87 positive system, 70, 75 positivity, 70 principal symbol, 206 proper neighborhood, 201 pullback bundle, 140, 172 quadratic form, 111 complexification of, 120 negative definite, 111 non-degenerate, 111 positive definite, 111 quantization map, 113, 212 quantum mechanics, 102 quotient algebra, 30 radical, 32 of a bilinear form, 54 of the Killing form, 37 rank
256 INDEX of a Lie algebra, 59 of a root system, 68 real form, 67 realification, 187 is orientable, 188 reduced K-group, 142, 153 reduced K-theory, 142, 153 reduced suspension, 145 reflection through hyperplane, 125 relative K-group, 144 Rellich Lemma, 225 representation, 9, 121 completely reducible, 12 complexification, 121 dimension of, 9 direct sum, 22 equivalence, 11, 121 faithful, 9, 49 irreducible, 12, 24, 50, 121 of a Lie algebra, 49 of a Lie group, 49 of Clifford algebras, 122 of spin groups, 135 tensor product, 22 unitary, 10 retract, 148 Riemannian connection, 172, 180 Riemannian vector bundle, 178 Riesz Representation Theorem, 15 right Haar integral, 15 right Haar measure, 15 root, 61, 68 is an integral element, 79 positive, 70, 75 reduced, 68 simple, 70 root reflection, 68 root space, 61 root space decomposition, 61, 75 root string, 65 root system abstract, 68 basis, 70 irreducible, 69 isomorphism, 69 rank, 68 reduced, 68 reducible, 69 root vector, 61 root vectors, 87 rotation group, 104 rotation operator, 104 Schur Orthogonality, 19, 20 Schur s Lemma, 13, 51 section p-integrable, 221 equivariant, 152 locally p-integrable, 221 measurable, 220 with compact support, 203 sheet of a covering, 131 simple acyclic complex, 156 simple system, 70 six-term exact sequence, 151 SN-decomposition, 35 SO(Φ), 113 Sobolev Embedding Theorem, 103, 225 Sobolev norm, 222, 224 independence of connections, 222 Sobolev space, 223 special orthogonal group, 113 Spectral Theorem, 25 spin bundle connection, 214 spin connection, 214 Spin group Lie algebra, 134 spin group, 128 is a Lie group, 131 Spin(3), 130 spin representation complex, 123 real, 122 spin-dirac operator, 220 Spin c (Φ), 128 spin c -representation, 135, 138 irreducibility, 138 is faithful, 138 Spin c (3), 130 Spin c (4), 131 Spin(Φ), 128 is a double covering of SO(Φ), 132 Spin(n), 128 is compact, 132 is simply connected, 133 is universal covering of SO(n), 133 spin(n), 134 spinor, 122 chiral, 123 Dirac, 123 field, 212 Weyl, 123 spinor bundle, 212 connection, 214 Dirac operator, 220 is a Dirac bundle, 212 spinor field, 212 spinor representation, 135
INDEX 257 irreducibility, 136, 137 is faithful, 135 Lie algebra representation, 137 spinorial representation, 12 splitting principle, 191, 198 stable equivalence, 143, 153 Stone s Theorem, 102 strong operator topology, 9 super tensor product, 116 suspension, 145 reduced, 145 symbol of a differential operator, 206 symbol class, 233 symbol map, 113, 134 symmetric algebra, 42 symmetric polynomial, 192 elementary, 192 tensor algebra, 42 filtration, 44 tensor product, 10, 51, 52, 81 of Hilbert spaces, 10 Thom class, 237 Thom defect, 237 is multiplicative, 237 is natural, 237 of a trivial bundle, 237 Thom Defect Formula, 237 Thom homomorphism, 160 Thom isomorphism K-theory, 160 and the Euler class, 237 in cohomology, 236 is a module homomorphism, 237 pullbacks, 162, 237 transitivity, 161 Todd class, 196 Todd sequence, 195 topological index, 233, 235 torsion, 171 torsion tensor, 180 torus, 59 maximal, 59 total F -class, 194 total Chern class, 184 sum formula, 184 total Pontrjagin class, 184 sum formula, 184 total Â-class, 196 translation operator, 102 trivial representation, 10 trivialization cover, 139 tubular neighborhood, 233 twisted adjoint representation, 125 kernel of, 126 unimodular group, 17 unitarization, 16 unitary representation, 10, 16 universal covering, 131 universal enveloping algebra, 42 Urysohn s Lemma, 13 vector bundle, 139 conjugate, 185 dual, 185 orientable, 187 orientation, 187 oriented, 187 pullback, 140, 172 realification, 187 vector field formal adjoint, 205 is a differential operator, 202 symbol, 207 vector space complexification of, 120 Verma module, 81 volume element, 123, 137 weak derivative, 103 weak integral, 93 wedge product, 175 weight, 60, 80, 87 highest, 76 is an integral element, 79 weight space, 60, 80 weight space decomposition, 60 weight vector, 60, 80 highest, 76 Weyl chamber, 75 Weyl group, 72 Weyl Lemma, 225 Weyl spinors negative, 123 positive, 123 Weyl s Theorem, 56 Whitney sum formula, 184