HAHN BANACH THEOREMS

HAHN BANACH THEOREMS S.H. KULKARNI Abstract. We give an overview of Hahn-Banach theorems. We present the statements of these theorems alongwith some definitions that are required to understand these statements and make some comments about the relevance, applications etc. Proofs are not given. 1. Introduction The Hahn-Banach theorems are one of the three most important and fundamental theorems in basic Functional Analysis, the other two being the Uniform Boundedness Principle and the Closed Graph Theorem. Usually Hahn-Banach theorems are taught before the other two and most books also present Hahn-Banach theorems ahead of Uniform Boundedness Principle or the Closed Graph Theorem. This may be due to several reasons. The statements, proofs and applications of Hahn-Banach theorems are relatively easier to understand. In particular, the hypotheses do not include completeness of the underlying normed linear spaces and proofs do not involve the use of Baire Category Theorem. In this article, 1 we give an overview of Hahn-Banach theorems. We present the statements of these theorems alongwith some definitions that are required to understand these statements and make some comments about the relevance, applications etc. Detailed proofs are not given as these can be found in any introductory textbook on Functional Analysis for example [1, 4, 6]. There are two classes of theorems commonly known as Hahn-Banach theorems, namely Hahn-Banach theorems in the extension form and Hahn-Banach theorems in the separation form. All these theorems assert the existence of a linear functional with certain properties. Why is it important to know the existence of such functionals? In a large number of applications of practical importance, the objects of study can be viewed as members of a vector space. A study of such objects involves making various measurements/observations. These are functionals on that vector space. As the names suggest, Hahn-Banach theorems in the extension form assert that functionals defined on a subspace of a vector space (frequently with some additional structure, usually with a norm or topology) and having some additional properties (like linearity, 1 An expanded version of a talk given at a Regional Workshop in Analysis, Villupuram. 1

2 S.H. KULKARNI continuity) can be extended to the whole space while retaining these additional properties. This is useful in asserting the existence of certain functionals and this in turn can be used in applications involving approximation of certain functions. These theorems and their proofs are analytic in nature. On the other hand, Hahn-Banach theorems in the separation form and also their proofs are geometric in nature. These theorems deal with the following question: Given two disjoint convex sets in a vector space, when is it possible to find a hyperplane such that the two given convex sets lie on the opposite sides of that hyperplane? This information is useful in answering several questions dealing with optimization problems, also known as Mathematical Programming. Hahn-Banach theorems in both these forms are mathematically equivalent. This means that it is possible to prove a Hahn-Banach theorem in the extension form first and then use it to prove a corresponding theorem in the separation form or prove a Hahn-Banach theorem in the separation form first and then use it to prove a corresponding theorem in the extension form. Preference among these two approaches is a matter of personal taste. Each of these approaches can be found in the text books, for example the first approach can be found in [1] whereas the second can be found in [4]. 2. Functionals and their extensions Hahn-Banach theorems are essentially theorems about real vector spaces. Basic theorems are first proved for real vector spaces. These are then extended to the case of complex vector spaces by means of a technical result.(see Lemma 7.1 of [4] and remarks preceding it.) In this article, we shall confine ourselves to real vector spaces. Common examples of real vector spaces are R n, sequence spaces l p, 1 p, function spaces C([a, b]) with pointwise or coordinatewise operations as the case may be. A functional on such a vector space is a real valued function defined on it. Next we list several properties that such a functional may or may not have. Definition 2.1. Let V be a real vector space. A functional on V is a function φ : V R. Such a φ is said to be (1) additive if φ(a + b) = φ(a) + φ(b) for all a, b V, (2) homogeneous if φ(αa) = αφ(a) for all a V and α R, (3) linear if it is additive and homogeneous, (4) subadditive if φ(a + b) φ(a) + φ(b)

for all a, b V, (5) positive homogeneous if HAHN BANACH THEOREMS 3 φ(αa) = αφ(a) for all a V and α R with α 0, (6) sublinear if it is subadditive and positive homogeneous, (7) convex if φ((1 t)a + tb) (1 t)φ(a) + tφ(b) for all a, b V and t [0, 1]. Several examples of functionals having some of the above properties and not having some of the other properties can be given. These are easy to construct and can be found in most books on Functional Analysis, including [1, 4, 6]. For example, the functional φ defined on R 2 by φ(x 1, x 2 ) := x 1 + x 2 for (x 1, x 2 ) R 2 is sublinear but not linear. On the other hand, the functional ψ defined on R 2 by ψ(x 1, x 2 ) := 2x 1 + x 2 for (x 1, x 2 ) R 2 is linear. It is obvious that a linear functional satisfies all the above properties. It is a good exercise to construct examples of functionals satisfying one of the above properties but are not linear. We are now in a position to state the first of the Hahn Banach Extension theorems. This is used in proving all the other extension theorems. It says that a linear functional defined on a subspace of a real vector space V and which is dominated by a sublinear functional defined on V has a linear extension which is also dominated by the same sublinear functional. Theorem 2.2. Let V be a real vector space and p be a sublinear functional defined on V. Suppose W is a subspace of V and φ is a linear functional defined on W such that φ(x) p(x) for all x W. Then there exists a linear functional ψ defined on V such that ψ(x) = φ(x) for all x W and ψ(y) p(y) for all y V. Usual proof of this theorem involves two steps.(see [6] for details.) Of course, if W = V, there is nothing to prove. When W is a proper subspace, some x 0 V \ W is chosen and φ is extended to the linear span W 0 of W {x 0 } in such a way that the extension ψ remains dominated by p in W 0. This is the first step. The crucial idea in this step is to choose the value of ψ(x 0 ) in an appropriate manner. The second step makes use of Zorn s Lemma to construct a maximal subspace containing W to which φ can be extended satisfying the required condition. Finally it is shown using the first step that this maximal subspace must coincide with V.

4 S.H. KULKARNI Though the above formulation of the Hahn Banach Extension theorem is most basic, the most popular one is stated in the context of normed linear spaces. In order to understand this version, we need to review a few basic definitions. Definition 2.3. Let X be a real vector space. A norm on X is a function. : X R satisfying: (1) x 0 for all x X and x = 0 if and only if x = 0. (2) αx = α x for all x X and α R. (3) x + y x + y for all x, y X. A real normed linear space is a pair (X,. ) where X is a real vector space and. is a norm on X. Examples of real normed linear spaces include R n (with several norms), sequence spaces l p, 1 p, function spaces L p, 1 p, C(X), C([a, b]), C 1 ([a, b]). Among the various norms on R n, the following are more frequently used. For x = (x 1,..., x n ) R n, (1) n x 1 := x j, (2) (3) x 2 := ( j=1 n x j 2 ) 1/2, j=1 x := max j=1...n x j. Definition 2.4. Let (X,. ) be a normed linear space. A linear functional φ on X is said to be bounded if sup{ φ(x) : x X, x 1} is finite. When this is the case, the above quantity is called the norm of φ and denoted by φ. It is easy to show (and is also proved in most textbooks) that a linear functional φ on X is bounded if and only if it is continuous at each x X if and only if it is uniformly continuous on X. Also note that if φ is a bounded linear functional on X, then φ(x) φ x for all x X. See the references at the end, for several examples of bounded as well as unbounded linear functionals. The popular version of the Hahn Banach Extension theorem mentioned above says that every bounded

HAHN BANACH THEOREMS 5 linear functional defined on a subspace of a normed linear space has a norm preserving extension to the whole space. Theorem 2.5. Let (X,. ) be a normed linear space, Y a subspace of X and φ a bounded linear functional defined on Y. Then there exists a bounded linear functional ψ defined on X such that φ(y) = ψ(y) for all y Y and ψ = φ. This is proved by observing that the functional p defined on X by p(x) := φ x, x X, is a sublinear functional that dominates φ on Y and using Theorem 2.2. This theorem is frequently used to deal with questions about approximations. Typically we want to know whether an element a X can be approximated by elements in a subspace Y, that is whether a belongs to the closure of Y. For example one may want to know whether a continuous function f defined on the interval [0, 1] can be approximated by polynomials. This amounts to taking X = C([0, 1]) and Y to be the subspace of all polynomials. The following well known corollary of the Hahn Banach Extension theorem helps in answering this question. Corollary 2.6. Let (X,. ) be a normed linear space, Y be a subspace of X and a X. Then a belongs to the closure of Y if and only if there exists no bounded linear functional φ on X such that φ(y) = 0 for all y Y and φ(a) 0. Equivalently, a does not belong to the closure of Y if and only if there exists a bounded linear functional φ on X such that φ(y) = 0 for all y Y and φ(a) 0. Given X, Y and a how does one decide whether such a bounded linear functional exists or not? In general, this is difficult to answer. In order to answer this, we need to have the knowledge of all bounded linear functionals on X. This is a theme of another class of theorems in Functional Analysis known as Riesz Representation Theorems. These theorems give complete description of bounded linear functionals on various concrete normed linear spaces. See the references at the end for more information on these theorems. An interesting application of this idea to the Poisson Integral Formula can be found in [5]. 3. Convex sets and their separation In this section we discuss Hahn-Banach theorems in the separation form. As mentioned in the introduction, these theorems deal with the question of existence of a hyperplane that separates two given disjoint convex subsets. We first need to review a few definitions and notations. Let V be a real vector space, A, B be nonempty subsets of V and α, a real number. Then the symbols A + B and αa have their natural meaning as follows: A + B := {a + b : a A, b B}

6 S.H. KULKARNI αa := {αa : a A}. Note that with these notations, A + B = B + A but A+A may not be the same as 2A. A special case of the above that occurs very frequently is when B is a singleton, say {x}. Thus A + x or x + A is the set {x + a : a A}. This is called the translate of A by x for obvious reasons. A subset C of V is said to be convex if for each t with 0 t 1, we have (1 t)c + tc C. In other words, for each x, y C and 0 t 1, (1 t)x + ty C. This is usually expressed by saying that if C contains two points, then it should also contain the line segment joining those two points. We come across several examples of convex sets. For example, every subspace is a convex set. A translate of a subspace by a nonzero element is not a subspace, but it is a convex set. Convex sets and functionals are closely related concepts in the sense that we can associate convex sets with functionals and vice versa. For example, if φ is a convex functional (See Definition 2.1) and α is a real number, then the set {x V : φ(x) α} is a convex set. If φ is a linear functional, then each of the following is a convex set. (1) {x V : φ(x) α}, (2) {x V : φ(x) α}, (3) {x V : φ(x) = α}. Next recall that a subspace is called proper if it is different from the whole space V. A hyperspace is a maximal proper subspace, that is a proper subspace not properly contained in any other proper subspace. If V is of finite dimension, say n, then every subspace of dimension n 1 is a hyperspace. Thus in R 2, every straight line passing through the origin is a hyperspace. Similarly, hyperspaces in R 3 are the planes passing through the origin. It is well known and is also easy to prove that if φ is a nonzero linear functional on V, then the null space of φ defined by N(φ) := {x V : φ(x) = 0} is a hyperspace. Also every hyperspace is a null space of some nonzero linear functional. Next, a hyperplane is a translate of a hyperspace.

HAHN BANACH THEOREMS 7 Thus hyperplanes in R 3 are the planes.(this may very well be the reason for the name hyperplane.) Further, if H is a hyperplane, then there exists a nonzero linear functional φ and a vector a V such that H = a + N(φ). Now suppose φ(a) = α. Let h H. Then h = a+y for some y N(φ). Hence φ(h) = φ(a) + φ(y) = α. On the other hand, suppose x V is such that φ(x) = α. Let y = x a. Then φ(y) = 0, that is y N(φ) and x = a + y a + N(φ) = H. Thus we have proved that with every hyperplane H, we can associate a pair (φ, α), where φ is a nonzero linear functional and α is a real number such that H = {x V : φ(x) = α}. For example, consider a plane given by the equation: 2x 1 x 2 + 7x 3 5 = 0 in R 3. Note that this is a hyperplane and the associated pair is (φ, 5) where the functional φ is given by φ(x) := 2x 1 x 2 + 7x 3, x = (x 1, x 2, x 3 ) R 3. Every such hyperplane H divides the whole space V into two convex sets: H l = {x V : φ(x) α} and H r = {x V : φ(x) α} known as halfspaces. We say that two nonempty subsets A and B of V are separated by H if A and B lie on different sides of H, that is A and B are contained in different half spaces formed by H. We are now ready to present the basic separation theorem. Theorem 3.1. Let A and B be nonempty disjoint convex subsets of a real vector space V. Then A and B can be separated by a hyperplane, that is, there exists a nonzero linear functional φ and a real number α such that φ(x) α φ(y) for all x A and y B. A proof of this can be found in [1]. Also note that we can not think of separating nonconvex sets by hyperplanes in this way. For example, suppose A is a circle in R 2 with the centre at the origin and radius 1 and B is the singleton set consisting of the origin. Then no straight line can separate A and B. Recall that straight lines are hyperplanes in R 2. Again as in the case of extension theorems, though the above theorem is very basic, the more popular ones are stated (and proved) in

8 S.H. KULKARNI the context of a normed linear space. (See Definition 2.3). Here we are looking for separation of convex sets not by just arbitrary hyperplanes but by closed hyperplanes. This means that the associated linear functionals must be continuous. (See Definition 2.4 and the remarks following it.) The following is usually known as the Hahn Banach Separation Theorem in normed linear spaces. Its proof can be found in [1, 4, 6]. Theorem 3.2. Let A and B be nonempty disjoint convex subsets of a real normed linear space X. Suppose A has an interior point. Then A and B can be separated by a closed hyperplane, that is, there exists a nonzero continuous linear functional φ and a real number α such that φ(x) < α φ(y) for all x A and y B. The following Corollary of this theorem is also very popular and is used in several applications. Corollary 3.3. Let A and B be nonempty disjoint convex subsets of a real normed linear space X. Suppose A is compact and B is closed. Then A and B can be strictly separated by a closed hyperplane, that is, there exists a nonzero continuous linear functional φ and real numbers α and β such that φ(x) < α < β < φ(y) for all x A and y B. As mentioned in the introduction, these separation theorems have applications to the problems of optimization, in particular to a class of problems known as convex programming problems. We only mention a very interesting and famous result known as Farkas Lemma which is a direct consequence of the separation theorem. Theorem 3.4. Let A be a matrix of order m n with real entries and b R m, regarded as a column matrix of order m 1. Then exactly one of the following alternatives hold. (i) The system of equations Ax = b has a nonnegative solution x R n. (Here nonnegative means x j 0 for each j = 1,..., n.) (ii) The system of inequalities y T A 0 and y T b < 0 has a solution y R m. It is easy to see that both the alternatives can not hold simultaneously, because if they do, then there exist x as in (i) and y as in (ii). Then 0 (y T A)x = y T (Ax) = y T b < 0, a contradiction. To prove that one of the alternatives must hold, the convex set C := {Ax : x R n, x 0} R m is considered. If b C, the alternative (i) holds. Otherwise, C and {b} are nonempty disjoint convex sets in R m and hence can be separated by a hyperplane H. A proof consists of noting that with every such

HAHN BANACH THEOREMS 9 hyperplane H in R m, we can associate a vector y R m and a real number α such that H = {u R m : y T u = α} and then showing that in this particular case α can be taken to be 0. Details of this proof as well as the application of Farkas Lemma to the duality theory of Linear Programming can be found in [2]. A highly refreshing and readable account of Farkas Lemma and its connection with several problems in Mathematics and Economics is given in [3]. 4. Limitations After having said so many things about the importance, relevance etc. of Hahn Banach theorems, we finally also point out some limitations of these theorems and also of methods based on these theorems. In fact, these limitations are common to many theorems and methods of Functional Analysis. Statements of all these Hahn Banach theorems are existence statements, that is, these statements assert the existence of some linear functionals. However all the proofs are nonconstructive in nature. This means that these proofs give no clue about how to find the linear functional whose existence is asserted by a theorem, even when the vector space under consideration is finite dimensional. (The situation is very similar to various proofs of the Fundamental Theorem of Algebra, none of which say anything about how to find a root of a given polynomial, though the Theorem asserts that every such polynomial must have a root.) For example, suppose we are given two finite sets in R 10. One way of giving these sets would be to give two matrices each having ten rows. Let A and B denote the convex hulls of these two sets. (Recall that a convex hull of a set is the smallest convex set containing the given set.) Suppose we consider the following problem: To determine whether A and B are disjoint and in case these are disjoint, to find a hyperplane in R 10 separating A and B. It may appear at first that the Hahn Banach Separation Theorem may be useful to tackle this problem. It is useful in the sense that if A and B are disjoint, the Theorem says that there exists a hyperplane in R 10 separating A and B. However it says nothing about how to find such a hyperplane. Completely different methods have to be used to tackle this problem. (See [2].) References [1] B. Bollobás, Linear analysis, Cambridge Univ. Press, Cambridge, 1990. MR1087297 (92a:46001) [2] J. Franklin, Methods of mathematical economics, Springer, New York, 1980. MR0602694 (82e:90002) [3] J. Franklin, Mathematical methods of economics, Amer. Math. Monthly 90 (1983), no. 4, 229 244. MR0700264 (84e:90001)

10 S.H. KULKARNI [4] B. V. Limaye, Functional analysis, Second edition, New Age, New Delhi, 1996. MR1427262 (97k:46001) [5] W. Rudin, Real and complex analysis, Third edition, McGraw-Hill, New York, 1987. MR0924157 (88k:00002) [6] A. E. Taylor and D. C. Lay, Introduction to functional analysis, Second edition, John Wiley & Sons, New York, 1980. MR0564653 (81b:46001) Department of Mathematics, Indian Institute of Technology - Madras, Chennai 600036 E-mail address: shk@iitm.ac.in