- PDF Free Download

Transcription

1 A Globally Convergent Primal-Dual Interior Point Method for Constrained Optimization Hiroshi Yamashita 3 Abstract This paper proposes a primal-dual interior point method for solving general nonlinearly constrained optimization problems. The method is based on solving the barrier Karush-Kuhn-Tucker conditions for optimality by the Newton method. To globalize the iteration we introduce the barrier-penalty fucntion and the optimality condition for minimizing this function. Our basic iteration is the Newton iteration for solving the optimality conditions with respect to the barrier-penalty function which coincides with the Newton iteration for the barrier Karush-Kuhn- Tucker conditions if the penalty parameter is suciently large. It is proved that the method is globally convergent from an arbitrary initial point that strictly satises the bounds on the variables. Implementations of the given algorithm are done for small dense nonlinear programs. The method solves all the problems in Hock and Schittkowski's textbook eciently. Thus it is shown that the method given in this paper possesses a good theoretical convergence property and is ecient in practice. Key words: interior point method, primal-dual method, constrained optimization, nonlinear programming 3 Mathematical Systems, Inc., 2-4-3, Shinjuku, Shinjuku-ku, Tokyo, Japan hy@msi.co.jp 1

2 1 Introduction In this paper we propose a primal-dual interior point method that solves general nonlinearly constrained optimization problems. To obtain a fast algorithm for nonlinear optimization problems it is fairly clear from various experiences (for example the well known success of the SQP method) that we should eventually solve the Karush-Kuhn-Tucker conditions for optimality by a Newton-like method. However, solving the optimality conditions simply as a system of equations does not give an algorithm for solving optimization problems in general except for convex problems. Therefore it is not appropriate to treat the primal and dual variables equally to obtain globally convergent methods for general nonlinear optimization problems. Observing that there is a nice relation between the parameterized optimality conditions (barrier KKT condition) and the classical logarithmic barrier function of the primal variables, we can develop a globally convergent method for general nonlinear optimization problems. To prevent the possible divergence of the dual variables, we introduce the barrier penalty function and solve the optimality conditions by a Newton-like method. For nonlinear programs it has been believed long time that interior point methods are not practical because of inevitable numerical diculties which occur at the nal stage of iterations. See Fiacco and McCormick [4], Fletcher [5] and other standard textbooks on nonlinear optimization. Thus the state of the art method for nonlinear programs today is the SQP method (see Powell [7], Fletcher [5]) which can also be interpreted as a Newton-like method for Karush-Kuhn-Tucker conditions near a solution. It is known that the SQP method is ecient and stable for wide range of problems. The SQP method requires the quadratic programming subproblems which handle the combinatorial aspect of the problem caused by inequality constraints. A solution of the quadratic programming problem itself requires rather expensive cost especially for large scale problems. Therefore it is desired to have an ecient and stable interior point method for nonlinear problems in the light of success of that in the eld of large scale linear programs. This paper shows that it is in fact possible to construct such a method that is globally convergent theoretically, and ecient and stable in practice. The method is tested on 115 test problems in Hock and Schittkowski's collection [6]. For these problems our method solves all the problems with 2 to 3 function evaluations per problem and about 2 iterations per problem. Details are described in Section 5. A preliminary report of this paper has appeared in [11]. In this paper, problems to be solved are restricted to small to medium ones because we do not exploit the sparsity of the matrices here. However, recent report by Yamashita, Yabe and Tanabe [13] studies a trust region type method to utilize the sparsity of the Hessian of the Lagrangian. They report eciency of the method that uses the barrier penalty function as in this paper. See also [1] for a trust region type interior point method. It is to be noted that the recent studies by Yamashita and Yabe [12] and Yabe and Yamashita [9] show the superlinear and/or quadratic convergence of a class of primaldual interior point methods that use the Newton or quasi-newton iteration for solving the barrier KKT conditions. Other reports on the local behavior of primal-dual interior point methods include [2] and [3]. In Section 2, we describe basic concepts in the primal-dual interior point method. The 2

3 barrier penalty function which plays a key role in the method of this paper is introduced in Section 3, and analyzed there. A line search method that minimizes the barrier penalty function is described in Section 4, and is proved to be globally convergent. Section 5 reports the results of numerical experiment. Notation. The subscript k denotes an iteration count. Subscripts i and j denote components of vectors and matrices. The superscript t denotes transposes of vectors and matrices. The vector e denotes the vector of all ones and the matrix I the identity matrix. For simplicity of description, we assume k1k denotes the l 2 norm for vectors. The symbol R n denotes the n dimensional real vector space. The set R n + is dened by R n = fx + 2 jx> g. Rn 2 Primal-dual interior point method In this paper, we consider the following constrained optimization problem: (1) minimize f(x); x 2 R n ; subject to g(x) = ; x ; where we assume that the functions f : R n! R 1 and g : R n! R m are twice continuously dierentiable. Let the Lagrangian function of the above problem be dened by (2) L(w) =f(x) y t g(x) z t x; where w =(x; y; z) t, and y 2 R m and z 2 R n are the Lagrange multiplier vectors which correspond to the equality and inequality constraints respectively. Then Karush-Kuhn- Tucker (KKT) conditions for optimality of the above problem are given by 1 1 r (w) B@ r xl(w) g(x) CA = B@ (3) CA XZe and (4) where x ; z ; r x L(w) = rf(x) A(x) t y z; A(x) = B@ rg 1(x) t. CA ; rg m (x) t 1 X = diag (x 1 ; 111;x n ) ; Z = diag (z 1 ; 111;z n ) : Now we approximate problem (1) by introducing the barrier function F B (; ) :R n +! R 1, (5) minimize subject to g(x) = ; F B (x; ) =f(x) np 3 log(x i );x 2 R n +

4 where the barrier parameter > is a given constant. It is well known that, if is suciently small, problem (5) is a good approximation to original problem (1) (see Fiacco and McCormick [4]). The optimality conditions for (5) are given by (6) rf (x) A(x) t y X 1 e = ; g(x) = and x>; where y 2 R m is the Lagrange multiplier for the equality constraints. If we introduce an auxiliary variable z 2 R n which is to be equal to X 1 e, then the above conditions become the conditions (7) and (8) r(w; ) B@ r xl(w) g(x) XZe e x>; 1 CA = z>: B@ The introduction of the variable z is essential to the numerical success of the barrier based algorithm in this paper. In this paper we call conditions (7) the barrier KKT conditions, and a point w() = (x();y();z()) that satises these conditions is called the barrier KKT point. We will use an interior point method for searching a point that approximately satises the above conditions, and nally obtain a point that satises the Karush-Kuhn-Tucker conditions by letting #. This means that we force x and z be strictly positive during iterations. Therefore we delete inequality conditions (8) hereafter, and always assume that x and z are strictly positive in what follows. Here we note that (9) where r(w; ) =r (w) ^e; ^e = B@ e An algorithm of this paper approximately solves the sequence of conditions (7) with a decreasing sequence of the barrier parameter that tends to, and thus obtains an approximate solution to KKT conditions. For deniteness, we describe a prototype of such algorithm as follows. Algorithm IP Step. (Initialize) Set ">, M c > and k =. Let a positive sequence f k g ; k # be given. 1 CA : Step 1. (Termination) If kr (w)k ", then stop. 1 CA ; 4

5 Step 2. (Approximate barrier KKT point) Find a point w k+1 that satises (1) kr(w k+1 ; k )km c k : Step 3. (Update) Set k := k + 1 and go to Step 1. 2 The following theorem shows the global convergence property of Algorithm IP. Theorem 1 Let fw k g be an innite sequence generated by Algorithm IP. Suppose that the sequences fx k g and fy k g are bounded. Then fz k g is bounded, and any accumulation point of fw k g satises KKT conditions (3) and (4). Proof. Assume that there exists an i such that (z k ) i!1. Equation (1) yields (rf(x k) A(x k ) t y k ) i (z k ) i 1 M c k1 (z k ) i ; which isacontradiction because of the boundedness of fx k g and fy k g.thus the sequence fz k g is bounded. Let ^w be any accumulation point offw k g. Since the sequences fw k g and f k g satisfy (1) for each k and k approaches zero, r (^w) = follows from the denition of r(w; ). Therefore the proof is complete. 2 We note that the barrier parameter sequence f k g in Algorithm IP need not be determined beforehand. The value of each k may be set adaptively as the iteration proceeds. An example of updating method of k is described in Section 5. We call condition (1) the approximate barrier KKT condition, and call a point that satises this condition the approximate barrier KKT point. To nd an approximate barrier KKT point for a given >, we use the Newton-like method in this paper. Let 1w =(1x; 1y; 1z) t be dened by a solution of (11) where (12) J(w) = J(w)1w = r(w; ); B@ G A(x)t I A(x) Z X Then the basic iteration of the Newton-like method may be described as 1 CA : (13) w k+1 = w k +3 k 1w k ; where 3 k = diag( xk I n ; yk I m ; zk I n ) is composed of step sizes in x, y and z variables. If G = r 2 L(w), then 1w becomes Newton's direction for solving (7). To solve (11), we x split the equations into two groups. Thus we solve! G + X 1 Z A(x) t 1x!! = r xl(w)+x 1 e z (14) ; A(x) 1y g(x) 5

6 for (1x; 1y) t, then we obtain 1z by the third equation in (11). If G + X 1 Z is positive denite and A(x) is of full rank, then the coecient matrix in (14) is nonsingular. It will be useful to note that (14) can be written as (15) where (16) G + X 1 Z A(x) t A(x)! 1x ey ~y = y +1y:! = rf(x)+x1 e g(x) In this paper, we will deal with the case in which the matrix G can be assumed nonnegative denite. Therefore, to solve the general nonlinear problems, we will use a positive denite quasi-newton approximation to the Hessian matrix of the Lagrangian function to obtain the desired property of the matrix G.! ; 3 Barrier penalty function To globalize the convergence property ofaninterior point algorithm based on the above iteration, we introduce two auxiliary problems. Firstly mp we dene the following problem: minimize F P (x; ) =f(x)+ jg i (x)j; x 2 R n ; (17) subject to x ; where the penalty parameter is a given positive constant. The necessary conditions for optimality of this problem are (see 14.2 of Fletcher [5]) r x L(w) = ; ( mx ) (18) y jg i (x)j ; XZe = ; x ;z ; where the means the subdierential of the function in the braces with respect to g. In our case the second condition in (18) is equivalent to (19) y i ; g i (x) =; y i = ; g i (x) > ; y i =; g i (x) < ; for each i =1; 111;m. This condition can be expressed as jg i (x)j = y i g i (x); y i ; i =1; 111;m; or jg i (x)j + y ig i (x) =; y i ; i =1; 111;m: 6

7 Therefore conditions (18) can be written as r (w) = B@ r xl(w) (2) r E (w) XZe and (21) where 1 CA = B@ x ; z ; y i ; i =1; 111;m; 1 CA r E (w) i = jg i (x)j + y ig i (x) ;; 111;m: Note that we are using the same symbol r (w) to denote the residual vector of the optimality conditions as in Section 2 for simplicity. Ifkyk 1 <, conditions (18) are equivalent to conditions (3) and (4). In this sense, problem (17) is equivalent to problem (1). Next we introduce the barrier penalty function F (; ; ) :R n +! R1 by (22) F (x; ; ) =f(x) nx log x i + mx jg i (x)j ; where and are given positive constants. This function plays an essential role in the method given in this paper. Let us approximate problem (17) by the following problem (23) minimize F (x; ; ); x 2 R n + : The necessary conditions for optimality of a solution to the above problem are r x L(w) = ; ( mx ) (24) y jg i (x)j XZe = e; x > ; z>; where we introduce an auxiliary variable z 2 R n as in (7). As above, conditions (24) can be written as r(w; ) B@ xl(w) (25) r E (w) CA = B@ CA XZe e and (26) 1 x>; z>; y i ; i =1; 111;m; where we use the same symbol r(w; ) as in Section 2 for simplicity. If>kyk 1 then conditions (25) and (26) coincide with (7) and (8). We call a point w() =(x();y();z()) 2 R n 2 R m 2 R n that satises (24) for a given > the barrier KKT point for this as before. We can use Algorithm IP to solve (17). The following theorem shows the global convergence property of Algorithm IP for solving (17). 1 7

8 Theorem 2 Let fw k g be an innite sequence generated byalgorithm IP for solving (17). Suppose that the sequences fx k g is bounded. Then fz k g is bounded, and any accumulation point of fw k g satises the optimality conditions (2) and (21) for problem (17). 2 Now we formulate a Newton-like iteration for solving the above conditions (25). Thus we calculate the rst order change of (24) with respect to a change in w. This gives (G + X 1 Z)1x A(x) t 1y = r x L(w)+X 1 e z; ( mx ~y y +1y g i (x)+rg i (x) 1x ) (27) t ; 1z = X 1 Z1x + X 1 e z: Following lemma gives a basic property of the iteration vector 1w =(1x; 1y; 1z) t. Lemma 1 Suppose that 1w satises (27) at an interior point w. (i) If >keyk 1, then 1w is identical to the one given by (11). (ii) If 1w =, then the point w isabarrier KKT point that satises (24). (iii) If 1x =, then the point (x; y +1y; z +1z) is a barrier KKT point that satises (24). 2 If we consider the subproblem: (28) minimize 1 2 1xt (G+X 1 Z)1x+(rf(x)X 1 e) t 1x+ mx g i (x)+rg i (x) t 1x ; 1x 2 R n ; the solution vector 1x and the corresponding multiplier vector ~y that satisfy the necessary conditions for optimality also satisfy conditions (27). If G + X 1 Z is positive denite we can solve the problem (28) by a straightforward active set method which starts with an active set that contains all the constraints g i (x)+rg i (x) t 1x =;; 111m. An example of the procedure is described in Fletcher [5]. It is important to note that, if the vectors 1x and ~y obtained by solving (15) satisfy >k~yk 1, then the desired iteration vector that satises (27) are also obtained. The procedure described above is devised instead of the simple Newton iteration given in Section 2 in order to show away of preventing possible divergence of the dual variable y. However for the practical purpose it seems sucient to solve only equation (11) once per iteration as shown in sections below in which practical experiences obtained by the author on general nonlinear programming problems are described. Therefore this procedure is of only theoretical importance at present. As shown above, optimality conditions (2) and (21) are identical to optimality conditions (3) and (4) if we set = 1. Also, Newton-like iteration (27) coincides with (11) when = 1. Therefore we will regard that the method explained below includes a method for solving (1) when = 1 for simplicity of exposition. We note that the penalty parameter which will appear below should be nite even if = 1. Another interesting point to be noted in the above iteration is that it can give a solution to an infeasible problem. Even if the constraints in problem (1) are incompatible, the 8

9 method described above will give a solution to problem (17) as shown below. A solution to problem (17) with infeasible constraints may give useful information about problem (1). Now we proceed to an analysis of the properties of the barrier penalty function. The directional derivative F (x; ; ; s) of the function F (x; ; ) along an arbitrary given direction s is dened by F F (x + s; ; ) F (x; ; ) (x; ; ; s) = lim # = rf(x) X X t s e t X 1 s + rg i (x) t s + + rg i (x) s X t rg i (x) t s; where the summations in the above equation are to be understood as X X a i = a i X X ; a i = a i X X ; a i = a i : + g i > We introduce a rst order approximation F l of F (x + s; ; ) by F l (x; ; ; s) f(x)+rf (x) t s log(x i )+ s i (29) + mx g i = nx g i (x)+rg i (x) t s; g i < x i and an estimate of the rst order change 1F l of F by (3) 1F l (x; ; ; s) F l (x; ; ; s) F (x; ; ); = rf(x) t s e t X 1 s + mx jg i (x)+rg i (x)sj mx jg i (x)j for an arbitrary given direction s. We have following properties for the quantities dened above which give an extension to the similar properties in the case of dierentiable functions (see for example Yamashita [1]). Lemma 2 Let >, > and s 2 R n be given. Then the following assertions hold. (i) The function F l (x; ; ; s) is convex with respect to the variable. (ii) There holds the relation (31) F (x; ; )+F (x; ; ; s) F l (x; ; ; s): (iii) Further, there exists a 2 (; 1) such that (32) F (x + s; ; ) F (x; ; )+F (x + s; ; ; s); whenever x + s>. 9

10 Proof. The rst statement of the lemma is obvious. If > is suciently small, we have F l (x; ; ; s) = F (x; ; )+F (x; ; ; s) = F (x; ; )+F (x; ; ; s): Since F l (x; ; ; s) isconvex with respect to and coincides with a linear function of > when is suciently small, we obtain (31). Now we show (32). If we consider F (x + s; ; ; s) as a function of the variable 2 [; 1], the number of discontinuous points of F is nite. Therefore there exists a 2 (; 1) such that F (x + s; ; ; s) Z1 F (x + s; ; ; s)d = F (x + s; ; ) F (x; ; ): This completes the proof. 2 Lemma 3 Let " 2 (; 1) be a given constant and s 2 R n be given. If 1F l (x; ; ; s) <, then (33) F (x + s; ; ) F (x; ; ) " 1F l (x; ; ; s); for suciently small >. Proof. From (32), there exists a 2 (; 1) such that (34) F (x + s; ; ) F (x; ; ) F (x + s; ; ; s) = F (x + s; ; ; s): From (31) we have (35) F (x + s; ; ; s) 1F l (x + s; ; ; s): Because 1F l (; ; ; s) is continuous, and 1F l (x; ; ; s) < by the assumption, we obtain (36) 1F l (x + s; ; ; s) " 1F l (x; ; ; s); for suciently small ksk. From (34) - (36) we obtain (33) for suciently small >. 2 Lemma 4 Suppose that 1w satises (27). If <, then (37) 1F l (x; ; ;1x) 1x t (G + X 1 Z)1x mx ( j~y i j)jg i (x)j: Further if, G is positive semidenite and keyk 1 1F l (x; ; ;1x) =yields 1x =., then 1F l (x; ; ;1x), and 1

11 Proof. From (27) and (3) we have 1F l (x; ; ;1x) =1x t (G+X 1 Z)1x+1x t A(x) t ~y+ The i-th components in the summations in the last three terms give mx g i (x)+rg i (x) t 1x mx jg i (x)j: (38) ~y i rg i (x) t 1x + g i (x)+rg i (x) 1x t jg i (x)j ~y i rg i (x) t 1x + g i (x)+rg i (x) 1x t jg i (x)j = ~y i g i (x) jg i (x)j ; where the inequality in the second line follows from, and the equality in the third line follows from the property ~y i ; g i (x)+rg i (x) t 1x =; ~y i = ; g i (x)+rg i (x) t 1x>; ~y i =; g i (x)+rg i (x) t 1x<: The relation (38) gives (37). 2 4 Line search algorithm To obtain a globally convergent algorithm to a barrier KKT point for a xed >, it is necessary to modify the basic Newton iteration with the unit step length somehow. Our iterations consist of (39) x k+1 = x k + xk 1x k ; y k+1 = y k + yk 1y k ; z k+1 = z k + zk 1z k ; where xk, yk and zk are step sizes determined by the line search procedures described below. The main iteration is to decrease the value of the barrier penalty function for xed. Thus the step size of the primal variable x is determined by the sucient decrease rule of the merit function. The step size of the dual variable z is determined so as to stabilize the iteration. The explicit rules follow in order. We adopt Armijo's rule as the line search rule for the variable x. At the point x k,we calculate the maximum allowed step to the boundary of the feasible region by (4) kmax = min i ( (x k) i (1x k ) i (1x k) i < i.e., the step size kmax gives an innitely large value of the barrier penalty function F if it exists, because of the barrier terms, and a step size 2 [; kmax ) gives a strictly feasible primal variable. A step to the next iterate is given by ) ; (41) xk = k l k ; k = min f kmax ; 1g ; 11

12 where 2 (; 1) and 2 (; 1) are xed constants and l k is the smallest nonnegative integer such that (42) F (x k + k l k 1x k; ; ) F (x k ; ; ) " k l k 1F l(x k ; ; ;1x k ); where " 2 (; 1). Typical values of these parameters are = :5, = :9995 and " =1 6. Therefore we will try the sequence x k +:9995 kmax 1x k ; x k +:5 2 :9995 kmax 1x k ; x k +:25 2 :9995 kmax 1x k ; 111 for example, and will nd a step size that satises (42). If G is positive semidenite, then 1F l (x k ; ; ;1x k ) by Lemma 4, and therefore the existence of such steps is assured by Lemma 3. For the variable z, we adopt the box constraints rule, i.e., we force x and z to satisfy the condition (43) c Lki ((x k ) i + xk (1x k ) i )((z k ) i + zk (1z k ) i ) c Uki ; i =1; 111;n at the end of each iteration, where the bounds c Lk and c Uk satisfy (44) <c Lki <<c Uki ; i =1; 111;n: To this end, we let (45) c Lki = minn M L ; ((x k ) i + xk (1x k ) i )(z k ) io ; c Uki = max fm U ; ((x k ) i + xk (1x k ) i )(z k ) i g ; where M L > 1 and M U > 1 are given constants. The construction of the above bounds shows that current z satises (46) (47) c Lki ((x k ) i + xk (1x k ) i ) (z k) i The step size z is determined by zk = min ( min i ( max i ( i c Uki ((x k ) i + xk (1x k ) i ) ; c Lki ((x k ) i + xk (1x k ) i ) c Uki ((z k ) i + i (1z k ) i ) ((x k ) i + xk (1x k ) i ) i =1; 111;n: )) ) ; 1 : The rule (47) means that the step size z is the maximal allowed step that satises the box constraints with the restriction of being not greater than the unit step length. Lemma 5 Suppose that an innite sequence fw k g is generated for xed >. Then if lim inf k!1(x k ) i > and lim sup k!1(x k ) i < 1, then lim inf k!1(c Lk ) i > and lim sup k!1(c Uk ) i < 1 for i =1; 111;n. 12

13 Proof. Suppose that (c Lk ) i! for an i and some subsequence K f; 1; 2; 111g. Then by the denition of (c Lk ) i in (45), (z k ) i! ;k 2 K. However, in order for a subsequence of f(z k ) i g to tend to, there must be an iteration k at which the lower bound (c Lk ) i =(x k+1 ) i of (z k ) i is arbitrary small and the value of (z k ) i at the iteration is strictly larger than that bound, i.e. at the iteration the value of (z k ) i decreases to a strictly smaller value. This means that at iteration k, (c Lk ) i = =M L from the denition (45), and therefore the value of (x k+1 ) i must be arbitrary large because =M L < (x k+1 ) i (z k ) i and (z k ) i! ;k 2 K. This is impossible because of the assumption of the lemma. The proof of the boundedness of (c Uk ) i is similar. 2 (48) In actual calculation we modify the direction 1z k (1z k ) i = 8>< >: ; if (z k) i = c Lki=(x k+1) i and (1z k ) i < ; ; if (z k ) i = c Uki =(x k+1 ) i and (1z k ) i > ; (1z k ) i ; otherwise: This modication means that we project the direction along the boundary of the box constraints if the point z k is on that boundary and the direction 1z k points outward of the box. This procedure is adopted because it gives better numerical results. The global convergence results shown in the following are equally valid for both unmodied and modied directions. For the variable y, there exist three obvious choices for the step length: by (49) yk =1 or xk or zk : The global convergence property given below holds for these choices. We choose yk = zk from numerical experiments. The following algorithm describes the iteration for xed > and >. We note that this algorithm corresponds to Step 2 of Algorithm IP in Section 2. Algorithm LS Step. (Initialize) Let w 2 R n + 2 Rm 2 R n +, and >, >. Set " >, 2 (; 1), 2 (; 1), " 2 (; 1), M L > 1 and M U > 1. Let k =. Step 1. (Termination) If kr(w k ;)k " ; then stop. Step 2. (Compute direction) Calculate the direction 1w k by (27). Step 3. (Stepsize) Set kmax = min i ( (x k) i ) (1x k ) i (1x k) i < ; k = min f kmax ; 1g : Find the smallest nonnegative integer l k that satises F (x k + k l k 1x k ;;) F (x k ;;) " k l k 1F l (x k ; ; ;1x k ): Calculate xk = k l k ; 13

14 c Lki = min ; (x k + xk 1x k ) i (z k ) i ; M L c Uki = max fm U ; (x k + xk 1x k ) i (z k ) i g ; ( ( ( i c Lki zk = min min max i i ((x k ) i + xk (1x k ) i c Uki )) ) ((z k ) i + i (1z k ) i ) ; 1 ; ((x k ) i + xk (1x k ) i ) yk = zk ; 3 k = diagf xk I n ; yk I m ; zk I n g: Step 4. (Update variables) Set w k+1 = w k +3 k 1w k : Step 5. Set k := k + 1 and go to Step 1. 2 To prove global convergence of Algorithm LS, we need the following assumptions. Assumption G (1) The functions f and g i ;i =1; :::; m, are twice continuously dierentiable. (2) The level set of the barrier penalty function at an initial point x 2 R n +, which is dened by nx 2 R n jf (x; ; ) + F (x ; ; )o, is compact for given >. (3) The matrix A(x) is of full rank on the level set dened in (2). (4) The matrix G k is positive semidenite and uniformly bounded. (5) The penalty parameter satises ky k +1y k k 1 for each k =; 1; :::. 2 We note that if a quasi-newton approximation is used for computing the matrix G k, then we need the continuity of only the rst order derivatives of functions in Assumption G-(1). We also note that if 1F l (x k ; ; ;1x k ) =, at an iteration k, then the step sizes xk = yk = zk = 1 are adopted and (x k+1 ;y k+1 ;z k+1 ) gives a barrier KKT point from Lemma 1 and Lemma 4. The following theorem gives a convergence of an innite sequence generated by Algorithm LS. Theorem 3 Let an innite sequence fw k g be generated byalgorithm LS. Then there exists at least one accumulation point of fw k g, and any accumulation point of the sequence fw k g is a barrier KKT point. 14

15 Proof. First we note that each component of the sequence fx k g is bounded away from zero and bounded above by the assumption and the existence of the log barrier term. Therefore the sequence fx k g has at least one accumulation point. The sequence fz k g also has these properties by Lemma 5. Thus there exists a positive number M such that (5) kpk 2 M pt (G k + X 1 k Z k)p M kpk 2 ; 8p 2 R n ; by the assumption. From (37) and (5), we have (51) and from (42), 1F l (x k ; ; ;1x k ) k1x kk 2 M < ; (52) F (x k+1 ; ; ) F (x k ; ; ) " k l k 1F l(x k ; ; ;1x k ) " k l k k1x kk 2 M < : Because the sequence ff (x k ; ; )g is decreasing and bounded below, the left hand side of (52) converges to. Since lim inf k!1(x k ) i > ;i =1; 111;n,wehave lim inf k!1 k >. Suppose that there exists a subsequence K f; 1; 111g and a such that (53) lim inf k!1 k1x kk>; k 2 K: Then we have l k!1;k 2 K from (52) because the left most expression tends to zero, and therefore we can assume l k > for suciently large k 2 K without loss of generality. If l k > then the point x k + xk 1x k = does not satisfy the condition (42). Thus, we have (54) F (x k + xk 1x k =; ; ) F (x k ; ; ) >" xk 1F l (x k ; ; ;1x k )=: By (32) and (31), there exists a k 2 (; 1) such that (55) F (x k + xk 1x k =; ; ) F (x k ; ; ) xk F (x k + k xk 1x k =; ; ;1x k )= xk 1F l (x k + k xk 1x k =; ; ;1x k )=; k 2 K: Then, from (54) and (55), we have " 1F l (x k ; ; ;1x k ) < 1F l (x k + k xk 1x k =; ; ;1x k ): This inequality yields (56) 1F l (x k + k xk 1x k =; ; ;1x k ) 1F l (x k ; ; ;1x k ) > (" 1)1F l (x k ; ; ;1x k ) > : Because 1x k is a solution of problem (28) and there holds (5), k1x k k is uniformly bounded above. Then by the property l k! 1,we have k k xk 1x k =k!;k 2 K. 15

16 Thus the left hand side of (56) and therefore 1F l (x k ; ; ;1x k ) converges to zero when k!1;k 2 K. This contradicts the assumption (53) because we have 1x k! ;k 2 K from (51). Therefore we proved (57) lim k1x k k =: k!1;k2k Let an arbitrary accumulation point of the sequence fx k g be ^x 2 R n + and let x k! ^x; k 2 K for K f; 1; 111g.Thus (58) x k! ^x; 1x k! ; x k+1! ^x; k 2 K: Because nx 1 Z k ko is bounded, we have lim k!1;k2kz k +1z k X e 1 = k from (27). If we dene ^z = ^X 1 e where ^X = diag(^x 1 ; 111; ^x n ), then we have Hence from (45) we have z k +1z k! ^z; k 2 K: (c Lk ) i M L (x k+1 ) i (z k +1z k ) i M U (C Uk ) i ; i =1; 111;n for k 2 K suciently large, which shows that the point z k +1z k is always accepted as z k+1 for suciently large k 2 K. Since zk = 1 is accepted for k 2 K suciently large, so is yk =1. Therefore we obtain lim r x L(^x; y k +1y k ; ^z) = ; k!1;k2k lim y k +1y k k!1;k2k ( nx ) jg i (^x)j : Because the matrix A(^x) is of full rank, the sequence fy k +1y k g ;k 2 K converges to a point ^y 2 R m which satises r x L(^x; ^y; ^z) = ; mx! ^y jg i (^x)j ; ^X ^z = e; ^x >; ^z >: This completes the proof because we proved that there exists at least one accumulation point offx k g, and for an arbitrary accumulation point ^x of fx k g, there exist unique ^y and ^z that satisfy the above. 2 16

17 5 Numerical Result In this section, we report numerical results of an implementation of the algorithm given in this paper for nonlinear programming problems. We set = 1 in this experiment. The software is called NUOPT and the code is written bytakahito Tanabe. In order to havean appropriate positive semidenite matrix G by a reasonable cost for nonlinear problems, we resort to a quasi-newton approximation to the Hessian matrix of the Lagrangian function. We use updating formula suggested by Powell[7] for the SQP method: G k+1 = G k G ks k s t k G k s t k G ks k + u ku t k s t u ; k k where u k is calculated by s k = x k+1 x k ; v k = r x L(x k+1 ;y k+1 ;z k+1 ) r x L(x k ;y k+1 ;z k+1 ); u k = k v k +(1 k )G k s k ; k = 8< : :8s t k G k s k s t k G k s ks t k v k 1; s t k v k :2s t k G ks k ; ; s t k v k :2s t k G ks k ; to satisfy s t u k k > for the hereditary positive deniteness of the update. Method for updating the barrier parameter k is as follows. Suppose we have an approximate barrier KKT point w k+1 that satises kr(w k+1 ; k )km c k ; in Step 2 of Algorithm IP. Then k+1 is dened by k+1 = max( kr(wk+1 ; k ) )k ; k ; M M where < M c < M and M > 1 should be satised. In our experiment weset M c =3;M =4;M = 5. As in the SQP method, we expect fast local convergence of the method if is suciently small near a solution because it is based on the Newton iteration for the optimality conditions and a quasi-newton approximation to the second derivative matrix. As noted in the above, this expectation is proved by Yamashita and Yabe [12]. Linear equation (14 ) is solved by using the Bunch-Parlett factorization. The test problems for nonlinear problems are adopted from the book by Hock and Schittkowski [6]. The results are summarized in Table 1 at the end of this paper. Following list explains the notations used in Table 1: n= number of variables. m=number of constraints. obj=nal objective function value. res=norm of nal KKT condition residual. itr=iteration count. 17