Adaptive Control Using Combined Online and Background Learning Neural Network

Adaptive Control Using Combined Online and Background Learning Neural Network Eric N. Johnson and Seung-Min Oh Abstract A new adaptive neural network (NN control concept is proposed with proof of stability properties. The NN learns the plant dynamics with online training, and then combines this with background learning from previously recorded data, which can be advantageous to the NN adaptation convergence characteristics. The network adaptation characteristics of the new combined online and background learning adaptive NN is demonstrated through simulations. I. INTRODUCTION Recently, artificial neural networks mimicking the biological neuronal mechanisms of the human intelligence system, the brain, have been successfully used in various fields, including pattern recognition, signal processing, and adaptive control []. A neural network can be thought of as a parameterized class of nonlinear maps. Throughout the 98 s and the early 99 s, numerous researchers showed that multilayer feedforward neural networks are capable of approximating any continuous unknown nonlinear function, or mapping, on a compact set [], [], and that the neural networks have an online learning adaptation capability that does not require preliminary off-line tuning [3]. As a result, this architecture represents a successful framework for use in adaptive nonlinear control systems. Online adaptive neural network controllers have been extensively studied and successfully applied to robot control by Lewis and others [3]. They have provided many feasible online learning algorithms accompanied by mathematical stability analyses. Online learning architectures are used to compensate for dynamic inversion model error caused by system uncertainties. Current adaptive neural network online control methods are diversified to various forms by using techniques from classical adaptive control, including σ-modification [], ε- modification [], [4], dead-zone method [5], or projection method [6]. Actually, most of these NN training laws have NN weight dynamics of low rank, mostly unity [7]. NN training could potentially be full rank and these additional degrees-of-freedom utilized to improve system performance. In this paper, we propose a new approach for neural network adaptive controller that overcomes the rank- This work was supported in part by NSF #ECS-38993. E. N. Johnson is the Lockheed Martin Assistant Professor of Avionics Integration, the School of Aerospace Engineering, Georgia Institute of Technology, Atlanta, GA 333, USA eric.johnson@ae.gatech.edu S.-M. Oh is with Graduate Research Assistant at same school. gtg895i@mail.gatech.edu x c, x c Reference Model crm pd ad Approximate Dynamic Inversion Neural Network P-D Compensator x, x, Adaptation Law Plant xrm, x rm x, x Error Calc Fig.. Neural Network adaptive control, including an approximate dynamic inversion. limitation and shows the properties of semi-global learning. This is accomplished by combining online learning algorithms with a background learning concept. II. ADAPTIVE CONTROL ARCHITECTURE A block diagram in Fig. illustrates the key elements of baseline controller architecture: the plant, the reference model that provides desired response, the approximate dynamic inversion block, the linear (proportional/derivative or P-D controller that is used to track the reference model, the online learning NN that corrects for errors/uncertainty in the approximate dynamic inversion. A. Dynamic Inversion-Based Adaptive Control For simplicity, consider the case of full model inversion, in which the representative n-degree-of-freedom multi-input multi-output (MIMO plant dynamics are given as ẍ = f(x, ẋ, δ, ( where x, ẋ, δ R n. We introduce a pseudo-control input ν, which represents a desired ẍ and is expected to be approximately achieved by actuating signal δ. ẍ = ν, where ν = f(x, ẋ, δ. Ideally, the actual control input δ is obtained by inverting f. Since the exact function f(x, ẋ, δ is usually unknown or difficult to invert, an approximation is introduced as ν = ˆf(x, ẋ, δ, which results in a modeling error in the system dynamics ẍ = ν + (x, ẋ, δ, ( where (x, ẋ, δ = f(x, ẋ, δ ˆf(x, ẋ, δ. Based on the approximation in ν above, the actuator command is determined by an approximate dynamic inversion of the form e rm

δ cmd = ˆf (x, ẋ, ν, (3 where ν is the pseudo-control and represents a desired ẍ that is expected to be approximately achieved by δ cmd. The reference model dynamics are given as ẍ rm = ν crm = f rm (x rm, ẋ rm, x c, ẋ c, (4 where x c, ẋ c represent external commands. B. Model Tracking Error Dynamics The total pseudo-control signal for the system is now constructed by the three components ν = ν crm + ν pd ν ad, (5 where ν crm is the pseudo-control signal generated by the reference model in (4, ν pd is the output of the linear compensator, and ν ad is the NN adaptation signal. The linear compensator term (ν pd can be designed by any standard linear control design technique and most often implemented by PD (Proportional-Derivative compensation as long as the linearized closed loop system is stable. For the second order system, PD compensation is expressed by ν pd = [ ] K p K d e, where the reference model tracking error is defined by e = [ (x rm x T (ẋ rm ẋ ] T T, and the compensator gain matrices K d >, R n n and K p >, R n n are diagonal matrices to be designed. The model tracking error dynamics are found by differentiating e: ė = Ae + B [ν ad (x, ẋ, δ (x, ẋ, δ], (6 [ ] [ I where A =, B =, and (x, ẋ, δ = K p K d I] f(x, ẋ, δ ˆf(x, ẋ, δ is the model error to be approximated and canceled by ν ad, the output of the NN. The linear PD compensator gain matrices K p, K d are chosen such that A is Hurwitz. These dynamics can form the basis of the NN adaptive law. C. Neural-Network-Based Adaptation Single Hidden Layer (SHL Perceptron NNs are universal approximators in that they can approximate any smooth nonlinear function to within arbitrary accuracy, given a sufficient number of hidden layer neurons and input information []. In the case of model error, this SHL NN is trained online and adapted to cancel the model error with feedback. The input-output map of the SHL NN can be ( expressed as [7] ν adk = b w θ w,k + n w j,k σ j b v θ v,j + n v i,j x i, (k =,..., n 3 or j= in matrix form ν ad (W, V, x = W T σ ( V T x R n3, where x = [ ] T b v x x... x n is the NN input vector, σ(z = [ b w σ (z σ (z... σ n (z n ] T is a sigmoidal activation function vector, V is an input layer to hidden layer weight matrix, W is a hidden layer to output layer weight matrix, and ν ad is the NN output. n, n, and n 3 are the number of variable input nodes, variable hidden layer nodes, and outputs, respectively. The input vector to the hidden layer neurons is given by z = V T x = [ ] T z z... z n and the inputoutput map in the hidden layer is defined by a sigmoidal activation function σ j (z j =, j =,..., n +e a j z j. A matrix containing the derivatives of the sigmoid vector is denoted as σ (z. The universal approximation property of NNs ensures that for x D, where D is a bounded domain, there exists an N and an ideal set of weights( W, V such that = W T σ(v T x + ε, ε ε, n N. We introduce the following assumptions: Assumption. All external command signals are bounded: [ x T c ẋ T c ẍ T c ] x c. Assumption. The input vector to the NN is uniformly bounded: x x, x >. Assumption 3. The norm of the ideal NN weights is bounded: Z F Z. [ ] W Define W W W, Ṽ V V, Z. Ṽ W and V are the upper bounds for the ideal NN weights : W F < W, V F < V. Using the Taylor series expansion of σ(z around σ(z, we get σ σ(z = σ(z z = σ(z σ (z z + O( z, where O( represents the higher order terms, and z = V T x, z = V T x, z = z z = Ṽ T x. Expanding the NN/model-error cancellation error [6] we have ν ad = W T σ ( V T x W T σ(v T x ε = W T σ + W T σ Ṽ T x + w, where w = W T (σ σ W T σ Ṽ T x ε. The following bounds are useful to prove the stability of adaptive law: W T σ b w + n W, zj σ j (z j δ =.4, σ V T x δ n, σ ā 4 n. D. Online Learning NN Adaptive Control and Rank- Limitation An appropriate use of NNs is nonlinear multidimensional curve fitting, and can be applied for approximating errors in a model ˆf of f, as described above. The NN is normally trained offline based on some form of training data, or online while controlling the plant. The online NN weight adaptation laws only tap a small amount of the potential adaptation possible with SHL perceptron NNs. This limitation occurs as a consequence of how the adaptive law was developed (backstepping, and results in an adaptive law of rank-, a rank of at most unity. Consider the nonnegative ( definite Lyapunov function candidate of the form L e, W, Ṽ = et P e + ( tr W Γ W w T + (Ṽ tr T Γ v Ṽ, where Γ w and Γ v are positive definite learning rate weighting matrices. One obtains the following online adaptive law by the time derivative of the Lyapunov function candidate [3]. Ẇ = σ(v T x r Γ w (7 V = Γ v x r W T σ (V T x, (8 where r = e T P B and P R n n is the positive definite solution to A T P + P A + Q =. Since this original

backpropagation law is the basis for developing the combined online and background learning weight adaptation law, the fundamental form of this online learning NN law is introduced for comparison purposes. Fact : Every matrix of the form A = u v T has at most a rank of one, where A is n m matrix, u is n vector, and v is m vector [8]. Since σ is (n + column vector and rγ w is n 3 row vector, Ẇ is always at most rank one matrix. Similarly, V is also at most rank one matrix because Γ v x is (n + column vector and rw T σ is n row vector. Even though the online NN weight adaptation laws have matrix forms, the rank of the gradient matrices is always at most one. This implies that the adaptation performance of the NN weights might be improved by taking advantage of the remaining subspace. III. COMBINED ONLINE AND BACKGROUND LEARNING ADAPTIVE CONTROL ARCHITECTURE Simultaneous batch (or background and instantaneous (or online learning NN adaptation law is proposed. This law is described as training both on a set of data over a number of points taken at different times and on purely current state/control information. This is done by utilizing a combination of a priori information, recorded online data or history stack [9], and the instantaneous current data, such as (7 and (8. Both should be done concurrently with real-time control. They provide the same guarantees of boundedness as earlier online training approaches. The approach and theory are given in the following subsections. A. Selection of NN inputs for background learning One reasonable choice for the background learning adaptive law is to train the neural network based on previously stored information such as a priori stored information or state/control data stored in real-time. One potential technique for conducting this training is presented here. With reference to the online learning NN adaptation law in (7 (8 and the model tracking error dynamics in (6, we need input vector ( x to the NN and the corresponding model error ( in order to train the NN. It is assumed that, for times sufficiently far in the past, the model error can be observed or otherwise measured, and the corresponding inputs to the NN such as parameters, states, and controls are stored. This storage can then be done for a number of data points, i =,,..., p, where i = f i ˆf (x i, ẋ i, δ i is a model error that will be used in the background adaptation. Regarding the selection of data points for the background learning NN adaptation, one may raise the following questions: How can we calculate the model error, i, for the i-th stored data point? We normally do not know the exact model, nor do the model error. One easily implemented method that measures the magnitude of model error i, which will be saved and used in background learning, is to utilize the residual signal r i from the online NN adaptation. The model error i for the i-th stored data point x i (i =,,..., p is estimated by the following equation at the time of each data storage. i = W T σ(v T x i r T i, (9 where r i is the residual signal. Current time r i for background learning is obtained through the simulation of tracking error dynamics. r i = e T i P B, ( where ė i = A e i + B [ ] W T σ(v T x i i. Which data points, x i (i =,,..., p, should be stored for the use of background learning adaptation? One possible choice for selecting new data points x i (i =,,..., p to be stored is to save the point whenever ( x x p T ( x x p x T > ε x. ( x This implies that new points are stored whenever the difference between the current input (states and controls and the last stored data point is greater than some specified amount. Saving only sufficiently different data points maximizes the input domain space of the NN mapping, which is spanned by x i (i =,,..., p and in which background adaptation will be performed. 3 When should the data point be removed from storage, and which point should be removed first? As a practical consideration, the total number of stored points could be fixed. When a new point is added, the oldest or least representative of the overall space could be dropped. B. Combined online and background learning NN adaptation One approach to utilize the stored data points in background learning NN adaptation is to apply the same adaptive learning law structure as the online learning weight adaptation law. All the stored data points x i (i =,,..., p used in background learning adaptation are equally considered and summed with online learning adaptation. The model tracking error for each stored data points is simulated by model tracking error dynamics that has the same structure as that in the online learning adaptation. A projection operator [6] is employed to constrain the learning NN weight estimates inside a known convex bounded set in the weight space that contains the unknown optimal weights. For the boundedness proof of combined online and background learning NN adaptation law, we need a boundedness theorem for the state-dependent impulsive dynamical system in state-space form ẋ(t = f c (x(t, x( = x, x(t / Z, ( x(t = f d (x(t, x(t Z, (3 where x(t D R n, D is an open set with D, x(t x(t + x(t, f c : D R n is Lipschitz continuous with f c ( =, f d : Z R n is continuous, and Z D is the resetting set. We refer to the differential equation ( as the continuous-time dynamics between the

resetting set, and we refer to the difference equation (3 as the resetting law. The stability of zero solution for the statedependent impulsive dynamical system is dealt in detail by Haddad et al []. We slightly extend the result for the proof of boundedness. Theorem : Suppose there exists a piecewise continuously differentiable function L : D [, such that L( =, (4 L(x >, x D {}, (5 L(x = L (xf c (x <, x / Z, x Ω {} (6 L = L(x + x(t L(x, x Z, (7 where Ω D is compact set. Then the solution x(t to (, (3 is bounded outside of the compact set Ω. Proof : Assume that the resetting times τ k (x are welldefined and distinct for every trajectory of (, (3 []. Before the first resetting time ( t τ (x = t, L(x(t can be obtained by the following integral equation: L(x(t = L(x( + t L (x(τf c (x(τdτ, t [, τ (x ]. Between two consecutive resetting times τ k (x and τ k+ (x (τ k (x < t τ k+ (x, k =,,, we get the following result: L(x(t = L(x(τ k (x + x(τ k (x + t τ k (x L (x(τf c (x(τdτ = L(x(τ k (x + [L(x(τ k (x + x(τ k (x L(x(τ k (x ] + t τ k (x L (x(τf c (x(τdτ, t (τ k (x, τ k+ (x ]. At time t = τ k (x, L(x(τ k (x = L(x(τ k (x + [L(x(τ k (x + x(τ k (x L(x(τ k (x ] + τk (x τ k (x L (x(τf c (x(τdτ. By recursive substitution of this into previous one, we get L(x(t = L(x(τ (x + + t k [L(x(τ i (x + x(τ i (x L(x(τ i (x ] τ (x L (x(τf c (x(τdτ, t (τ k (x, τ k+ (x ]. (8 Since τ (x = t = and L = L(x(τ i (x + x(τ i (x L(x(τ i (x, L(x(t becomes L(x(t = L(x( + t L (x(τf c (x(τdτ, t (τ k (x, τ k+ (x ]. Since L(x = L (xf c (x < for all x Ω {} and x / Z, then L(x(t L(x( for all t. (9 Hence, Lyapunov stability is established. For some time s < t, we obtain similar expression with (8 as L(x(s = L(x(τ (x + k [L(x(τ i(x + x(τ i (x L(x(τ i (x ] + s τ (x L (x(τf c (x(τdτ, t (τ k (x, τ k+ (x ]. By subtracting this from (8, we have L(x(t L(x(s = t s L (x(τf c (x(τdτ <, t > s, x Ω. ( c Proj(W, Fig.. i i W i i W i Projection operator. g(w i L(x(t < L(x(s for all t > s, x Ω. As long as x(t lies in the region of Ω, the trajectory moves in order to reduce L(x(t as time increases until x(t goes outside of the region Ω. Hence, x is bounded by the region outside of the compact set Ω. We ll drive the combined online and background learning NN adaptation law and prove the boundedness of the law by using the Theorem. Theorem : Consider the system in ( with the inverting controller in (3. The following combined online and background learning NN adaptation law guarantees the boundedness of all system signals. Ẇ = P roj(w, ξ Γ w ( V = Γ v P roj(v, ζ, ( where ė i = Ae i + B ( W T σ(v T x i i, i =,..., p (3 ξ = σr σ(v T x i r i (4 ζ = xrw T σ (V T x x i r i W T σ (V T x i (5 r = e T P B, r i = e T i P B, i =,..., p (6 i = W T σ ( V T x i r T i, i =,..., p (7 A T P + P A + Q =. (8 The projection operator is defined in column vectors. For the weight matrices, the following definitions are used: P roj(w, ξ = [P roj(w, ξ... P roj(w n3, ξ n3 ] R (n+ n3, P roj(v, ζ = [P roj(v, ζ... P roj(v n, ζ n ] R (n+ n, where W i, ξ i, V i, ζ i are the i-th column vector of W, ξ, V, ζ matrices, respectively. The projection operator concept, illustrated in Fig., is defined as follows. ξ i g gt g ξ i g(w i, P roj(w i, ξ i = if g(w i > and g T ξ i > ξ i, if otherwise, i =,..., n 3. Here, we introduce a convex set having a smooth boundary defined by Ω i,c = {W i R n : g(w i c}, c, where g : R n R is a smooth known function g(w i =

W T i W i W i, i =,..., n 3. Wi is the estimated bound on the weight vector W i, and > denotes the projection tolerance. Gradient of the convex function is defined as the column vector g(w i = W i, i =,..., n 3. Proof : This theorem can be proved by introducing the following Lyapunov function candidate: L ( e, e i, W, Ṽ = et P e + tr ( W Γ w W T + (Ṽ tr T Γ v Ṽ + e T i P e i. (9 Remark : When a data point is added, the discrete change in the Lyapunov function L is zero. The initial condition for each tracking error dynamics (3 is set to zero (e i ( =. When a data point is dropped, the discrete change in the Lyapunov function L is negative. With the Lyapunov function candidate in (9, the positive definiteness conditions (4, (5 in the Theorem are satisfied. In addition, the nonincreasing condition (7 of Lyapunov function value at the resetting set is gauranteed by the Remark. Finally, we need to prove the boundedness condition (6 between the resetting points and find the compact set Ω outside of which defines the bounded region the trajectory resides. The boundedness of the NN weight W is shown by defining the Lyapunov function of the form [6]: L wi = g(γ w W i + W i rate of { change is L wi = = (Γ w, i =,..., n 3. Its time = g T Γ w Ẇi = g T P roj(w i, ξ i g T ξ i ( g(w i, if g(w i > and g T ξ i > g T ξ i, if otherwise, i =,..., n 3. WiT Γ w Wi Hence, Lwi outside Ω i, and W i is bounded in a compact set Ω i,. Denote the maximum value of the norm of W as W i max Wi Ω i,,,...,n 3 W i (t. Similarly, V is bounded and the maximum value of its norm is denoted as V i max Vi Ω i,,,...,n V i (t. Using these bounds on W and V, the disturbances w and w i can be bounded as follows: w = W T (σ σ W T σ Ṽ T x ε b w + n W + ā 4 n Wi ( V i + V x + ε w and w i = W T (σ i σi W T σ iṽ T x i ε i b w + n W + ā 4 n Wi ( V i + V x i + ε i, w i, i =,..., p. The tracking error dynamics for the current states can be ( expressed as ė = Ae + B (ν ad = Ae + B W T σ(v T x + W σ (V T xṽ T x + w. Similarly, the simulated tracking error dynamics for the i-th stored data point x i, i = (,..., p are ė i = Ae i + B (ν adi i = Ae i + B W T σ(v T x i + W σ (V T x i Ṽ T x i + w i. Now, with these tracking error dynamics, the time derivative for the Lyapunov function candidate in (9 can be expressed as L = et Qe + r (ν ad + tr (Ẇ Γ W w T + tr (Ṽ T Γ v V e T i Qe i + r i (ν adi i = et Qe e T i Qe i + r w + n r i w i + n3 { (Wi W i T [ P roj(w i, ξ i ξ i ] } + { (Vi V i T [ P roj(v i, ζ i ζ i ] }. By the definition of projection operator, we have g gt g ξ i g(w i, P roj(w i, ξ i ξ i = if g(w i > and g T ξ i >, if otherwise, i =,..., n 3. Since g is the convex function, g always directs outward. Hence, referring to the Fig., we get (Wi W i T g(w i. Therefore, the following quantities are always less than or equal to zero. (W i Wi T ( P roj(w i, ξ i ξ i. (3 By the similar argument, since (Vi V i T h(v i, next inequalities are always true: (V i Vi T ( P roj(v i, ζ i ζ i. (3 Using (3 and (3, L et Qe e T i Qe i + r w + p ( r i w i λ min(q e w P B λ min (Q P B λ min (Q ( λ min (Q e i w i P B λ min (Q + γ, where γ = ( w + p. w i L(x <, x / Z, x Ω {}, (3 where Ω = {x R n : e > w P B λ + γ min(q λ min(q ( or e i w i P B λ min(q > γ λ }. min(q Using the assumption 3 and the result that the columns of W (t, W i, are bounded in the compact set Ω i,, the error weight matrix between hidden and output layer, W (t, is clearly bounded. Similar argument is applied for the boundedness of error weight matrix between input and hidden layer, Ṽ (t. Since we satisfied all the conditions in Theorem, the ultimate boundedness of e, e i, W, V is established. Here, the following definitions are used. Z = { x(t D : ( x(t x i T ( x(t x i x(t T > ε}, (33 x(t where x(t is input { vector to NN. for added point, f d = (34 e i for subtracted point. x = [ (vecṽ T (vec W ] T e T e T e T T p. (35 IV. SIMULATION RESULTS The overall approach of combining instantaneous online and background learning is conceptually appealing, because current data dominates when tracking error is large. Also past data is used to train the NN in any case, so the controller skill, performance when a problem experienced earlier is re-encountered, is improved even when no excitation or tracking error is present. It is also an obvious extension to allow the stored data to be developed from

3 4 5 6 7 8 9 e.5 x x rm.6 x x rm.5.4.5.8..6 x(t, x rm (t.3. NN weights.5. x(t, x rm (t.4.3 NN weights.4...5...4..6.8 Time ( seconds 3 4 5 6 7 8 9 Time ( seconds 3 4 5 6 7 8 9 Time ( seconds 3 4 5 6 7 8 9 Time ( seconds (a Comparison of States (b Weights V (a Comparison of States (b Weights V.4 4..4 Torque,del 3.5.. 3 Torque.4.6.8 Torque,del Torque..4.6 # of stored points.5.5.8...5 3 4 5 6 7 8 9 Time ( seconds (c Control Input Fig. 3. Online NN adaptation law.4 3 4 5 6 7 8 9 Time ( seconds (c Control Input.5.45 stored points 3 4 5 6 7 8 9 Time ( seconds (d Total Number of Stored Points e e e 3 certain types of a priori information regarding the plant, such as data recorded during previous use of the plant. To illustrate the method, a local learning problem was induced by using a combination of a relatively low learning rate and a model error that was large and strongly-dependent on plant state variables. A low-dimensional problem is used as an illustration. Remember that the greater the dimension of the problem (n, the greater benefit we expect from these methods due to the larger subspace, so, in a sense, this is a worst case test. The plant is described by ẍ = δ + sin(x ẋ ẋ, (36 where the last two terms, regarded as unknown, represent a significant model error. The desired dynamics are those of a linear second-order system. At first, a square wave external command that repeats with some frequency is simulated with online learning NN adaptation. There is only minor improvement over the seconds of the trajectory given in Fig. 3(a. The weight histories, shown in Fig. 3(b, are slowly reaching constant values, with a partially periodic response at the same period as the command and state response, indicating that it must re-learn much of the effect each time the external command is cycled. The control input history is provided in Fig. 3(c. Simulation results with combined online and background learning NN adaptation are presented in Fig. 4(a Fig. 4(e. From Fig. 4(a, combined learning NN adaptation provides better global convergence except during the initial phase. In Fig. 4(b, note that the weights assume smooth and nearly constant values sooner. V. CONCLUSIONS A new adaptive neural network (NN control concept is proposed. The NN retains the advantages of the existing online trained NN that enables the system to learn the complete plant state and control space and attains the capability.4.35.3.5..5..5 Fig. 4. 3 4 5 6 7 8 9 Time ( seconds (e Tracking Error History Combined NN adaptation law of background learning with the information from a priori recorded data and current state. Proof of boundedness of all system signals is provided. The characteristics of the algorithm were demonstrated using a typified plant model simulation. REFERENCES [] J. T. Spooner, M. Maggiore, R. Ordonez, and K. M. Passino, Stable Adaptive Control and Estimation for Nonlinear Systems: Neural and Fuzzy Approximator Techniques, John Wiley & Sons,. [] K. Hornik, M. Stinchcombe, and H. White, Multilayered Feedforward Networks are Universal Approximators, Neural Networks, Vol., 989, pp. 359-366. [3] F. Lewis, Nonlinear Network Structures for Feedback Control, Asian Journal of Control. Vol., No. 4, December 999. [4] K. Narendra and A. Annaswamy, A New Adaptive Law for Robust Adaptation Without Persistent Excitation, IEEE Transactions on Automatic Control, Vol. 3, No., Feb. 987, pp.34-45. [5] B. Peterson and K. Narendra, Bounded Error Adaptive Control, IEEE Transactions on Automatic Control, Vol. 7, No. 6, Dec. 98, pp.6-68. [6] N. Kim, Improved Methods in Neural Network-Based Adaptive Output Feedback Control, with Applications to Flight Control, Ph.D. Thesis, Georgia Institute of Technology, 3. [7] E. N. Johnson, Limited Authority Adaptive Flight Control, Ph.D. Thesis, Georgia Institute of Technology,. [8] G. Strang, Linear Algebra And Its Applications, 3 rd Ed, Harcourt College Publishers, 988. [9] P. M. Mills, A. Y. Zomaya, and M. O. Tade, Neuro-Adaptive Process Control: A Practical Approach, Wiley, 996. [] W. M. Haddad and V. Chellaboina, Nonlinear Dynamical Systems and Control, preprint,. e 4 e 5