Adaptive control of discretetime nonlinear systems using recurrent neural networks
t
L. Jin P.N. Nikiforuk
M.M. Gupta
Indexing terms: Nonlinear systems, Adaptive control
Abstract: A learning and adaptive control scheme for a general class of unknown MIMO discretetime nonlinear systems using multilayered recurrent neural networks (MRNNs) is presented. A novel MRNN structure is proposed to approximate the unknown nonlinear inputoutput relationship, using a dynamic back propagation (DBP) learning algorithm. Based on the dynamic neural model, an extension of the concept of the inputoutput linearisation of discretetime nonlinear systems is used to synthesise a control technique for model reference control purposes. A dynamic learning control architecture is developed with simultaneous online identification and control. The potentials of the proposed methods are demonstrated by simulation studies. 1
Introduction
The objective of neural networksbased adaptive control systems for unknown nonlinear plants is to develop algorithms for identification and control using neural networks through a learning process. To avoid modelling difficulties for complex physical systems, the neural method of learning and control provides a natural framework for the design of tracking controllers for unknown nonlinear systems, which can be viewed as the nonlinear dynamic mapping of control inputs into observation outputs. A number of multilayered feedforward neural networksbased controllers have been recently proposed [9211. For such types of adaptive learning control systems, feedforward neural networks [l5) are used to approximate the unknown nonlinear static functions that are contained in the nonlinear systems, so that the adaptive control laws can be then designed on the basis of the neural approximation model. References 9 and 14 have proposed adaptive control approaches using neural networks for discretetime nonlinear systems with all states observable. They divided the adaptive control problem for an unknown nonlinear system into an identification, or system modelling stage, and a nonlinear control stage. Conversely, References 15 and 16 have proposed and evaluated the Gaussian networkbased adaptive controller for a class of
0IEE, 1994 Paper 9976D (CS), first received 17th March and in revised form 23rd November I993 The authors are at the Intelligent Systems Research Laboratory, College of Engineering, University of Saskatchewan, Saskatoon, Saskatchewan, Canada S7N OW0 I E E Proc.Control Theory Appl., Vol. 141, No. 3, May 1994
continuoustime nonlinear dynamic systems for which an explicit linear parameterisation of the uncertainty in the dynamics is either unknown or impossible to evaluate, and References 20 and 21 have provided a novel algorithm for adaptive tracking for SISO nonlinear systems using layered feedforward neural networks. They have also analysed the convergence of the weight learning and the stability of the closed system. In fact, because of the complexities of nonlinear dynamic systems, currently, the main results of neural networks based nonlinear adaptive control approaches are focused on SISO nonlinear systems with all states observable. Furthermore, the important current research topics in this field are the problems of neural learning and control for general multiinput and multioutput (MIMO) nonlinear systems, which are described by MIMO nonlinear state equations. More recently, several studies have noted that an appropriate dynamic mapping may be realised by a dynamic recurrent neural network which is trained through a seriesparallel or a parallel learning model similar to the case of the feedforward networks, so that a desired response can be obtained [7, 81. On the other hand, there are potential advantages in using dynamic feedback neural models, like its better prediction capabilities than the static feedforward neural model [17]. The recurrent neural network consists of both feedforward and feedback connections between the layers and neurons forming a complicated dynamic system. Obviously, the ability of a recurrent neural network to approximate a continuous/discrete nonlinear dynamic system by the neural dynamics defined by a system of nonlinear differential/difference equations has the potential for application to adaptive control systems. When dynamic recurrent neural networks are used to approximate and control a unknown nonlinear system through online learning processes, they may be treated as subsystems of such adaptive control systems, where the weights of the networks need to be updated using a dynamic learning algorithm during the control processes. 2
Multilayered recurrent neural networks (MRNNs)
An artificial neural network consists of many interconnected identical simple processing units called neurons or nodes. An individual neuron sums its weighted inputs and yields an output through a nonlinear activation function with a threshold. A novel multilayered recurrent neural network (MRNN) architecture is proposed in this section. The MRNN is a hybrid feedforward and feedback neural network, with the feedback represented by the recurrent connections and crosstalk, appropriate for I69
approximating a nonlinear dynamic system. The MRNN is composed of an input layer, a series of hidden layers, and an output layer. It allows for feedforward and feedback among the neurons of neighbouring layers, and crosstalk and recurrency in the hidden layers. A basic structure of the multilayered recurrent neural networks (MRNNs) with feedforward and feedback connections is shown in Fig. 1 , here the network has I inputs input layer
hidden layers output layer
...........__________.
recurrent link
The neural activation function a( .) may be chosen as the continuous and differentiable nonlinear sigmoidal function satisfying the following conditions: (i)a(x)+ k1 a s x  + a ; (ii) a ( x ) is bounded with the upper bound 1 and the lower bound  1 ; (iii) a ( x ) = 0 at a unique point x = 0; (iv) u'(x) > 0 and d ( x ) + 0 as x + k CO ; (v) d(x) has a global maximal value c < 1. Typical examples of such a function a( . ) are ex  e  x 1  ex a(x) = = tanh ( x ) U ( X ) = ex + e  % 1 +e" 2 X2 a ( x ) =  tan'(: x ) a ( x ) = sign ( x ) ~
feedforward and feedback links
b
a
Fig. 1
Multilayered recurrent neural network ( M R N N )
and N , outputs. Let M be total number of hidden layers of the MRNN, the ith neuron in the sth hidden layer be denoted by neuron (s, i), N , be total number of neurons in the sth hidden layer, ui be the ith input of the MRNN, x , i(k) be the state of the neuron (s, i), yi be the ith output of the MRNN, 01: be the intralayer linkweight coefficient from the neuron (s, j ) to the neuron (s, i), u:l1, be the feedforward linkweight coefficient from the neuron (s  1 , j ) to the neuron (s, i), o:.il, be the feedback linkweight coefficient from the neuron (s + 1, j ) to the neuron (s, i), and O; be the threshold of the neuron (s, i). Mathematically, the operation of the neuron (s, i) is defined by following dynamic equations. For the first hidden layer:
where sign ( .) is a sign function, and all the above nonlinear activation functions are bounded, monotonic, nondecreasing functions. As the function  1 < a( . ) < 1, the state vector x(k) of the system (1)(3) exists in the 'box' H" = [  1, l]", which is a closed ndimensional hypercube, and the output y(k) is uniformly bounded for the bounded input u(k). For the adaptive learning control purposes, the number of input ui of MRNN is assumed to have the same number as the output p i . Furthermore, in order to obtain the IjO relationship of the MRNN shown in Fig. 1, let xi = [ x i , xi,2 , _ . _ , be the state of the neurons in the ith hidden layer of the MRNN. The dynamic neural system (eqns. 13) can then be represented in the vector difference equations xl(k
+ 1) = a[Wix,(k) + W : x2(k) + W ; u(k) + w ; ]
xdk
+ 1) = a [ W j  , x i  , ( k ) + W f x i ( k )
= u , [ x , ( k ) , X2(k)>4 k ) l
+ W ! +i x i + i(k) + w;l = aiCxil(k), xdkb xi+ I W l i = 1,2, ..., M  1
x,(k i = 1 , 2 , , . . , N 1 (1)
+ 1) = u [ W z  , x ,  , ( k ) + W i x d k ) + w:] 3 aMCxM
I(k), XM(k)l
For the sth hidden layer: N,+ I
(4)
y(k) = W E + ' x , ( k )
N S
Then, the relationship between the state x(k), input u(k) and the output y ( k ) may be derived as follows:
j  1
Y(k
+ 1) = Wfi+'o,wCx,i(k),
~ d k ) l
T ' [ x ,  i(kh x d k ) I
y(k
s = 2 , 3,..., M
i = l , 2 ,..., N,
(2)
Note that there are no feedback actions from the output = 0. If the layer in the Mth hidden layer; that is, o~~~~~ activation function a( .) is a symmetric ramp function, the MRNN is then a special type of the brainstateinabox (BSB) model with a nonsymmetric weight matrix. The terms on the righthand side of above equation represent the feedback from the upper hidden layer, the intralayer connections, and the feedforward from the lower layer, respectively. Indeed, the output equations of the MRNN are derived as N"
y,(k) =
1wE,+,','xM,jk)
,=I
170
i = 1, 2,
..., m
(3)
+ 2) = T1[xMI ( k + I), x,(k + I)] T Z [ ~ , +  2 ( kxby  i(kL x.dk)l
y ( k + M ) = T M  ' [ x 1 ( k + 1). ..., ~ , ( k + l ) ] E T"'[Xl(k),.. . ,xy(k), 4 k ) l (5) Therefore, for the MRNN with minputs and moutputs, and Mhidden layers, the relative degree [22241 of the dynamic system (eqns. 13) at some point ( x o , U') is then r j = M , j = 1,2, ..., m. For convenience, the state equation and the inputoutput equation of the MRNN are represented as following vector form
x(k + 1 ) = f [ x ( k b Nk), w l
(6)
I E E Proc.Control Theury Appl., Vol. 141, Nu. 3, M a y 1994
and
t
(7) where x is the state vector of the dynamic hidden neurons of the MRNN, and w is the weight vector which consists of the input layer weights U:.; hidden layer weights w : : , , ~ , and threshold w"?' of the MRNN. Next, a new input variable or the socalled equivalent control variable U E R"' is defined as:
44 = hCx(k), u(k), wl
(8)
or y(k + M ) = ~ ( k ) (9) Eqn. 9 obviously shows that the modified system is inputoutput linearised. Furthermore, if it is assumed that the m x m matrix ah(x, U, w)/& is nonsingular at (xo, U"), then, by the Implicit Function Theorem there exists a unique local solution of the nonlinear algebraic eqn. 8 as follows: u(k) = aCx(k), 4 k ) l (10) where du/& # 0 at (x0, U"), U' = g(xo, U"). The nonlinear algebraic eqn. 8 may be solved using the numerical methods at each instant. In order to avoid the numerical complexity for solving the nonlinear algebraic eqn. 8, the righthand side term of eqn. 8 is expended to first order term of the incremental term Au(k) = u(k) u(k I); that is, ~
where w(k) is an estimation of the weight vector at time k, and v is a step size parameter, which affects the rate of convergence of the weights during learning. The output of the network at current instant k may be obtained only using the state and input of the network at past time k  M . The error index E(k) should be then defined as
where ei(k) = y f ( k )  ydk) is a learning error between the desired and network outputs at time k. The partial derivatives of the error index E(k), with respect to the weight of the network, are obtained using the dynamic neural model as follows:
M 1 aw(k)
dxlk

+j )
] +m} ayi(k)
(16)
Let z j ,,(k  M  1
[email protected]  M  1 + j )
+j )
[email protected]) l = l , 2 , ...,ti
(17)
where ti is the total number of weights which need to be adapted. Then, eqn. 16 may be rewritten as
~
v(k) = h[x(k),u(k  l ) , W ]
Based on the nonsingularity of the matrix ah/&, an iterative algorithm is given as
~ ( k=) ~ ( k 1 )
+ Au(k)
(12)
where
+
where Z,{k  M  1 j ) = [z,. l(k  M  1 + j ) , . . ., z,JMI+j)], and z , , , ( k  M  l + j ) . [ = I , 2, . . . , n are determined by the following timevarying linear system 2,.
,(k  M
au.

+
1 j)
= 2[ x ( k 
M   2 + j ) , u ( k  M  2 + j ) , w(k)]
ax 3
x zj.,(k  M
In the supervised learning of the MRNN, the purpose of the weight learning of the MRNN is to estimate the weights such that the output vector y(k) of the MRNN tracks the desired output vector yd(k)with an error which converges to zero as k + 03. Hence, if the weights of a MRNN are taken into account as the unknown parameters of a nonlinear inputoutput system, the weight learning problem of the MRNN can be phrased as a parameter identification problem for the dynamic nonlinear system. As a matter of fact, a simple and natural extension of the backpropagation (BP) algorithm for the multilayered feedforward network is the dynamic backpropagation (DBP) algorithm for the MRNN, and this learning approach was first studied by Williams and Zipser [ S I , and Narendra and Parthasarathy 191. A new dynamic learning algorithm will be derived in this section based on the relative degree concept. A dynamic learning process may be formulated as:
I E E Pruc.Control Theory Appl., Vol. 141, No. 3, M a y 1994
2
au . +j ) + A
aw, x [ x ( k  M  2 + j ) , u(k  M
N e w dynamic learning algorithm
 2 + j ) , w(k)] (19) The partial derivatives in eqns. 18 and 19 may be derived from eqns. 13. It is to be noted that the first term in eqn. 18 results from the dynamic behaviour of the MRNN and the second term is a static partial derivative similar to that in the conventional static backpropagation algorithm. In this paper, a parallel learning structure as shown in Fig. 2 is used in order to avoid the problem of the fullstate feedback.
4
Learning control scheme using MRNN
4.1 Dynamic learning control of unknown nonlinear plants
Recently, several independent studies have found that recurrent neural networks using the dynamic backpropagation algorithm can approximate a wide range of inputoutput relationships of nonlinear systems to any desired degree of accuracy [681. In this section, the inputoutput linearised control technique combining the multilayered recurrent neural networks (MRNNs) is used 171
to develop nonlinear adaptive control systems for unknown MIMO discretetime nonlinear systems with online identification and control abilities.
t
Fig. 2
Dynamic neural modelling ofunknown plant
Consider a general class of unknown multiinput and multioutput (MIMO) discretetime nonlinear systems of the form
the neural networks as follows:
i
x(k
+ 1) = U[Wrj(k)X(k)+ WAk)u(k) + wT(k)]
A k ) = Wo(k)x(k) (22) where x is the state of the recurrent networks MRNN, U is the input of the neural network MRNN, and y is the output of the MRNN. The matrices W'(k), W,(k) and Wo(k) are, respectively, the estimations of the weight matrices of the hidden, input, and output layers, and w d k ) is the estimation of the threshold vector. The expressions of the W&), W,(k), Wo(k),and wT(k) are easily implied from earlier discussions in eqns. 13. Next, assume that an nonlinear control law u(k) is designed based on the equivalent control concept and the dynamic neural system (eqn. 22) such that the output y(k) of the dynamic neural system (eqn. 22) will track asymptotically the output y"(k) of the reference model; that is, lim [y"(k)
 Ak)] = 0
km
On the other hand, for learning control scheme shown in Fig. 3, the error used to train the neural network MRNN is defined as e*@) = yp(k)  A k )
where x p E R" is an ndimensional state vector, U E R" is an mdimensional control vector, and y , E R" is a mdimensional output vector. The mapping f, and fmction h, are assumed to be unknown and analytic. The problem of producing an output, irrespective of the initial state of the unknown nonlinear system, that converges asymptotically to a given reference output y"(k) will now be investigated. The reference output is not just a fixed function of time, but is the output of a reference model, which in turn is subject to some input r(k), described by equations of the form xm(k + 1) = Ax"(k)
i
+ Br(k)
y"(k) = Cx"'(k) (21) where X" E R", r E R", and y" E R" are the state, input, and output of the reference model, respectively, A is a n, x n, Hurwitz matrix, and B and CT are n, x m vectors. For the unknown nonlinear system (20), the design procedure of the learning control system is divided into the following two steps. Let the neural networks MRNN with the weight w be first used to approximate the nonlinear plant (20). Then, eqn. 20 may be governed by using
Fig. 3 172
(24)
where y(k) and y,(k) are the outputs of the neural network and the plant, respectively. As shown in Fig. 3, note that the error between the outputs of the model and the plant satisfies
44 = I Y d k )  Y"(k) I I Y(k)  Yrn(k)I + I A k )  Y,(k) I
(25)
Indeed, the output y(k) of the MRNN tracks asymptotically the output y"(k) of the model by means of the learning control law u(k). Hence, if the output of neural network MRNN is trained to approximate the output of plant with Iimkdme*(k) = 0, the output of the plant is then adaptively controlled to track asymptotically the output y"(k) of the model; that is, limk+me(k) = 0. In fact, the recurrent network MRNN is used to identify the nonlinear plant online, while the control law is constructed based on the identification results of the neural network. 4.2 Design of the equivalent control law The purpose of the designing equivalent control is to find a new feedback control v(k) such that the output y ( k ) of the MRNN will asymptotically converge to the corres
Dynamic learning'control ofunknown plant using M R N N
I E E Proc.Control Theory Appl., Vol. 141, No.3, M a y I994
ponding output y"(k) produced by the model under the effect of r(k), and the actual feedback control law u(k) is solved by means of the nonlinear algebraic eqn. 8 where the weight w is replaced by the estimation w(k). Note that yy(k + i ) = c j A i x m ( k + ) c,A''Br(k)
+ . . .+ c j A B r ( k + i 
2 ) + c,Br(k
+i 
1)
i = 1 , 2, . . ., r j  1 (26)
t
cT,
where C' = [c:, .., , ci]. Suppose that the reference model (eqn. 21) has a relative degree { r l , ..., r m } , and every r j is equal to or possibly larger than the corresponding relative degree M of the neural system (eqn. 22). Then,
c1 . ~ =1 c . ~ ~ == c. .. .~
'
~

2
~
=
~
(27)
and the relationship between the input and output of the reference model (eqn. 21) may then be represented as
fl(k
+ i) = cjA'xm(k) O
I
(28)
and
5
Simulation studies
Example 1 (SISO nonlinear system) In this example, the plant is a singleinput and singleoutput system with unknown dynamics described by the nonlinear state difference equation
1
x,(k x,(k
+ 1 ) = XJk) + 1) = X3(k)
x3(k + 1 ) = gCxi(kb Xz(k), x3(k), ~ 4 ( k ) 4k)l r
x4(k
+ 1 ) = u(k)
Yw = x,(k)
It is easy to show that the relative degree is 1 if ag/au # 0. Let the unknown nonlinear function g( .) in eqn. 35, for numerical simulation purposes, have the form dX1,
+ r j ) = cjA"xm(k) + c j A"'Br(k) j = 1 , 2 , ..., m (29)
+ M ) + h1 pj. hb$"(k + h)  Y,{k + h)] =O = cjAM'Br(k) + cjAMxm(k)
= ym(k
+
M1 Bj. Jcj
Ahxm(k) T:(x)I
h=O
j = 1 , 2 , ..., m (30) The output tracking error equation is then derived by substituting eqn. 30 into eqn. 9 as follows: e,{k+rj)+Bj,,jle,(k + r j 
l)+...+Bj,oe,(k)=O j = 1 , 2 , ..., m (31)
where e j k ) = yy(k)  y,{k) is the tracking error of the jth output. If the coeflicients bj,o , p,. . . ., fij,,, j = 1 , 2, . . ., rn are chosen such that the zpolynomial
z'J+pj,,,lz'Jl + " ' + p j , o = o j = 1 , 2 , . . . , m (32) has all its zeros inside the unit circle in the complex zplane, the output ~ ( k )of the system will track asymptotically the desired output ym(k);that is, lim [ y ( k )  y"(k)] = 0
(33)
km
Since the equivalent input v(k) depends explicitly, at each time k, on the state x(k) of the system, on the input r(k) of the model, and on the state x"(k) of the model, which in turn obeys the reference model (eqn. 21), u(k) can be regarded as the 'output' of a dynamic system of the form
i
xm(k
+ 1) = Ax"'(k) + Br(k)
(34) u(k) = 4 x m ( k ) ,~ ( 4 r(4l , where the internal state x"(k) is driven by the 'input' r(k) and x(k). In the following section, we will provide two simulation examples. IEE ProcControl Theory Appl., Vol. 141, N o . 3, May 1994
=
1
1)
+
+ x: + x : + x :
U
(36)
I
x,,,, 3(k
+ 1 ) = xm. 2 ( k ) + 1) = xm. 3 ( 4
+ 1) = 0 . 1 2 ~3(k) ~ . + 0 . 2 2 ~,(k) ~. 0
. 1 7 ~1(k) ~.
(37)
+ 0.33r(k)
y m ( 4 = xm. 3 ( k )
M1
vi(k)
U)
The reference model is taken as by the stable linear system
xm, 2 ( k
Let the equivalent input in eqn. 8 be designed as
X I X 2 x 3 x4(x3 
x2, x3, X4r
xm,l(k
yy(k
(35)
where r(k) is the uniformly bounded reference input. Since 1 + x: + x i + x : # 0, for any x 1 E R, x 2 E R, and x 3 E R, the model reference control problem may be solved based on the approaches obtained in the previous section. A threelayered recurrent neural network MRNN with a single hidden layer, a single input u(k), and a single output yR(k) was used to approximate the unknown nonlinear system (eqn. 35) by using the online dynamic learning algorithm. For this purpose, the nonlinear activation function was chosen as a ( x ) = tanh (x). and the number of neurons in the hidden layer of the network was set as n, = 10. The nonlinear algebraic equation (eqn. 8) was solved using the Newton method at each instant k. Let the reference input r ( k ) be a square wave with a circle of 100 steps, and an amplitude of 0.5 and 0.5. The initial values of the weights were chosen randomly in the interval [ 1, 11, and the learning rate q was selected to be 0.005.Fig. 4 shows the histories of the outputs of the reference model and the controlled plant, the output tracking error [email protected]), and the control input u(k), respectively. The simulation results in Fig. 4 show that the model reference control of the unknown system (eqn. 35) was performed using the DBP learning algorithm. Although, the initial few steps responses of the controlled plant oscillated around the reference output y,(k), the tracking error converged asymptotically to zero after the learning period. In fact, the suitable choice of the iterative initial values of the MRNN is important, even if which can be determined randomly in a fixed range. As mentioned above, the unit interval was used in this simulation study. Example 2 ( M I M O nonlinear system) The unknown nonlinear system in this case was represented by a nonlinear multivariable plant with two 173
respectively. The relative degree of the system is { 1, I } if ag,/au # 0 and dy2/du # 0 are satisfied. The specific plant used in the simulation study was
I
and Ag,[x(k)] in eqn. 38 was the perturbation term of the plant, the purpose of which was to verify the robustness of the neural network based control system in the presence of structural or parametric variations of the plant. The varying term was then set to
1.5 reference :plant
I
i
1
The stable reference model was given by following MIMO linear system
05
xm(k + 1)
0
=
Axm(k)+ Br(k)
y"(k) = Cx"(k) 05
where x"(k) = [xy(k), x';(k)]', r(k) = [ r , ( k ) , r2(k)]', and y"(k) = [y = (k), y Z ( k ) l r are the state, input and output
1
vectors of the reference model, and
1 5
I I
I
I
8
1
0
200
400
600
800
a
time,k
.......... :..............
4
: ..............
:.. ............
6  . . . . . . . . . . . .i ............. j ........... .;..
.........................................
8
10
I
........
...............
C
200
b
LO0 time k
600
800
(43) Note that agl/2u # 0 and dy,/au 8 0 since 1 + x: + x i # 0 for any xz E R and x j E R. The learning controller for the unknown system (eqn. 38) may be therefore designed using the techniques proposed in the previous section. A threelayered network with the two inputs and the two outputs was used to identify the unknown nonlinear system (eqn. 38) online by using the DBP learning algorithm. The parameters of the network, the activation function, and the initial values of the learning control process were chosen to be similar to those in Example 1. Fig. 5 shows the outputs of both the reference model and plant, and the track error under the learning control law for the reference inputs r , ( k ) = sin (2nk/50) and r2(k) = cos (2nk/50). For this condition, the simulation results indicate that the outputs of the unknown MIMO plant tracked perfectly the outputs of the reference model by the learning control scheme proposed even though the structure and parameters of the plant varied discontinuously during control process. The oscillation around k = 100 was due to the varying of the plant structure described by eqn. 41. These simulation studies demonstrate that the MRNNs based learning control system has good robustness for timevarying unknown plants. 6
Conclusions
A multilayered recurrent neural networks (MRNNs)
c
Fig. 4 problem
time,k
Simulation results of the asymptotic model marchzng control
n Reference output ym(k)and controlled plant oulpul y i k )
b Output tracking error 4k) e Nonlinear learnmg control n(k)
174
based adaptive control scheme for unknown MIMO discretetime nonlinear systems has been proposed in this paper. This approach uses the inputoutput linearisation concept of nonlinear systems and the dynamic weight learning process. As in all adaptive control techniques, the MRNNsbased learning control scheme combines identification and control performed by an online adaptively weighted updating process. The ability of the MRNNs with the dynamic learning algorithm to model arbitrary dynamic nonlinear systems was used to approximate the unknown inputoutput relationship of a nonI E E Proc.Control Theory Appl.. Vol. 141, No. 3, May 1994
0 9
0 1
r
I 1 1
0
50
100
150
I
0
2 00
t1me.k
0
50
100
150
200
time,k
b
1 5
1
0 5
I
......... ... ...
0
1 5
I 0 C
Fig. 5
I
I
50
100 t1me.k
150
200
060
! 0 d
I 50
100
150
2 00
time,k
Simulation results for a M l M O nonlinear system when As,(’ ) = 0.O2x,x2/(1 + xi), k > 100 c Output tracking error e , ( k ) = y , ( k )  y;“(k) d Output tracking error e , ( k ) = y,(k)  y;(k)
a Reference output y“;k) and controlled plant output p , ( k ) b Referenceoutput C(k)and controlled plant output y,(k)
linear system, and the control strategy was constructed based on the approximation model. As the unknown nonlinear systems were modelled online and controlled by dynamic neural networks, the control mechanisms were less sensitive to the varying of the system parameters and structures, and this phenomenon was demonstrated by simulation results. A comparison of the dynamic neural network based controllers proposed in this paper and the feedforward network based controllers [9161 shows that the former needs less a priori knowledge about the unknown plant. The structure of the former is also simpler. Another advantage of the proposed control algorithm is that only the output signal of the plant is fed back to the controller through the neural networks at each instant, because the parallel learning architectures of the MRNNs are utilised during the modeling process. In other words, the dynamic neural networksbased control system is a type of outputfeedback adaptive control scheme. Hence, the difficulties of the implementation of the conventional fullystate feedback systems are avoided in such neural control systems. 7
References
I RUMELHART, D.E., and McCELLAND, J.L.: ‘Learning internal representations by error propagation’. in ’Parallel distributed processing: explorations in the microstructure of cognition’, Vol. I . Foundations (MIT Press, 1986) IEE Proc.Control Theory Appl., Vol. 141, No. 3, May 1994
2 HECHTNIELSEN, R.: ‘Theory of the backpropagation neural network‘. Proc. Internal. Joint Conf. on Neural Networks, June 1989, pp. 15931605 3 SIMPSON, P.K.: ‘Artificial neural systems’ (Pergamon Press, 1990) 4 HORNIK, K., STINCHCOMBE, M., and WHITE, H.: ‘Multilayer feedforward networks are universal approximators’, Neural Networks, 1989,2, pp. 359366 5 WIDROW, B., and LEHR, M.A.: ‘30 years of adaptive neural networks: perceptron, madaline, and backpropagation’, Proc. IEEE, 1990, 78, (9), pp. 14151441 6 WILLIAMS, R., and ZIPSER, D.: ‘A learning algorithm for continually running fully recurrent neural networks’, Neural Computation, 1989,1, pp. 270280 7 SUDHARSANAN, S.I., and SUNDERESHAN, M.K.: ‘Training of a threelayer dynamical recurrent neural network for nonlinear inputoutput mapping’, Proc. I J C N N , 1991,II, pp. 111115 8 PARLOS, A., ATIYA, A., and CHONG, K.: ‘Recurrent multilayer perceptron for nonlinear system identification’, Proc. I J C N N , 1991, 11, pp. 537540 9 NARENDRA, K.S., and PARTHASARATHY, K.: ‘Identification and control of dynamic systems using neural networks’, IEEE Trans. Neural Networks, 1990, ”1, (l), pp. 427 IO CHEN, S., COWAN, C.F.N., BILLINGS, S.A., and GRANT, P.M.: ‘Parallel recursive prediction error algorithm for training layered neural networks’, Int. J . Control, 1990,51, (6), pp. 12151228 1 1 BILLINGS, S.A., JAMALUDDIN, H.B., and CHEN, S.: ‘Properties of neural networks with applications to modelling nonlinear dynamical systems’, I n t . J . Control, 1992,SS. ( I ) , pp. 193224 12 PSALTIS. D., SIDERIS, A., and YAMAMURA, A.A.: ‘A multilayered neural network controller’, IEEE Control System Magazine, 1988, pp. 1721 13 CHEN, F.C., and KHALIL, H.K.: ‘Adaptive control of nonlinear a deadzone approach’, Proc. 1991 systems using neural networks American Control Con$, 1991, pp. 667672 ~
175
14 HUNT, K.J., and SBARBARO, D.: ‘Neural networks for nonlinear internal model control‘, IEE Proc. D, 1991,138, (5), PP. 431438 15 SANNER, R.M., and SLOTINE, J.J.E.: ‘Gaussian networks for direct adaptive control’, IEEE Trans. Neural Networks, 1992, 3, (6), pp. 837863 16 SANNER, R.M., and SLOTINE, J.J.E.: ‘Stable adaptive control and recursive identification using radial Gaussian network‘, Proc. of the 30th CDC, 1991,2, pp. 21162123
i
17 SU, M., and McAVOY, D.: ‘Identification of chemical processes using recurrent networks’, Proc. ofthe 1991 American Control Conf., 1991,3, pp. 23142319 18 TZIRKELHANCOCK, E., and FALLSIDE, F.: ‘Stable control of nonlinear systems using neural networks’, Internat. .I. Robust and Nonlinear Control, 1992,2, (I), pp. 6386 19 ZBIKOWSKI, R.: ‘Statespace approach to continuous recurrent
176
neural networks’, Proc. 7th IEEE Symposium on Intelligent Control, 1992, pp. 152157 20 JIN, L., NIKIFORUK, P.N., and GUPTA, M.M.: ‘Adaptive tracking of SISO nonlinear systems using multilayered neural networks’, Proc. 1992 American Control Conference, Chicago, 1992, pp. 5660 21 JIN, L.. NIKIFORUK, P.N., and GUPTA, M.M.: ‘Direct adaptive output tracking control using multilayered neural networks’, IEE Proc. D, 1993,140, (6), pp. 393398 22 SIRARAMIREZ, H.: ‘Nonlinear discrete variable structure systems in quasisliding model’, Int. J . Control, 1991, 54, (S), pp. 11711 187 23 ISIDORI, A.: ‘Nonlinear control system’ (SpringerVerlag, New York, 1989) 24 NIJMEIJER, H., and VAN DER SCHAFT, A.J.: ‘Nonlinear dynamic control systems’ (SpringerVerlag, New York, 1990)
IEE Proc.Control Theory Appl., Vol. 141, No. 3, M a y 1994