Soft Sensor for The Adaptive Catalyst Monitoring of a Multi-tube Reactor, Final Report for NiSIS Competition 2006

Soft Sensor for The Adaptve Catalyst Montorng of a Mult-tube Reactor, Fnal Report for NSIS Competton 2006 Martn Macas Czech Techncal Unversty n Prague Faculty of Electrcal Engneerng Department of Cybernetcs Techncka 2, Prague 6, 66 27, Czech Republc Phone: +420 224 357 666, Fax: +420 224 923 677 emal: lhotska@fel.cvut.cz ABSTRACT: Ths report descrbes the soluton of NSIS Competton 2006 awarded by best nature nspred concept award. KEYWORDS: Partcle swarm optmzaton, Elman neural network, predcton. INTRODUCTION The objectve of the competton s to create an adaptve mathematcal model descrbng the relatonshp between 4 nput varables and one output varables, all of them varyng wth tme. Such a model probably has to be adapted to process state changes resultng from non-measurable nfluences. After adaptaton to the current workng pont of the process the model should be able to predct the output varable over a certan tme horzon, supposed the future behavor of the nput varables s known. For detaled descrpton see []. INPUT DATA PREPROCESSING Orgnal nput data consst of 4 tme seres of measured features. The thrd feature representng measured concentraton of combustble component n combustble gas feed n mass fracton was recognzed as carryng no nformaton was cancelled from the data set and thus just thrteen features were used for further processng. Frst, the outlayng values were replaced usng followng method. The 85 th and 25 th percentles were computed for each feature defnng the upper and the lower bound respectvely. Next, all values hgher than the 85 th percentle were replaced by the percentle value. The same operaton was done for the extremely low values usng the 25 th percentle. In the next step of preprocessng, all features were normalzed n order to have the zero mean and the standard devaton equal to. Further, the prncpal component analyss was appled n order to obtan less correlated features. Just frst fve prncpal components were used for subsequent processng. The nput dmenson was reduced from 4 to 5. The next phase of preprocessng conssted of applcaton of smoothng flters. The movng average flter averagng 00 ponts was appled on the feature tme seres. Fnally, for each tme pont t n each -th tme seres X (t), the followng two specal features were extracted: F t, ( t) = 00 X ( ) τ = t 00 t 00 τ = t 00 τ () F2, ( t) = (( X ( ) X ( τ )) 2 τ. (2) In ths way, the data set wth 0 features was obtaned, whch was further normalzed nto range (-;) and used for model desgn and tranng.

POSTPROCESSING Output data conssted of one tme seres representng the catalyst actvty. The only preprocessng appled here was very slght smoothng (movng average from 0 ponts) and normalzaton nto range (-;). MODEL STRUCTURE The model used for predcton s based on neural network. The system whch s to be modeled has many unknown delays between nputs and outputs; therefore, the model has to be able to handle wth tme context of data. That s why the recurrent neural network was supposed to be the sutable one. The specal case of recurrent network was used the Elman network. It has two layers wth feedback from the frst-layer output to the frst layer nput. Ths feedback enables the network to add the tme context nformaton nto the modelng process. The schema of Elman network s depcted n Fgure 2. The model conssts of hdden layer wth sgmodal neurons, output layer wth lnear neuron (just one was used here) and the context layer whch s represented by a recurrent connecton wth delay. The delay n ths connecton stores values from the prevous tme step, whch can be used n the current tme step. The connectons between output of hdden layer and the context unts are fxed (the weghts are set to ). All the remanng connectons are tranable. For detaled descrpton of the Elman network see [2]. Target output OUTPUT Lnear neurons Error Tranng algorthm HIDDEN CONTEXT Sgmodal neurons INPUT Feature vector Fgure 2: The schema of Elman recurrent network. The tranable connectons are represented by dashed lnes. d (the weghts are set to ). All the remanng connectons are tranable. For detaled descrpton of the Elman network see [2]. TRAINING METHOD The Elman network could be traned by any optmzaton algorthm. The most common method are gradent-based back-propagaton algorthms. There are two man dsadvantages. Frst, the gradent algorthms are often gettng stuck n local optma. The second problem s computaton of gradent. The contrbutons of weghts and bases to errors va the delayed recurrent connectons can not be computed and consdered. Therefore these contrbutons are gnored and only approxmaton of the error gradent s used. Both the two problems are solved n soluton descrbed here usng the global searchng and not-gradent- based method. The Partcle Swarm Optmzaton (PSO) method was used here for tranng the Elman network.

The PSO method s one of optmzaton method developed for searchng global optma of some nonlnear functon [3]. It has been nspred by socal behavor of brds and fsh. The method apples the approach of problem solvng n groups. Each soluton conssts of set of parameters and represents a pont n multdmensonal space. The soluton s called "partcle" and the group of partcles (populaton) s called "swarm". Each partcle s represented as a D-dmensonal poston vector x (t) and has a correspondng nstantaneous velocty vector v (t). Furthermore, t remembers ts ndvdual best value of ftness functon and poston p whch has resulted n that value. Durng each teraton t, the velocty update rule (3) s appled on each partcle n the swarm. The p g s the best poston of the entre swarm and represents the socal knowledge. v ( 2 2 g t) = wv ( t ) + ϕ R ( p x ( t )) + ϕ R ( p x ( t )), (3) The parameter w s called nerta weght and durng all teratons decreases lnearly from w start to w end. The symbols R and R 2 represent the dagonal matrces wth random dagonal elements drawn from a unform dstrbuton between 0 and. The parameters φ and φ 2 are scalar constants that weght nfluence of partcles' own experence and the socal knowledge. Next, the poston update rule (4) s appled: x ( t) = x ( t ) v ( t). (4) + If any component of v s less than V max or greater than +V mn, the correspondng value s replaced by V max or +V max respectvely. The V max s maxmum velocty parameter. The update formulas (3,4) are appled durng each teraton and the p and p g values are updated smultaneously. The algorthm stops f maxmum number of teratons s acheved. Each partcle corresponds to one partcular set of network weghts and bases. The topology of network (numbers of neurons n layers) were fxed and were set expermentally. Therefore, each partcle corresponds to one network. The evaluaton of each partcle (ftness) was done usng network tranng error. The tranng of model (network) for predcton of STEP was performed usng random ntalzaton of swarm. The parameters of PSO were set as followng: Parameter Value φ and φ 2 2 V max 0.5 w start, w end 0.9, 0.4 Intalzaton range (-0.5;0.5) dmenson Swarm sze 30 partcles Iteratons 2000 Hdden unts 0 Output unts Context unts 0 Input unts 0 The mean squared error between the actual (target) catalyst actvty and the network response was used as ftness. Important property of predcted system s ts non-statonarty. Therefore, the tranng data dd not consst of the whole tme seres, but only a wndow (here referred as tranng wndow) foregong to the predcton pont must be consdered. The length of the tranng wndow was set expermentally to 3300 samples (38 hours) as t s depcted n Fgure 3. ADAPTATION OF THE TRAINED MODEL The predcton model has to be adapted to process state changes resultng from non-measurable nfluences. One possblty s to shft the tranng wndow n tme where the predcton s requred and to tran new network. However, t s probable, that the modeled system s not completely dfferent n the new predcton pont, ts dynamcs s very smlar and t s enough to just slghtly adapt the traned model to the new workng pont. Ths problem s solved n terms of swarm ntalzaton. In contrast to the basc PSO algorthm descrbed above, durng adaptve process, the swarm s not ntalzed totally randomly. It s splt nto two parts, the frst part s rentalzed randomly and the second part s preserved. Ths method preserves some nformaton from the old model and smultaneously ncreases the

dversty of the populaton and possblty of fndng new optma. The partly rentalzed swarm s further used n common PSO descrbed above. The Fgure 4 shows the tme ranges of tranng data. The frst swarm was randomly ntalzed, the swarms of step 2-4 were ntalzed partly. Fgure 5 shows how the nformaton s preserved durng all adaptaton processes..2 Target output 0.8 0.6 Tranng data for STEP 0.4 0.2 0 50 00 50 200 Tme [hours] Fgure 3: Only a part of the whole data set was used for tranng due to the non-statonarty of modeled system. The nput data were splt n the same manner Fgure 4: Just a part of the whole data set was used for tranng due to the nonstatonarty of modeled system. The nput data were splt n the same manner

STEP STEP 2 STEP 3 STEP 4 ntalzaton of complete swarm ntalzaton of half of swarm ntalzaton of half of swarm ntalzaton of half of swarm Tranng Adaptng Adaptng Adaptng Step Step2 Step3 Step4 Fgure 5: The dagram of soluton RESULT AND CONCLUSIONS The descrbed approach was used for predcton of the catalyst actvty n 4 steps. The result of frst valdaton experment, where the whole tranng data set of STEP was splt to tranng and testng set s depcted n Fgure 6. The green curve s the valdaton part of the data set and the blue curve s the model output. The fnal result of all steps except the result of STEP 4 s shown n Fgure 7. It could be seen that the predcton dffers sgnfcantly n several ponts. These dfferences are probably caused by errors n nput data (measurements) or by fxed tranng wndow used for adaptaton. 0.8 0.6 0.4 0.2 0-0.2-0.4-0.6 0 000 2000 3000 4000 5000 Fgure 6: The frst valdaton of the model

Comparson Measurement Predcton of STEPS -3 0.8 Cat.actvty 0.6 0.4 0.2 0 0 50 00 50 200 250 300 Tme [hours] Fgure 7: The fnal result of steps -3 The approach s hghly nspred by nature from two ponts of vew. Frst, the neural network model was used nspred by neural nets n lvng organsms. Moreover, the optmzaton algorthm was used for tranng and adaptaton, whch s nspred by movement of groups of creatures (fsh, brds, nsects) and even has a soco-cogntve metaphor n human decson makng (ndvdual and socal knowledge). REFERENCES [] www.nss.de [2] Elman, J. L.,"Fndng structure n tme," Cogntve Scence, vol. 4, pp. 79-2, 990. [3] Kennedy J, Eberhart RC. Partcle swarm optmzaton. In: Proc. IEEE Internatonal Conference on Neural Networks, 995:942-948.