Applyng Ensemble Learnng Technques to ANFIS for Ar Polluton Index Predcton n Macau Kn Seng Le and Feng Wan Department of Electrcal and Computer Engneerng, Faculty of Scence and Technology, Unversty of Macau, Macau SAR, Chna {ma76560,fwan}@umac.mo Abstract. Nowadays, the concepton on envronmental protecton s ncreasngly rsng up and one of the crtcal envronmental ssues s the ar polluton due to the rapdly growth of economy and populaton. Hence, a sgnfcant forecastng for the ar polluton ndex (API) becomes mportant as t can act as the alarm for alertng our awareness n the ar polluton ssue. In ths research, an archtecture for ensembles of ANFIS (Adaptve Neuro-Fuzzy Inference System) s proposed for forecastng the Macau API and the performance of the proposed method s compared wth the conventonal ANFIS and the results s verfed by the performance ndexes, Root Mean Square Error (RMSE) and Average Percentage Error (APE), showng that a promsng result can be acheved. Keywords: API, ANFIS, Ensemble Learnng, RMSE. 1 Introducton The Macau Regon, ncludng the Macau Pennsula, Tapa Island and Coloane Island, s located south of Guangdong Provnce at the western bank of the Pearl Rver Estuary. It s neghborng to Gongpe of Zhuha Cty, lyng close to the South Chna Sea n the south. It s separated by a rver from Wancha of Zhuha Cty n the west and faces Hong Kong n the east by the sea, wth a dstance of 42 nautcal mles. Its total area covers 23.5 square klometers. The populaton of Macau was rsng up from 431,867 to 543,656 durng the last decade whle the Gross Domestc Product (GDP) was ncreasng from 6.1 bllon MOP to 21.7 bllon MOP. On the other words, the percentage growth of populaton and the GDP should be 20% and 350% respectvely. As a result of the dramatc growth of economy n Macau, ar qualty becomes a crtcal concern for us snce the poor ar qualty has both chronc and serous effects on human health. The Macau Meteorologcal and Geophyscal Bureau (SMG) was establshed at 1953 and started to montor and report the last 24-hour ar qualty stuaton to the publc n March of 1999 tll now. In order to provde an easy understandng of ar qualty to the general publc, the SMG used an Ar Qualty Index (AQI) system whch classfes the ar qualty nto sx levels. The defnton of the AQI J. Wang, G.G. Yen, and M.M. Polycarpou (Eds.): ISNN 2012, Part I, LNCS 7367, pp. 509 516, 2012. Sprnger-Verlag Berln Hedelberg 2012
510 K.S. Le and F. Wan system presented n Macau s generally equvalent to the concept of the nternatonal API system. The dffuson mechansm of ar pollutants s very complcated and depends on several parameters, such as hydrocarbon (O 3 ), ntrogen doxde (NO 2 ), suspended partculates (PM 10 ) and sulfur doxdes (SO 2 ), and so on. It s also strongly affected by both weather condtons (e.g. temperature, humdty, wnd speed and drecton.) and the presence of prmary pollutants that react wth each other. Therefore, t s hard to make a predcton for the API based on the tradtonal mathematcal sklls snce ts lldefended and complcated structure. Thus many researchers have ntroduced lots of approaches to forecastng the API, and the most commonly used s Artfcal Neural Network (ANN), whch s a computatonal model based on bologcal neural network. ANN s generally traned by means of tranng data, and due to ts generalzaton propertes, hence t has been wdely used for modelng and forecastng. Especally, t has been successfully appled n the feld of ar qualty predcton n the past decade [1] [2]. From a dfferent vewpont, Takag and Sugeno explored a systematcal method to Fuzzy Inference [3]. It can apply the human knowledge and reasonng processes wthout employng precse quanttatve analyses; however, there are stll no standard methods exstng for transformng the human knowledge or experence nto the rule base of a fuzzy nference system. In addton, an effectve method should be defned for fne tunng the membershp functons so that the output error measure s mnmzed or a performance ndex s maxmzed. In order to ncorporate the concept of fuzzy logc nto the neural network, Jang proposed another approach, that s, Adaptve Neuro-Fuzzy Inference System (ANFIS) [4], [5]. Generally speakng, ANFIS can be regarded as a bass for constructng a set of fuzzy f-then rules wth approprate membershp functons whch s based on the knowledge learnng from the nput/output data sets. Therefore, ANFIS combnes the advantages of neural network and fuzzy logc: the neural networks have the better learnng ablty, parallel processng, adaptaton, fault-tolerance and dstrbuted knowledge representaton, and the fuzzy logc technques can deal wth reasonng on a hgher-level. However, sample selecton s a key concern as vares tranng data selecton sometmes may not reflect the real dstrbuton of the predcton model and the effectveness of the predcton algorthm can not be assured. Therefore, how to choose a proper tranng data set s very mportant for tme seres predcton. In ths paper, an ensemble structure s proposed as t comprses several Sub-ANFIS wth dfferent nput selecton so that the concluson can be drawn by ntegratng the results of each ANFIS and the fnal result can be consdered n a global vew ponts. The proposed model s adopted for forecastng the Macau API and the smulatng results compares wth the sgnal ANFIS model va evaluatng the performance ndex root mean square error (RMSE) aganst nne years measured data n the Macau cty. 1.1 Paper Organzaton In the next secton, the bascs theory of ANFIS and ensemble learnng are addressed. Secton 3 ntroduces the performance ndex for verfyng the results obtaned n ths
Applyng Ensemble Learnng Technques to ANFIS for API Predcton n Macau 511 work. Secton 4 performs the nput selecton for API ssue. Secton 5 dscusses the results and the performance of the proposed model and fnally, Secton 6 draws out the conclusons of ths paper. 2 Methodology Revew Prevous researches revealed that t s nflexble to predct the ar polluton ndex usng tradtonal mathematcal meteorologcal and dsperson models snce t could only descrbe the relatonshp between pollutant emsson, transmsson and ambent ar concentraton of the ar pollutant as a functon of space and tme, whle the ar qualty could also be nfluenced by the condton of ts neghborng regon and numerous weather factors. Roughly speakng, all the related factors should be consdered and addressed n the predcton model, whch wll be unfortunately a complcated non-lnear functon. As a result, many researchers suggested that the forecastng can be made by adoptng the artfcal ntellgent technques such as Artfcal Neural Network (ANN), Fuzzy Inference System (FIS), and Adaptve Neuro-Fuzzy Inference System (ANFIS) because these methods have been verfed that they are unversal approxmators. Among them, the ANFIS combnes the advantages of ANN and FIS and therefore, ths research focuses on the ANFIS model and the concept s dscussed next. 2.1 Adaptve Neuro-fuzzy Inference System (ANFIS) ANFIS can regard as a dvson of adaptve neural networks that are essentally equal to fuzzy nference systems. The basc structure of ANFIS can be expressed as a feedforward neural network wth 5 layers: Layer 1: Every node n ths layer s an adaptve node wth an approprated membershp functon corresponds to the nput to node. O = ( ) (1) μ x 1, A Where x s the nput to node and A s a lngustc label assocated wth ths node. O 1, s the membershp grade whch specfes the degree to whch the gven nput satsfes the quantfer A. All the parameters n ths layer are referred to as antecedent parameters. Layer 2: Every node n ths layer s a fxed node whose output s the fre strengths of the rules. For nstance: O = w = μ ( x) μ ( ) (2) y 2, A B
512 K.S. Le and F. Wan Layer 3: Every node n ths layer s a fxed node whose output s called normalzed frng strength whch represents the rato of the th rule s frng strength to the sum of all rules frng strengths. O 3, = w w = w + w 1 2 (3) Layer 4: Every node n ths layer s an adaptve node wth node O = w f = w ( p x + q y + r ) (4) 4, w s the output of the 3 rd layer and ) Where ( p, q, r s the parameter set of ths node. All the parameters n ths layer are referred to as consequent parameters. Layer 5: The snge node n ths layer s a fxed node whch computes the overall output as the summaton of all ncomng sgnals. O 5, = w f w f = w (5) Fgure 1 llustrates a typcal structure of the adaptve neuro-fuzzy nference system. Fg. 1. General Structure for ANFIS From the above ANFIS structure, t can be observed that the consequent parameters can be expressed as lnear combnatons f the values of the premse parameters were fxed. Such as f = w ( p x + q y + r ) = ( w x) p + ( w y) q + ( w ) r (6)
Applyng Ensemble Learnng Technques to ANFIS for API Predcton n Macau 513 In [4], Jang proposed a hybrd learnng method whch combnes the gradent descent and least squares estmaton. More specfcally, these undefned lnear parameters (p, q, r ) can be dentfed by Least Squares Method where n the backward step the premse parameters are updated by gradent descent. 2.2 Ensemble learnng The general concept of ensemble learnng s frst proposed by Zhou where multple component learners are traned for dong a same task. It has been wdely used and successfully appled n dfferent felds, ncludng decson makng, classfcaton, medcal dagnoss owng to ts global characterstcs. There are many methods to realze ensemble learnng. In ths paper, we use bootstrap samplng wth replacement and random sample wthout replacement to construct the subsystems n the proposed ensemble system. [6] In Fg. 2, EN-ANFIS s constructed by fve layers: nput layers, sample layer, tranng layer, testng layer and output layer. In sample layer, each ANFIS () s traned by usng random selected tranng data. Output () s the traned ANFIS (). The testng data nput to each Output () at the same tme and the fnal out of EN- ANFIS s obtaned by unform weghtng each outputs of all Sub-ANFIS unts. ENANFIS = n = 1 ANFIS (7) / n Fg. 2. The ensemble ANFIS structure 3 Performance Index The root mean square error (RMSE) s employed as the performance ndex to check the predctve results of the proposed model.
514 K.S. Le and F. Wan RMSE = 1 N 2 ( a p ) (8) N = 1 Where a and p are the actual and predcted value of API on day, N s the number of testng days. 4 Input Selecton The desgn nputs nclude the prevous days concentratons of partcular matters (PM 10 ), sulphur doxde (SO 2 ), ntrogen doxde (NO 2 ), carbon monoxde (CO), and ozone (O 3 ), and for those are affectng to the API, also wth some meteorologcal factors they are temperature, relatve humdty, wnd speed, solar radaton and pressure. Those daly record are provded by the Macau Meteorologcal and Geophyscal Bureau (SMG) as 8-h average values and for the perods from 1994.4 to 2003.9. 5 Results and Dscusson From 1994.4 to 2003.9, we collected around 3400 data pars. For conventonal ANFIS, the frst 3170 data sets are used for tranng whle the others are used for testng. For EN-ANFIS, we only apply 30% of the tranng data that s 951 sets of data to each ANFIS unt. The tranng data usng random sample are dfferent but that of bootstrap have some repettous data. To ensure the same crtera for comparson, EN-ANFIS conssts 8 ANFIS subunts, all were traned by the hybrd-learnng technque wth the desred error 0.001 and employed the gaussmf as the membershp functon from consderng the statstcal aspect of predcton model. Table 1. shows the mappng between the data accumulated over the past years for tranng and testng the API of the followng year aganst the performances of EN- ANFIS, allanfis and ANFIS unts. Bootstrap samplng Random samplng Table 1. Use of yearly progressvely tranng sets and related performances RMSE Tranng Tme (s) Number of Tranng data sets ANFISmn 12.5271 11.54 951 ANFISmax 13.7214 13.02 951 ANFISmean 12.8312 12.21 951 ANFISmn 12.3897 12.08 951 ANFISmax 14.2168 12.97 951 ANFISmean 13.2011 12.55 951 EN-ANFIS (Bootstrap) 12.2351 12.78 951 EN-ANFIS (Random) 12.2072 12.79 951 allanfis 12.0315 38.91 3400
Applyng Ensemble Learnng Technques to ANFIS for API Predcton n Macau 515 Referrng to Table.1., we can easly note that the predcton results of EN-ANFIS s always better than any ANFIS unts whatever usng dfferent samplng technologes. On the other hand, the predcton accuracy of EN-ANFIS s almost smlar to allanfis. However, we can see that a sgnfcant mprovement n the tranng tme and number of tranng data adoptng where EN-ANFIS consumes much less tme and uses less tranng data pars. From the above dscusson and analyss, we fnd that the EN-ANFIS shows an outstandng performance than any ANFIS unts and the ensemble of each ANFIS unts can acheve a smlar performance wth allanfis. To renforce ths concluson, the predcted API values and the actual API values s gven n Fg. 3. 180 160 140 120 EN-ANFIS (Bootstrap) Actual EN-ANFIS (Random) allanfis I P A 100 80 60 40 20 0 0 30 60 90 120 150 180 210 240 270 Days Fg. 3. The predcted and actual values of API durng the testng stage 6 Concluson Ensemble learnng ncorporatng wth ANFIS s ntroduced n ths paper for forecastng the API n Macau by adoptng the daly metrologcal data sets measured from 1994.4 to 2002.12. The expermental results show that the proposed EN-ANFIS structure can not only perform much better than any ANFIS unts but also can obtan an equvalent performance whle comparng wth the conventonal ANFIS. However, EN-ANFIS s possble to use less tranng data sets and consumes less tranng tme. It s proved that the proposed hybrd approach has great ablty n handlng the nonlnear problem and complex phenomena.
516 K.S. Le and F. Wan References 1. Boznar, M., Lesjack, M., Mlakar, P.: A neural network based method for short-term predctons of ambent SO2 concentratons n hghly polluted ndustral areas of complex Terran. Atmospherc Envronment 270B (2), 221 230 (1993) 2. Mok, K.M., Tam, S.C., Yan, P., Lam, L.H.: A neural network forecastng system for daly ar qualty ndex n Macau. In: Ar Polluton VII, C.A (2000) 3. Takag, T., Sugeno, M.: Fuzzy dentfcaton of systems and ts applcatons to modelng and control. IEEE Trans. Syst., Man, Cybern. 15, 116 132 (1985) 4. Jang, J.S.: ANFIS: Adaptve-Network-Based Fuzzy Inference System. IEEE Trans. Syst., Man, Cybern. 23, 665 683 (1993) 5. Jang, J. S.R.: Neuro-fuzzy and soft computng a computatonal approach to learnng and machne ntellgence, pp. 335 422. Prentce Hall, Upper Saddle Rver (1997) 6. Zhou, Z.H., Wu, J., Tang, W.: Ensemblng neural networks: Many could be better than all. Artfcal Intellgence 137(1-2), 239 263 (2002) 7. Wang, C., Zhang, J.P.: Tme seres predcton based on ensemble ANFIS. In: Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, August 18-21 (2005) 8. Talebzadeh, M., Mordnejad, A.: Uncertanty analyss for the forecast of lake level fluctuatons usng ensembles of ANN and ANFIS models. Expert Systems wth Applcatons 38 (2011)