Applying Ensemble Learning Techniques to ANFIS for Air Pollution Index Prediction in Macau



Similar documents
The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Forecasting and Modelling Electricity Demand Using Anfis Predictor

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

Forecasting the Direction and Strength of Stock Market Movement

What is Candidate Sampling

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Calculation of Sampling Weights

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A COLLABORATIVE TRADING MODEL BY SUPPORT VECTOR REGRESSION AND TS FUZZY RULE FOR DAILY STOCK TURNING POINTS DETECTION

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

A Secure Password-Authenticated Key Agreement Using Smart Cards

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns

Intelligent Voice-Based Door Access Control System Using Adaptive-Network-based Fuzzy Inference Systems (ANFIS) for Building Security

Performance Analysis and Coding Strategy of ECOC SVMs

Australian Forex Market Analysis Using Connectionist Models

Invoicing and Financial Forecasting of Time and Amount of Corresponding Cash Inflow

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

Project Networks With Mixed-Time Constraints

Lecture 2: Single Layer Perceptrons Kevin Swingler

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Development of an intelligent system for tool wear monitoring applying neural networks

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

An Alternative Way to Measure Private Equity Performance

IMPACT ANALYSIS OF A CELLULAR PHONE

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Using an Adaptive Fuzzy Logic System to Optimise Knowledge Discovery in Proteomics

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

VoIP Playout Buffer Adjustment using Adaptive Estimation of Network Delays

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

Implementation of Deutsch's Algorithm Using Mathcad

Calculating the high frequency transmission line parameters of power cables

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

An Interest-Oriented Network Evolution Mechanism for Online Communities

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

A Genetic Programming Based Stock Price Predictor together with Mean-Variance Based Sell/Buy Actions

Single and multiple stage classifiers implementing logistic discrimination

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

Damage detection in composite laminates using coin-tap method

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS

Application of Multi-Agents for Fault Detection and Reconfiguration of Power Distribution Systems

Automobile Demand Forecasting: An Integrated Model of PLS Regression and ANFIS

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

The Application of Fractional Brownian Motion in Option Pricing

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

Inter-Ing INTERDISCIPLINARITY IN ENGINEERING SCIENTIFIC INTERNATIONAL CONFERENCE, TG. MUREŞ ROMÂNIA, November 2007.

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

The OC Curve of Attribute Acceptance Plans

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Modelling of Web Domain Visits by Radial Basis Function Neural Networks and Support Vector Machine Regression

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Foreign Exchange Rate Prediction using Computational Intelligence Methods

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Abstract. 260 Business Intelligence Journal July IDENTIFICATION OF DEMAND THROUGH STATISTICAL DISTRIBUTION MODELING FOR IMPROVED DEMAND FORECASTING

Time Delayed Independent Component Analysis for Data Quality Monitoring

Brigid Mullany, Ph.D University of North Carolina, Charlotte

Statistical Approach for Offline Handwritten Signature Verification

L10: Linear discriminants analysis

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

Fuzzy Control of HVAC Systems Optimized by Genetic Algorithms

Waste to Energy System in Shanghai City

Detecting Credit Card Fraud using Periodic Features

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

Intelligent stock trading system by turning point confirming and probabilistic reasoning

Gender Classification for Real-Time Audience Analysis System

A Probabilistic Theory of Coherence

Credit Limit Optimization (CLO) for Credit Cards

A DATA MINING APPLICATION IN A STUDENT DATABASE

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST)

Design and Development of a Security Evaluation Platform Based on International Standards

Mining Multiple Large Data Sources

Dsaster Management and Network Analysis

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Sequential Optimizing Investing Strategy with Neural Networks

Fuzzy Set Approach To Asymmetrical Load Balancing In Distribution Networks

A neuro-fuzzy collaborative filtering approach for Web recommendation. G. Castellano, A. M. Fanelli, and M. A. Torsello *

Support Vector Machines

The circuit shown on Figure 1 is called the common emitter amplifier circuit. The important subsystems of this circuit are:

A system for real-time calculation and monitoring of energy performance and carbon emissions of RET systems and buildings

Hybrid-Learning Methods for Stock Index Modeling

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Ants Can Schedule Software Projects

Set. algorithms based. 1. Introduction. System Diagram. based. Exploration. 2. Index

Optimal Choice of Random Variables in D-ITG Traffic Generating Tool using Evolutionary Algorithms

Traffic State Estimation in the Traffic Management Center of Berlin

Application of an Improved BP Neural Network Model in Enterprise Network Security Forecasting

Figure 1. Training and Test data sets for Nasdaq-100 Index (b) NIFTY index

Transcription:

Applyng Ensemble Learnng Technques to ANFIS for Ar Polluton Index Predcton n Macau Kn Seng Le and Feng Wan Department of Electrcal and Computer Engneerng, Faculty of Scence and Technology, Unversty of Macau, Macau SAR, Chna {ma76560,fwan}@umac.mo Abstract. Nowadays, the concepton on envronmental protecton s ncreasngly rsng up and one of the crtcal envronmental ssues s the ar polluton due to the rapdly growth of economy and populaton. Hence, a sgnfcant forecastng for the ar polluton ndex (API) becomes mportant as t can act as the alarm for alertng our awareness n the ar polluton ssue. In ths research, an archtecture for ensembles of ANFIS (Adaptve Neuro-Fuzzy Inference System) s proposed for forecastng the Macau API and the performance of the proposed method s compared wth the conventonal ANFIS and the results s verfed by the performance ndexes, Root Mean Square Error (RMSE) and Average Percentage Error (APE), showng that a promsng result can be acheved. Keywords: API, ANFIS, Ensemble Learnng, RMSE. 1 Introducton The Macau Regon, ncludng the Macau Pennsula, Tapa Island and Coloane Island, s located south of Guangdong Provnce at the western bank of the Pearl Rver Estuary. It s neghborng to Gongpe of Zhuha Cty, lyng close to the South Chna Sea n the south. It s separated by a rver from Wancha of Zhuha Cty n the west and faces Hong Kong n the east by the sea, wth a dstance of 42 nautcal mles. Its total area covers 23.5 square klometers. The populaton of Macau was rsng up from 431,867 to 543,656 durng the last decade whle the Gross Domestc Product (GDP) was ncreasng from 6.1 bllon MOP to 21.7 bllon MOP. On the other words, the percentage growth of populaton and the GDP should be 20% and 350% respectvely. As a result of the dramatc growth of economy n Macau, ar qualty becomes a crtcal concern for us snce the poor ar qualty has both chronc and serous effects on human health. The Macau Meteorologcal and Geophyscal Bureau (SMG) was establshed at 1953 and started to montor and report the last 24-hour ar qualty stuaton to the publc n March of 1999 tll now. In order to provde an easy understandng of ar qualty to the general publc, the SMG used an Ar Qualty Index (AQI) system whch classfes the ar qualty nto sx levels. The defnton of the AQI J. Wang, G.G. Yen, and M.M. Polycarpou (Eds.): ISNN 2012, Part I, LNCS 7367, pp. 509 516, 2012. Sprnger-Verlag Berln Hedelberg 2012

510 K.S. Le and F. Wan system presented n Macau s generally equvalent to the concept of the nternatonal API system. The dffuson mechansm of ar pollutants s very complcated and depends on several parameters, such as hydrocarbon (O 3 ), ntrogen doxde (NO 2 ), suspended partculates (PM 10 ) and sulfur doxdes (SO 2 ), and so on. It s also strongly affected by both weather condtons (e.g. temperature, humdty, wnd speed and drecton.) and the presence of prmary pollutants that react wth each other. Therefore, t s hard to make a predcton for the API based on the tradtonal mathematcal sklls snce ts lldefended and complcated structure. Thus many researchers have ntroduced lots of approaches to forecastng the API, and the most commonly used s Artfcal Neural Network (ANN), whch s a computatonal model based on bologcal neural network. ANN s generally traned by means of tranng data, and due to ts generalzaton propertes, hence t has been wdely used for modelng and forecastng. Especally, t has been successfully appled n the feld of ar qualty predcton n the past decade [1] [2]. From a dfferent vewpont, Takag and Sugeno explored a systematcal method to Fuzzy Inference [3]. It can apply the human knowledge and reasonng processes wthout employng precse quanttatve analyses; however, there are stll no standard methods exstng for transformng the human knowledge or experence nto the rule base of a fuzzy nference system. In addton, an effectve method should be defned for fne tunng the membershp functons so that the output error measure s mnmzed or a performance ndex s maxmzed. In order to ncorporate the concept of fuzzy logc nto the neural network, Jang proposed another approach, that s, Adaptve Neuro-Fuzzy Inference System (ANFIS) [4], [5]. Generally speakng, ANFIS can be regarded as a bass for constructng a set of fuzzy f-then rules wth approprate membershp functons whch s based on the knowledge learnng from the nput/output data sets. Therefore, ANFIS combnes the advantages of neural network and fuzzy logc: the neural networks have the better learnng ablty, parallel processng, adaptaton, fault-tolerance and dstrbuted knowledge representaton, and the fuzzy logc technques can deal wth reasonng on a hgher-level. However, sample selecton s a key concern as vares tranng data selecton sometmes may not reflect the real dstrbuton of the predcton model and the effectveness of the predcton algorthm can not be assured. Therefore, how to choose a proper tranng data set s very mportant for tme seres predcton. In ths paper, an ensemble structure s proposed as t comprses several Sub-ANFIS wth dfferent nput selecton so that the concluson can be drawn by ntegratng the results of each ANFIS and the fnal result can be consdered n a global vew ponts. The proposed model s adopted for forecastng the Macau API and the smulatng results compares wth the sgnal ANFIS model va evaluatng the performance ndex root mean square error (RMSE) aganst nne years measured data n the Macau cty. 1.1 Paper Organzaton In the next secton, the bascs theory of ANFIS and ensemble learnng are addressed. Secton 3 ntroduces the performance ndex for verfyng the results obtaned n ths

Applyng Ensemble Learnng Technques to ANFIS for API Predcton n Macau 511 work. Secton 4 performs the nput selecton for API ssue. Secton 5 dscusses the results and the performance of the proposed model and fnally, Secton 6 draws out the conclusons of ths paper. 2 Methodology Revew Prevous researches revealed that t s nflexble to predct the ar polluton ndex usng tradtonal mathematcal meteorologcal and dsperson models snce t could only descrbe the relatonshp between pollutant emsson, transmsson and ambent ar concentraton of the ar pollutant as a functon of space and tme, whle the ar qualty could also be nfluenced by the condton of ts neghborng regon and numerous weather factors. Roughly speakng, all the related factors should be consdered and addressed n the predcton model, whch wll be unfortunately a complcated non-lnear functon. As a result, many researchers suggested that the forecastng can be made by adoptng the artfcal ntellgent technques such as Artfcal Neural Network (ANN), Fuzzy Inference System (FIS), and Adaptve Neuro-Fuzzy Inference System (ANFIS) because these methods have been verfed that they are unversal approxmators. Among them, the ANFIS combnes the advantages of ANN and FIS and therefore, ths research focuses on the ANFIS model and the concept s dscussed next. 2.1 Adaptve Neuro-fuzzy Inference System (ANFIS) ANFIS can regard as a dvson of adaptve neural networks that are essentally equal to fuzzy nference systems. The basc structure of ANFIS can be expressed as a feedforward neural network wth 5 layers: Layer 1: Every node n ths layer s an adaptve node wth an approprated membershp functon corresponds to the nput to node. O = ( ) (1) μ x 1, A Where x s the nput to node and A s a lngustc label assocated wth ths node. O 1, s the membershp grade whch specfes the degree to whch the gven nput satsfes the quantfer A. All the parameters n ths layer are referred to as antecedent parameters. Layer 2: Every node n ths layer s a fxed node whose output s the fre strengths of the rules. For nstance: O = w = μ ( x) μ ( ) (2) y 2, A B

512 K.S. Le and F. Wan Layer 3: Every node n ths layer s a fxed node whose output s called normalzed frng strength whch represents the rato of the th rule s frng strength to the sum of all rules frng strengths. O 3, = w w = w + w 1 2 (3) Layer 4: Every node n ths layer s an adaptve node wth node O = w f = w ( p x + q y + r ) (4) 4, w s the output of the 3 rd layer and ) Where ( p, q, r s the parameter set of ths node. All the parameters n ths layer are referred to as consequent parameters. Layer 5: The snge node n ths layer s a fxed node whch computes the overall output as the summaton of all ncomng sgnals. O 5, = w f w f = w (5) Fgure 1 llustrates a typcal structure of the adaptve neuro-fuzzy nference system. Fg. 1. General Structure for ANFIS From the above ANFIS structure, t can be observed that the consequent parameters can be expressed as lnear combnatons f the values of the premse parameters were fxed. Such as f = w ( p x + q y + r ) = ( w x) p + ( w y) q + ( w ) r (6)

Applyng Ensemble Learnng Technques to ANFIS for API Predcton n Macau 513 In [4], Jang proposed a hybrd learnng method whch combnes the gradent descent and least squares estmaton. More specfcally, these undefned lnear parameters (p, q, r ) can be dentfed by Least Squares Method where n the backward step the premse parameters are updated by gradent descent. 2.2 Ensemble learnng The general concept of ensemble learnng s frst proposed by Zhou where multple component learners are traned for dong a same task. It has been wdely used and successfully appled n dfferent felds, ncludng decson makng, classfcaton, medcal dagnoss owng to ts global characterstcs. There are many methods to realze ensemble learnng. In ths paper, we use bootstrap samplng wth replacement and random sample wthout replacement to construct the subsystems n the proposed ensemble system. [6] In Fg. 2, EN-ANFIS s constructed by fve layers: nput layers, sample layer, tranng layer, testng layer and output layer. In sample layer, each ANFIS () s traned by usng random selected tranng data. Output () s the traned ANFIS (). The testng data nput to each Output () at the same tme and the fnal out of EN- ANFIS s obtaned by unform weghtng each outputs of all Sub-ANFIS unts. ENANFIS = n = 1 ANFIS (7) / n Fg. 2. The ensemble ANFIS structure 3 Performance Index The root mean square error (RMSE) s employed as the performance ndex to check the predctve results of the proposed model.

514 K.S. Le and F. Wan RMSE = 1 N 2 ( a p ) (8) N = 1 Where a and p are the actual and predcted value of API on day, N s the number of testng days. 4 Input Selecton The desgn nputs nclude the prevous days concentratons of partcular matters (PM 10 ), sulphur doxde (SO 2 ), ntrogen doxde (NO 2 ), carbon monoxde (CO), and ozone (O 3 ), and for those are affectng to the API, also wth some meteorologcal factors they are temperature, relatve humdty, wnd speed, solar radaton and pressure. Those daly record are provded by the Macau Meteorologcal and Geophyscal Bureau (SMG) as 8-h average values and for the perods from 1994.4 to 2003.9. 5 Results and Dscusson From 1994.4 to 2003.9, we collected around 3400 data pars. For conventonal ANFIS, the frst 3170 data sets are used for tranng whle the others are used for testng. For EN-ANFIS, we only apply 30% of the tranng data that s 951 sets of data to each ANFIS unt. The tranng data usng random sample are dfferent but that of bootstrap have some repettous data. To ensure the same crtera for comparson, EN-ANFIS conssts 8 ANFIS subunts, all were traned by the hybrd-learnng technque wth the desred error 0.001 and employed the gaussmf as the membershp functon from consderng the statstcal aspect of predcton model. Table 1. shows the mappng between the data accumulated over the past years for tranng and testng the API of the followng year aganst the performances of EN- ANFIS, allanfis and ANFIS unts. Bootstrap samplng Random samplng Table 1. Use of yearly progressvely tranng sets and related performances RMSE Tranng Tme (s) Number of Tranng data sets ANFISmn 12.5271 11.54 951 ANFISmax 13.7214 13.02 951 ANFISmean 12.8312 12.21 951 ANFISmn 12.3897 12.08 951 ANFISmax 14.2168 12.97 951 ANFISmean 13.2011 12.55 951 EN-ANFIS (Bootstrap) 12.2351 12.78 951 EN-ANFIS (Random) 12.2072 12.79 951 allanfis 12.0315 38.91 3400

Applyng Ensemble Learnng Technques to ANFIS for API Predcton n Macau 515 Referrng to Table.1., we can easly note that the predcton results of EN-ANFIS s always better than any ANFIS unts whatever usng dfferent samplng technologes. On the other hand, the predcton accuracy of EN-ANFIS s almost smlar to allanfis. However, we can see that a sgnfcant mprovement n the tranng tme and number of tranng data adoptng where EN-ANFIS consumes much less tme and uses less tranng data pars. From the above dscusson and analyss, we fnd that the EN-ANFIS shows an outstandng performance than any ANFIS unts and the ensemble of each ANFIS unts can acheve a smlar performance wth allanfis. To renforce ths concluson, the predcted API values and the actual API values s gven n Fg. 3. 180 160 140 120 EN-ANFIS (Bootstrap) Actual EN-ANFIS (Random) allanfis I P A 100 80 60 40 20 0 0 30 60 90 120 150 180 210 240 270 Days Fg. 3. The predcted and actual values of API durng the testng stage 6 Concluson Ensemble learnng ncorporatng wth ANFIS s ntroduced n ths paper for forecastng the API n Macau by adoptng the daly metrologcal data sets measured from 1994.4 to 2002.12. The expermental results show that the proposed EN-ANFIS structure can not only perform much better than any ANFIS unts but also can obtan an equvalent performance whle comparng wth the conventonal ANFIS. However, EN-ANFIS s possble to use less tranng data sets and consumes less tranng tme. It s proved that the proposed hybrd approach has great ablty n handlng the nonlnear problem and complex phenomena.

516 K.S. Le and F. Wan References 1. Boznar, M., Lesjack, M., Mlakar, P.: A neural network based method for short-term predctons of ambent SO2 concentratons n hghly polluted ndustral areas of complex Terran. Atmospherc Envronment 270B (2), 221 230 (1993) 2. Mok, K.M., Tam, S.C., Yan, P., Lam, L.H.: A neural network forecastng system for daly ar qualty ndex n Macau. In: Ar Polluton VII, C.A (2000) 3. Takag, T., Sugeno, M.: Fuzzy dentfcaton of systems and ts applcatons to modelng and control. IEEE Trans. Syst., Man, Cybern. 15, 116 132 (1985) 4. Jang, J.S.: ANFIS: Adaptve-Network-Based Fuzzy Inference System. IEEE Trans. Syst., Man, Cybern. 23, 665 683 (1993) 5. Jang, J. S.R.: Neuro-fuzzy and soft computng a computatonal approach to learnng and machne ntellgence, pp. 335 422. Prentce Hall, Upper Saddle Rver (1997) 6. Zhou, Z.H., Wu, J., Tang, W.: Ensemblng neural networks: Many could be better than all. Artfcal Intellgence 137(1-2), 239 263 (2002) 7. Wang, C., Zhang, J.P.: Tme seres predcton based on ensemble ANFIS. In: Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, August 18-21 (2005) 8. Talebzadeh, M., Mordnejad, A.: Uncertanty analyss for the forecast of lake level fluctuatons usng ensembles of ANN and ANFIS models. Expert Systems wth Applcatons 38 (2011)