A New Method for Traffic Forecasting Based on the Data Mining Technology with Artificial Intelligent Algorithms



Similar documents
The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

How To Use Neural Networks In Data Mining

Design call center management system of e-commerce based on BP neural network and multifractal

Data Mining and Neural Networks in Stata

Application of Data Mining Techniques in Intrusion Detection

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

UPS battery remote monitoring system in cloud computing

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique

MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network

Data Mining using Artificial Neural Network Rules

Intrusion Detection via Machine Learning for SCADA System Protection

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin *

The Research of Data Mining Based on Neural Networks

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

ISSN: (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies

THE APPLICATION OF DATA MINING TECHNOLOGY IN REAL ESTATE MARKET PREDICTION

Visualization of Breast Cancer Data by SOM Component Planes

Method of Combining the Degrees of Similarity in Handwritten Signature Authentication Using Neural Networks

Study on the Evaluation for the Knowledge Sharing Efficiency of the Knowledge Service Network System in Agile Supply Chain

A survey on Data Mining based Intrusion Detection Systems

Applications of improved grey prediction model for power demand forecasting

Using Data Mining Techniques to Increase Efficiency of Customer Relationship Management Process

Forecasting Stock Prices using a Weightless Neural Network. Nontokozo Mpofu

A resource schedule method for cloud computing based on chaos particle swarm optimization algorithm

Journal of Chemical and Pharmaceutical Research, 2015, 7(3): Research Article. E-commerce recommendation system on cloud computing

D A T A M I N I N G C L A S S I F I C A T I O N

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India

INTELLIGENT DECISION SUPPORT SYSTEMS FOR ADMISSION MANAGEMENT IN HIGHER EDUCATION INSTITUTES

Financial Trading System using Combination of Textual and Numerical Data

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

Analecta Vol. 8, No. 2 ISSN

6.2.8 Neural networks for data mining

Customer Relationship Management using Adaptive Resonance Theory

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Comparison of K-means and Backpropagation Data Mining Algorithms

The Security Evaluation of ATM Information System Based on Bayesian Regularization

A Survey on Intrusion Detection System with Data Mining Techniques

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

NEURAL NETWORKS IN DATA MINING

Performance Evaluation and Prediction of IT-Outsourcing Service Supply Chain based on Improved SCOR Model

Supply Chain Forecasting Model Using Computational Intelligence Techniques

Numerical Research on Distributed Genetic Algorithm with Redundant

Network Traffic Prediction Based on the Wavelet Analysis and Hopfield Neural Network

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE

Capability Service Management System for Manufacturing Equipments in

Framework model on enterprise information system based on Internet of things

APPLICATION OF INTELLIGENT METHODS IN COMMERCIAL WEBSITE MARKETING STRATEGIES DEVELOPMENT

Stock Data Analysis Based On Neural Network. 1Rajesh Musne, 2 Sachin Godse

Evolution Feature Oriented Model Driven Product Line Engineering Approach for Synergistic and Dynamic Service Evolution in Clouds

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

The relation between news events and stock price jump: an analysis based on neural network

CONCEPTUAL MODEL OF MULTI-AGENT BUSINESS COLLABORATION BASED ON CLOUD WORKFLOW

Data quality in Accounting Information Systems

OPTIMIZED SENSOR NODES BY FAULT NODE RECOVERY ALGORITHM

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan

Review on Financial Forecasting using Neural Network and Data Mining Technique

Prediction of Stock Performance Using Analytical Techniques

Role of Neural network in data mining

Neural Network Applications in Stock Market Predictions - A Methodology Analysis

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

CITY UNIVERSITY OF HONG KONG 香 港 城 市 大 學. Self-Organizing Map: Visualization and Data Handling 自 組 織 神 經 網 絡 : 可 視 化 和 數 據 處 理

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier

ISSN: (Online) Volume 2, Issue 2, February 2014 International Journal of Advance Research in Computer Science and Management Studies

Gerard Mc Nulty Systems Optimisation Ltd BA.,B.A.I.,C.Eng.,F.I.E.I

Internet of Things for Smart Crime Detection

Healthcare Measurement Analysis Using Data mining Techniques

Optimum Design of Worm Gears with Multiple Computer Aided Techniques

Blog Post Extraction Using Title Finding

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Effective Data Mining Using Neural Networks

An Overview of Knowledge Discovery Database and Data mining Techniques

An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks

Keywords: Data Mining, Neural Networks, Data Mining Process, Knowledge Discovery, Implementation. I. INTRODUCTION

Intrusion Detection. Jeffrey J.P. Tsai. Imperial College Press. A Machine Learning Approach. Zhenwei Yu. University of Illinois, Chicago, USA

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Identifying Peer-to-Peer Traffic Based on Traffic Characteristics

Research on the Performance Optimization of Hadoop in Big Data Environment

Neural Networks and Back Propagation Algorithm

A Content based Spam Filtering Using Optical Back Propagation Technique

Genetic Algorithm Based Interconnection Network Topology Optimization Analysis

The Design and Application of Water Jet Propulsion Boat Weibo Song, Junhai Jiang3, a, Shuping Zhao, Kaiyan Zhu, Qihua Wang

Research Article EFFICIENT TECHNIQUES TO DEAL WITH BIG DATA CLASSIFICATION PROBLEMS G.Somasekhar 1 *, Dr. K.

Figure 1. The cloud scales: Amazon EC2 growth [2].

Wireless Sensor Networks Coverage Optimization based on Improved AFSA Algorithm

Chapter 12 Discovering New Knowledge Data Mining

Big Data Storage Architecture Design in Cloud Computing

A Review of Data Mining Techniques

DATA MINING TECHNIQUES AND APPLICATIONS

A SURVEY ON GENETIC ALGORITHM FOR INTRUSION DETECTION SYSTEM

Transcription:

Research Journal of Applied Sciences, Engineering and Technology 5(12): 3417-3422, 213 ISSN: 24-7459; e-issn: 24-7467 Maxwell Scientific Organization, 213 Submitted: October 17, 212 Accepted: November 23, 212 Published: April 1, 213 A New Method for Traffic Forecasting Based on the Data Mining Technology with Artificial Intelligent Algorithms 1 Wei He, 2 Tao Lu and 3 Enjun Wang 1 Transportation Engineering Institute of Minjiang University, Fujian, 3518, China 2 Hubei Province Key Laboratory of Intelligent Robot, College of Computer Science and Engineering, Wuhan Institute of Technology, Wuhan, 437, China 3 Transportation Research Center, Wuhan Institute of Technology, 4373, Wuhan, China Abstract: This study aims to investigate the traffic information forecasting based on the data mining technology. As well known, useful knowledge in traffic management system often hides in a large amount of traffic data. Generally, prior data pattern labels have been used to train the Artificial Neural Network (ANN) to identify the traffic conditions in the traffic information forecasting. The performance of the ANN models suffers from the prior information of the experts. To relieve this impact in the traffic information forecasting, a new ANN model is proposed based on the data mining technology in this study. The Self-Organized Feature Map (SOFM) is firstly employed to cluster the traffic data through an unsupervised learning and provide the labels for these data. Then the labeled data were used to train the GA-Chaos optimized RBF neural network. Herein, the GA-Chaos algorithm is used to train the RBF parameters. Experimental tests use practical data sets from the Intelligent Transportation Systems (ITS) were implemented to validate the performance of the proposed ANN model. The analyses results demonstrate that the proposed method can extract the potential patterns hidden in the traffic data and can accurately predict the future traffic state. The prediction accuracy is beyond 95%. Hence, the new data mining model can provide practical application for traffic information forecasting in the ITS system. Keywords: Artificial neural network, data mining, optimization, traffic forecasting INTRODUCTION In recent years, there emerges a rapid development in the computer science and sensor technologies. As a result, there is a huge amount of data stored in the database ever than before (Zahra et al., 21). The updating speed of data collection and storage in Intelligent Transportation Systems (ITS) is therefore very fast and a large amount of traffic data acquired by various sensors increases a lot of computer computation cost in the analysis of traffic information. Useful information has hidden in mass data. Using the data mining technology, it can find potential patterns of traffic activity and management to reduce the computation cost and enhance the traffic forecasting and control. It is therefore crucial to implement efficient data mining processing to discover important traffic rules and information to construct real-time and accurate traffic information system to help traffic status predicting and decision making. In traffic forecasting and control, wireless sensor networks, cameras and high speed computers have been employed in current ITS systems (Nejad et al., 29). The traffic volume, speed and occupancy data have been regarded as important features in traffic control and information management systems. Based on these traffic features, it is possible to develop models to predict and extrapolate the forthcoming traffic conditions (Wen and Lee, 25). In general, the number of samples has great influence on the decision-makings. However, in real world the traffic data is extreme complex and the high dimensions of the data make classical statistical methods inefficient to provide a relatively good decision for the traffic forecasting and control. To overcome this problem, some new algorithms are imperative to analyze mass data and mine useful information. This procedure is the so called data mining technology. Lots of work has been done in traffic forecasting using data mining technology. Hauser and Scherer (21) adopted clustering approach to manage urban traffic for the first time. Reasonable management scheme was obtained in their study. After that Park et al. (23) employed Genetic Algorithm (GA) to solve the problem of unclean clusters and enhance the precision of the traffic forecasting. Following, the decision trees (Xu and Lin, 29), Artificial Intelligent (AI) algorithms (Jia et al., 26) Corresponding Author: Wei He, Transportation Engineering Institute of Minjiang University, Fujian, 3518, China 3417

etc., were introduced into the field of traffic forecasting management. However, most of the researches are limited for the purpose of accidents alarms. Very limited work has been done to connect the traffic features to the traffic conditions. However, the investigation on deep correlation of various traffic parameters is necessary for traffic forecasting management. A comprehensive understanding of potential traffic principals is important for correct traffic management decision-making. Although neural network models (Raahemi et al., 28) were developed for digging the associated rules of the ITS database, the data was labeled in advance and the knowledge learning was under a supervised way. This is not realistic in practice because the classes of the data are difficult to determine before the data mining procedure (Li et al., 21, 211a, b, 212a, b, c). More practical tools of finding the hidden knowledge in mass data stares us in the face. In order to mine useful information hidden in mass ITS data for the traffic information forecasting, a new hybrid intelligent data mining model is proposed in this study based on Self-Organizing Feature Map (SOFM) and GA-Chaos optimized RBF neural network. The SOFM was firstly used to label potential clusters hidden in the ITS data base through an unsupervised manner. Then the labeled clusters were treated as feature patterns to train the RBF neural network for traffic forecasting. To optimize the RBF model, the GA-Chaos algorithm was used to optimize the RBF parameters. Empirical study on the ITS data has prove that the new method is a useful tool for traffic forecasting and control. DESCRIPTION OF THE PROPOSED PREDICTION MODEL Data mining technology is a hottest topic in fields of database statistics. It aims to analyze and mine knowledge from mass data sets (Nejad et al., 29). By data mining, some useful features associating traffic flow trend can be revealed from the ITS data warehouse. Thus, the traffic features can be transformed into readable information to enhance the traffic information forecasting and traffic control. Figure 1 shows a typical Intelligent Transportation System (ITS). It includes ITS data source module, data warehouse module, data mining module and Decision Support System (DDS) module. Data mining is one of its key techniques in this traffic information system. It is the basic of the Decision Support System (DDS) module, which is respond to correct traffic information forecasting and traffic control. Hence, it is crucial to establish efficient data mining method for the ITS system. For this reason, the SOFM and RBF neural networks are applied for intelligent data mining for ITS system in this study. Self-Organizing Feature Map (SOFM): SOFM is proposed by Kohonen (199). It is a powerful tool for pattern recognition using unsupervised learning. Due to hidden patterns in the ITS data is unknown, the SOFM is very suitable in this case. The SOFM can find useful information contained in the ITS database to identify Traffic management Traffic control Traffic prediction Decision support system Index parameter based analysis subsystem AI model based analysis sub-system Knowledge based analysis sub-system Data mining ITS database Data storage Sensor data Fig. 1: Typical control and management framework of ITS 3418

p 1 p 2 p a y 1 where, µ is the control parameters and the system is in chaos situation when µ = 4 and the chaos optimization process is as follows. Firstly give any initial x and the N chaotic variables, x {X 1, X 2,, X n }, with different paths. Secondly, the i chaotic variables are mapped into solution space by the first carrier: Input layer y b Output layer yin = ci + dx i in (2) Fig. 2: The structure of SOFM neural network potential clusters automatically (Jiang et al., 21). By doing so, the labeled clusters can be used to instead of man-made labels and hence avoid the shortcomings of expert experience. The theory of SOFM is fully discussed in Kohonen (199). The SOFM is a two-layer neural network. The first layer is the input layer. The second layer is the output layer which contains neurons arranged in a rectangular pattern. By Kohonen learning (Kohonen, 199), the SOFM can automatically find clusters in the input data if they exist. Figure 2 shows the structure of SOFM, where p i (i=1, 2,, a) are the input variables and y i (i=1, 2,, b) are the output neurons. GA-Chaos optimized RBF neural network: The RBF neural network has good nonlinear mapping capability and hence is suitable for the traffic information forecasting. The performance of RBF network will be influenced by the hidden node number, the central values and the width of the base function. The Genetic Algorithm (GA) is used to optimize these parameters in this study. GA has three operators: selection, crossover and mutation. The goal of these operators is to pick out the new vitality strong fitness. However, the best fitness is not always easy to obtain. Sometimes GA may fall into local extreme, i.e., premature problem. In review of mechanism, the premature is mainly caused by lack of effective gene in offspring. In order to make the GA avoid premature, this study adopts chaos optimization technology to achieve this goal. Chaos optimization search is able to help GA in the search process to avoid local extreme. A common used chaos optimization principle is Logistic sequence (Krishna, 212). It firstly the maps the chaotic variables into the solution space by Logistic. Secondly, it searches the characteristics of chaotic variables which are of ergodicity, randomicity and regularity. The initial population can be generated by chaotic sequences which in a certain extent can improve the searching efficiency of genetic algorithm. The mapping expression of Logistic is: x = x (1 x ) (1) n µ n 1 n 1 where, c i and d i are constants. Then set the current best points y *. If the optimal value is f *, make y * = y and f * = f. If f * remains constant pass N iterations, the second carrier is: y = x + α ( x.5) (3) * im i i im where, m = The iterative step α i = A constant x im = Smaller chaotic variables in traversal area y im = The searching result Until y im satisfies the terminate qualification, it gives the optimal solution y * and the optimal value f * Through this process, the chaos algorithm can effectively find reasonable genetic operation parameters, help genetic algorithm jump out of local extreme. Thus, the optimization process of chaos-genetic- RBF can be expressed as follows: i. GA chromosomes are coded by hidden nodes number and the base function of central values and the width of the RBF networks ii. Initialize chromosomes and set genetic operation parameters iii. Calculate the corresponding fitness iv. Do crossover and mutation v. Decode newborn progeny populations to obtain the corresponding fitness vi. The optimal individuals in groups are optimized by chaotic algorithm. If the searching result is bigger than the original fitness, substitute the individual vii. If the results satisfy the termination conditions, stop to the end, Otherwise return to (iii) for iteration. The principle of the proposed data mining method: Figure 3 shows the framework of the proposed data mining method for ITS system. 3419

Fig. 3: Data mining method for traffic information prediction 4.5 4 3.5 3 Value 2 2.5 2 1.5 1.5.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Value 1 Fig. 4: Data mining of the ITS data using SOFM 7 6 5 GA-RBF GA-chaos-RBF Actual results RBF Traffic flow 4 3 2 1 1 2 3 4 5 6 7 8 9 1 Time (minute) Fig. 5: Prediction results of small traffic flow EXPERIMENTAL ANALYSIS A set of ITS data has been used to validate the new method in this study. Here, 15 data sets with unknown patterns were prepared for the traffic forecasting. The SOFM was firstly employed to cluster the ITS data. The input variables of the SOFM were traffic flow series and the output adopted 3 neurons. The data mining results are shown in Fig. 4. It can be seen in the figure that the ITS data can be clustered into 3 groups. Then we analyzed the ITS data and found that these three clusters represented small, middle and large traffic flow, respectively. This cluster result agrees well with the physical truth of the testing ITS data. The classification result indicates that the hidden patterns can be identified efficiently by the SOFM and hence a reliable ANN model can be constructed with those labeled groups. In this study, the three clusters are used to train the RBF models to predict the small, middle and large traffic flow, respectively. In order to forecast the traffic flow, we use the three labeled clusters to construct three RBF models to predict small, middle and large traffic flow, respectively. The prediction results are shown in Fig. 5 to 7. The comparison of the RBF, GA- RBF and GAchaos-RBF has been implemented in the traffic flow forecasting. The prediction rate of the RBF model is 83.5%, the prediction rate of the GA-RBF model is 342

14 12 1 Actual results GA--chaos-RBF GA-RBF RBF Traffic flow 8 6 4 2 2 4 6 8 1 Time (minute) Fig. 6: Prediction results of middle traffic flow Traffic flow 45 4 35 3 25 2 GA-RBF GA-chaos-RBF Actual results RBF 15 1 5 1 2 3 4 5 6 7 8 9 1 Time (minute) Fig. 7: Prediction results of large traffic flow 9.5%, while the GA-chaos-RBF is 95%. Hence, the new GA-chaos-RBF model is the best one among these approaches in the traffic flow forecasting. With the proposed SOFM-GA-chaos-RBF data mining model, accurate traffic flow can be forecasted and optimized traffic management decision can be provided. CONCLUSION Intelligent Transportation Systems (ITS) processes a large amount of traffic information every hour. It is necessary to employ advanced data mining approaches to excavate the hidden knowledge in the ITS database. This study presents a new hybrid intelligent data mining model for the traffic information forecasting. This new method combines the advantages of the unsupervised learning of SOFM and supervised learning of RBF network to mine distinct and potential patterns of the traffic data. Moreover, the GA-chaos algorithm is adopted to optimize the RBF parameters. The experimental test results show that the presented data mining approach is feasible and efficient for potential 3421 knowledge extraction of ITS data. The prediction rate of the proposed SOFM-GA-chaos-RBF model is 95% and much better than the model without optimization algorithm. The proposed forecasting system in this work may provide practical utilities for ITS data mining. Further research can extend the proposed method to other complex information mining system. ACKNOWLEDGMENT This study is sponsored by the National Natural Science Foundation of China (No. 5128394) and National Science Foundation of Hubei Province of China (No. 212FFA99). REFERENCES Hauser, T. and W. Scherer, 21. Data mining tools for real time traffic signal decision support and maintenance. Proceeding of the IEEE International Conference on Systems, Man and Cybernetics, 3: 1471-1477.

Jia, L., L. Yang, Q. Kong and S. Lin, 26. Study of artificial immune clustering algorithm and its applications to urban traffic control. Int. J. Inform. Technol., 12: 1-9. Jiang, Y., Z. Li and Y. Geng, 21. Research on AR modeling method with SOFM-based classifier applied to gear multi-faults diagnosis. Proceeding of International Asia Conference on Informatics in Control, Automation and Robotics, 2: 488-491. Kohonen, T., 199. Derivation of a class of training algorithms. IEEE T. Neural Networks, 1: 229-232. Krishna, B., 212. Binary phase coded sequence generation using fractional order logistic equation. Circ. Syst. Signal Process, 31(1): 41-411. Li, Z., X. Yan, C. Yuan, J. Zhao and Z. Peng, 21. The fault diagnosis approach for gears using multidimensional features and intelligent classifier. Imeche. Sem. Worldwide, 41: 76-86. Li, Z., X. Yan, C. Yuan, J. Zhao and Z. Peng, 211a. Fault detection and diagnosis of the gearbox in marine propulsion system based on bispectrum analysis and artificial neural networks. J. Mar. Sci. Appl., 1: 17-24. Li, Z., X. Yan, C. Yuan, Z. Peng and L. Li, 211b. Virtual prototype and experimental research on gear multi-fault diagnosis using waveletautoregressive model and principal component analysis method. Mech. Syst. Signal Pr., 25: 2589-267. Li, Z., X. Yan, Y. Jiang, L. Qin and J. Wu, 212a. A new data mining approach for gear crack level identification based on manifold learning. Mechanika, 18: 29-34. Li, Z., X. Yan, Z. Guo, P. Liu, C. Yuan and Z. Peng, 212b. A new intelligent fusion method of multidimensional sensors and its application to tribosystem fault diagnosis of marine diesel engines. Tribol. Lett., 47: 1-15. Li, Z., X. Yan, C. Yuan and Z. Peng, 212c. Intelligent fault diagnosis method for marine diesel engines using instantaneous angular speed. J. Mech. Sci. Technol., 26(8): 2413-2423. Nejad, S., F. Seifi, H. Ahmadi and N. Seifi, 29. Applying data mining in prediction and classification of urban traffic. Proceeding of the WRI World Congress on Computer Science and Information Engineering, 3: 674-678. Park, B., D. Lee and H. Yun, 23. Enhancement of time of day based traffic signal control. Proceeding of the IEEE International Conference on Systems, Man and Cybernetics, 4: 3619-3624. Raahemi, B., A. Kouznetsov, A. Hayajneh and P. Rabinovitch, 28. Classification of peer-to-peer traffic using incremental neural networks (fuzzy ARTMAP). Proceeding of IEEE Canadian Conference on Electrical and Computer Engineering, pp: 719-724. Wen, Y. and T. Lee, 25. Fuzzy data mining and grey recurrent neural network forecasting for traffic information systems. Proceeding of the IEEE International Conference on Information Reuse and Integration, pp: 356-361. Xu, P. and S. Lin, 29. Internet traffic classification using C4.5 decision tree. J. Softw., 2(1): 2692-274. Zahra, Z., P. Mahmoud and S. Hossein, 21. Application of data mining in traffic management: Case of city of Isfahan. Proceeding of the International Conference on Electronic Computer Technology, pp: 12-16. 3422