Metaheuristics in Big Data: An Approach to Railway Engineering



Similar documents
Big Data in Transportation Engineering

Random forest algorithm in big data environment

Swarm Intelligence in Big Data Analytics

Biogeography Based Optimization (BBO) Approach for Sensor Selection in Aircraft Engine

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

Discrete Hidden Markov Model Training Based on Variable Length Particle Swarm Optimization Algorithm

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India

Contemporary Techniques for Data Mining Social Media

NATIONWIDE WAYSIDE DETECTOR SYSTEM

Data Security Strategy Based on Artificial Immune Algorithm for Cloud Computing

BMOA: Binary Magnetic Optimization Algorithm

Genetic Algorithm Based Interconnection Network Topology Optimization Analysis

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA

Method of Fault Detection in Cloud Computing Systems

Dynamic Task Scheduling with Load Balancing using Hybrid Particle Swarm Optimization

A Big Data Analytical Framework For Portfolio Optimization Abstract. Keywords. 1. Introduction

Performance Evaluation of Task Scheduling in Cloud Environment Using Soft Computing Algorithms

Information Management course

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

Principles of Data Mining by Hand&Mannila&Smyth

Méta-heuristiques pour l optimisation

Data Mining for Digital Forensics

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Research on the Performance Optimization of Hadoop in Big Data Environment

Improved PSO-based Task Scheduling Algorithm in Cloud Computing

The Application Research of Ant Colony Algorithm in Search Engine Jian Lan Liu1, a, Li Zhu2,b

Computing Issues for Big Data Theory, Systems, and Applications

Keywords Stock Exchange Market, Clustering, Hadoop, Map-Reduce

EXAMINATION OF SCHEDULING METHODS FOR PRODUCTION SYSTEMS. 1. Relationship between logistic and production scheduling

SCHEDULING IN CLOUD COMPUTING

BIG DATA ANALYSIS BASED ON MATHEMATICAL MODEL: A COMPREHENSIVE SURVEY

A Survey of Classification Techniques in the Area of Big Data.

UNIT 1: NAMING AROUND THE WORLD. English Reading 1 for Foreign Students 2 nd week, Fall, 2012 Prof. Hyojin Chung Dongguk University

Optimal PID Controller Design for AVR System

Steven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore

Customer Relationship Management using Adaptive Resonance Theory

Establishment of Fire Control Management System in Building Information Modeling Environment

BIG DATA IN HEALTHCARE THE NEXT FRONTIER

Effective Product Ranking Method based on Opinion Mining

Journal of Chemical and Pharmaceutical Research, 2015, 7(3): Research Article. E-commerce recommendation system on cloud computing

CONCEPTUAL MODEL OF MULTI-AGENT BUSINESS COLLABORATION BASED ON CLOUD WORKFLOW

A RANDOMIZED LOAD BALANCING ALGORITHM IN GRID USING MAX MIN PSO ALGORITHM

International Journal of Engineering Research ISSN: & Management Technology November-2015 Volume 2, Issue-6

ISSN Vol.04,Issue.19, June-2015, Pages:

A Dynamic Approach to Extract Texts and Captions from Videos

CLOUD DATABASE ROUTE SCHEDULING USING COMBANATION OF PARTICLE SWARM OPTIMIZATION AND GENETIC ALGORITHM

A REVIEW ON EFFICIENT DATA ANALYSIS FRAMEWORK FOR INCREASING THROUGHPUT IN BIG DATA. Technology, Coimbatore. Engineering and Technology, Coimbatore.

Open Access Research on Database Massive Data Processing and Mining Method based on Hadoop Cloud Platform

ANALYSING THE FEATURES OF JAVA AND MAP/REDUCE ON HADOOP

AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP

Finding Liveness Errors with ACO

An ACO-LB Algorithm for Task Scheduling in the Cloud Environment

Intelligent Diagnose System of Wheat Diseases Based on Android Phone

BIG DATA CHALLENGES AND PERSPECTIVES

SCORE BASED DEADLINE CONSTRAINED WORKFLOW SCHEDULING ALGORITHM FOR CLOUD SYSTEMS

Projects - Neural and Evolutionary Computing

Statistics for BIG data

Performance Analysis of Data Mining Techniques for Improving the Accuracy of Wind Power Forecast Combination

Hybrid Algorithm using the advantage of ACO and Cuckoo Search for Job Scheduling

Log Mining Based on Hadoop s Map and Reduce Technique

Research Article EFFICIENT TECHNIQUES TO DEAL WITH BIG DATA CLASSIFICATION PROBLEMS G.Somasekhar 1 *, Dr. K.

International Journal of Innovative Research in Computer and Communication Engineering

Entropy-Based Collaborative Detection of DDoS Attacks on Community Networks

Social Prediction in Mobile Networks: Can we infer users emotions and social ties?

Efficient Scheduling in Cloud Networks Using Chakoos Evolutionary Algorithm

A hybrid Approach of Genetic Algorithm and Particle Swarm Technique to Software Test Case Generation

Role of Social Networking in Marketing using Data Mining

Hui(Wendy) Wang Stevens Institute of Technology New Jersey, USA. VLDB Cloud Intelligence workshop, 2012

Fig. 1 WfMC Workflow reference Model

Optimization of PID parameters with an improved simplex PSO

A Network Simulation Experiment of WAN Based on OPNET

A TunableWorkflow Scheduling AlgorithmBased on Particle Swarm Optimization for Cloud Computing

The Research on Optimal Control of Hvac Refrigeration System

AN INFORMATION AGENT SYSTEM FOR CLOUD COMPUTING BASED LOCATION TRACKING

ISSN: (Online) Volume 2, Issue 2, February 2014 International Journal of Advance Research in Computer Science and Management Studies

A Hybrid Model of Particle Swarm Optimization (PSO) and Artificial Bee Colony (ABC) Algorithm for Test Case Optimization

Analytical review of three latest nature inspired algorithms for scheduling in clouds

TOWARD BIG DATA ANALYSIS WORKSHOP

siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service

Binary Ant Colony Evolutionary Algorithm

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

Introduction to Data Mining

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

International Journal of Emerging Technology & Research

A New Method for Traffic Forecasting Based on the Data Mining Technology with Artificial Intelligent Algorithms

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin *

Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach

Study on Architecture and Implementation of Port Logistics Information Service Platform Based on Cloud Computing 1

Keywords: Information Retrieval, Vector Space Model, Database, Similarity Measure, Genetic Algorithm.

An ACO Approach to Solve a Variant of TSP

Machine Learning over Big Data


Transcription:

Metaheuristics in Big Data: An Approach to Railway Engineering Silvia Galván Núñez 1,2, and Prof. Nii Attoh-Okine 1,3 1 Department of Civil and Environmental Engineering University of Delaware, Newark, DE, USA 2 silgalnu@udel.edu, 3 okine@udel.edu IEEE Workshop on Large Data Analytics in Transportation Engineering October 27, 2014 October 27, 2014 1 / 17

Outline 1 Introduction 2 Big Data and Optimization 3 Big Data in Railway Engineering 4 Conclusion and Future Work October 27, 2014 2 / 17

Section 1 Introduction October 27, 2014 3 / 17

Introduction Big Data and Optimization Big Data in Railway Engineering Conclusion and Future Work Introduction Huge amount of data generated on the internet every minute: Facebook users share over 2.5 million pieces of content. Email users send over 200 million messages. Google receives 4 million queries. http: //www.domo.com/blog/2014/04/data-never-sleeps-2-0/ October 27, 2014 4 / 17

Introduction Railway Engineering 3 terabytes of data are generated in a year for bearing temperature detectors for Class I US railroad [Li et al., 2014]. Data collected from different sensors need to be integrated for predicting track failures [Xie and Liu, 2010]. October 27, 2014 5 / 17

Introduction Figure 1: Five V s of Big Data ([Costa, 2013], [Lusher et al., 2013], [Cheng et al., 2013], [Pandey and Nepal, 2013], [Markowetz et al., 2014], [Chen, 2014]) October 27, 2014 6 / 17

Big Data Analytics Techniques - Some Examples Figure 2: Example of a framework for Big Data Analytics ([Li et al., 2013], IEEE [Tannahill Workshop and onjamshidi, Large Data2014]) Analytics in Transportation Engineering, October 27, 2014 7 / 17

Big Data Analytics Techniques - Some Examples Figure 2: Example of a framework for Big Data Analytics ([Li et al., 2013], IEEE [Tannahill Workshop and onjamshidi, Large Data2014]) Analytics in Transportation Engineering, October 27, 2014 7 / 17

Big Data Analytics Techniques - Some Examples Figure 2: Example of a framework for Big Data Analytics ([Li et al., 2013], IEEE [Tannahill Workshop and onjamshidi, Large Data2014]) Analytics in Transportation Engineering, October 27, 2014 7 / 17

Big Data Analytics Techniques - Some Examples Figure 2: Example of a framework for Big Data Analytics ([Li et al., 2013], IEEE [Tannahill Workshop and onjamshidi, Large Data2014]) Analytics in Transportation Engineering, October 27, 2014 7 / 17

Big Data Analytics Techniques - Some Examples Figure 2: Example of a framework for Big Data Analytics ([Li et al., 2013], IEEE [Tannahill Workshop and onjamshidi, Large Data2014]) Analytics in Transportation Engineering, October 27, 2014 7 / 17

Section 2 Big Data and Optimization October 27, 2014 8 / 17

Optimization Techniques Figure 3: Classification of Optimization Techniques October 27, 2014 9 / 17

Optimization Techniques Figure 3: Classification of Optimization Techniques October 27, 2014 9 / 17

Big Data Analytics using Optimization Techniques Optimization Technique Evolutionary optimization Ant colony optimization Authors [Fan et al.,2000],[chen, 2008], [Lee et al., 2012], [Chen et al., 2013], [Cambria et al., 2013], [Thomas and Jin, 2013], [Zhu et al., 2014], [Balicki et al. 2014], [Lee et al., 2014], [Liu, 2014],[Tannahill and Jamshidi, 2014] [Yang and Chen, 2006], [Sun et al., 2013], [Wu et al., 2013], [Zhang et al.,2014] Optimization Technique Particle swarm optimization Greedy algorithm Authors [Chaari et al., 2012], [Chang et al., 2013], [Fong et al., 2013], [Govindarajan et al., 2013], [Cheng et al., 2013], [Jian and Wang, 2014] [Chung et al., 2013], [Mestre and Pires, 2013], [Tan et al., 2013], [Lin et al., 2013], [Wang et al., 2014] Hierarchical Neighbor [Liu et al., 2013] [Yang and Fong, 2013] scheduling embedding l1-regularized [Saha et al., 2013], Horizontal data [Bellatreche et al., 2013] optimization [Tran et al., 2013] partitioning Method of multipliers 2014] annealing [Liu et al., 2013], [Anbari et al., Simulated [Rahimian et al., 2014] Artificial [Qin and Rusu, 2013], [Ahmadi et Gradient Descent immune [Cabanas-Abascal et al., 2013] al., 2014], [Mittal et al., 2014] IEEE Workshop on Large Data Analytics systemsin Transportation Engineering, October 27, 2014 10 / 17

Metaheuristics in Big Data Advantages Adequate for solving NP-Hard problems. Mainly used for: Feature extraction. Dimension reduction. Potential to address multi-objective problems. Challenges The fitness function or the processed data is noisy. The fitness function suffers from approximation errors. October 27, 2014 11 / 17

Section 3 Big Data in Railway Engineering October 27, 2014 12 / 17

Applications Learning to Predict Train Wheel Failures [Yang and Létourneau, 2005] Goal: Optimize maintenance and operation of trains. Approach: Decision Trees and Naïve Bayes. A Simple and Efficient Parallel Approach to Large-Scale Railway Freight Data Analysis [Xie and Li, 2010] Goal: Integrate the national-widely distributed railway freight data sets. Approach: Parallel optimization techniques. October 27, 2014 13 / 17

Applications Improving rail network velocity: A machine learning approach to predictive maintenance [Li et al., 2013] Data from disparate sources: Historical detector data, failure data, maintenance action data, inspection schedule data, train type data and weather data. Approach: Principal Component Analysis, Support Vector Machine. October 27, 2014 14 / 17

Findings Emerging area. Need to integrate data from different sources. October 27, 2014 15 / 17

Section 4 Conclusion and Future Work October 27, 2014 16 / 17

Conclusion Potential to use population-based metaheuristics for Big Data Analytics applied in railway engineering. Opportunity to improve reliability and safety in railway engineering. Future work Identification of the railway data sets. Metaheuristic selection and definition of its use in the Big Data Analytics framework. October 27, 2014 17 / 17

Metaheuristics in Big Data: An Approach to Railway Engineering Silvia Galván Núñez 1,2, and Prof. Nii Attoh-Okine 1,3 1 Department of Civil and Environmental Engineering University of Delaware, Newark, DE, USA 2 silgalnu@udel.edu, 3 okine@udel.edu IEEE Workshop on Large Data Analytics in Transportation Engineering October 27, 2014 October 27, 2014 17 / 17