Research Article EFFICIENT TECHNIQUES TO DEAL WITH BIG DATA CLASSIFICATION PROBLEMS G.Somasekhar 1 *, Dr. K.

Size: px
Start display at page:

Download "Research Article www.ijptonline.com EFFICIENT TECHNIQUES TO DEAL WITH BIG DATA CLASSIFICATION PROBLEMS G.Somasekhar 1 *, Dr. K."

Transcription

1 ISSN: X CODEN: IJPTFI Available Online through Research Article EFFICIENT TECHNIQUES TO DEAL WITH BIG DATA CLASSIFICATION PROBLEMS G.Somasekhar 1 *, Dr. K.Karthikeyan 2 1 Research Scholar/SCSE, VIT University,Vellore,Tamil Nadu,India. 2 Associate Professor/SAS, VIT University,Vellore,Tamil Nadu,India. gidd.somasekhar2014@vit.ac.in, k.karthikeyan@vit.ac.in Received on Accepted on Abstract Big data analytics is the process of examining large data sets containing a variety of data types i.e., big data to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. It is the extension of data mining. Many traditional techniques in data mining can be made suitable for big data either by little modifications or by the combination of more than one data mining technique. Mapreduce is a technique we can use in many big data applications. The main focus of this paper is on big data classification. The techniques those can be used for big data classification are discussed. These classification techniques can be implemented in all big data scenarios. Keywords: Classification, big data, analytics, mapreduce, ensemble learning, fuzzy set approach, incremental algorithm, semi-supervised learning. Introduction Big data management is becoming crucial now a days because of the evolution of large number of social networking websites, powerful mobile devices, sensors, and cloud computing. According to IDC analysis, the global data volume is going to grow 44 times between 2009 and It may also go beyond limits such that we cannot control. The existing technology and infrastructure may not support to maintain these large chunks of data. Big data became a buzz word today leading to revolutionary changes in data processing, data storage and data analytics. As the information is the basic need for any kind of development, society needs more big data techniques to extract useful information from big data. The big data tools like Hadoop evolved as primary tools for any developing organization to deal with big data. IJPT Sep-2015 Vol. 7 Issue No Page 8942

2 Characteristics of Big Data and application of big data analytics Big data should have 4 V s which are termed as fundamental characteristics of big data. i) Volume: The real time big data scenarios are demanding huge volumes of data ranging from peta bytes to exa bytes. It may go beyond exa bytes also. ii) iii) iv) Variety: Collection of different types of data from different data sources. Velocity: Speed of data generation and data updation. Value: Valuable knowledge extraction and decision making. Some of the challenges of big data include limited main memory, data security, data recovery, data processing, and maintaining the balance between ethical values and big data management. Big data analytics evolved as a subject of extracting interesting patterns from big data to support decision making process. It is applied in many fields like finance, medical, bio-informatics, science, space, retail industry etc. Some applications and big data algorithms are mentioned in Table-1: Applications of big data, the algorithms and computing methods. Big Data Classification: Problems Classification is a data mining technique to extract categorical labels or classes from a large data set. When this data set has the characteristics of big data, it is termed as big data classification. As the big data has several challenges to overcome (mentioned in Table-1 section 2), the traditional classification techniques may not be suitable to handle big data classification problem. Modification of conventional classification algorithms and applying any big data technique is inevitable to meet the big data processing needs. In addition to these, the big data classification algorithms need to be scalable and incremental. They have to solve the problems of big stream data like concept drift, infinite length, concept evolution, and feature evolution. The following section explains briefly the efficient big data classification techniques. IJPT Sep-2015 Vol. 7 Issue No Page 8943

3 Techniques for Big Data Classification G.Somasekhar* et al. International Journal Of Pharmacy & Technology a) Application of Mapreduce to traditional data classification algorithms[6]: There is the possibility that both lazy learners and eager learners can be subjected to map reduce to get accurate classification result with in less time. Lazy learners include K-Nearest Neighbor Classifier and Case Based Reasoning(CBR).Eager learners include the classification algorithms such as Bayesian classification, Decision tree induction, Rule based classification, Classification by back propagation and so on. Should be map reduces strategy and its application to data classification algorithms are depicted in fig 1 and fig 2 below. Fig 1 : Sample application of Mapreduce for shape counter. Fig 2: Application of Mapreduce on a traditional data classification technique. b) Semi supervised learning: Semi-supervised learning [5] is the mixture of both data classification (supervised learning) and data clustering (unsupervised learning) as depicted in fig 3 below. Building classifiers becomes much difficult, labour intensive, cost consuming and time consuming in real time big data scenarios. It is often the case that we may have a small number of labeled samples to train a few classifiers, but a large number of unlabeled samples are available to build clusters from big data. In such cases we can choose this classification technique. IJPT Sep-2015 Vol. 7 Issue No Page 8944

4 Fig 3: Generation of mixed ensemble by semi-supervised learning. c) Fuzzy set approach: In fig.4, the membership values of x in each fuzzy set do not have to total to 1.Each x may be the member of two or more fuzzy sets. (Here x is the value of income). Fig 4: Graph representation of fuzzy membership values for fuzzy sets low_income, medium_income and high_income in a sample employee data set. For example, m medium_income ($49k) = 0.15, and m high_income ($49k) = 0.96 m medium_income ($49k) + m high_income ($49k) 1.(Where m( ) is the membershipfunction). The above approach is called fuzzy set approach[7] which is based on fuzzy set theory.fuzzy set theory is also known as possibility theory which is very useful in dealing with vague or inexact facts in big data applications. Fuzzy rules and fuzzy models can be derived from fuzzy sets. Fuzzy logic systems can be used in numerous areas for big data classification including market research, finance, health care, and environmental engineering. IJPT Sep-2015 Vol. 7 Issue No Page 8945

5 d) Incremental learning: G.Somasekhar* et al. International Journal Of Pharmacy & Technology Properties of the incremental classification[8]: i) Updates a classifier dynamically using the test data. ii) iii) iv) No need to store all train data in main memory. Flexibility to modify a model based on newly trained records. The classifier can adapt to gradual concept drift problem of big stream data. Example: Very Fast Decision Tree (VFDT), and Concept adapting Very Fast Decision Tree(CVFDT) algorithms: The traditional Hoeffding Tree algorithm is modified in three ways to get VFDT algorithm. The three modifications are, i) Aggressive breaking of near ties during attribute selection. ii) iii) Deactivating least promising leaves Dropping poor splitting attributes To handle the concept drift problem efficiently, VFDT is again converted into CVFDT. Both VFDT and CVFDT can deal with the big stream data classification problems improving the speed, memory utilization and scalability. Other examples include incremental decision tree, incremental Bayesian classification and so on. e) Ensemble learning: The incremental learning cannot remove old records from a classifier. This limitation of incremental learning lead to new learning method called Ensemble learning[8]. It divides large data stream into small data chunks. For each chunk, an independent classifier is built. Finally a set of n number of top most classifiers based on heuristic methods is obtained where n is the size of the ensemble and majority voting method is applied to get the label for a test tuple as depicted in fig 5. Fig 5: Ensemble learning. IJPT Sep-2015 Vol. 7 Issue No Page 8946

6 Advantages: i) As each data chunk is relatively small compared to entire data, classifier construction cost per chunk is very less. ii) As we are storing the classifier of a chunk instead of storing all the chunk related to a classifier, the memory is saved. iii) It can adapt to rigorous concept drift problem of big stream data also. f) Genetic Algorithms: Genetic algorithms (GAs) are a particular class of evolutionary algorithms involving inheritance, mutation, selection and cross-over techniques of biology. GAs use binary strings to encode features of an individual. The main advantage of GAs is they are easily parallelizable and more suitable for big data classification. Mapreduce strategy is well applicable here. Basic Genetic Algorithm Step 1: Randomly select initial population. Step 2: Repeat the following steps until terminated. Step 3: Evaluate each individual s fitness. Step 4: Prune population. Step 5: Select pairs to mate from best-ranked individuals. Step 6: Replenish population using selected pairs.(apply cross-over,mutation etc.) Step 7: Add or replace generated member to population. Step 8: Check for termination criteria. Step 9: end repeat. g) Rough set approach: Rough set theory [10] is a powerful mathematical tool developed by Z. Pawlak in the early 1980s. It can be applied widely to extract knowledge from database. It discovers hidden patterns in data by identifying partial and total dependencies in data. It also works with null or missing values. Rough set methods work very well in dealing with uncertainties. Rough sets can be used together with other methods such as fuzzy sets, statistic methods, genetic algorithms etc. to get mixed benefits or it can be map reduced to get the advantage of scalability in real time big data scenarios. A rough set depends on upper and lower approximations which are explained below based on fig.6. IJPT Sep-2015 Vol. 7 Issue No Page 8947

7 1. Lower approximation: The lower approximation consists of all the data pertaining to class C, without any ambiguity based on attributes. 2. Upper approximation: The objects are probably belong to class C, cannot be described as not belonging to class C based on the knowledge of the attributes. 3. Boundary region The differences between these lower and upper approximations define the boundary region of the rough set. The set is crisp, if boundary region is empty. Or set is rough, if the boundary region is nonempty. Rough set deals with vagueness and uncertainty emphasized in decision making. The equivalence classes are represented by rectangular regions in fig.6. Fig 6: Rough set approximation of the tuples belonging to class C, using upper and lower approximation sets. h) Swarm intelligence: Swarm intelligence is inspired by the swarm behavior [9] of insects, flocks and birds. Swarm is generally a group of several agents helping each other to achieve common goal. The agents follow local rules to execute their actions and with the help of entire group they achieve their objective. Particle Swarm Optimization (PSO) and Ant Colonies Optimization (ACO) are the most popular examples for Swarm Intelligence. This technique can be implemented in real time where big data is distributed over huge network. i) Artificial neural networks: Artificial Neural Networks [9] are basically the computational models that consist of number of processing units that communicates to one another over a large network by sending signals. They are inspired by human brains. In terms of biology, neuron collects signals from other neurons through Dendrites. The main important feature of this algorithm is IJPT Sep-2015 Vol. 7 Issue No Page 8948

8 that we can learn from examples so that we can ignore programming. A sample neural network approach is depicted in fig 7 below. Fig 7: A sample neural network approach. j) Co-evolutionary programming[9]: It is based on the fact that the individuals of the two populations evolve through either competing against each other or through co-operating each other. The fitness function involving the relationship with other individuals is used in this technique. In competitive approach, the fitness of an individual in population is completely based on the fitness of an individual in other population whereas in cooperative approach, the fitness of an individual purely depends on degree of cooperation with the other individual in another population. This technique can be applied in big data classification algorithms. Conclusion: Big data management is becoming crucial now a days. Traditional data mining techniques need to be reviewed and modified to deal with big data problems. This paper focused on the big data classification. Efficient big data classification strategies and techniques are discussed. The big data research should be encouraged to get new ideas to handle big data. In future, we would focus on big data clustering. References 1. G. Noseworthy, Infographic: Managing the Big Flood of Big Data in Digital Marketing, hic-big-flood-of-big-data-in-digitalmarketing/. 2. H. Moed, The Evolution of Big Data as a Research and Scientific Topic: Overview of the Literature, 2012, ResearchTrends, 3. MIKE 2.0, Big Data Definition, Definition. IJPT Sep-2015 Vol. 7 Issue No Page 8949

9 4. P. Zikipoulos, T. Deutsch, D. Deroos, Harness the Power of Big Data, 2012, s-power-big-data-book-excerpt. 5. Peng Zhang, Xingquan Zhu, Jianlong Tan and Li Guo Classifier and Cluster Ensembles for Mining Concept Drifting Data Streams,10 th IEEE International Conference on Data Mining, pp , AhsanulHaque, Brandon Parker and Latifur Khan, Labeling Instances in Evolving Data Streams withmapreduce, Big Data Congress, IEEE, pp , Mukkamala,R.R.,Hussain,A.,Vatrapu,R., Fuzzy-Set Based Sentiment Analysis of Big Social Data,In Proc.of IEEE 18 th international Enterprise distributing object computing conference,pp.71-80, WenyuZang, Peng Zhang, Chuan Zhou and Li Guo, Comparative study between incremental and ensemble learning on data streams: Case study, Journal Of Big Data 2014,1:5,Springer, Neha Khan, Mohd Shahid Husain, Mohd Rizwan Beg, Big Data Classification using Evolutionary Techniques: A Survey,In Proc.of IEEE International Conference on Engineering and Technology (ICETECH),pp , Prachi Patil, Data Mining with Rough Set Using Map-Reduce, International Journal of Innovative Research in Computer and Communication Engineering, Vol. 2, Issue 11,pp , Corresponding Author: G.Somasekhar*, gidd.somasekhar2014@vit.ac.in IJPT Sep-2015 Vol. 7 Issue No Page 8950

Big Data with Rough Set Using Map- Reduce

Big Data with Rough Set Using Map- Reduce Big Data with Rough Set Using Map- Reduce Mr.G.Lenin 1, Mr. A. Raj Ganesh 2, Mr. S. Vanarasan 3 Assistant Professor, Department of CSE, Podhigai College of Engineering & Technology, Tirupattur, Tamilnadu,

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

D A T A M I N I N G C L A S S I F I C A T I O N

D A T A M I N I N G C L A S S I F I C A T I O N D A T A M I N I N G C L A S S I F I C A T I O N FABRICIO VOZNIKA LEO NARDO VIA NA INTRODUCTION Nowadays there is huge amount of data being collected and stored in databases everywhere across the globe.

More information

A Big Data Analytical Framework For Portfolio Optimization Abstract. Keywords. 1. Introduction

A Big Data Analytical Framework For Portfolio Optimization Abstract. Keywords. 1. Introduction A Big Data Analytical Framework For Portfolio Optimization Dhanya Jothimani, Ravi Shankar and Surendra S. Yadav Department of Management Studies, Indian Institute of Technology Delhi {dhanya.jothimani,

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

Random forest algorithm in big data environment

Random forest algorithm in big data environment Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

More information

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6

International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6 International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: editor@ijermt.org November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering

More information

14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,

More information

Classification algorithm in Data mining: An Overview

Classification algorithm in Data mining: An Overview Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

A New Method for Traffic Forecasting Based on the Data Mining Technology with Artificial Intelligent Algorithms

A New Method for Traffic Forecasting Based on the Data Mining Technology with Artificial Intelligent Algorithms Research Journal of Applied Sciences, Engineering and Technology 5(12): 3417-3422, 213 ISSN: 24-7459; e-issn: 24-7467 Maxwell Scientific Organization, 213 Submitted: October 17, 212 Accepted: November

More information

Research on the Performance Optimization of Hadoop in Big Data Environment

Research on the Performance Optimization of Hadoop in Big Data Environment Vol.8, No.5 (015), pp.93-304 http://dx.doi.org/10.1457/idta.015.8.5.6 Research on the Performance Optimization of Hadoop in Big Data Environment Jia Min-Zheng Department of Information Engineering, Beiing

More information

Specific Usage of Visual Data Analysis Techniques

Specific Usage of Visual Data Analysis Techniques Specific Usage of Visual Data Analysis Techniques Snezana Savoska 1 and Suzana Loskovska 2 1 Faculty of Administration and Management of Information systems, Partizanska bb, 7000, Bitola, Republic of Macedonia

More information

Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing

Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 4, April 2015,

More information

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann

More information

Web Mining using Artificial Ant Colonies : A Survey

Web Mining using Artificial Ant Colonies : A Survey Web Mining using Artificial Ant Colonies : A Survey Richa Gupta Department of Computer Science University of Delhi ABSTRACT : Web mining has been very crucial to any organization as it provides useful

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

American International Journal of Research in Science, Technology, Engineering & Mathematics

American International Journal of Research in Science, Technology, Engineering & Mathematics American International Journal of Research in Science, Technology, Engineering & Mathematics Available online at http://www.iasir.net ISSN (Print): 2328-349, ISSN (Online): 2328-3580, ISSN (CD-ROM): 2328-3629

More information

A Data Generator for Multi-Stream Data

A Data Generator for Multi-Stream Data A Data Generator for Multi-Stream Data Zaigham Faraz Siddiqui, Myra Spiliopoulou, Panagiotis Symeonidis, and Eleftherios Tiakas University of Magdeburg ; University of Thessaloniki. [siddiqui,myra]@iti.cs.uni-magdeburg.de;

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

Data Mining & Data Stream Mining Open Source Tools

Data Mining & Data Stream Mining Open Source Tools Data Mining & Data Stream Mining Open Source Tools Darshana Parikh, Priyanka Tirkha Student M.Tech, Dept. of CSE, Sri Balaji College Of Engg. & Tech, Jaipur, Rajasthan, India Assistant Professor, Dept.

More information

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing

More information

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

More information

The University of Jordan

The University of Jordan The University of Jordan Master in Web Intelligence Non Thesis Department of Business Information Technology King Abdullah II School for Information Technology The University of Jordan 1 STUDY PLAN MASTER'S

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

Keywords data mining, prediction techniques, decision making.

Keywords data mining, prediction techniques, decision making. Volume 5, Issue 4, April 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analysis of Datamining

More information

BIG DATA IN HEALTHCARE THE NEXT FRONTIER

BIG DATA IN HEALTHCARE THE NEXT FRONTIER BIG DATA IN HEALTHCARE THE NEXT FRONTIER Divyaa Krishna Sonnad 1, Dr. Jharna Majumdar 2 2 Dean R&D, Prof. and Head, 1,2 Dept of CSE (PG), Nitte Meenakshi Institute of Technology Abstract: The world of

More information

Data Mining and Machine Learning in Bioinformatics

Data Mining and Machine Learning in Bioinformatics Data Mining and Machine Learning in Bioinformatics PRINCIPAL METHODS AND SUCCESSFUL APPLICATIONS Ruben Armañanzas http://mason.gmu.edu/~rarmanan Adapted from Iñaki Inza slides http://www.sc.ehu.es/isg

More information

International Journal of Software and Web Sciences (IJSWS) www.iasir.net

International Journal of Software and Web Sciences (IJSWS) www.iasir.net International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Print): 2279-0063 ISSN (Online): 2279-0071 International

More information

Adaptive Classification Algorithm for Concept Drifting Electricity Pricing Data Streams

Adaptive Classification Algorithm for Concept Drifting Electricity Pricing Data Streams Adaptive Classification Algorithm for Concept Drifting Electricity Pricing Data Streams Pramod D. Patil Research Scholar Department of Computer Engineering College of Engg. Pune, University of Pune Parag

More information

Biogeography Based Optimization (BBO) Approach for Sensor Selection in Aircraft Engine

Biogeography Based Optimization (BBO) Approach for Sensor Selection in Aircraft Engine Biogeography Based Optimization (BBO) Approach for Sensor Selection in Aircraft Engine V.Hymavathi, B.Abdul Rahim, Fahimuddin.Shaik P.G Scholar, (M.Tech), Department of Electronics and Communication Engineering,

More information

CLOUD DATABASE ROUTE SCHEDULING USING COMBANATION OF PARTICLE SWARM OPTIMIZATION AND GENETIC ALGORITHM

CLOUD DATABASE ROUTE SCHEDULING USING COMBANATION OF PARTICLE SWARM OPTIMIZATION AND GENETIC ALGORITHM CLOUD DATABASE ROUTE SCHEDULING USING COMBANATION OF PARTICLE SWARM OPTIMIZATION AND GENETIC ALGORITHM *Shabnam Ghasemi 1 and Mohammad Kalantari 2 1 Deparment of Computer Engineering, Islamic Azad University,

More information

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA Professor Yang Xiang Network Security and Computing Laboratory (NSCLab) School of Information Technology Deakin University, Melbourne, Australia http://anss.org.au/nsclab

More information

Professor Anita Wasilewska. Classification Lecture Notes

Professor Anita Wasilewska. Classification Lecture Notes Professor Anita Wasilewska Classification Lecture Notes Classification (Data Mining Book Chapters 5 and 7) PART ONE: Supervised learning and Classification Data format: training and test data Concept,

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015 RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering

More information

Data Mining for Knowledge Management. Classification

Data Mining for Knowledge Management. Classification 1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh

More information

Intrusion Detection. Jeffrey J.P. Tsai. Imperial College Press. A Machine Learning Approach. Zhenwei Yu. University of Illinois, Chicago, USA

Intrusion Detection. Jeffrey J.P. Tsai. Imperial College Press. A Machine Learning Approach. Zhenwei Yu. University of Illinois, Chicago, USA SERIES IN ELECTRICAL AND COMPUTER ENGINEERING Intrusion Detection A Machine Learning Approach Zhenwei Yu University of Illinois, Chicago, USA Jeffrey J.P. Tsai Asia University, University of Illinois,

More information

Computational intelligence in intrusion detection systems

Computational intelligence in intrusion detection systems Computational intelligence in intrusion detection systems --- An introduction to an introduction Rick Chang @ TEIL Reference The use of computational intelligence in intrusion detection systems : A review

More information

A REVIEW ON EFFICIENT DATA ANALYSIS FRAMEWORK FOR INCREASING THROUGHPUT IN BIG DATA. Technology, Coimbatore. Engineering and Technology, Coimbatore.

A REVIEW ON EFFICIENT DATA ANALYSIS FRAMEWORK FOR INCREASING THROUGHPUT IN BIG DATA. Technology, Coimbatore. Engineering and Technology, Coimbatore. A REVIEW ON EFFICIENT DATA ANALYSIS FRAMEWORK FOR INCREASING THROUGHPUT IN BIG DATA 1 V.N.Anushya and 2 Dr.G.Ravi Kumar 1 Pg scholar, Department of Computer Science and Engineering, Coimbatore Institute

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Effect of Using Neural Networks in GA-Based School Timetabling

Effect of Using Neural Networks in GA-Based School Timetabling Effect of Using Neural Networks in GA-Based School Timetabling JANIS ZUTERS Department of Computer Science University of Latvia Raina bulv. 19, Riga, LV-1050 LATVIA janis.zuters@lu.lv Abstract: - The school

More information

International Journal of Innovative Research in Computer and Communication Engineering

International Journal of Innovative Research in Computer and Communication Engineering FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,

More information

A Review And Evaluations Of Shortest Path Algorithms

A Review And Evaluations Of Shortest Path Algorithms A Review And Evaluations Of Shortest Path Algorithms Kairanbay Magzhan, Hajar Mat Jani Abstract: Nowadays, in computer networks, the routing is based on the shortest path problem. This will help in minimizing

More information

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016 Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

Improving Decision Making and Managing Knowledge

Improving Decision Making and Managing Knowledge Improving Decision Making and Managing Knowledge Decision Making and Information Systems Information Requirements of Key Decision-Making Groups in a Firm Senior managers, middle managers, operational managers,

More information

A hybrid Approach of Genetic Algorithm and Particle Swarm Technique to Software Test Case Generation

A hybrid Approach of Genetic Algorithm and Particle Swarm Technique to Software Test Case Generation A hybrid Approach of Genetic Algorithm and Particle Swarm Technique to Software Test Case Generation Abhishek Singh Department of Information Technology Amity School of Engineering and Technology Amity

More information

GA as a Data Optimization Tool for Predictive Analytics

GA as a Data Optimization Tool for Predictive Analytics GA as a Data Optimization Tool for Predictive Analytics Chandra.J 1, Dr.Nachamai.M 2,Dr.Anitha.S.Pillai 3 1Assistant Professor, Department of computer Science, Christ University, Bangalore,India, chandra.j@christunivesity.in

More information

Big Data: Study in Structured and Unstructured Data

Big Data: Study in Structured and Unstructured Data Big Data: Study in Structured and Unstructured Data Motashim Rasool 1, Wasim Khan 2 mail2motashim@gmail.com, khanwasim051@gmail.com Abstract With the overlay of digital world, Information is available

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin *

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin * Send Orders for Reprints to reprints@benthamscience.ae 766 The Open Electrical & Electronic Engineering Journal, 2014, 8, 766-771 Open Access Research on Application of Neural Network in Computer Network

More information

A Binary Model on the Basis of Imperialist Competitive Algorithm in Order to Solve the Problem of Knapsack 1-0

A Binary Model on the Basis of Imperialist Competitive Algorithm in Order to Solve the Problem of Knapsack 1-0 212 International Conference on System Engineering and Modeling (ICSEM 212) IPCSIT vol. 34 (212) (212) IACSIT Press, Singapore A Binary Model on the Basis of Imperialist Competitive Algorithm in Order

More information

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014 Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions

More information

A Survey on Parallel Method for Rough Set using MapReduce Technique for Data Mining

A Survey on Parallel Method for Rough Set using MapReduce Technique for Data Mining www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 9 Sep 2015, Page No. 14160-14163 A Survey on Parallel Method for Rough Set using MapReduce Technique

More information

A Survey of Classification Techniques in the Area of Big Data.

A Survey of Classification Techniques in the Area of Big Data. A Survey of Classification Techniques in the Area of Big Data. 1PrafulKoturwar, 2 SheetalGirase, 3 Debajyoti Mukhopadhyay 1Reseach Scholar, Department of Information Technology 2Assistance Professor,Department

More information

A Survey on Intrusion Detection System with Data Mining Techniques

A Survey on Intrusion Detection System with Data Mining Techniques A Survey on Intrusion Detection System with Data Mining Techniques Ms. Ruth D 1, Mrs. Lovelin Ponn Felciah M 2 1 M.Phil Scholar, Department of Computer Science, Bishop Heber College (Autonomous), Trichirappalli,

More information

SOFT COMPUTING AND ITS USE IN RISK MANAGEMENT

SOFT COMPUTING AND ITS USE IN RISK MANAGEMENT SOFT COMPUTING AND ITS USE IN RISK MANAGEMENT doc. Ing. Petr Dostál, CSc. Brno University of Technology, Kolejní 4, 612 00 Brno, Czech Republic, Institute of Informatics, Faculty of Business and Management,

More information

Effective Data Mining Using Neural Networks

Effective Data Mining Using Neural Networks IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 8, NO. 6, DECEMBER 1996 957 Effective Data Mining Using Neural Networks Hongjun Lu, Member, IEEE Computer Society, Rudy Setiono, and Huan Liu,

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Big Data Classification: Problems and Challenges in Network Intrusion Prediction with Machine Learning

Big Data Classification: Problems and Challenges in Network Intrusion Prediction with Machine Learning Big Data Classification: Problems and Challenges in Network Intrusion Prediction with Machine Learning By: Shan Suthaharan Suthaharan, S. (2014). Big data classification: Problems and challenges in network

More information

Classification and Prediction

Classification and Prediction Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser

More information

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics

More information

Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies

Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Image

More information

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

More information

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm Martin Hlosta, Rostislav Stríž, Jan Kupčík, Jaroslav Zendulka, and Tomáš Hruška A. Imbalanced Data Classification

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering

More information

Processing of Big Data. Nelson L. S. da Fonseca IEEE ComSoc Summer Scool Trento, July 9 th, 2015

Processing of Big Data. Nelson L. S. da Fonseca IEEE ComSoc Summer Scool Trento, July 9 th, 2015 Processing of Big Data Nelson L. S. da Fonseca IEEE ComSoc Summer Scool Trento, July 9 th, 2015 Acknowledgement Some slides in this set of slides were provided by EMC Corporation and Sandra Avila, University

More information

Review on Financial Forecasting using Neural Network and Data Mining Technique

Review on Financial Forecasting using Neural Network and Data Mining Technique ORIENTAL JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY An International Open Free Access, Peer Reviewed Research Journal Published By: Oriental Scientific Publishing Co., India. www.computerscijournal.org ISSN:

More information

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor

More information

How To Classify Data Stream Mining

How To Classify Data Stream Mining JOURNAL OF COMPUTERS, VOL. 8, NO. 11, NOVEMBER 2013 2873 A Semi-supervised Ensemble Approach for Mining Data Streams Jing Liu 1,2, Guo-sheng Xu 1,2, Da Xiao 1,2, Li-ze Gu 1,2, Xin-xin Niu 1,2 1.Information

More information

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Nine Common Types of Data Mining Techniques Used in Predictive Analytics 1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com

More information

Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm

Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Journal of Al-Nahrain University Vol.15 (2), June, 2012, pp.161-168 Science Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Manal F. Younis Computer Department, College

More information

A Survey of Evolutionary Algorithms for Data Mining and Knowledge Discovery

A Survey of Evolutionary Algorithms for Data Mining and Knowledge Discovery A Survey of Evolutionary Algorithms for Data Mining and Knowledge Discovery Alex A. Freitas Postgraduate Program in Computer Science, Pontificia Universidade Catolica do Parana Rua Imaculada Conceicao,

More information

Projects - Neural and Evolutionary Computing

Projects - Neural and Evolutionary Computing Projects - Neural and Evolutionary Computing 2014-2015 I. Application oriented topics 1. Task scheduling in distributed systems. The aim is to assign a set of (independent or correlated) tasks to some

More information

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive

More information

Using News Articles to Predict Stock Price Movements

Using News Articles to Predict Stock Price Movements Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 gyozo@cs.ucsd.edu 21, June 15,

More information

Extraction of Satellite Image using Particle Swarm Optimization

Extraction of Satellite Image using Particle Swarm Optimization Extraction of Satellite Image using Particle Swarm Optimization Er.Harish Kundra Assistant Professor & Head Rayat Institute of Engineering & IT, Railmajra, Punjab,India. Dr. V.K.Panchal Director, DTRL,DRDO,

More information

Study on Cloud Computing Resource Scheduling Strategy Based on the Ant Colony Optimization Algorithm

Study on Cloud Computing Resource Scheduling Strategy Based on the Ant Colony Optimization Algorithm www.ijcsi.org 54 Study on Cloud Computing Resource Scheduling Strategy Based on the Ant Colony Optimization Algorithm Linan Zhu 1, Qingshui Li 2, and Lingna He 3 1 College of Mechanical Engineering, Zhejiang

More information

Flexible Neural Trees Ensemble for Stock Index Modeling

Flexible Neural Trees Ensemble for Stock Index Modeling Flexible Neural Trees Ensemble for Stock Index Modeling Yuehui Chen 1, Ju Yang 1, Bo Yang 1 and Ajith Abraham 2 1 School of Information Science and Engineering Jinan University, Jinan 250022, P.R.China

More information

Decision Trees for Mining Data Streams Based on the Gaussian Approximation

Decision Trees for Mining Data Streams Based on the Gaussian Approximation International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-3 E-ISSN: 2347-2693 Decision Trees for Mining Data Streams Based on the Gaussian Approximation S.Babu

More information

Aggregation Methodology on Map Reduce for Big Data Applications by using Traffic-Aware Partition Algorithm

Aggregation Methodology on Map Reduce for Big Data Applications by using Traffic-Aware Partition Algorithm Aggregation Methodology on Map Reduce for Big Data Applications by using Traffic-Aware Partition Algorithm R. Dhanalakshmi 1, S.Mohamed Jakkariya 2, S. Mangaiarkarasi 3 PG Scholar, Dept. of CSE, Shanmugnathan

More information

Knowledge Acquisition Approach Based on Rough Set in Online Aided Decision System for Food Processing Quality and Safety

Knowledge Acquisition Approach Based on Rough Set in Online Aided Decision System for Food Processing Quality and Safety , pp. 381-388 http://dx.doi.org/10.14257/ijunesst.2014.7.6.33 Knowledge Acquisition Approach Based on Rough Set in Online Aided ecision System for Food Processing Quality and Safety Liu Peng, Liu Wen,

More information

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

More information

INCREMENTAL AGGREGATION MODEL FOR DATA STREAM CLASSIFICATION

INCREMENTAL AGGREGATION MODEL FOR DATA STREAM CLASSIFICATION INCREMENTAL AGGREGATION MODEL FOR DATA STREAM CLASSIFICATION S. Jayanthi 1 and B. Karthikeyan 2 1 Department of Computer Science and Engineering, Karpagam University, Coimbatore, India 2 Dhanalakshmi Srinivsan

More information

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1 Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints

More information

UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS

UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS Dwijesh C. Mishra I.A.S.R.I., Library Avenue, New Delhi-110 012 dcmishra@iasri.res.in What is Learning? "Learning denotes changes in a system that enable

More information

Performance Analysis of Data Mining Techniques for Improving the Accuracy of Wind Power Forecast Combination

Performance Analysis of Data Mining Techniques for Improving the Accuracy of Wind Power Forecast Combination Performance Analysis of Data Mining Techniques for Improving the Accuracy of Wind Power Forecast Combination Ceyda Er Koksoy 1, Mehmet Baris Ozkan 1, Dilek Küçük 1 Abdullah Bestil 1, Sena Sonmez 1, Serkan

More information

Proposal of Credit Card Fraudulent Use Detection by Online-type Decision Tree Construction and Verification of Generality

Proposal of Credit Card Fraudulent Use Detection by Online-type Decision Tree Construction and Verification of Generality Proposal of Credit Card Fraudulent Use Detection by Online-type Decision Tree Construction and Verification of Generality Tatsuya Minegishi 1, Ayahiko Niimi 2 Graduate chool of ystems Information cience,

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

Introduction to Data Mining Techniques

Introduction to Data Mining Techniques Introduction to Data Mining Techniques Dr. Rajni Jain 1 Introduction The last decade has experienced a revolution in information availability and exchange via the internet. In the same spirit, more and

More information

processed parallely over the cluster nodes. Mapreduce thus provides a distributed approach to solve complex and lengthy problems

processed parallely over the cluster nodes. Mapreduce thus provides a distributed approach to solve complex and lengthy problems Big Data Clustering Using Genetic Algorithm On Hadoop Mapreduce Nivranshu Hans, Sana Mahajan, SN Omkar Abstract: Cluster analysis is used to classify similar objects under same group. It is one of the

More information