Data Mining Approach for Predictive Modeling of Agricultural Yield Data

Size: px
Start display at page:

Download "Data Mining Approach for Predictive Modeling of Agricultural Yield Data"

Transcription

1 Data Mining Approach for Predictive Modeling of Agricultural Yield Data Branko Marinković, Jovan Crnobarac, Sanja Brdar, Borislav Antić, Goran Jaćimović, Vladimir Crnojević Faculty of Agriculture, University of Novi Sad, Serbia {branko, jovanc, Faculty of Technical Sciences, University of Novi Sad, Serbia {tk_boris, Abstract - Prediction of agricultural yields is a challenging task that demands fusion of knowledge from different areas such as data mining, statistics and agriculture. This paper shows that data mining techniques can be successfully applied to agricultural data analysis. Results that we present are gained on the data set that contains monthly measurements of different environmental parameters and annual yields for maize, soybean and sugar beet. Obtained results are in compliance with previous results on plant production modeling that are at the core of agricultural science. Keywords - Data mining, precision agriculture, yield prediction, genetic algorithms. I. INTRODUCTION Precision agriculture is a new paradigm that arose mostly from the developments in the field of wireless sensor networks. Those networks can collect huge amounts of environmental data that have a strong impact on agricultural production. One of the interesting aspects of precision agriculture is the prediction of yields. Data mining offers possibilities to change raw data into valuable information that could be used for making better decisions. Timely data collection directly from field deployed sensors is a new paradigm that aims to make improvements and increase profitability of agricultural measures through the use of appropriate data analysis algorithms. Pioneering applications of data mining in agriculture have been reported in the papers [1] [2], but the concept can also be successfully operated in other environmental fields, such as forestry and biodegradability analysis [3] [4]. Numerous factors have an impact on yield of cultivated plants. They significantly determine its level, whether separately or through a very complex set of interactions. Climate conditions have predominant role among all of the factors [5]. In the past few decades, plant production modeling has introduced some novel elements in the modern agriculture. It is motivated by the human everlasting wish to anticipate the progress of cultivated plants and it came up as a result of joint work of many teams of biologists, agronomists, meteorologists and programmers [6]. In the course of previous developments and exploitations, plant production models had served to researchers as a valuable instrument for organization and retrieval of data collected through field experimentations. In developed countries, these models have become irreplaceable source of information for all counseling services, agricultural stations and other sites that use them for making important decisions regarding plant production. First techniques for plant production modeling were based on the regression analysis. It represents the simplest technique that interprets experimental data using a mathematical model or function that describes a certain phenomenon or process. These techniques are substantially improved lately, and there are today very complex computer programs forecasting vegetation dynamics, yield components or yields of cultivated plants. By knowing the soil characteristics, the requirements of cultivar or hybrid, the history of applied agro-technical measures and weather conditions, it is possible to predict the moments of phonological phase changes, biomass developments and plant yields. The rest of the paper is structured as follows. Section II describes the dataset and section III describes the data mining methodology used to build the predictive models. In section IV, we present and discuss the results. Section V presents the conclusions and plans for future work. II. DATA Data collected during the period from 1999 to 2008 about the yields of main field crops (maize, soybean and sugar beet) in Serbian province of Vojvodina, have been taken from the internal database of the Department of field and vegetable crops at the Faculty of Agriculture in Novi Sad. Basic meteorological parameters in vegetation period - maximal, minimal and average monthly air temperature, as well as precipitation level, have been used in the analysis. These parameters are calculated by averaging daily measurements made by seven hydro-meteorological stations distributed all over the province of Vojvodina. For the analysis of water balance in vegetation periods of some crops (ETP potential evapotranspiration, ETR real evapotranspiration, shortage or excess of water with respect to the plant s needs), the bioclimate method based on hydrophitothermic indices (HFTC) has been used [7]. In the semi-arid conditions of Vojvodina, this is the most widely used method for defining plant s water deficit or surplus. Hydrophitothermic index HFTC shows a quantity of water (in

2 milliliters) used by a plant in the ETP process for every grade of average daily temperature. Monthly ETP value is calculated on a basis of the following formula (1) where ETP represents potential evapotranspiration (measured in mm) for a month period, HFTC is a hydrophitothermic index and Σt represents the sum of all average daily temperatures ( C) that have been recorded during a particular month. III. DATA MINING ALGORITHMS Data mining algorithms were applied using WEKA software. It includes a wide variety of learning algorithms and preprocessing tools [8] [9]. Among the algorithms implemented in WEKA, M5P model tree was the most suitable for our dataset and the problem of yield prediction that we want to solve. MP5 model tree is a combination of data classification and regression [10]. It follows the idea of decision tree methods, but in the leaves it has liner regression functions instead of class labels. The MP5 model tree is constructed in a top down way. At each step a decision is made whether to partition the training set (i.e. to create a split node) or to introduce a regression function as a leaf node. The decision is based on the standard deviation of target variable and the calculation of expected reduction of using equation (2). If we denote by T the set of training instances, T i the subsets of instances that are created by splitting the set T, std(t) and std(t i ) the standard deviations of sets T and T i, the reduction term is given by Δ (2) Important parameter that indicates the performance of model tree is the correlation coefficient r. It measures the statistical correlation between the prediction p and target variable a using equation, where cov(p,a) is the covariance between predicted and actual values, while std(p) and std(a) are their standard deviations. Error measures are expressed by the root mean squared (RMSE) and the mean absolute (MAE). (3) (4) (5) We have also performed experiments with attribute selection filters in order to extract the most relevant attributes that have an impact on agricultural yields. In that way we managed to reduce the attribute list and to increase model tree performance. s are grouped into feature subsets. Selection is done by evaluating the objective function for each feature subset. Subsets of features that are highly correlated with the target variable while having a low inter-correlation are preferred. Feature subset search methods that improved our results are the best-first search and genetic algorithm. The best-first search method searches in the space of feature subsets by greedy hill-climbing technique. It starts with a random solution and iteratively makes changes to the solution in order to improve it. Algorithm terminates when it cannot produce any further improvement. In Weka this heuristic search method is implemented with backtracking facility, which means that algorithm keeps previous state and therefore can return to it if the current state is found unpromising. Genetic algorithm (GA) is a search method that incorporates principles of natural selection. GA evolves a population of individuals, where each individual is a possible solution to the optimization problem. In our agricultural yield prediction problem, each individual is a candidate subset of attributes that strongly influence the yield. Every individual is quantitatively evaluated by fitness function. Promising candidates are selected and copied to the next generation. Also, these candidates are randomly altered by genetic operations crossover and mutation. In crossover operation two individuals swap segments of their code and in that way produce offspring. Mutation changes a few bits of individual's code. These operations are intended to simulate the analogous processes of recombination and mutation of chromosomes in living beings. When new generation is created, fitness evaluation is performed again. Overall process is repeated several times and solution that GA produces is the best individual in all generations. GA searches solution space in multiple directions at once. Therefore the strength of this algorithm lies in its effectiveness when searching large spaces. IV. RESULTS This section presents the results of our work. We estimated the performance of the applied data mining algorithms by the 10-fold cross validation. Data are randomly partitioned into 10 blocks, one block is held out for the test purpose and the model is built on the remaining nine blocks. This method is then repeated for other blocks. Finally, a measure of performance is calculated by averaging. Part A describes the results gained on full attribute set, while part B describes the improved results gained with attribute selection processes. A. Full attribute set Table I presents the correlation coefficients, mean absolute and root mean squared for maize, soybean and sugar beet datasets. TABLE I. MODEL TREE PERFORMANCE Model Tree Parameters of Performance Results Mean Root mean Correlation Coefficient absolute squared Maize Soybean Sugar beet The best correlation coefficient is obtained for maize data set. After performing pruning, model trees for all three datasets

3 are reduced to only one regression function. Figures 1, 2 and 3 present these regression expressions. All attribute values were normalized in order to better understand and compare their influence on the yield * Tmax_ * Tmax_ * Tmax_ * Tmin_ * Tmin_ * Tmin_ * Tsr_ * Tsr_ * Pmm_ * Pmm_ Figure 1. Regression rule for maize data set According to Figure 1, it is established that the maize yield is mostly affected by the temperatures in June, while the precipitation variables are most significant in May and September. May and June can be deemed critical for the growht and development of maize, since during these months, there is an intensive increase of the vegetation mass and the generative organs start to form * Tmax_ * Tmax_ * Tsr_ * Tsr_ * Tsr_ * Pmm_ * Pmm_ * Pmm_ * ETP_ Figure 2. Regression rule for soybean data set Soybean yield, very much like the maize yield, depends on the temperatures during summer. It should be taken into account that the temperatures during the hottest months (July and August), had a negative impact on the yield while the temperatures in June had a positive impact. Precipitation in May and June affect the formation of vegetation mass, which also has a positive impact on the yield. The results are fairly in accordance with the agricultural practice since it is stated in the literature that soybean is particularly sensitive to drought during blooming and grain formation. Soybean s needs for water are growing since sowing, it is largely in demand for water during summer (June, July and August) and afterwards it is getting less and less water until the end of the vegetation period. This is related to the growth, development and ripening of soybean as well as with meteorological changes during the vegetation period. According to Figure 3, high average temperatures in August and maximum temperatures in September had the greatest negative impact on the yield of sugar beet, while high temperatures in August had a positive impact on the yield. Also, real evapotranspiration values in May and June had a positive effect on the yield, while the potential evapotranspiration values in July negatively affected the yield * Tmax_ * Tmax_ * Tmin_ * Tmin_ * Tsr_ * Tsr_ * ETP_ * ETR_ * ETR_ Figure 3. Regression rule for sugar beet data set B. Reduced sets For soybean data set the highest improvement is gained by genetic algorithm search method for attribute selection, as explained in Table II. Starting from 41 attributes, GA produces a reduced set of seven most informative attributes: Tsr_04 (average temperature in April), Tsr_08 (average temperature in August), ETP_06 (potential evapotranspiration in June), ETP_07 (potential evapotranspiration in July), ETP_08 (potential evapotranspiration in August), ETR_04 (real evapotranspiration in April) and ETR_06 (real evapotranspiration in June). TABLE II. Model Tree improvements for Soybean Without attribute selection best-first search genetic algorithm search IMPROVED MODEL TREE PERFORMANCE FOR SOYBEAN Model Tree Parameters Mean Root mean Correlation Coefficient absolute squared After attribute selection done by GA method, data set with the higher level of relevance to the soybean yield is produced. The effect of temperatures and precipitation in October is eliminated since it is known that they don t have any significant impact on soybean yield. In accordance with the improved regression rule for soybean data set displayed in Figure 4, April occurs as a significant period for yield formation when the growth of real evapotranspiration value causes an adequate yield growth (its impact is measured by teperature and precipitation values).

4 * Tsr_ * Tsr_ * ETP_ * ETR_ * ETR_ Figure 4. Improved Regression Rule for Soya Data Set The highest improvement for sugar beet data set is gained by the best-first search method for attribute selection (Table III). Starting from the set of 41 attributes, the algorithm manages to find the subset of only 8 attributes: Tmin_05 (minimum temperature in May), Tmin_08 (minimum temperature in August), Tsr_06 (average temperature in June), Tsr_07 (average temperature in July), Tsr_08 (average temperature in August), ETP_05 (potential evapotranspiration in May), ETP_08 (potential evapotranspiration in August), ETR_05 (real evapotranspiration in May). The model tree built onto these seven attributes is shown in Figure 5. It complies well with the previous published results that relate to the prediction of the yield of sugar beet based on general environmental factors. TABLE III. Model Tree improvements for Sugar Beet Without attribute selection genetic algorithm search best first search IMPROVED MODEL TREE PERFORMANCE FOR SUGAR BEET Model Tree Parameters Correlation Coefficient Mean absolute Root mean squared V. CONCLUSION In this paper we present new research possibilities for the application of data mining methodology to the problem of yield prediction for maize, soybean, sugar beet and other field cultures. Data mining algorithms were applied using WEKA software. Basic meteorological parameters in vegetation period - maximal, minimal and average monthly air temperature, as well as precipitation level, have been used in the analysis. For the analysis of water balance in vegetation periods of some crops, the bioclimate method based on hydrophitothermic indices (HFTC) is used. M5P model tree applied on the full set of 41 attributes produced meaningful regression rules that are in accordance with plant production models proposed by agricultural scientists. Important feature subset selection methods, such as best-first method or genetic algorithm, have improved the accuracy and made simpler models that are easier to interpret by agronomists. VI. ACKNOWLEDGEMENT This work was supported by the Ministry of Science and Technological Development of Republic of Serbia, under the technology development project Wireless Sensor Networks and Remote Sensing Foundations of Modern Agricultural Infrastructure (grant TR-11022). Sanja Brdar was supported through the student scholarship program of the Ministry of Science and Technological Development of Republic of Serbia. Tsr_08 <= : LM1 Tsr_08 > : Tsr_08 <= : LM2 Tsr_08 > : ETR_05 <=42.65: LM3 ETR_05 > 42.65: LM4 LM num: * Tmin_ * Tmin_ * Tsr_ * ETP_ * ETP_ * ETR_ LM num: * ETP_ * ETR_ LM num: * ETP_ * ETR_ LM num: * ETP_ * ETR_ Figure 5. Improved Model Tree for Sugar Beet Dataset REFERENCES [1] G. Ruß, R. Kruse, M. Schneider, P. Wagner, Data Mining with Neural Networks for Wheat Yield Prediction, in Advances in Data Mining. Medical Applications, E-Commerce, Marketing and Theoretical Aspects, Lecture Notes in Computer Science, Vol. 5077, Springer, pp , 2008.

5 [2] D. Pokrajac, T. Fiez, D. Obradovic, S. Kwek, Z. Obradovic, Distribution comparison for site-specific regression modeling in agriculture, in Proc. 12 th International Joint Conference on Neural Networks (IJCNN), pp , [3] S. Džeroski, A. Kobler, V. Gjorgjioski, P. Panov, Using decision trees to predict forest stand height and canopy cover from LANSAT and LIDAR data, in Proc. 20th International Conference on Informatics for Environmental Protection, pp , [4] H. Blockeel, S. Džeroski, B. Kompare, S. Kramer, B. Pfahringer, W. V. Laer, Experiments in predicting biodegradability, in Proc. 9th International Workshop on Inductive Logic Programming, pp , Springer, [5] B. Marinković, J. Crnobarac, D. Marinković, G. Jaćimović, D.V. Mircov, Weather conditions in the function of optimal corn yield in Serbia and the Vojvodina province, in Proceeding of the 1 st Scientific Agronomic Days, pp , [6] B. Lalic, L. Pankovic, D. T. Mihailovic, M. Malesevic, I. Arsenic: Crop models and its use in vegetation dynamic forecasting. In Proc. of Institute of Field and Vegetable Crops, Vol. 44, pp , [7] N. Vučić, Bioclimate coefficients and plant water regime theory and practical applications. Vodoprivreda, Vol. 6, Num. 8, pp , [8] I. H. Witten, E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2005, Second edition. [9] P.-N. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining, Addison Wesley, [10] J.R. Quinlan, Learning with Continuous Classes, in Proc. 5th Australian Joint Conference on Artificial Intelligence, pp , 1992.

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

More information

Biological Forum An International Journal 7(1): 1469-1473(2015)

Biological Forum An International Journal 7(1): 1469-1473(2015) ISSN No. (Print): 0975-1130 ISSN No. (Online): 2249-3239 Evaluation of a Data Mining model in Predicting of Average Temperature and Potential Evapotranspiration Month for the next Month in the Synoptic

More information

Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

More information

IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES

IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES Bruno Carneiro da Rocha 1,2 and Rafael Timóteo de Sousa Júnior 2 1 Bank of Brazil, Brasília-DF, Brazil brunorocha_33@hotmail.com 2 Network Engineering

More information

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

Weather forecast prediction: a Data Mining application

Weather forecast prediction: a Data Mining application Weather forecast prediction: a Data Mining application Ms. Ashwini Mandale, Mrs. Jadhawar B.A. Assistant professor, Dr.Daulatrao Aher College of engg,karad,ashwini.mandale@gmail.com,8407974457 Abstract

More information

D A T A M I N I N G C L A S S I F I C A T I O N

D A T A M I N I N G C L A S S I F I C A T I O N D A T A M I N I N G C L A S S I F I C A T I O N FABRICIO VOZNIKA LEO NARDO VIA NA INTRODUCTION Nowadays there is huge amount of data being collected and stored in databases everywhere across the globe.

More information

A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries

A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries Aida Mustapha *1, Farhana M. Fadzil #2 * Faculty of Computer Science and Information Technology, Universiti Tun Hussein

More information

Studying Auto Insurance Data

Studying Auto Insurance Data Studying Auto Insurance Data Ashutosh Nandeshwar February 23, 2010 1 Introduction To study auto insurance data using traditional and non-traditional tools, I downloaded a well-studied data from http://www.statsci.org/data/general/motorins.

More information

Prediction of Stock Performance Using Analytical Techniques

Prediction of Stock Performance Using Analytical Techniques 136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

More information

IMPORTANCE OF LONG-TERM EXPERIMENTS IN STUDYING THE EFFECTS OF CLIMATE CHANGE. Introduction

IMPORTANCE OF LONG-TERM EXPERIMENTS IN STUDYING THE EFFECTS OF CLIMATE CHANGE. Introduction IMPORTANCE OF LONG-TERM EXPERIMENTS IN STUDYING THE EFFECTS OF CLIMATE CHANGE N. HARNOS 1, É. ERDÉLYI 2 and T. ÁRENDÁS 1 1 AGRICULTURAL RESEARCH INSTITUTE OF THE HUNGARIAN ACADEMY OF SCIENCES, MARTONVÁSÁR,

More information

Automatic Resolver Group Assignment of IT Service Desk Outsourcing

Automatic Resolver Group Assignment of IT Service Desk Outsourcing Automatic Resolver Group Assignment of IT Service Desk Outsourcing in Banking Business Padej Phomasakha Na Sakolnakorn*, Phayung Meesad ** and Gareth Clayton*** Abstract This paper proposes a framework

More information

GA as a Data Optimization Tool for Predictive Analytics

GA as a Data Optimization Tool for Predictive Analytics GA as a Data Optimization Tool for Predictive Analytics Chandra.J 1, Dr.Nachamai.M 2,Dr.Anitha.S.Pillai 3 1Assistant Professor, Department of computer Science, Christ University, Bangalore,India, chandra.j@christunivesity.in

More information

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br

More information

Programming Risk Assessment Models for Online Security Evaluation Systems

Programming Risk Assessment Models for Online Security Evaluation Systems Programming Risk Assessment Models for Online Security Evaluation Systems Ajith Abraham 1, Crina Grosan 12, Vaclav Snasel 13 1 Machine Intelligence Research Labs, MIR Labs, http://www.mirlabs.org 2 Babes-Bolyai

More information

ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS

ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS Abstract D.Lavanya * Department of Computer Science, Sri Padmavathi Mahila University Tirupati, Andhra Pradesh, 517501, India lav_dlr@yahoo.com

More information

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One

More information

Data Mining and Soft Computing. Francisco Herrera

Data Mining and Soft Computing. Francisco Herrera Francisco Herrera Research Group on Soft Computing and Information Intelligent Systems (SCI 2 S) Dept. of Computer Science and A.I. University of Granada, Spain Email: herrera@decsai.ugr.es http://sci2s.ugr.es

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Classification of Learners Using Linear Regression

Classification of Learners Using Linear Regression Proceedings of the Federated Conference on Computer Science and Information Systems pp. 717 721 ISBN 978-83-60810-22-4 Classification of Learners Using Linear Regression Marian Cristian Mihăescu Software

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Applying Data Mining Technique to Sales Forecast

Applying Data Mining Technique to Sales Forecast Applying Data Mining Technique to Sales Forecast 1 Erkin Guler, 2 Taner Ersoz and 1 Filiz Ersoz 1 Karabuk University, Department of Industrial Engineering, Karabuk, Turkey erkn.gler@yahoo.com, fersoz@karabuk.edu.tr

More information

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set Overview Evaluation Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes training set, validation set, test set holdout, stratification

More information

First Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms

First Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms First Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms Azwa Abdul Aziz, Nor Hafieza IsmailandFadhilah Ahmad Faculty Informatics & Computing

More information

A Robust Method for Solving Transcendental Equations

A Robust Method for Solving Transcendental Equations www.ijcsi.org 413 A Robust Method for Solving Transcendental Equations Md. Golam Moazzam, Amita Chakraborty and Md. Al-Amin Bhuiyan Department of Computer Science and Engineering, Jahangirnagar University,

More information

DATA MINING METHODS WITH TREES

DATA MINING METHODS WITH TREES DATA MINING METHODS WITH TREES Marta Žambochová 1. Introduction The contemporary world is characterized by the explosion of an enormous volume of data deposited into databases. Sharp competition contributes

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

DATA MINING AND REPORTING IN HEALTHCARE

DATA MINING AND REPORTING IN HEALTHCARE DATA MINING AND REPORTING IN HEALTHCARE Divya Gandhi 1, Pooja Asher 2, Harshada Chaudhari 3 1,2,3 Department of Information Technology, Sardar Patel Institute of Technology, Mumbai,(India) ABSTRACT The

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

Financial Trading System using Combination of Textual and Numerical Data

Financial Trading System using Combination of Textual and Numerical Data Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,

More information

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S. AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

More information

Introducing diversity among the models of multi-label classification ensemble

Introducing diversity among the models of multi-label classification ensemble Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and

More information

International Journal of Advance Research in Computer Science and Management Studies

International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 12, December 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Enhanced Boosted Trees Technique for Customer Churn Prediction Model IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction

More information

ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS

ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS Michael Affenzeller (a), Stephan M. Winkler (b), Stefan Forstenlechner (c), Gabriel Kronberger (d), Michael Kommenda (e), Stefan

More information

Numerical Research on Distributed Genetic Algorithm with Redundant

Numerical Research on Distributed Genetic Algorithm with Redundant Numerical Research on Distributed Genetic Algorithm with Redundant Binary Number 1 Sayori Seto, 2 Akinori Kanasugi 1,2 Graduate School of Engineering, Tokyo Denki University, Japan 10kme41@ms.dendai.ac.jp,

More information

Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data

Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data Fifth International Workshop on Computational Intelligence & Applications IEEE SMC Hiroshima Chapter, Hiroshima University, Japan, November 10, 11 & 12, 2009 Extension of Decision Tree Algorithm for Stream

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier D.Nithya a, *, V.Suganya b,1, R.Saranya Irudaya Mary c,1 Abstract - This paper presents,

More information

Cellular Automaton: The Roulette Wheel and the Landscape Effect

Cellular Automaton: The Roulette Wheel and the Landscape Effect Cellular Automaton: The Roulette Wheel and the Landscape Effect Ioan Hălălae Faculty of Engineering, Eftimie Murgu University, Traian Vuia Square 1-4, 385 Reşiţa, Romania Phone: +40 255 210227, Fax: +40

More information

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Nine Common Types of Data Mining Techniques Used in Predictive Analytics 1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better

More information

Data Mining for Fun and Profit

Data Mining for Fun and Profit Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools

More information

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR

More information

Artificial Neural Network and Non-Linear Regression: A Comparative Study

Artificial Neural Network and Non-Linear Regression: A Comparative Study International Journal of Scientific and Research Publications, Volume 2, Issue 12, December 2012 1 Artificial Neural Network and Non-Linear Regression: A Comparative Study Shraddha Srivastava 1, *, K.C.

More information

Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware

Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware Mahyar Shahsavari, Zaid Al-Ars, Koen Bertels,1, Computer Engineering Group, Software & Computer Technology

More information

Web Document Clustering

Web Document Clustering Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,

More information

Arturo Sanchez-Azofeifa, PhD, PEng Cassidy Rankine, Gilberto Zonta-Pastorello Centre for Earth Observation Sciences (CEOS) Earth and Atmospheric

Arturo Sanchez-Azofeifa, PhD, PEng Cassidy Rankine, Gilberto Zonta-Pastorello Centre for Earth Observation Sciences (CEOS) Earth and Atmospheric Arturo Sanchez-Azofeifa, PhD, PEng Cassidy Rankine, Gilberto Zonta-Pastorello Centre for Earth Observation Sciences (CEOS) Earth and Atmospheric Sciences Department University of Alberta Microsoft WSN

More information

Predicting Student Performance by Using Data Mining Methods for Classification

Predicting Student Performance by Using Data Mining Methods for Classification BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance

More information

A Parallel Processor for Distributed Genetic Algorithm with Redundant Binary Number

A Parallel Processor for Distributed Genetic Algorithm with Redundant Binary Number A Parallel Processor for Distributed Genetic Algorithm with Redundant Binary Number 1 Tomohiro KAMIMURA, 2 Akinori KANASUGI 1 Department of Electronics, Tokyo Denki University, 07ee055@ms.dendai.ac.jp

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

AnalysisofData MiningClassificationwithDecisiontreeTechnique

AnalysisofData MiningClassificationwithDecisiontreeTechnique Global Journal of omputer Science and Technology Software & Data Engineering Volume 13 Issue 13 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals

More information

Management Science Letters

Management Science Letters Management Science Letters 4 (2014) 905 912 Contents lists available at GrowingScience Management Science Letters homepage: www.growingscience.com/msl Measuring customer loyalty using an extended RFM and

More information

Indian Agriculture Land through Decision Tree in Data Mining

Indian Agriculture Land through Decision Tree in Data Mining Indian Agriculture Land through Decision Tree in Data Mining Kamlesh Kumar Joshi, M.Tech(Pursuing 4 th Sem) Laxmi Narain College of Technology, Indore (M.P) India k3g.kamlesh@gmail.com 9926523514 Pawan

More information

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE S. Anupama Kumar 1 and Dr. Vijayalakshmi M.N 2 1 Research Scholar, PRIST University, 1 Assistant Professor, Dept of M.C.A. 2 Associate

More information

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Henrik Boström School of Humanities and Informatics University of Skövde P.O. Box 408, SE-541 28 Skövde

More information

AUTOMATED SOIL WATER TENSION-BASED DRIP IRRIGATION FOR PRECISE IRRIGATION SCHEDULING

AUTOMATED SOIL WATER TENSION-BASED DRIP IRRIGATION FOR PRECISE IRRIGATION SCHEDULING AUTOMATED SOIL WATER TENSION-BASED DRIP IRRIGATION FOR PRECISE IRRIGATION SCHEDULING Sabine Seidel sabine.seidel@tu-dresden.de Institute of Hydrology and Meteorology, Faculty of Environmental Sciences,

More information

Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

More information

Intelligent Modeling of Sugar-cane Maturation

Intelligent Modeling of Sugar-cane Maturation Intelligent Modeling of Sugar-cane Maturation State University of Pernambuco Recife (Brazil) Fernando Buarque de Lima Neto, PhD Salomão Madeiro Flávio Rosendo da Silva Oliveira Frederico Bruno Alexandre

More information

MONITORING OF DROUGHT ON THE CHMI WEBSITE

MONITORING OF DROUGHT ON THE CHMI WEBSITE MONITORING OF DROUGHT ON THE CHMI WEBSITE Richterová D. 1, 2, Kohut M. 3 1 Department of Applied and Land scape Ecology, Faculty of Agronomy, Mendel University in Brno, Zemedelska 1, 613 00 Brno, Czech

More information

FOREX TRADING PREDICTION USING LINEAR REGRESSION LINE, ARTIFICIAL NEURAL NETWORK AND DYNAMIC TIME WARPING ALGORITHMS

FOREX TRADING PREDICTION USING LINEAR REGRESSION LINE, ARTIFICIAL NEURAL NETWORK AND DYNAMIC TIME WARPING ALGORITHMS FOREX TRADING PREDICTION USING LINEAR REGRESSION LINE, ARTIFICIAL NEURAL NETWORK AND DYNAMIC TIME WARPING ALGORITHMS Leslie C.O. Tiong 1, David C.L. Ngo 2, and Yunli Lee 3 1 Sunway University, Malaysia,

More information

D-optimal plans in observational studies

D-optimal plans in observational studies D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational

More information

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical

More information

Flexible Neural Trees Ensemble for Stock Index Modeling

Flexible Neural Trees Ensemble for Stock Index Modeling Flexible Neural Trees Ensemble for Stock Index Modeling Yuehui Chen 1, Ju Yang 1, Bo Yang 1 and Ajith Abraham 2 1 School of Information Science and Engineering Jinan University, Jinan 250022, P.R.China

More information

Decision-Tree Learning

Decision-Tree Learning Decision-Tree Learning Introduction ID3 Attribute selection Entropy, Information, Information Gain Gain Ratio C4.5 Decision Trees TDIDT: Top-Down Induction of Decision Trees Numeric Values Missing Values

More information

Data Mining Classification: Decision Trees

Data Mining Classification: Decision Trees Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous

More information

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques. International Journal of Emerging Research in Management &Technology Research Article October 2015 Comparative Study of Various Decision Tree Classification Algorithm Using WEKA Purva Sewaiwar, Kamal Kant

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

Alpha Cut based Novel Selection for Genetic Algorithm

Alpha Cut based Novel Selection for Genetic Algorithm Alpha Cut based Novel for Genetic Algorithm Rakesh Kumar Professor Girdhar Gopal Research Scholar Rajesh Kumar Assistant Professor ABSTRACT Genetic algorithm (GA) has several genetic operators that can

More information

Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I

Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Data is Important because it: Helps in Corporate Aims Basis of Business Decisions Engineering Decisions Energy

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News

Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News Sushilkumar Kalmegh Associate Professor, Department of Computer Science, Sant Gadge Baba Amravati

More information

Predicting Critical Problems from Execution Logs of a Large-Scale Software System

Predicting Critical Problems from Execution Logs of a Large-Scale Software System Predicting Critical Problems from Execution Logs of a Large-Scale Software System Árpád Beszédes, Lajos Jenő Fülöp and Tibor Gyimóthy Department of Software Engineering, University of Szeged Árpád tér

More information

Selective Naive Bayes Regressor with Variable Construction for Predictive Web Analytics

Selective Naive Bayes Regressor with Variable Construction for Predictive Web Analytics Selective Naive Bayes Regressor with Variable Construction for Predictive Web Analytics Boullé Orange Labs avenue Pierre Marzin 3 Lannion, France marc.boulle@orange.com ABSTRACT We describe our submission

More information

Learning bagged models of dynamic systems. 1 Introduction

Learning bagged models of dynamic systems. 1 Introduction Learning bagged models of dynamic systems Nikola Simidjievski 1,2, Ljupco Todorovski 3, Sašo Džeroski 1,2 1 Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan

More information

Association rules for improving website effectiveness: case analysis

Association rules for improving website effectiveness: case analysis Association rules for improving website effectiveness: case analysis Maja Dimitrijević, The Higher Technical School of Professional Studies, Novi Sad, Serbia, dimitrijevic@vtsns.edu.rs Tanja Krunić, The

More information

Decision Tree Learning on Very Large Data Sets

Decision Tree Learning on Very Large Data Sets Decision Tree Learning on Very Large Data Sets Lawrence O. Hall Nitesh Chawla and Kevin W. Bowyer Department of Computer Science and Engineering ENB 8 University of South Florida 4202 E. Fowler Ave. Tampa

More information

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt

More information

Geospatial intelligence and data fusion techniques for sustainable development problems

Geospatial intelligence and data fusion techniques for sustainable development problems Geospatial intelligence and data fusion techniques for sustainable development problems Nataliia Kussul 1,2, Andrii Shelestov 1,2,4, Ruslan Basarab 1,4, Sergii Skakun 1, Olga Kussul 2 and Mykola Lavreniuk

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013 A Short-Term Traffic Prediction On A Distributed Network Using Multiple Regression Equation Ms.Sharmi.S 1 Research Scholar, MS University,Thirunelvelli Dr.M.Punithavalli Director, SREC,Coimbatore. Abstract:

More information

Introduction to Data Mining Techniques

Introduction to Data Mining Techniques Introduction to Data Mining Techniques Dr. Rajni Jain 1 Introduction The last decade has experienced a revolution in information availability and exchange via the internet. In the same spirit, more and

More information

KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE

KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE Dr. Ruchira Bhargava 1 and Yogesh Kumar Jakhar 2 1 Associate Professor, Department of Computer Science, Shri JagdishPrasad Jhabarmal Tibrewala University,

More information

Lecture 10: Regression Trees

Lecture 10: Regression Trees Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Using Adaptive Random Trees (ART) for optimal scorecard segmentation

Using Adaptive Random Trees (ART) for optimal scorecard segmentation A FAIR ISAAC WHITE PAPER Using Adaptive Random Trees (ART) for optimal scorecard segmentation By Chris Ralph Analytic Science Director April 2006 Summary Segmented systems of models are widely recognized

More information

Predictive Analytics using Genetic Algorithm for Efficient Supply Chain Inventory Optimization

Predictive Analytics using Genetic Algorithm for Efficient Supply Chain Inventory Optimization 182 IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.3, March 2010 Predictive Analytics using Genetic Algorithm for Efficient Supply Chain Inventory Optimization P.Radhakrishnan

More information

DATA MINING APPROACH FOR PREDICTING STUDENT PERFORMANCE

DATA MINING APPROACH FOR PREDICTING STUDENT PERFORMANCE . Economic Review Journal of Economics and Business, Vol. X, Issue 1, May 2012 /// DATA MINING APPROACH FOR PREDICTING STUDENT PERFORMANCE Edin Osmanbegović *, Mirza Suljić ** ABSTRACT Although data mining

More information