# Web Site Visit Forecasting Using Data Mining Techniques

Save this PDF as:

Size: px
Start display at page:

Download "Web Site Visit Forecasting Using Data Mining Techniques"

## Transcription

2 III. PREPROCESSING Data per-processing is an impotent step in data mining. It helps to remove garbage information effect from initial data set. It intends to reduce some noises, incomplete and inconsistent data. The results from pre-processing step can be later processed by data mining algorithms. In a data set of web site visits, marketing or related activities can create a sudden increase in number of visitors. But it won't retain for a longer time period. In this type of a situation that sudden increased visitor count data is considered as outliers. A. Handling Outliers The best way to remove outliers is organizing similar values in to group/ cluster. Outliers can be identified through human or computer inspection. But used human inspection to identify outliers patterns. For example consider a situation where mean page view count is 121, but some days has page view amount as 518. In this case, for removing the outliers fist calculate mean value and standard deviation (is the square root of variance). In a normal distribution curve, from μ σ to μ+σ contains about 68% of the measurements. There for if the forecasting doesn't interested about outliers, the values which are out of the μ σ to μ+σ range will be removed and assigned with mean value. IV. FORECASTING TECHNIQUES The data present in the environment of web site usage predication is time dependent series of data points. Modeling and explaining such data using statistical techniques is 'Time series analysis. The process of using a model to forecast future events based on known past events is 'Time series forecasting'. Time series data such as web site access usage data has a natural temporal ordering. Typically in data mining applications, each data point is an independent example of the concept to be learned and the ordering of data points within the set doesn't matter. But for time series data it is not the case. So one approach of handling time series data is removing its temporal ordering so that standard propositional learning algorithms can process them. Here when removing the temporal ordering, the time dependent data should be encoded via additional input fields. These fields are known as 'lagged' variables. After data has been transformed, regression algorithms can be applied to learn a model. One approach is to apply multiple linear regressions. Also any method which is capable of predicting target can be applied. Lagged variables are the main mechanism by which the relationship between past and current values of a series can be captured by propositional learning algorithms[11]. In this research we used minimum lag value as 1 and maximum lag value as 14. A. Weka Weka (Waikato Environment for Knowledge Analysis) is a popular suite of machine learning software written in Java, developed at the University of Waikato, New Zealand [6]. Weka (>= 3.7.3) has an environment which is dedicated for time series analysis. It allows creating forecasting models, evaluating them and visualizing them. The approach of time series analysis in Weka is transforming the data into form that standard propositional learning algorithms can process. As above mentioned Weka does this by removing the temporal ordering of individual input examples by encoding the time dependent via additional input fields. Various other fields are also computed automatically to allow the algorithms to model trends and seasonality. After the data has been transformed, any of Weka's regression algorithms can be applied to learn a model. One approach is to apply multiple linear regressions. Also Weka allows any method capable of predicting a continuous target. For example support vector machine (SVM) for regression trees and model trees can be applied. Model trees are decision trees with linear regression at the leaves. Weka is consist with multiple classifier functions and in this research four of those classifier functions will be used. They are namely Process, Multilayer Perceptron, and. 1. Process This classifier function implements process for regression without hyper-parameter tuning. Here the missing values are replaced with global mean/mode values [7]. 2. Multilayer Perceptron This classifier function uses backpropagation to classify instances. This network can be built by hand, created by an algorithm or both. The network can also be monitored and modified during training time [8]. 3. This function uses linear regression for predication. Weighted instances are possible to be used with this approach [9]. 4. Support vector classifier will be trained using polynomial or RBF kernels where John C. Platt's sequential minimal optimization algorithm is implemented here [10]. V. WEB SITE VISIT FORECASTING This paper discusses about the Web site visitors forecasting for a web site owned by one of a Software Company. For this purpose we have used daily page views for the web site from the Google Analytics from 01 April 2011 to 22 July 2012 time range. Here a number of different prediction approaches have been applied such as processes, linear regression, multilayer perceptron and regression which are available on Weka time series analysis environment. A. Date Set We have consecutive 476 days of page view information which are downloaded from Google Analytic. The same page view data set cannot be used in prediction modelling and model validation both scenarios. Therefore information on last 14 days has been removed from the input data set and that data set is used to validate output of the prediction. The input data set is with only one attribute which is daily page view. Further the input data set will be used in forecasting in two different ways. Data set with outliers and data set without outliers are used in forecasting and variation of the forecast results will be evaluated along with the data set. B. Forecasting Execution Environment In this research, Waikato Environment for Knowledge Analysis-version (Weka 3.7.6) tool and a package called Time series forecasting environment are being used for performing the predication modelling and predicting[11]. The input data set prepossessing and Attribute-Relation File Format (ARFF) generation steps are done manually and Time 171

3 series forecasting environment is used for forecasting. Figure 1 shows structure of ARFF 'Analytics Visitors Visits Day date 99,4/1/ ,4/2/ ,4/3/ ,4/4/ ,4/5/2011 Figure 1: View of Attribute-Relation File Format C. Forecast Results The result of the forecasted information is shown in the Table 1 and Table 2. Those results include values returned by applying each of the predication methods and actual page visit count. Hence result page visit information can be compared between each prediction methods;, regression, Multilayer Perceptron regression and regression. As above mentioned this forecast has used page view information of 472 days. TABLE I FORECASTING RESULT WITH ACTUAL VALUES WITHOUT OUTLIERS Gaus sian Regre ssion Reg Multil ayer Perce ptron Actu al Visit s 06/30/ /01/ /02/ /03/ /04/ /05/ /06/ TABLE II FORECASTING RESULT WITH ACTUAL VALUES WITH OUTLIERS Gauss ian Regre ssion Reg Multila yer Perce ptron Actu al Visit s 06/30/ /01/ /02/ /03/ /04/ /05/ /06/ /07/ /08/ /09/ /10/ /11/ /12/ /13/ D. Results Evaluation TABLE III PERFORMANCE COMPARISON OF FORECASTING RESULT WITHOUT OUTLIERS MAE MSE RMSE RBE RSE 07/07/ /08/ /09/ /10/ /11/ Processes /12/ /13/ Multilayer Perceptron

4 TABLE IV PERFORMANCE COMPARISON OF FORECASTING RESULT WITH OUTLIERS Processes Multilayer Perceptron MAE MSE RMSE RBE RSE The accuracy of the above forecast results has been analysed and evaluated based on Mean Squared Error (MSE), Mean Absolute Error (MAE), Rooted Mean Squared Error (RMSE), Relative Mean Squared Error (RMSE) and Relative Absolute Error (RAE). Table 3 and Table 4 are shown with results of accuracy for each of the four forecasting methods used. Actual visit and forecasted graph representation shown in Fig Actual Fig. 1 A line graph representation of forecast result wit actual visit The mean squared error and the mean absolute error shows how forecast result and actual output are closer. The relative mean squared error and relative absolute error are used as a good measure of accuracy. By comparing the results which are shown in Table 3 and Table 4, and are the best suited methods for predicting or forecasting Web site visit related information. When using, performing outlier s analysis is a very important pre-processing step which produces more accurate results. By comparing the results which are shown in Table 3 and Table 4, and are the best suited methods for predicting or forecasting Web site visit related information. When using, performing outlier s analysis is a very important pre-processing step which produces more accurate results. VI. CONCLUSIONS This paper had discussed about the Web site visitors forecasting for a web site owned by one of a Software Company. The data present in this environment is time dependent series of data points. Modeling and forecasting in this environment is Time Series analysis and Time Series Forecasting. So the paper had discussed about applicability of Weka tool which provides a knowledge analysis environment. Basically the research went along four regression algorithms which are applied against the data set with outliers and without outliers. So how the data pre-processing, particular algorithm will assist in achieving accurate results were evaluated end of the paper based on the derived results. Based on the evaluation results we can conclude that regression and algorithms are better suited for forecasting web site related information and also using on pre-processed data gives more accurate results. Further when forecasting is done on web site related information, we can reduce various web hosting and server maintenance cost through prior acknowledging about future events. This research can be extended further with applicability of forecasting and data smoothing technology like Exponential smoothing. ACKNOWLEDGMENT We wish to acknowledge Mark Hall for developing Weka Time series forecasting environment package and given me advice we conduct this research. REFERENCES [1] D. Ciobanu, C. E. Dinuca, Predicting the next page that will be visited by a web surfer using Page Rank algorithm, in International Journal of Computers and Communications, 2012, pp [2] Z. Markov, D. T. Larose, Data Mining The Web Uncovering Patterns in Web Content, Structure and Usage. USA: John Wiley & Sons, [3] T. Wang and Y. Ren, Research on personalized recommendation based on web usage mining using collaborative filtering technique, WSEAS Trans. Info. Sci. and App., vol. 6, no. 1, pp , Jan [4] X. Wang, A. Abraham, and K. A. Smith, Intelligent web traffic mining and analysis, J. Netw. Comput. Appl., vol. 28, no. 2, pp , Apr [5] W. Tong and H. Pi-lian, Web Log Mining by an Improved AprioriAll Algorithm, ;in Proc. WEC (2), 2005, pp

5 [6] I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, Third Edition, 3rd ed. Morgan Kaufmann, [7] K. Driessens, Processes. [Online]. ka/classifiers/functions/processes.html. [Accessed: 6-Aug-2012]. [8] M. Ware, MultilayerPerceptron. [Online]. nctions/multilayerperceptron.html. [Accessed: 8- Aug-2012]. [9] E. Frank and L. Trigg,. [Online]. nctions/.html. [Accessed: 10- Aug-2012]. [10] E. Frank, L. Shane, and S. Inglis,. [Online]. nctions/.html. [Accessed: 7-Aug-2012]. [11] M. Hall, Time Series Analysis and Forecasting with Weka - Pentaho Data Mining. [Online]. +Series+Analysis+and+Forecasting+with+Weka. [Accessed: 10-Aug-2012]. 174

### A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries

A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries Aida Mustapha *1, Farhana M. Fadzil #2 * Faculty of Computer Science and Information Technology, Universiti Tun Hussein

### Web Document Clustering

Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,

### Machine Learning algorithms and methods in Weka

Machine Learning algorithms and methods in Weka Presented by: William Elazmeh PhD. candidate at the Ottawa-Carleton Institute for Computer Science, University of Ottawa, Canada Abstract: this workshop

### An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

### An Introduction to Data Mining

An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

### Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

### An Introduction to WEKA. As presented by PACE

An Introduction to WEKA As presented by PACE Download and Install WEKA Website: http://www.cs.waikato.ac.nz/~ml/weka/index.html 2 Content Intro and background Exploring WEKA Data Preparation Creating Models/

### Data Mining with Weka

Data Mining with Weka Class 1 Lesson 1 Introduction Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz Data Mining with Weka a practical course on how to

### A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt

### Introduction Predictive Analytics Tools: Weka

Introduction Predictive Analytics Tools: Weka Predictive Analytics Center of Excellence San Diego Supercomputer Center University of California, San Diego Tools Landscape Considerations Scale User Interface

### Classification of Women Health Disease (Fibroid) Using Decision Tree algorithm

International Journal of Computer Applications in Engineering Sciences [VOL II, ISSUE III, SEPTEMBER 2012] [ISSN: 2231-4946] Classification of Women Health Disease (Fibroid) Using Decision Tree algorithm

### ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR

### Relational Learning for Football-Related Predictions

Relational Learning for Football-Related Predictions Jan Van Haaren and Guy Van den Broeck jan.vanhaaren@student.kuleuven.be, guy.vandenbroeck@cs.kuleuven.be Department of Computer Science Katholieke Universiteit

### Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

### Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

### BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

### The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

, pp.67-76 http://dx.doi.org/10.14257/ijdta.2016.9.1.06 The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network Lihua Yang and Baolin Li* School of Economics and

### Financial Trading System using Combination of Textual and Numerical Data

Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,

### DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

### FOREX TRADING PREDICTION USING LINEAR REGRESSION LINE, ARTIFICIAL NEURAL NETWORK AND DYNAMIC TIME WARPING ALGORITHMS

FOREX TRADING PREDICTION USING LINEAR REGRESSION LINE, ARTIFICIAL NEURAL NETWORK AND DYNAMIC TIME WARPING ALGORITHMS Leslie C.O. Tiong 1, David C.L. Ngo 2, and Yunli Lee 3 1 Sunway University, Malaysia,

### More Data Mining with Weka

More Data Mining with Weka Class 1 Lesson 1 Introduction Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz More Data Mining with Weka a practical course

### Pentaho Data Mining Last Modified on January 22, 2007

Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org

### Study and Analysis of Data Mining Concepts

Study and Analysis of Data Mining Concepts M.Parvathi Head/Department of Computer Applications Senthamarai college of Arts and Science,Madurai,TamilNadu,India/ Dr. S.Thabasu Kannan Principal Pannai College

### An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

### International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

### Data Mining in Education: Data Classification and Decision Tree Approach

Data Mining in Education: Data Classification and Decision Tree Approach Sonali Agarwal, G. N. Pandey, and M. D. Tiwari Abstract Educational organizations are one of the important parts of our society

### Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

### Data Mining Solutions for the Business Environment

Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

### Dr. Antony Selvadoss Thanamani, Head & Associate Professor, Department of Computer Science, NGM College, Pollachi, India.

Enhanced Approach on Web Page Classification Using Machine Learning Technique S.Gowri Shanthi Research Scholar, Department of Computer Science, NGM College, Pollachi, India. Dr. Antony Selvadoss Thanamani,

### Introduction. A. Bellaachia Page: 1

Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

### Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

### Forecasting stock markets with Twitter

Forecasting stock markets with Twitter Argimiro Arratia argimiro@lsi.upc.edu Joint work with Marta Arias and Ramón Xuriguera To appear in: ACM Transactions on Intelligent Systems and Technology, 2013,

### Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

### Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.

International Journal of Emerging Research in Management &Technology Research Article October 2015 Comparative Study of Various Decision Tree Classification Algorithm Using WEKA Purva Sewaiwar, Kamal Kant

### A Survey on Web Mining From Web Server Log

A Survey on Web Mining From Web Server Log Ripal Patel 1, Mr. Krunal Panchal 2, Mr. Dushyantsinh Rathod 3 1 M.E., 2,3 Assistant Professor, 1,2,3 computer Engineering Department, 1,2 L J Institute of Engineering

### Rule based Classification of BSE Stock Data with Data Mining

International Journal of Information Sciences and Application. ISSN 0974-2255 Volume 4, Number 1 (2012), pp. 1-9 International Research Publication House http://www.irphouse.com Rule based Classification

### International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013

A Short-Term Traffic Prediction On A Distributed Network Using Multiple Regression Equation Ms.Sharmi.S 1 Research Scholar, MS University,Thirunelvelli Dr.M.Punithavalli Director, SREC,Coimbatore. Abstract:

### Predicting Students Final GPA Using Decision Trees: A Case Study

Predicting Students Final GPA Using Decision Trees: A Case Study Mashael A. Al-Barrak and Muna Al-Razgan Abstract Educational data mining is the process of applying data mining tools and techniques to

### Smart Grid Data Analytics for Decision Support

1 Smart Grid Data Analytics for Decision Support Prakash Ranganathan, Department of Electrical Engineering, University of North Dakota, Grand Forks, ND, USA Prakash.Ranganathan@engr.und.edu, 701-777-4431

### Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition Data

Proceedings of Student-Faculty Research Day, CSIS, Pace University, May 2 nd, 2014 Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition

### Comparative Analysis of Classification Algorithms on Different Datasets using WEKA

Volume 54 No13, September 2012 Comparative Analysis of Classification Algorithms on Different Datasets using WEKA Rohit Arora MTech CSE Deptt Hindu College of Engineering Sonepat, Haryana, India Suman

### A Case of Study on Hadoop Benchmark Behavior Modeling Using ALOJA-ML

www.bsc.es A Case of Study on Hadoop Benchmark Behavior Modeling Using ALOJA-ML Josep Ll. Berral, Nicolas Poggi, David Carrera Workshop on Big Data Benchmarks Toronto, Canada 2015 1 Context ALOJA: framework

### Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Toshio Sugihara Abstract In this study, an adaptive

### Maschinelles Lernen mit MATLAB

Maschinelles Lernen mit MATLAB Jérémy Huard Applikationsingenieur The MathWorks GmbH 2015 The MathWorks, Inc. 1 Machine Learning is Everywhere Image Recognition Speech Recognition Stock Prediction Medical

### International Journal of Electronics and Computer Science Engineering 1449

International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

### Automated Content Analysis of Discussion Transcripts

Automated Content Analysis of Discussion Transcripts Vitomir Kovanović v.kovanovic@ed.ac.uk Dragan Gašević dgasevic@acm.org School of Informatics, University of Edinburgh Edinburgh, United Kingdom v.kovanovic@ed.ac.uk

### Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

### Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

### ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION

ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION K.Vinodkumar 1, Kathiresan.V 2, Divya.K 3 1 MPhil scholar, RVS College of Arts and Science, Coimbatore, India. 2 HOD, Dr.SNS

### A Time Efficient Algorithm for Web Log Analysis

A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,

### Classification of Learners Using Linear Regression

Proceedings of the Federated Conference on Computer Science and Information Systems pp. 717 721 ISBN 978-83-60810-22-4 Classification of Learners Using Linear Regression Marian Cristian Mihăescu Software

### A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

### SOFTWARE EFFORT ESTIMATION USING RADIAL BASIS FUNCTION NEURAL NETWORKS Ana Maria Bautista, Angel Castellanos, Tomas San Feliu

International Journal Information Theories and Applications, Vol. 21, Number 4, 2014 319 SOFTWARE EFFORT ESTIMATION USING RADIAL BASIS FUNCTION NEURAL NETWORKS Ana Maria Bautista, Angel Castellanos, Tomas

### Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

### INCREASING FORECASTING ACCURACY OF TREND DEMAND BY NON-LINEAR OPTIMIZATION OF THE SMOOTHING CONSTANT

58 INCREASING FORECASTING ACCURACY OF TREND DEMAND BY NON-LINEAR OPTIMIZATION OF THE SMOOTHING CONSTANT Sudipa Sarker 1 * and Mahbub Hossain 2 1 Department of Industrial and Production Engineering Bangladesh

### Facilitating Business Process Discovery using Email Analysis

Facilitating Business Process Discovery using Email Analysis Matin Mavaddat Matin.Mavaddat@live.uwe.ac.uk Stewart Green Stewart.Green Ian Beeson Ian.Beeson Jin Sa Jin.Sa Abstract Extracting business process

### Students Behavioural Analysis in an Online Learning Environment Using Data Mining

Students Behavioural Analysis in an Online Learning Environment Using Data Mining I. P. Ratnapala Computing Centre Faculty of Engineering, University of Peradeniya Peradeniya, Sri Lanka R. G. Ragel, S.

### Modeling Suspicious Email Detection Using Enhanced Feature Selection

Modeling Suspicious Email Detection Using Enhanced Feature Selection Sarwat Nizamani, Nasrullah Memon, Uffe Kock Wiil, and Panagiotis Karampelas Abstract The paper presents a suspicious email detection

### Prof. Pietro Ducange Students Tutor and Practical Classes Course of Business Intelligence 2014 http://www.iet.unipi.it/p.ducange/esercitazionibi/

Prof. Pietro Ducange Students Tutor and Practical Classes Course of Business Intelligence 2014 http://www.iet.unipi.it/p.ducange/esercitazionibi/ Email: p.ducange@iet.unipi.it Office: Dipartimento di Ingegneria

### Time Series Data Mining in Rainfall Forecasting Using Artificial Neural Network

Time Series Data Mining in Rainfall Forecasting Using Artificial Neural Network Prince Gupta 1, Satanand Mishra 2, S.K.Pandey 3 1,3 VNS Group, RGPV, Bhopal, 2 CSIR-AMPRI, BHOPAL prince2010.gupta@gmail.com

### Prerequisites. Course Outline

MS-55040: Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot Description This three-day instructor-led course will introduce the students to the concepts of data mining,

### Keywords Sales Forecasting, ES, MA, Adaptive Neuro Fuzzy Inference System, ANN, Linear Regression.

A Business Intelligence Technique for Forecasting the Automobile Sales using Adaptive Intelligent Systems (ANFIS and ANN) Alekh Dwivedi Maheshwari Niranjan Kalicharan Sahu Department of Information Technology

### A Comparative Study of clustering algorithms Using weka tools

A Comparative Study of clustering algorithms Using weka tools Bharat Chaudhari 1, Manan Parikh 2 1,2 MECSE, KITRC KALOL ABSTRACT Data clustering is a process of putting similar data into groups. A clustering

### Improving students learning process by analyzing patterns produced with data mining methods

Improving students learning process by analyzing patterns produced with data mining methods Lule Ahmedi, Eliot Bytyçi, Blerim Rexha, and Valon Raça Abstract Employing data mining algorithms on previous

### Impact of Boolean factorization as preprocessing methods for classification of Boolean data

Impact of Boolean factorization as preprocessing methods for classification of Boolean data Radim Belohlavek, Jan Outrata, Martin Trnecka Data Analysis and Modeling Lab (DAMOL) Dept. Computer Science,

### Prediction Model for Crude Oil Price Using Artificial Neural Networks

Applied Mathematical Sciences, Vol. 8, 2014, no. 80, 3953-3965 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.43193 Prediction Model for Crude Oil Price Using Artificial Neural Networks

### KEITH LEHNERT AND ERIC FRIEDRICH

MACHINE LEARNING CLASSIFICATION OF MALICIOUS NETWORK TRAFFIC KEITH LEHNERT AND ERIC FRIEDRICH 1. Introduction 1.1. Intrusion Detection Systems. In our society, information systems are everywhere. They

### Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data

Fifth International Workshop on Computational Intelligence & Applications IEEE SMC Hiroshima Chapter, Hiroshima University, Japan, November 10, 11 & 12, 2009 Extension of Decision Tree Algorithm for Stream

### COURSE RECOMMENDER SYSTEM IN E-LEARNING

International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

### Drug Store Sales Prediction

Drug Store Sales Prediction Chenghao Wang, Yang Li Abstract - In this paper we tried to apply machine learning algorithm into a real world problem drug store sales forecasting. Given store information,

### Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

### 131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

### Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News

Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News Sushilkumar Kalmegh Associate Professor, Department of Computer Science, Sant Gadge Baba Amravati

### DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY

### Data Mining Analytics for Business Intelligence and Decision Support

Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing

### NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling

1 Forecasting Women s Apparel Sales Using Mathematical Modeling Celia Frank* 1, Balaji Vemulapalli 1, Les M. Sztandera 2, Amar Raheja 3 1 School of Textiles and Materials Technology 2 Computer Information

### Neural Networks in Data Mining

IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V6 PP 01-06 www.iosrjen.org Neural Networks in Data Mining Ripundeep Singh Gill, Ashima Department

### C19 Machine Learning

C9 Machine Learning 8 Lectures Hilary Term 25 2 Tutorial Sheets A. Zisserman Overview: Supervised classification perceptron, support vector machine, loss functions, kernels, random forests, neural networks

### MLeXAI Workshop. Ingrid Russell, Zdravko Markov. Sanibel Island, FL, May 18, 2009

MLeXAI Workshop Ingrid Russell, Zdravko Markov Sanibel Island, FL, May 18, 2009 Web Document Classification Originally developed in summer 2005 by Ingrid Russell, Zdravko Markov, and Todd Neller Revised

### K-means Clustering Technique on Search Engine Dataset using Data Mining Tool

International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 6 (2013), pp. 505-510 International Research Publications House http://www. irphouse.com /ijict.htm K-means

### STOCK MARKET TRENDS USING CLUSTER ANALYSIS AND ARIMA MODEL

Stock Asian-African Market Trends Journal using of Economics Cluster Analysis and Econometrics, and ARIMA Model Vol. 13, No. 2, 2013: 303-308 303 STOCK MARKET TRENDS USING CLUSTER ANALYSIS AND ARIMA MODEL

### SPATIAL DATA CLASSIFICATION AND DATA MINING

, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

### DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM M. Mayilvaganan 1, S. Aparna 2 1 Associate

### Contents WEKA Microsoft SQL Database

WEKA User Manual Contents WEKA Introduction 3 Background information. 3 Installation. 3 Where to get WEKA... 3 Downloading Information... 3 Opening the program.. 4 Chooser Menu. 4-6 Preprocessing... 6-7

### Open-Source Machine Learning: R Meets Weka

Open-Source Machine Learning: R Meets Weka Kurt Hornik Christian Buchta Achim Zeileis Weka? Weka is not only a flightless endemic bird of New Zealand (Gallirallus australis, picture from Wekapedia) but

### Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network

Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Anthony Lai (aslai), MK Li (lilemon), Foon Wang Pong (ppong) Abstract Algorithmic trading, high frequency trading (HFT)

### COURSE SYLLABUS COURSE TITLE:

1 COURSE SYLLABUS COURSE TITLE: FORMAT: CERTIFICATION EXAMS: 55040 Data Mining: Predictive Analytics with Microsoft SQL Server Analysis Services and Excel Using PowerPivot and the Data Mining Add-Ins Instructor-Led

### DBTech Pro Workshop. Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining. Georgios Evangelidis

DBTechNet DBTech Pro Workshop Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining Dimitris A. Dervos dad@it.teithe.gr http://aetos.it.teithe.gr/~dad Georgios Evangelidis

### Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

### The Operational Value of Social Media Information. Social Media and Customer Interaction

The Operational Value of Social Media Information Dennis J. Zhang (Kellogg School of Management) Ruomeng Cui (Kelley School of Business) Santiago Gallino (Tuck School of Business) Antonio Moreno-Garcia

### Review of Computer Engineering Research WEB PAGES CATEGORIZATION BASED ON CLASSIFICATION & OUTLIER ANALYSIS THROUGH FSVM. Geeta R.B.* Shobha R.B.

Review of Computer Engineering Research journal homepage: http://www.pakinsight.com/?ic=journal&journal=76 WEB PAGES CATEGORIZATION BASED ON CLASSIFICATION & OUTLIER ANALYSIS THROUGH FSVM Geeta R.B.* Department

### Critical Regression Analysis of Real Time Industrial Web Data Set using Data Mining Tool

Critical Regression Analysis of Real Time Industrial Web Data Set using Data Mining Tool Shruti Kohli Assistant Professor, Department of Computer Science and Engineering Birla Institute of Technology,

### Server Load Prediction

Server Load Prediction Suthee Chaidaroon (unsuthee@stanford.edu) Joon Yeong Kim (kim64@stanford.edu) Jonghan Seo (jonghan@stanford.edu) Abstract Estimating server load average is one of the methods that

### College of Health and Human Services. Fall 2013. Syllabus

College of Health and Human Services Fall 2013 Syllabus information placement Instructor description objectives HAP 780 : Data Mining in Health Care Time: Mondays, 7.20pm 10pm (except for 3 rd lecture

### Prediction Models for a Smart Home based Health Care System

Prediction Models for a Smart Home based Health Care System Vikramaditya R. Jakkula 1, Diane J. Cook 2, Gaurav Jain 3. Washington State University, School of Electrical Engineering and Computer Science,

### What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO

What is Data Mining? Data Mining (Knowledge discovery in database) Data Mining: "The non trivial extraction of implicit, previously unknown, and potentially useful information from data" William J Frawley,

### An application for clickstream analysis

An application for clickstream analysis C. E. Dinucă Abstract In the Internet age there are stored enormous amounts of data daily. Nowadays, using data mining techniques to extract knowledge from web log

### Clustering Marketing Datasets with Data Mining Techniques

Clustering Marketing Datasets with Data Mining Techniques Özgür Örnek International Burch University, Sarajevo oornek@ibu.edu.ba Abdülhamit Subaşı International Burch University, Sarajevo asubasi@ibu.edu.ba