International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013"

Transcription

1 A Short-Term Traffic Prediction On A Distributed Network Using Multiple Regression Equation Ms.Sharmi.S 1 Research Scholar, MS University,Thirunelvelli Dr.M.Punithavalli Director, SREC,Coimbatore. Abstract: A new approach is proposed to predict the fractal behavior of a distributed network traffic, in which a random scaling fractal model is used to simulate the self-affine characteristics ofa network traffic.a study of the network traffic is done by sniffing a portion of it using Wireshark. The sniffed traffic is inspected and dissected using filter option, for each differentprotocols. The fractal behavior of the traffic are sniffed and examined by an open-source network analyzer. Later, the packet records that were sniffed are exported to NeuroSolutions builder,spss andthen examined. Further, the exported and dissected traffic data is fed as input to train the neural network to let it predict the resultant fractal behavior of the distributed network traffic and an equation is proposed to derive the ultimate close network traffic prediction in SPSS. Keywords: fractal behavior, sniffing, predict, SPSS, NeuroSolution builder, NeuroXL predictor. I INTRODUCTION For the examination of local problems in a small network, monitoring at a single observation point is sufficient to train the network builder. For such cases, a network analyzer may be used which can be a machine running Wireshark and is directly connected to a network segment or the monitoring port of a switch or a router. In larger networks, it is often necessary to perform simultaneous monitoring at multiple observation points to train the constructed neural network in a more efficient manner. In this research a Neural Network(Multilayer Perceptron)is proposed to be used to predict the dependent variable values over different independent variable value distributions using two specific modeling tools, viz., SPSS and NeuroSolutions. One objective of this is to find the effect of the dependent variable values distributions in the dataset using different modeling tools on the Neural Network prediction performance. A second objective is to compare the performance of the two modeling tools in the predictionof the dependent variable values. Analyzing packet records with wireshark Wireshark [1], formerly known as Ethereal is probably the most popular open-source network analyzer tool. For the experiments, we configured Wireshark on our machine to capture network packets. The data collected is exported in Comma Separated Value (.csv) format. Wireshark can be divided into four main modules: Capture Core, WireTap, Protocol Interpreter and Dissector. Capture Core uses the common library WinPcap to capture data from different network (Ethernet, Ring, etc.); once the data is obtained, WireTap is used to save it as a binary file; since the data is in binary, without the Protocol Interpreter and Dissector, user cannot understand the data. Dissector can be available in a built-in or a plug-in mode. The proposed approach allows profiting from Wireshark's extensive packet inspection facility and protocol dissection capabilities for distributed network analysis. Neuro solutions The NeuralBuilder helps to construct the neural network by selecting parameters. The four currently available problem types in the NeuralExpert are Classification, Prediction, Function Approximation, and Clustering. Later, a parameter list is selected to train the neural network and the desired traffic is output to train the network. ISSN: Page 2452

2 Figure 1. Flow diagram to deploy traffic prediction using ANN. An ANN is a computational method motivated by biological models. ANNs attempt to mimic the fundamental operation of the human brain and can be used to solve a broad variety of problems [10]. One of the most important features of ANNs is that it can discover hidden patterns from data sets [11], and solve complex problems especially when a mathematical model does not exist (or when the model is not suitable for the case at hand). Furthermore, ANNs are commonly immune to noise and irregularities present in the data [12, 13]. ANN learning is typically based on two data sets: the training set and the validation set. The training set is used on a new artificial neural network, as its name indicates, for training. The validation set is used after the neural network has been trained to assess its performance. The validation set in most case is similar to the training set but not same [14, 15 ]. Data mapping In artificial intelligence, a desired output is commonly known as the target. For the specific case of ANNs, the target is used for network training [9]. ANNs can map a given input to a desired output; when an ANN is used for this purpose, the ANN is typically called a mapping ANN. The network is trained by applying the desired input to the ANN, and then monitoring the actual ANN output. The difference between the actual ANN output and the desired output is normally used to manage the learning process. During the process of training, the learning algorithm attempts to reduce the error measured between the actual network output and the targetin the training set [9, 11]. The training process may be time consuming, but when the process has been successfully completed, an ANN canquickly calculate its output once the input data has ISSN: Page 2453

3 Start Translate the network traffic data parameters. Train the NN s architecture for N number of epochs. Step: 1 Dissect the network traffic dataset and enlist the Step : 5 Perform Prediction- Original expected traffic Step : 2 Evaluate the performance Step :3 Criteria Satisfy Step : 4 Extract a new traffic dataset dissected N Y Figure 2: Flow diagram of ANN been applied to the network input. Data classification Data classification or just classification is the process of identifying an object from a set of possible outcomes [9, 12]. An ANN Stop using NeuroSolutions: can be trained to identify and classify any kind of objects. These objects can be numbers, images, sounds, signals, etc. An ANN used for this purpose is also known as a classifier. Figure 3. Training fractal-dataset graph ISSN: Page 2454

4 The traffic data is trained initially with a network traffic-dataset that had been downloaded from wireshark sample captures as a pcap file and the data is exported to network builder for prediction. The predicted fractal behavior on the traffic data set is shown in table 1. II INVESTIGATION OF CORRELATION COEFFICIENT VALUE On investigating the effect of dependent variable values and the distribution on the prediction accuracy rate. The results of the analyses lets us to find the effect of the dependent variable values distribution on prediction accuracy that exploits and leads us generating an equation that would predict the expected traffic based on the independent variable-values distribution using the modeling tool SPSS. Correlation Coefficient, R, is a measure of the strength of the association between the independent (explanatory) variables and the dependent (prediction) variable.r is never a negative value. This can be seen from the formula below, since the square root of this value indicates the positive root[2,3]. Formula for R,Formula for two independent variables, X1 and X2 The coefficient of multiple correlation estimates the combined influence of two or more variables on the observed (dependent) variable. To analyse the traffic data using multiple regression, part of the process involves the following assumptions to be verified[8]. The dependent variable is measured on a continuous scale. Two or more independent variables, are continuous or categorical. Observatios should be recorded. Linear relationship exists between the dependent variable and each of the independent variables. Traffic data shows homoscedasticity, which is where the variances along the line of best-fit remain similar as one move along the line. The data does not show multicollinearity, which occurs when two or more independent variables are highly correlated. There are no significant outliers, high leverage points or highly influential points. Residuals (errors) are approximately normally distributed. The above listed assumptions are not violated and henceforth the Multiple Correlation Coefficient, R, is computed to measure the strength of the association between the independent (explanatory) variables and a single dependent (prediction) variable. Multiple Regression-booster prediction phases: In MR-Booster, by using each feature of the association existing between the actual traffic and the dissected traffic explicitly helped to generate the prediction equation and the standard error factor when probed in further boosts a better way to refine the regression equation that predicts the network traffic. The correlation structure of traffic is finally generated in a much easier way. Phase 1: a. The sniffed traffic data is plotted as a scatter plot graph to visualize if there is a possible linear relationship. b. Calculate and interpret the linear correlation coefficient, using the data sets. ISSN: Page 2455

5 Phase 2: c. Determine all possible regression equation for the data by refining it further by adjusting the constant standard error from it. d. Select and apply the best generated regression equation and forecast. Phase 3: e. Identify outliers and note the observations. f. Process and interpret the performance of, R-booster prediction. Table 1.Descriptive Statistics(SPSS) Mean Std. Deviation N Actual-Traffic Traffic-n Traffic-n Traffic-n Model Table 2.Correlation Coefficientsa (a-dependent actual traffic-graph) Unstandardized Coefficients R Std. Error Beta Standardized Coefficients 1 (Constant) T Sig. Network1(n1) Network2(n2) Network3(n3) The equation generated to predict the actual traffic that could be generated for the following dissected protocol-traffic. Predicted traffic(w.r.t time slice)=n1 *(R( n1) standard Error-n1) + n2 *(R(n2) standard Error) + n3 * (R(n3) standard Error) + (R-constant standard Error) Predicted-traffic=Traffic-n1*0.873+Trafficn2*1.015+Traffic-n3* R value of traffic from n1 and n2 have a strong association with the actual traffic, where as traffic from n3 has a weak association is shown in table 3. R value Table 3.R value strength. Interpretation 0.9 strong association 0.5 moderate association 0.25 weak association ISSN: Page 2456

6 Figure 4. Actual-traffic vs Predicted traffic(neurosolutions) and Computed-traffic(SPSS) The figure 4, shows that the traffic computed using the generated equation is very close to the actual-target-traffic. III PERFORMANCE EVALUATION The overall performance of the analyzed prediction methods are stated here to estimate the prediction accuracy. Coefficient Efficiency(E) is one such estimation method that measures the performance and reveals the efficiency rate. The efficiency coefficient can take values in the domain (, 1]. If E = 1, we have a perfect fit between the observed and the forecasted data. A value of E = 0 occurs when the prediction corresponds to estimating the mean of the actual values. An efficiency less than zero, i.e. < E < 0, indicates that the average of the actual values is a better predictor than the analyzed forecasting method. The closer E is to 1, the more accurate the prediction is as the coefficient efficiency stays at 0.9 for the forecasted traffic IV CONCLUSION The experimental results demonstrate that 1) the regression model is more effective for traffic prediction; and 2) both the proposed prediction equation and standard error based R(correlation coefficient) update scheme are effective to predict the traffic in a easier way.the goal of the experiments is to evaluate and to compare the performance of the ANN prediction approaches presented earlier in this paper. Hence, the linear regression model offers is a powerful tool for analyzing the association between one or more independent variables and a single dependent variable. Some novice researchers wish to move quickly beyond this model and learn to use more sophisticated models because they get discouraged about its limitations and believe that other regression models are more appropriate for their analysis needs. References [1]Wireshark Homepage, www. wireshark.org, [2] ClearSight Networks, Inc. Homepage, [3]https://statistics.laerd.com/spss-tutorials /multipleregression-using-spss-statistics. php [4]http://en.wikipedia.org/wiki/Multiple _ correlation [5]http://www.yeatts.us/6200-Multivariate %20Stats/Lectures-tests/Test%202/Week-12- assumptions.pdf [6] WildPackets, Inc. Homepage, http: //www. wildpackets.com, [7] S. Waldbusser, Remote Network Monitoring Management InformationBase, RFC 2819 (Standard), May [8] T. Masters, Practical Neural Network Recipes in C++. Preparing Input Data (C-16), Academic Press, Inc., pp , (1993). [9] S. J. Russel and P. Norvig, Artificial Intelligence: A Modern Approach.Prentice-Hall of India, Second Edition.Statistical Learning Methods (C-20), pp , (2006). [10] T. Masters, Neural, Novel & Hybrid Algorithms for Time Series Prediction. Neural Network Tools (C-10), John Wiley & Sons Inc., pp , (1995). [11] T. Masters, Signal and Image Processing With Neural Networks. Data Preparation for Neural Networks (C-3), John Wiley & Sons Inc., pp , (1994). [12] T. Masters, Advanced Algorithms for Neural Networks. Assessing Generalization Ability (C-9), John Wiley & Sons Inc., pp , (1995). [13] R. D. Reed and R. J. Marks II, Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks. Factors Influencing Generalization (C-14), The MIT Press, pp , (1999). [14]http://en.wikipedia.org/wiki/Neural _Lab ISSN: Page 2457

Joseph Twagilimana, University of Louisville, Louisville, KY

Joseph Twagilimana, University of Louisville, Louisville, KY ST14 Comparing Time series, Generalized Linear Models and Artificial Neural Network Models for Transactional Data analysis Joseph Twagilimana, University of Louisville, Louisville, KY ABSTRACT The aim

More information

Non-Linear Regression Analysis

Non-Linear Regression Analysis Non-Linear Regression Analysis By Chanaka Kaluarachchi Presentation outline Linear regression Checking linear Assumptions Linear vs non-linear Non linear regression analysis Linear regression (reminder)

More information

Simple Linear Regression in SPSS STAT 314

Simple Linear Regression in SPSS STAT 314 Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,

More information

, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (

, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients ( Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we

More information

The scatterplot indicates a positive linear relationship between waist size and body fat percentage:

The scatterplot indicates a positive linear relationship between waist size and body fat percentage: STAT E-150 Statistical Methods Multiple Regression Three percent of a man's body is essential fat, which is necessary for a healthy body. However, too much body fat can be dangerous. For men between the

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

1. ε is normally distributed with a mean of 0 2. the variance, σ 2, is constant 3. All pairs of error terms are uncorrelated

1. ε is normally distributed with a mean of 0 2. the variance, σ 2, is constant 3. All pairs of error terms are uncorrelated STAT E-150 Statistical Methods Residual Analysis; Data Transformations The validity of the inference methods (hypothesis testing, confidence intervals, and prediction intervals) depends on the error term,

More information

MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL

MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL G. Maria Priscilla 1 and C. P. Sumathi 2 1 S.N.R. Sons College (Autonomous), Coimbatore, India 2 SDNB Vaishnav College

More information

CHAPTER 6 NEURAL NETWORK BASED SURFACE ROUGHNESS ESTIMATION

CHAPTER 6 NEURAL NETWORK BASED SURFACE ROUGHNESS ESTIMATION CHAPTER 6 NEURAL NETWORK BASED SURFACE ROUGHNESS ESTIMATION 6.1. KNOWLEDGE REPRESENTATION The function of any representation scheme is to capture the essential features of a problem domain and make that

More information

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Understand when to use multiple Understand the multiple equation and what the coefficients represent Understand different methods

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Relationship of two variables

Relationship of two variables Relationship of two variables A correlation exists between two variables when the values of one are somehow associated with the values of the other in some way. Scatter Plot (or Scatter Diagram) A plot

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Advanced analytics at your hands

Advanced analytics at your hands 2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously

More information

NEURAL NETWORKS IN DATA MINING

NEURAL NETWORKS IN DATA MINING NEURAL NETWORKS IN DATA MINING 1 DR. YASHPAL SINGH, 2 ALOK SINGH CHAUHAN 1 Reader, Bundelkhand Institute of Engineering & Technology, Jhansi, India 2 Lecturer, United Institute of Management, Allahabad,

More information

Keywords Sales Forecasting, ES, MA, Adaptive Neuro Fuzzy Inference System, ANN, Linear Regression.

Keywords Sales Forecasting, ES, MA, Adaptive Neuro Fuzzy Inference System, ANN, Linear Regression. A Business Intelligence Technique for Forecasting the Automobile Sales using Adaptive Intelligent Systems (ANFIS and ANN) Alekh Dwivedi Maheshwari Niranjan Kalicharan Sahu Department of Information Technology

More information

Prediction Model for Crude Oil Price Using Artificial Neural Networks

Prediction Model for Crude Oil Price Using Artificial Neural Networks Applied Mathematical Sciences, Vol. 8, 2014, no. 80, 3953-3965 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.43193 Prediction Model for Crude Oil Price Using Artificial Neural Networks

More information

International Journal of Electronics and Computer Science Engineering 1449

International Journal of Electronics and Computer Science Engineering 1449 International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Moderator and Mediator Analysis

Moderator and Mediator Analysis Moderator and Mediator Analysis Seminar General Statistics Marijtje van Duijn October 8, Overview What is moderation and mediation? What is their relation to statistical concepts? Example(s) October 8,

More information

Padma Charan Das Dept. of E.T.C. Berhampur, Odisha, India

Padma Charan Das Dept. of E.T.C. Berhampur, Odisha, India Volume 5, Issue 3, March 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Measuring Quality

More information

Artificial Neural Network and Non-Linear Regression: A Comparative Study

Artificial Neural Network and Non-Linear Regression: A Comparative Study International Journal of Scientific and Research Publications, Volume 2, Issue 12, December 2012 1 Artificial Neural Network and Non-Linear Regression: A Comparative Study Shraddha Srivastava 1, *, K.C.

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

Predictive Analytics Tools and Techniques

Predictive Analytics Tools and Techniques Global Journal of Finance and Management. ISSN 0975-6477 Volume 6, Number 1 (2014), pp. 59-66 Research India Publications http://www.ripublication.com Predictive Analytics Tools and Techniques Mr. Chandrashekar

More information

Using SPSS for Multiple Regression. UDP 520 Lab 7 Lin Lin December 4 th, 2007

Using SPSS for Multiple Regression. UDP 520 Lab 7 Lin Lin December 4 th, 2007 Using SPSS for Multiple Regression UDP 520 Lab 7 Lin Lin December 4 th, 2007 Step 1 Define Research Question What factors are associated with BMI? Predict BMI. Step 2 Conceptualizing Problem (Theory) Individual

More information

Detecting Threats in Network Security by Analyzing Network Packets using Wireshark

Detecting Threats in Network Security by Analyzing Network Packets using Wireshark 1 st International Conference of Recent Trends in Information and Communication Technologies Detecting Threats in Network Security by Analyzing Network Packets using Wireshark Abdulalem Ali *, Arafat Al-Dhaqm,

More information

CHAPTER 2 AND 10: Least Squares Regression

CHAPTER 2 AND 10: Least Squares Regression CHAPTER 2 AND 0: Least Squares Regression In chapter 2 and 0 we will be looking at the relationship between two quantitative variables measured on the same individual. General Procedure:. Make a scatterplot

More information

Data Mining mit der JMSL Numerical Library for Java Applications

Data Mining mit der JMSL Numerical Library for Java Applications Data Mining mit der JMSL Numerical Library for Java Applications Stefan Sineux 8. Java Forum Stuttgart 07.07.2005 Agenda Visual Numerics JMSL TM Numerical Library Neuronale Netze (Hintergrund) Demos Neuronale

More information

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Nine Common Types of Data Mining Techniques Used in Predictive Analytics 1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better

More information

2. IMPLEMENTATION. International Journal of Computer Applications (0975 8887) Volume 70 No.18, May 2013

2. IMPLEMENTATION. International Journal of Computer Applications (0975 8887) Volume 70 No.18, May 2013 Prediction of Market Capital for Trading Firms through Data Mining Techniques Aditya Nawani Department of Computer Science, Bharati Vidyapeeth s College of Engineering, New Delhi, India Himanshu Gupta

More information

ID X Y

ID X Y Dale Berger SPSS Step-by-Step Regression Introduction: MRC01 This step-by-step example shows how to enter data into SPSS and conduct a simple regression analysis to develop an equation to predict from.

More information

APPLICATION OF BOX-JENKINS METHOD AND ARTIFICIAL NEURAL NETWORK PROCEDURE FOR TIME SERIES FORECASTING OF PRICES

APPLICATION OF BOX-JENKINS METHOD AND ARTIFICIAL NEURAL NETWORK PROCEDURE FOR TIME SERIES FORECASTING OF PRICES STATISTICS IN TRANSITION new series, Spring 2015 83 STATISTICS IN TRANSITION new series, Spring 2015 Vol. 16, No. 1, pp. 83 96 APPLICATION OF BOX-JENKINS METHOD AND ARTIFICIAL NEURAL NETWORK PROCEDURE

More information

An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks

An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks 2011 International Conference on Network and Electronics Engineering IPCSIT vol.11 (2011) (2011) IACSIT Press, Singapore An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks Reyhaneh

More information

Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY Before we begin: Turn on the sound on your computer. There is audio to accompany this presentation. Audio will accompany most of the online

More information

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

More information

Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

More information

Weather forecast prediction: a Data Mining application

Weather forecast prediction: a Data Mining application Weather forecast prediction: a Data Mining application Ms. Ashwini Mandale, Mrs. Jadhawar B.A. Assistant professor, Dr.Daulatrao Aher College of engg,karad,ashwini.mandale@gmail.com,8407974457 Abstract

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Multiple Regression in SPSS STAT 314

Multiple Regression in SPSS STAT 314 Multiple Regression in SPSS STAT 314 I. The accompanying data is on y = profit margin of savings and loan companies in a given year, x 1 = net revenues in that year, and x 2 = number of savings and loan

More information

0.1 Multiple Regression Models

0.1 Multiple Regression Models 0.1 Multiple Regression Models We will introduce the multiple Regression model as a mean of relating one numerical response variable y to two or more independent (or predictor variables. We will see different

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems 2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Impact of Feature Selection on the Performance of ireless Intrusion Detection Systems

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries

A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries Aida Mustapha *1, Farhana M. Fadzil #2 * Faculty of Computer Science and Information Technology, Universiti Tun Hussein

More information

Application of Predictive Model for Elementary Students with Special Needs in New Era University

Application of Predictive Model for Elementary Students with Special Needs in New Era University Application of Predictive Model for Elementary Students with Special Needs in New Era University Jannelle ds. Ligao, Calvin Jon A. Lingat, Kristine Nicole P. Chiu, Cym Quiambao, Laurice Anne A. Iglesia

More information

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network , pp.67-76 http://dx.doi.org/10.14257/ijdta.2016.9.1.06 The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network Lihua Yang and Baolin Li* School of Economics and

More information

Regression Analysis Using ArcMap. By Jennie Murack

Regression Analysis Using ArcMap. By Jennie Murack Regression Analysis Using ArcMap By Jennie Murack Regression Basics How is Regression Different from other Spatial Statistical Analyses? With other tools you ask WHERE something is happening? Are there

More information

Directions for using SPSS

Directions for using SPSS Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

More information

EFFECTIVE APPROACH FOR DYNAMIC TEST CASE GENERATION FOR LOAD TESTING OF HTTP WEB SERVER

EFFECTIVE APPROACH FOR DYNAMIC TEST CASE GENERATION FOR LOAD TESTING OF HTTP WEB SERVER EFFECTIVE APPROACH FOR DYNAMIC TEST CASE GENERATION FOR LOAD TESTING OF HTTP WEB SERVER Shweta Ahuja M.Tech. Research Scholar Computer Science and Engineering Guru Nanak Institute of Technology Mullana,

More information

Elementary Statistics. Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination

Elementary Statistics. Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination What is a Scatter Plot? A Scatter Plot is a plot of ordered pairs (x, y) where the horizontal axis is used

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Running head: ASSUMPTIONS IN MULTIPLE REGRESSION 1. Assumptions in Multiple Regression: A Tutorial. Dianne L. Ballance ID#

Running head: ASSUMPTIONS IN MULTIPLE REGRESSION 1. Assumptions in Multiple Regression: A Tutorial. Dianne L. Ballance ID# Running head: ASSUMPTIONS IN MULTIPLE REGRESSION 1 Assumptions in Multiple Regression: A Tutorial Dianne L. Ballance ID#00939966 University of Calgary APSY 607 ASSUMPTIONS IN MULTIPLE REGRESSION 2 Assumptions

More information

Web Site Visit Forecasting Using Data Mining Techniques

Web Site Visit Forecasting Using Data Mining Techniques Web Site Visit Forecasting Using Data Mining Techniques Chandana Napagoda Abstract: Data mining is a technique which is used for identifying relationships between various large amounts of data in many

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

System Specification. Author: CMU Team

System Specification. Author: CMU Team System Specification Author: CMU Team Date: 09/23/2005 Table of Contents: 1. Introduction...2 1.1. Enhancement of vulnerability scanning tools reports 2 1.2. Intelligent monitoring of traffic to detect

More information

PASS Sample Size Software. Linear Regression

PASS Sample Size Software. Linear Regression Chapter 855 Introduction Linear regression is a commonly used procedure in statistical analysis. One of the main objectives in linear regression analysis is to test hypotheses about the slope (sometimes

More information

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

More information

Neural Network Design in Cloud Computing

Neural Network Design in Cloud Computing International Journal of Computer Trends and Technology- volume4issue2-2013 ABSTRACT: Neural Network Design in Cloud Computing B.Rajkumar #1,T.Gopikiran #2,S.Satyanarayana *3 #1,#2Department of Computer

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Predictive time series analysis of stock prices using neural network classifier

Predictive time series analysis of stock prices using neural network classifier Predictive time series analysis of stock prices using neural network classifier Abhinav Pathak, National Institute of Technology, Karnataka, Surathkal, India abhi.pat93@gmail.com Abstract The work pertains

More information

Analecta Vol. 8, No. 2 ISSN 2064-7964

Analecta Vol. 8, No. 2 ISSN 2064-7964 EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

CUSTOMER Presentation of SAP Predictive Analytics

CUSTOMER Presentation of SAP Predictive Analytics SAP Predictive Analytics 2.0 2015-02-09 CUSTOMER Presentation of SAP Predictive Analytics Content 1 SAP Predictive Analytics Overview....3 2 Deployment Configurations....4 3 SAP Predictive Analytics Desktop

More information

Hybrid Intrusion Detection System Using K-Means Algorithm

Hybrid Intrusion Detection System Using K-Means Algorithm International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-3 E-ISSN: 2347-2693 Hybrid Intrusion Detection System Using K-Means Algorithm Darshan K. Dagly 1*, Rohan

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Traffic Analyzer Based on Data Flow Patterns

Traffic Analyzer Based on Data Flow Patterns AUTOMATYKA 2011 Tom 15 Zeszyt 3 Artur Sierszeñ*, ukasz Sturgulewski* Traffic Analyzer Based on Data Flow Patterns 1. Introduction Nowadays, there are many systems of Network Intrusion Detection System

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

Wednesday PM. Multiple regression. Multiple regression in SPSS. Presentation of AM results Multiple linear regression. Logistic regression

Wednesday PM. Multiple regression. Multiple regression in SPSS. Presentation of AM results Multiple linear regression. Logistic regression Wednesday PM Presentation of AM results Multiple linear regression Simultaneous Stepwise Hierarchical Logistic regression Multiple regression Multiple regression extends simple linear regression to consider

More information

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S. AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

Price Prediction of Share Market using Artificial Neural Network (ANN)

Price Prediction of Share Market using Artificial Neural Network (ANN) Prediction of Share Market using Artificial Neural Network (ANN) Zabir Haider Khan Department of CSE, SUST, Sylhet, Bangladesh Tasnim Sharmin Alin Department of CSE, SUST, Sylhet, Bangladesh Md. Akter

More information

Semester 1 Statistics Short courses

Semester 1 Statistics Short courses Semester 1 Statistics Short courses Course: STAA0001 Basic Statistics Blackboard Site: STAA0001 Dates: Sat. March 12 th and Sat. April 30 th (9 am 5 pm) Assumed Knowledge: None Course Description Statistical

More information

DATA ANALYTICS USING R

DATA ANALYTICS USING R DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data

More information

SEX DISCRIMINATION PROBLEM

SEX DISCRIMINATION PROBLEM SEX DISCRIMINATION PROBLEM 12. Multiple Linear Regression in SPSS In this section we will demonstrate how to apply the multiple linear regression procedure in SPSS to the sex discrimination data. The numerical

More information

Short Term Load Forecasting Using Time Series Analysis: A Case Study for Karnataka, India

Short Term Load Forecasting Using Time Series Analysis: A Case Study for Karnataka, India ISO 91:28 Certified Volume 1, Issue 2, November 212 Short Term Load Forecasting Using Time Series Analysis: A Case Study for Karnataka, India Nataraja.C 1, M.B.Gorawar 2, Shilpa.G.N. 3, Shri Harsha.J.

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Understand linear regression with a single predictor Understand how we assess the fit of a regression model Total Sum of Squares

More information

Teaching Multivariate Analysis to Business-Major Students

Teaching Multivariate Analysis to Business-Major Students Teaching Multivariate Analysis to Business-Major Students Wing-Keung Wong and Teck-Wong Soon - Kent Ridge, Singapore 1. Introduction During the last two or three decades, multivariate statistical analysis

More information

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

More information

203.4770: Introduction to Machine Learning Dr. Rita Osadchy

203.4770: Introduction to Machine Learning Dr. Rita Osadchy 203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:

More information

Some Research Challenges for Big Data Analytics of Intelligent Security

Some Research Challenges for Big Data Analytics of Intelligent Security Some Research Challenges for Big Data Analytics of Intelligent Security Yuh-Jong Hu hu at cs.nccu.edu.tw Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University,

More information

Design call center management system of e-commerce based on BP neural network and multifractal

Design call center management system of e-commerce based on BP neural network and multifractal Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2014, 6(6):951-956 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Design call center management system of e-commerce

More information

Pattern-Aided Regression Modelling and Prediction Model Analysis

Pattern-Aided Regression Modelling and Prediction Model Analysis San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Fall 2015 Pattern-Aided Regression Modelling and Prediction Model Analysis Naresh Avva Follow this and

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

Chapter 4 Describing the Relation between Two Variables

Chapter 4 Describing the Relation between Two Variables Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation The response variable is the variable whose value can be explained by the value of the explanatory or predictor

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

CHAPTER 5 PREDICTIVE MODELING STUDIES TO DETERMINE THE CONVEYING VELOCITY OF PARTS ON VIBRATORY FEEDER

CHAPTER 5 PREDICTIVE MODELING STUDIES TO DETERMINE THE CONVEYING VELOCITY OF PARTS ON VIBRATORY FEEDER 93 CHAPTER 5 PREDICTIVE MODELING STUDIES TO DETERMINE THE CONVEYING VELOCITY OF PARTS ON VIBRATORY FEEDER 5.1 INTRODUCTION The development of an active trap based feeder for handling brakeliners was discussed

More information

UNDERSTANDING MULTIPLE REGRESSION

UNDERSTANDING MULTIPLE REGRESSION UNDERSTANDING Multiple regression analysis (MRA) is any of several related statistical methods for evaluating the effects of more than one independent (or predictor) variable on a dependent (or outcome)

More information

Introduction to Pattern Recognition

Introduction to Pattern Recognition Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ 1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

More information

SOFTWARE EFFORT ESTIMATION USING RADIAL BASIS FUNCTION NEURAL NETWORKS Ana Maria Bautista, Angel Castellanos, Tomas San Feliu

SOFTWARE EFFORT ESTIMATION USING RADIAL BASIS FUNCTION NEURAL NETWORKS Ana Maria Bautista, Angel Castellanos, Tomas San Feliu International Journal Information Theories and Applications, Vol. 21, Number 4, 2014 319 SOFTWARE EFFORT ESTIMATION USING RADIAL BASIS FUNCTION NEURAL NETWORKS Ana Maria Bautista, Angel Castellanos, Tomas

More information

Neural Network Applications in Stock Market Predictions - A Methodology Analysis

Neural Network Applications in Stock Market Predictions - A Methodology Analysis Neural Network Applications in Stock Market Predictions - A Methodology Analysis Marijana Zekic, MS University of Josip Juraj Strossmayer in Osijek Faculty of Economics Osijek Gajev trg 7, 31000 Osijek

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Practice 3 SPSS. Partially based on Notes from the University of Reading:

Practice 3 SPSS. Partially based on Notes from the University of Reading: Practice 3 SPSS Partially based on Notes from the University of Reading: http://www.reading.ac.uk Simple Linear Regression A simple linear regression model is fitted when you want to investigate whether

More information