Empirical Study of Decision Tree and Artificial Neural Network Algorithm for Mining Educational Database

Size: px
Start display at page:

Download "Empirical Study of Decision Tree and Artificial Neural Network Algorithm for Mining Educational Database"

Transcription

1 Empirical Study of Decision Tree and Artificial Neural Network Algorithm for Mining Educational Database A.O. Osofisan 1, O.O. Adeyemo 2 & S.T. Oluwasusi 3 Department of Computer Science, University of Ibadan Ibadan, Oyo State, Nigeria. nikeosofisan@gmail.com, wumiglory@yahoo.com 1 Corresponding author: ABSTRACT The ability to predict student s performance is very important in educational environments because it plays an important role in producing the best quality graduates and post-graduates who will become great leaders of tomorrow and source of manpower for the country. Therefore the performance of students in universities is of utmost concern. One way to achieve this is by discovering knowledge for prediction as regards enrollment of student in a particular course, prediction of students performance and so on. The knowledge is hidden among the educational data set and it is extractable through data mining techniques. Over the years, many students who enrolled in University of Ibadan M.Sc. program were unable to complete the program because there were no supporting tools that can help them take the best decision previous to their enrolment. Some also finish with poor grades, due to the fact that the students enrolment is only based on their personal experience. However, many students do not have enough experience for taking enrolment decisions. This is a waste of resources from the student s point of view as well as from the department s. These students also have probably wasted their time doing a course that they do not have the ability to do or interest to complete the program. On the other hand the department has wasted resources on such students. These resources could have been applied elsewhere or used on for student that were not admitted but deserved admission. The aim of this research work is to use Data Mining techniques to study students performance in order to discover appropriate knowledge and extract useful patterns from existing stored data of students. The knowledge and pattern extracted would be used for decision making and the specific Objectives are to discover knowledge for prediction regarding enrolment of student in a particular course and enhance decision making, to improve students performance and overcome the problem of low grades of graduate students and to discover an efficient algorithm that is sufficient in handling mining of data in educational sector. The work investigates the educational domain of data mining using a case study of the M.Sc. Student s data from Computer Science department, University of Ibadan. The data comprised of four hundred and eleven (411) records of students. In this research, the classification task is used to evaluate student s performance and as there are many approaches that are used for data classification, the neural network and decision tree method was used. The results of the two classification methods - Decision Trees and Neural Network are compared to determine the one that gives the best classification results as well as prediction capability in EDM. For the modeling stage, an open source software called WEKA was used. The data set was divided into two sets Training and Testing. Seventy percent (70%) was used for training while thirty percent (30%) was used for testing. From the output generated from the experiment, for neural network, as the number of hidden layer increases, a better result was obtained. The results obtained from the analysis clearly demonstrated a superior performance of neural network over decision tree not only in terms of the number of correctly classified instances but also in terms of RMSE, MAE, RAE. Neural Network performed well in classification as well as in prediction but suffered from lack of speed. Decision Tree was fast but performed badly at the classification. Also the rules generated makes decision tree to be clearer and understandable. Neural Network gives the best classification results as well as prediction capability in EDM. Keywords: Data Mining (DM), Knowledge Discovery in Databases (KDD), Educational Data Mining (EDM), Classification, Prediction, Decision Trees, Neural Network. Reference Format: A.O. Osofisan 1 O.O. Adeyemo 2 & S.T. Oluwasusi 3 (2014). Empirical Study of Decision Tree and Artificial Neural Network Algorithm for Mining Educational Database. Afr J. of Comp & ICTs. Vol 7, No. 2. Pp

2 1. INTRODUCTION Students are the major assets in a university. The ability to evaluate and predict student s performance is very important in educational environments because it plays an important role in producing the best quality graduates and post-graduates who will become great leaders of tomorrow and source of manpower for the country. Therefore the performance of students in universities is of utmost concern. Discovering knowledge for prediction regarding: enrolment of students in a particular course, detection of abnormal values in the result sheets of the students, and prediction about students performance are information hidden within the educational data set. This hidden information can be extracted through data mining techniques. Data Mining (DM) focuses upon methodologies for extracting useful knowledge from large amounts of data. There are several useful Data Mining (DM) tools for extracting knowledge, such knowledge if found in students database may be used to increase quality of education. The evolution of information technology has made the collection, processing, transfer and storage of huge amount of data easier and cheaper to meet the increasing demand for information. As huge amount of data is being collected and stored in various formats (records, files, documents, images, sound, video, scientific data) traditional statistical techniques and database management tools are no longer adequate for analyzing them, hence there is need for proper and efficient knowledge extraction tool such as data mining [1]. 1.1 Data Mining Data mining techniques are used to operate on large volumes of data to discover hidden patterns and relationships helpful in decision making. While data mining and Knowledge Discovery in Databases (KDD) are frequently treated as synonyms, data mining is actually part of the knowledge discovery process. Data mining is a step in the "Knowledge Discovery in Databases" (KDD) process. The aim of this research work is to use Data Mining techniques to study students performance in order to discover appropriate knowledge and extract useful patterns from existing stored data of students. The knowledge and pattern extracted would be used for decision making. The main attribute of Data Mining (DM) is that it includes identifying valid, novel, potentially useful and understandable patterns in data repositories, thereby contributing to the prediction of outcome trends by profiting performance attributes that support effective decision making [2]. DM has been successfully used in different areas including the educational environment.dm application in Educational System is referred to as Educational Data Mining (EDM). EDM uses many techniques such as Decision Trees, Neural Networks, Naive Bayes, K- Nearest neighbor, k-means, Support Vector Machines, Expectation Maximization, etc. but the methods used in this work are Decision Trees and Neural Network. 1.2 The specific Objectives are: To discover knowledge for prediction regarding enrolment of student in a particular course and enhance decision making. To improve students performance and overcome the problem of low grades of graduate students. Discover an efficient algorithm that is sufficient in handling mining of data in educational sector. The Knowledge Discovery in Databases process comprises of a few steps leading from raw data collections to some form of new knowledge. The iterative process consists of the following steps: Data cleaning: also known as data cleansing, it is a phase in which noise and irrelevant data are removed from the collection. Data integration: at this stage, multiple data sources, often heterogeneous, may be combined in a common source. Data selection: at this step, the data relevant to the analysis is decided on and retrieved from the data collection. Data transformation: also known as data consolidation, it is a phase in which the selected data is transformed into forms appropriate for the mining procedure. Data mining: it is the crucial step in which clever techniques are applied to extract patterns potentially useful. Pattern evaluation: in this step, strictly interesting patterns representing Knowledge are identified based on given measures. Knowledge representation: is the final phase in which the discovered knowledge is visually represented to the user. This essential step uses visualization techniques to help users understand and interpret the data mining results. 1.3 Decision Tree A decision tree is a flow-chart-like tree structure, where each internal node is denoted by rectangles, and leaf nodes are denoted by ovals. All internal nodes have two or more child nodes. All internal nodes contain splits, which test the value of an expression of the attributes. Arcs from an internal node to its children are labelled with distinct outcomes of the test. Each leaf node has a class label associated with it. Decision trees are powerful and popular for both classification and prediction. The attractiveness of tree-based methods is due largely to the fact that decision trees represent rules. Rules can readily be expressed in English so that humans can understand them. Decision trees are produced by algorithms that identify 188

3 various ways of splitting a dataset into branch-like segments. These segments form an inverted decision tree that originates with a root node at the top of the tree. The object of analysis is reflected in this root node as a simple, one-dimensional display in the decision tree interface. The name of the field of data that is the object of analysis is usually displayed, along with the spread or distribution of the values that are contained in that field. 1.4 Artificial Neural Network An artificial neural network, simply called neural network is a mathematical model inspired by biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation. In most cases a neural network is an adaptive system changing its structure during a learning phase. Neural networks are used for modeling complex relationships between inputs and outputs. Neural networks, with their remarkable ability to derive meaning from complicated data, can be used to extract patterns and detect trends that are too complex to be noticed by humans or other computer techniques. A trained neural network can be thought of as an expert in the category of information it has been given to analyze. A Neural Network is usually structured into an input layer of neurons, one or more hidden layers and one output layer. Neurons belonging to adjacent layers are usually fully connected and the various types and architectures are identified both by the different topologies adopted for the connections as well by the choice of the activation function. The values of the functions associated with the connections are called weights. For NNs to yield appropriate outputs for given inputs, the weight must be set to suitable values. The way this is obtained allows a further distinction among modes of operations. Figure 1: Neural Network 2. RELATED WORKS [3] gave a case study of mining students data to analyze learning behaviour using 151 students data collected from data base management system course held at the Islamic university of Gaza in the first semester of 2007/2008 including their usage of moodle e-learning facility. Four data mining task namely: Association rules, Classification, clustering and outline detection was applied to the data and it was found that each one of these knowledge discovered can be used to improve students performance. [4] investigated the academic background in relationship with the performance of students in a computer science programme in a Nigerian university. Results indicate that the grade obtained from senior secondary certificate examination (SSCE) in mathematics is the highest determinant used by the C4.5 learning algorithm in building the model of the students performance. Another of the findings is that even if a student does not finish his programme in the normal number of (four) academic sessions for whatever reasons he would still graduate with minimum of second class lower if he took further mathematics at SSCE examination. Students who spend more than four academic sessions in the programme and did not take further mathematics at SSCE examination are more likely to graduate with class below second class lower. [5] conducted a study on comparative study for predicting student s performance by selecting 48 students from VBS Purvanchal University, Jaunpur (Ultar pradesh) India on the sampling method of computer applications department of course MCA (Master of Computer Application) from session 2008 to Three different decision trees algorithm namely (ID3, C4.5, and CART) were used in order to investigate their accuracy or know the best out of them. The outcome of their results indicates that CART is the best algorithm for classification of data. [6] carried out a research on mining education data to predict student s retention. In the study machine learning algorithm (1D3, C4.5, and ADT) was applied to analyze and extract information from existing student data to establish predictive models and shows that machine learning algorithm such as Alternating decision tree (ADT) can learn predictive models from the student retention data accumulated from the previous year. [7] applied data classification and decision tree methods in order to improve the student performance. The data set used was obtained from M.Sc. IT department of Information Technology 2009 to 2012 batch. Extracurricular activities were also included. The information generated after the implementation of the data mining techniques will help the teachers to predict those students who have lesser performance and also to develop them with special attention. [8] conducted a study on the use of data mining technology to evaluate students academic achievement via multiple channels of enrolment like joint recruitment enrolment, athletic enrolment and application enrolment. Decision tree method was used and this shows that there are differences in the academic results of students from different enrolment channels. 189

4 It was found out that joint recruitment enrolment students perform much better than other students who are admitted via other enrolment methods and also that the long-term performance of students from athletic enrolment all show a declining trend. So, from this it can be seen that different enrolment methods influence the students academic achievement. [9] applied data mining techniques particularly classification, association, clustering and outlier detection rules to improve student s performance. They extracted useful knowledge from graduate students data collected from the College of Science and Technology, Khanyounis which include fifteen years period ( ). Each one of the tasks can be used to improve the performance of graduate students. [10] applied Bayesian classification method on student database in order to predict for performance improvement. In the study, data was gathered from different degree colleges and institutions affiliated with Dr. R.M.L.Awadh University, Faizabad, India. The study will work to identify those students which needed special attention to reduce failing ration and taking appropriate action at right time. [11] conducted a study on an empirical study of applications of data mining techniques for predicting student performance in higher education. Student data of B.Tech second year (CS & IT branch) from database management system course held at the United College of Engineering and Research Naini Allahabad (Affiliated to GBTU) in the fourth semester of 2011/2012 was collected and also used questionnaire to collect the real data that describe the relationships between learning behavior of students and their academic performance. Data mining techniques were applied to discover knowledge, association rules, classification rules and k-means to cluster the students in to groups. The study showed how useful data mining can be used in higher education specifically to improve engineering students performance. 3. METHODOLOGY Before using data mining technology to carry out analysis, it is important to undergo some procedures to increase the accuracy of the analysis (Han and Kamber, 2001). Therefore, this research adopted the following steps before proceeding to analysis. 3.1 Data Collection The data used for this research was postgraduate student data from session 2000 to 2011 collected from Computer Science department, University of Ibadan. 3.2 Experimental Design a. Data Cleaning b. Data Integration c. Data Selection d. Data Transformation e. Data Mining Data Cleaning This is the phase in which irrelevant data are removed from the collection, such as data errors, (which can either be from a data entry clerk or from a faulty data collection devices), irrelevant fields, non-variant fields, etc. In the original dataset, some classes of data such as the serial number, Accumulated Course Units Passed (ACUP), Overall Weighted Total (OWT) were not selected to be part of the mining process; this is because they do not provide any knowledge for the data set processing. Also, duplicate data are removed. Data source from the total of 511 instances, the data cleaning process ended up 411 instances that are ready to be mined Data Integration Data Integration is the phase where multiple data sources are combined in a data source. Also, a number of separate tables can be joined into one Data Selection At this stage, the data relevant to the analysis is decided on and retrieved from the dataset. This step in KDD process selects the data to be analyzed from the set of all available data. It will be highly unnecessary to attempt to analyze all data if meaningful pattern is to be obtained. The selected data is based on an evaluation of its potential to yield knowledge and these sets of data may represent a number of different aspects of the domain that are not directly related Data Transformation This is the stage in which the selected data is transformed into forms acceptable to data mining software. In this phase, a number of separate tables can be joined into one and vice versa. If the data is represented as text, but it is intended to use a data mining technique that require the data to be in numerical form, the data must be transformed accordingly. The data file was saved in Comma Separated Value (CSV) file format and later was converted to Attribute relation file format (ARFF) file inside weka. 190

5 4. RESULTS AND DISCUSSION In this analysis, the data set used was postgraduate student data from session 2000 to 2011 collected from Computer Science department, University of Ibadan, Nigeria. Table 1: Results of Modelling Student data on MLP Metrics Value 2.7 Seconds Correctly Classified Instances % Incorrectly Classified Instances % Kappa Statistics Mean Absolute Error Root Mean Squared Error Relative Absolute Error % Root Relative Squared Error % Total Number of Instances 288 Table 2: The Performance measures TP Rate FP Rate Precision Recall F-Measure MCC ROC- Class Area M.Phil/Ph.D M.Phil Ph.D Withdraw Fail Terminal Weighted Average Predicted Actual === Confusion Matrix === a b c d e f <-- classified as a = M.Phil/Ph.D b = M.Phil c = Ph.D d = Withdraw e = Fail f = Terminal 4.1 Confusion Matrix The confusion matrix is commonly named contingency table. The number of correctly classified instances is the sum of the diagonals in the matrix; all others are incorrectly classified. Table 3: Results on test set (MLP) Metrics Value 5.93 seconds Correctly Classified Instances % Incorrectly Classified Instances % Kappa Statistics Mean Absolute Error Root Mean Squared Error Relative Absolute Error % Root Relative Squared Error % Total Number of Instances

6 Table 4: Performance measures on test set TP Rate FP Rate Precision Recall F-Measure MCC ROC-Area Class M.Phil/Ph.D M.Phil Ph.D Withdraw Fail Terminal Weighted Average Predicted Actual === Confusion Matrix === a b c d e f <-- classified as a = M.Phil/Ph.D b = M.Phil c = Ph.D d = Withdraw e = Fail f = Terminal The confusion matrix is commonly named contingency table. The number of correctly classified instances is the sum of the diagonals in the matrix; all others are incorrectly classified. Table 5: Results of Modelling Student data in J48 decision tree Metrics Value 0.25 Seconds Correctly Classified Instances % Incorrectly Classified Instances % Kappa Statistics Mean Absolute Error Root Mean Squared Error Relative Absolute Error % Root Relative Squared Error % Total Number of Instances 288 Table 6: The Performance measures TP Rate FP Rate Precision Recall F-Measure MCC ROC-Area Class M.Phil/Ph.D M.Phil Ph.D Withdraw Fail Terminal Weighted Average Predicted Actual === Confusion Matrix === a b c d e f <-- classified as a = M.Phil/Ph.D b = M.Phil c = Ph.D d = Withdraw e = Fail f = Terminal 192

7 The confusion matrix is commonly named contingency table. The number of correctly classified instances is the sum of the diagonals in the matrix; all others are incorrectly classified. Table 7: Results on Test Set Metrics Value 0.04 seconds Correctly Classified Instances % Incorrectly Classified Instances % Kappa Statistics Mean Absolute Error Root Mean Squared Error Relative absolute error % Root relative squared error % Total number of instances 123 Table 8: The Performance measure TP Rate FP Rate Precision Recall F-Measure MCC ROC-Area Class M.Phil/Ph.D M.Phil Ph.D Withdraw Fail Terminal Weighted Average Predicted === Confusion Matrix === Actual a b c d e f <-- classified as a = M.Phil/Ph.D b = M.Phil c = Ph.D d = Withdraw e = Fail f = Terminal The confusion matrix is commonly named contingency table. The number of correctly classified instances is the sum of the diagonals in the matrix; all others are incorrectly classified. 193

8 Figure 2: Decision tree rules Above is the decision tree constructed by the J48 classifier. This indicates how the classifier uses the attributes to make a decision. The leaf nodes indicate the outcome of a test, and each leaf (terminal) node holds a class label and the topmost node is the root node (Eligibility). 26 Rules generated from the decision tree. It can be expressed in English so that we humans can understand them. 1. IF Eligibility = NG & YGSD > 2004 & CSC 755 > 60 THEN Class = Withdraw 2. IF Eligibility = NG & YGSD > 2004 & CSC 755 <= 60 THEN Class = Fail 3. IF Eligibility = NG & YGSD <= 2004 THEN Class = Withdraw 4. IF Eligibility = P & CSC 765 > 57 & CSC 742 > 49 & CSC 746 > 43 & CSC 799 > 68 THEN Class = PhD 5. IF Eligibility = P & CSC 765 > 57 & CSC 742 > 49 & CSC 746 > 43 & CSC 799 <= 68 & CSC 765 > 62 THEN Class = PhD 6. IF Eligibility = P & CSC 765 > 57 & CSC 742 > 49 & CSC 746 > 43 & CSC 799 <= 68 & CSC 765 <= 62 & CSC 766 > 62 THEN Class = PhD 7. IF Eligibility = P & CSC 765 > 57 & CSC 742 > 49 & CSC 746 > 43 & CSC 799 <= 68 & CSC 765 <= 62 & CSC 766 <= 62 & CSC 746 > 57 & CSC 751 > 53 THEN Class = PhD 8. IF Eligibility = P & CSC 765 > 57 & CSC 742 > 49 & CSC 746 > 43 & CSC 799 <= 68 & CSC 765 <= 62 & CSC 766 <= 62 & CSC 746 > 57 & CSC 751 <= 53 & CSC 746 > 68 THEN Class = PhD 9. IF Eligibility = P & CSC 765 > 57 & CSC 742 > 49 & CSC 746 > 43 & CSC 799 <= 68 & CSC 765 <= 62 & CSC 766 <= 62 & CSC 746 > 57 & CSC 751 <= 53 & CSC 746 <= 68 THEN Class = MPhil/PhD 10. IF Eligibility = P & CSC 765 > 57 & CSC 742 > 49 & CSC 746 <= 43 THEN Class = MPhil/PhD 11. IF Eligibility = P & CSC 765 > 57 & CSC 742 <=49 & CSC 751 > 52 & CSC 747 > 46 & CSC 775 > 22 THEN Class = PhD 12. IF Eligibility = P & CSC 765 > 57 & CSC 742 <=49 & CSC 751 <=52 & CSC 746 > 61 THEN Class = PhD 13. IF Eligibility = P & CSC 765 > 57 & CSC 742 <=49 & CSC 751 <=52 & CSC 746 <= 61 & Modeofentry = PT THEN Class = MPhil 194

9 14. IF Eligibility = P & CSC 765 > 57 & CSC 742 <=49 & CSC 751 <=52 & CSC 746 <= 61 & Modeofentry = FT & CSC 776 > 54 THEN Class = MPhil/PhD 15. IF Eligibility = P & CSC 765 > 57 & CSC 742 <=49 & CSC 751 <=52 & CSC 746 <= 61 & Modeofentry = FT & CSC 776 <= 54 THEN Class = MPhil 16. IF Eligibility = P & CSC 765 <= 57 & CSC 799 > 54 & CSC 755 > 48 & CSC 741 > & CSC 747 > 61 THEN Class = PhD 17. IF Eligibility = P & CSC 765 <= 57 & CSC 799 > 54 & CSC 755 > 48 & CSC 741 > & CSC 747 <= 61 THEN Class = MPhil/PhD 18. IF Eligibility = P & CSC 765 <= 57 & CSC 799 > 54 & CSC 755 > 48 & CSC 741 <= & CSC 745 > 53 THEN Class = MPhil/PhD 19. IF Eligibility = P & CSC 765 <= 57 & CSC 799 > 54 & CSC 755 > 48 & CSC 741 <= & CSC 745 <= 53 & CSC 753 > THEN Class = MPhil/PhD 20. IF Eligibility = P & CSC 765 <= 57 & CSC 799 > 54 & CSC 755 > 48 & CSC 741 <= & CSC 745 <= 53 & CSC 753 <= THEN Class = MPhil 21. IF Eligibility = P & CSC 765 <= 57 & CSC 799 > 54 & CSC 755 <=48 & CSC 741 >44 & CSC 741 > 56 THEN Class = MPhil/PhD 22. IF Eligibility = P & CSC 765 <= 57 & CSC 799 > 54 & CSC 755 <=48 & CSC 741 >44 & CSC 741 <= 56 THEN Class = MPhil 23. IF Eligibility = P & CSC 765 <= 57 & CSC 799 > 54 & CSC 755 <=48 & CSC 741 <= 44 THEN Class = Terminal 24. IF Eligibility = P & CSC 765 <= 57 & CSC 799 <= 54 & CSC 757 > 30 & CSC 741 > 62 THEN Class = MPhil/PhD 25. IF Eligibility = P & CSC 765 <= 57 & CSC 799 <= 54 & CSC 757 > 30 & CSC 741 <= 62 THEN Class = MPhil 26. IF Eligibility = P & CSC 765 <= 57 & CSC 799 <= 54 & CSC 757 <= 30 THEN Class = Terminal 4.2 Discussion of ANN and Decision Tree Models for Student Datasets Artificial Neural Networks Modelling Results of Table 4.1a and 4.1b show that MLP ANN is better and more appropriate for student data than decision tree considering its highest level of accuracy. Also, Decision Trees Modelling Result of Table 4.3a and 4.3b and figure 4.6 show that decision tree is appropriate in deriving rules from the dataset and has lowest time taken to model than MLP-ANN Table 9: Comparative analysis on training set Metrics Value (MLP) Value (DT) 2.7 Seconds 0.25 Seconds Correctly Classified Instances % % Incorrectly Classified Instances % % Kappa Statistics Mean Absolute Error Root Mean Squared Error Relative Absolute Error % % Root Relative Squared Error % % Total Number of Instances Table 10: Comparative analysis on test set Metrics Value (MLP) Value (DT) 5.93 Seconds 0.04 Seconds Correctly Classified Instances % % Incorrectly Classified Instances % % Kappa Statistics Mean Absolute Error Root Mean Squared Error Relative Absolute Error % % Root Relative Squared Error % % Total Number of Instances

10 The results obtained from the analysis clearly demonstrated a superior performance of neural network over decision tree not only in terms of the number of correctly classified instances also in terms of RMSE, MAE, RAE. Neural Network performed well in classification as well as in prediction but suffered from lack of speed. Decision Tree was fast but performed badly at the classification. Also the rules generated makes decision tree to be clearer and understandable. 5. CONCLUSION The data to be analyzed by data mining techniques may be incomplete, noisy and inconsistent. Thus when starting the application, first the data must be preprocessed. This preprocessing includes data cleaning, data selection and data transformation. The data used in this application was also preprocessed. We applied data mining techniques to discover knowledge. Particularly we discovered classification rules using decision tree. These rules can be of help to the student to take the right decision based on courses to enrol. Thus, with this information, students will have supporting tool that will help them to take the best decisions previous to their enrolment. REFERENCES [1] Kumar, V. and Chadha, A. (2011) An Empirical Study of the Applications of Data Mining Techniques in Higher Education. IJACSA - International Journal of Advanced Computer Science and Applications, 2(3), Retrieved from [2] Ogor, E. N., (2007) Student Academic Performance Monitoring and Evaluation Using Data Mining Techniques. Fourth Congress of Electronics, Robotics and Automotive Mechanics. IEEE Computer Society. pp [3] Alaa el-halees, (2009) Mining students data to analyze e- Learning behavior: A Case Study. Department of Computer Science, Islamic University of Gaza P.O.Box 108 Gaza, Palestine. [6]. Surjeet Kumar Yadav et al., Data Mining Applications: A comparative Study for Predicting Student s Performance. International Journal Of Innovative Technology & Creative Engineering (Issn: ) Vol.1 No.12 December [7]. Chin Chia Hsu and Tao Huang, The use of Data Mining Technology to evaluate student s academic achievement via multiple channels of enrolment: An empirical analysis of St. John s University of Technolgy. The IABPAD Conference Proceedings Orlando, Florida, January 3-6,2006. [8]. Shanmuga Priya K. and Senthil Kumar A.V., Improving the student s performance using Educational Data Mining, Int. J. Advanced Networking and Applications. Volume: 04 Issue: 04 Pages: (2013) ISSN : [9]. Mohammed M. Abu Tair, Alaa M. El-Halees, Mining Educational Data to Improve Students Performance: A Case Study. International Journal of Information and Communication Technology Research Volume 2 No. 2, February ISSN [10]. Brijesh Kumar Bhardwaj and Saurabh Pal, Data Mining: A prediction for performance improvement using classification. (IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 4, April [11]. Mahendra Tiwari, Randhir Singh and Neeraj Vimal, An Empirical Study of Applications of Data Mining Techniques for Predicting Student Performance in Higher Education. International Journal of Computer Science and Mobile Computing, IJCSMC, Vol. 2, Issue. 2, February 2013, pg [12]. WEKA. [13]. [4] Osofisan A.O. and Olamiti A.O., (2009) Academic Background of Students and Performance in a Computer Science Programme in a Nigerian University. European Journal of Social Sciences. 9(4): [5]. Surjeet Kumar Yadav et al., Mining Education Data to predict Student s Retention: A comparative Study. (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 2,

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

First Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms

First Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms First Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms Azwa Abdul Aziz, Nor Hafieza IsmailandFadhilah Ahmad Faculty Informatics & Computing

More information

Data Mining: A Prediction for Performance Improvement of Engineering Students using Classification

Data Mining: A Prediction for Performance Improvement of Engineering Students using Classification World of Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741 Vol. 2, No. 2, 51-56, 2012 Data Mining: A Prediction for Performance Improvement of Engineering Students using Classification

More information

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE S. Anupama Kumar 1 and Dr. Vijayalakshmi M.N 2 1 Research Scholar, PRIST University, 1 Assistant Professor, Dept of M.C.A. 2 Associate

More information

Data Mining Application in Enrollment Management: A Case Study

Data Mining Application in Enrollment Management: A Case Study Data Mining Application in Enrollment Management: A Case Study Surjeet Kumar Yadav Research scholar, Shri Venkateshwara University, J. P. Nagar, (U.P.) India ABSTRACT In the last two decades, number of

More information

Predicting Students Final GPA Using Decision Trees: A Case Study

Predicting Students Final GPA Using Decision Trees: A Case Study Predicting Students Final GPA Using Decision Trees: A Case Study Mashael A. Al-Barrak and Muna Al-Razgan Abstract Educational data mining is the process of applying data mining tools and techniques to

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

How To Understand The Impact Of A Computer On Organization

How To Understand The Impact Of A Computer On Organization International Journal of Research in Engineering & Technology (IJRET) Vol. 1, Issue 1, June 2013, 1-6 Impact Journals IMPACT OF COMPUTER ON ORGANIZATION A. D. BHOSALE 1 & MARATHE DAGADU MITHARAM 2 1 Department

More information

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

More information

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

More information

Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network

Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network General Article International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347-5161 2014 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Impelling

More information

Data Mining Application in Advertisement Management of Higher Educational Institutes

Data Mining Application in Advertisement Management of Higher Educational Institutes Data Mining Application in Advertisement Management of Higher Educational Institutes Priyanka Saini M.Tech(CS) Student, Banasthali University Rajasthan Sweta Rai M.Tech(CS) Student, Banasthali University

More information

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques. International Journal of Emerging Research in Management &Technology Research Article October 2015 Comparative Study of Various Decision Tree Classification Algorithm Using WEKA Purva Sewaiwar, Kamal Kant

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Predicting Student Performance by Using Data Mining Methods for Classification

Predicting Student Performance by Using Data Mining Methods for Classification BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance

More information

Neural Networks in Data Mining

Neural Networks in Data Mining IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V6 PP 01-06 www.iosrjen.org Neural Networks in Data Mining Ripundeep Singh Gill, Ashima Department

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Rule based Classification of BSE Stock Data with Data Mining

Rule based Classification of BSE Stock Data with Data Mining International Journal of Information Sciences and Application. ISSN 0974-2255 Volume 4, Number 1 (2012), pp. 1-9 International Research Publication House http://www.irphouse.com Rule based Classification

More information

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One

More information

Edifice an Educational Framework using Educational Data Mining and Visual Analytics

Edifice an Educational Framework using Educational Data Mining and Visual Analytics I.J. Education and Management Engineering, 2016, 2, 24-30 Published Online March 2016 in MECS (http://www.mecs-press.net) DOI: 10.5815/ijeme.2016.02.03 Available online at http://www.mecs-press.net/ijeme

More information

Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

Keywords data mining, prediction techniques, decision making.

Keywords data mining, prediction techniques, decision making. Volume 5, Issue 4, April 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analysis of Datamining

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

A Framework for Dynamic Faculty Support System to Analyze Student Course Data

A Framework for Dynamic Faculty Support System to Analyze Student Course Data A Framework for Dynamic Faculty Support System to Analyze Student Course Data J. Shana 1, T. Venkatachalam 2 1 Department of MCA, Coimbatore Institute of Technology, Affiliated to Anna University of Chennai,

More information

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Mobile Phone APP Software Browsing Behavior using Clustering Analysis Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis

More information

A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries

A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries Aida Mustapha *1, Farhana M. Fadzil #2 * Faculty of Computer Science and Information Technology, Universiti Tun Hussein

More information

Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing

Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

A Comparative Analysis of Classification Techniques on Categorical Data in Data Mining

A Comparative Analysis of Classification Techniques on Categorical Data in Data Mining A Comparative Analysis of Classification Techniques on Categorical Data in Data Mining Sakshi Department Of Computer Science And Engineering United College of Engineering & Research Naini Allahabad sakshikashyap09@gmail.com

More information

Data Mining as a tool to Predict the Churn Behaviour among Indian bank customers

Data Mining as a tool to Predict the Churn Behaviour among Indian bank customers Data Mining as a tool to Predict the Churn Behaviour among Indian bank customers Manjit Kaur Department of Computer Science Punjabi University Patiala, India manjit8718@gmail.com Dr. Kawaljeet Singh University

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Keywords Data Mining, Knowledge Discovery, Direct Marketing, Classification Techniques, Customer Relationship Management

Keywords Data Mining, Knowledge Discovery, Direct Marketing, Classification Techniques, Customer Relationship Management Volume 4, Issue 6, June 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Simplified Data

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015 RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering

More information

Data Mining : A prediction of performer or underperformer using classification

Data Mining : A prediction of performer or underperformer using classification Data Mining : A prediction of performer or underperformer using classification Umesh Kumar Pandey S. Pal VBS Purvanchal University, Jaunpur Abstract Now a day s students have a large set of data having

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

PREDICTING STUDENTS PERFORMANCE USING ID3 AND C4.5 CLASSIFICATION ALGORITHMS

PREDICTING STUDENTS PERFORMANCE USING ID3 AND C4.5 CLASSIFICATION ALGORITHMS PREDICTING STUDENTS PERFORMANCE USING ID3 AND C4.5 CLASSIFICATION ALGORITHMS Kalpesh Adhatrao, Aditya Gaykar, Amiraj Dhawan, Rohit Jha and Vipul Honrao ABSTRACT Department of Computer Engineering, Fr.

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

Random forest algorithm in big data environment

Random forest algorithm in big data environment Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

More information

1. Classification problems

1. Classification problems Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification

More information

Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition Data

Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition Data Proceedings of Student-Faculty Research Day, CSIS, Pace University, May 2 nd, 2014 Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition

More information

Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations

Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations Volume 3, No. 8, August 2012 Journal of Global Research in Computer Science REVIEW ARTICLE Available Online at www.jgrcs.info Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations

More information

An Analysis on Performance of Decision Tree Algorithms using Student s Qualitative Data

An Analysis on Performance of Decision Tree Algorithms using Student s Qualitative Data I.J.Modern Education and Computer Science, 2013, 5, 18-27 Published Online June 2013 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijmecs.2013.05.03 An Analysis on Performance of Decision Tree Algorithms

More information

Classification algorithm in Data mining: An Overview

Classification algorithm in Data mining: An Overview Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department

More information

A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model

A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model Twinkle Patel, Ms. Ompriya Kale Abstract: - As the usage of credit card has increased the credit card fraud has also increased

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

More information

A New Approach for Evaluation of Data Mining Techniques

A New Approach for Evaluation of Data Mining Techniques 181 A New Approach for Evaluation of Data Mining s Moawia Elfaki Yahia 1, Murtada El-mukashfi El-taher 2 1 College of Computer Science and IT King Faisal University Saudi Arabia, Alhasa 31982 2 Faculty

More information

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

Use Data Mining Techniques to Assist Institutions in Achieving Enrollment Goals: A Case Study

Use Data Mining Techniques to Assist Institutions in Achieving Enrollment Goals: A Case Study Use Data Mining Techniques to Assist Institutions in Achieving Enrollment Goals: A Case Study Tongshan Chang The University of California Office of the President CAIR Conference in Pasadena 11/13/2008

More information

Web Usage Mining: Identification of Trends Followed by the user through Neural Network

Web Usage Mining: Identification of Trends Followed by the user through Neural Network International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 617-624 International Research Publications House http://www. irphouse.com /ijict.htm Web

More information

Robust Outlier Detection Technique in Data Mining: A Univariate Approach

Robust Outlier Detection Technique in Data Mining: A Univariate Approach Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

DATA MINING APPROACH FOR PREDICTING STUDENT PERFORMANCE

DATA MINING APPROACH FOR PREDICTING STUDENT PERFORMANCE . Economic Review Journal of Economics and Business, Vol. X, Issue 1, May 2012 /// DATA MINING APPROACH FOR PREDICTING STUDENT PERFORMANCE Edin Osmanbegović *, Mirza Suljić ** ABSTRACT Although data mining

More information

A Hybrid Decision Tree Approach for Semiconductor. Manufacturing Data Mining and An Empirical Study

A Hybrid Decision Tree Approach for Semiconductor. Manufacturing Data Mining and An Empirical Study A Hybrid Decision Tree Approach for Semiconductor Manufacturing Data Mining and An Empirical Study 1 C. -F. Chien J. -C. Cheng Y. -S. Lin 1 Department of Industrial Engineering, National Tsing Hua University

More information

IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION

IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION http:// IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION Harinder Kaur 1, Raveen Bajwa 2 1 PG Student., CSE., Baba Banda Singh Bahadur Engg. College, Fatehgarh Sahib, (India) 2 Asstt. Prof.,

More information

Predicting Student Academic Performance at Degree Level: A Case Study

Predicting Student Academic Performance at Degree Level: A Case Study I.J. Intelligent Systems and Applications, 2015, 01, 49-61 Published Online December 2014 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijisa.2015.01.05 Predicting Student Academic Performance at Degree

More information

Enhancing Quality of Data using Data Mining Method

Enhancing Quality of Data using Data Mining Method JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2, ISSN 25-967 WWW.JOURNALOFCOMPUTING.ORG 9 Enhancing Quality of Data using Data Mining Method Fatemeh Ghorbanpour A., Mir M. Pedram, Kambiz Badie, Mohammad

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing

Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 4, April 2015,

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Inner Classification of Clusters for Online News

Inner Classification of Clusters for Online News Inner Classification of Clusters for Online News Harmandeep Kaur 1, Sheenam Malhotra 2 1 (Computer Science and Engineering Department, Shri Guru Granth Sahib World University Fatehgarh Sahib) 2 (Assistant

More information

INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com

INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com IJFEAT INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR Bharti S. Takey 1, Ankita N. Nandurkar 2,Ashwini A. Khobragade 3,Pooja G. Jaiswal 4,Swapnil R.

More information

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical

More information

ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS

ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS Abstract D.Lavanya * Department of Computer Science, Sri Padmavathi Mahila University Tirupati, Andhra Pradesh, 517501, India lav_dlr@yahoo.com

More information

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577 T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier Santosh Tirunagari : 245577 January 20, 2011 Abstract This term project gives a solution how to classify an email as spam or

More information

How To Predict Web Site Visits

How To Predict Web Site Visits Web Site Visit Forecasting Using Data Mining Techniques Chandana Napagoda Abstract: Data mining is a technique which is used for identifying relationships between various large amounts of data in many

More information

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Sagarika Prusty Web Data Mining (ECT 584),Spring 2013 DePaul University,Chicago sagarikaprusty@gmail.com Keywords:

More information

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,

More information

International Journal of Advance Research in Computer Science and Management Studies

International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 12, December 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

ISSN: 2321-7782 (Online) Volume 2, Issue 10, October 2014 International Journal of Advance Research in Computer Science and Management Studies

ISSN: 2321-7782 (Online) Volume 2, Issue 10, October 2014 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 2, Issue 10, October 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

A Data Mining view on Class Room Teaching Language

A Data Mining view on Class Room Teaching Language 277 A Data Mining view on Class Room Teaching Language Umesh Kumar Pandey 1, S. Pal 2 1 Research Scholar, Singhania University, Jhunjhunu, Rajasthan, India 2 Dept. of MCA, VBS Purvanchal University Jaunpur

More information

Towards applying Data Mining Techniques for Talent Mangement

Towards applying Data Mining Techniques for Talent Mangement 2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Towards applying Data Mining Techniques for Talent Mangement Hamidah Jantan 1,

More information

Comparison of Classification Techniques for Heart Health Analysis System

Comparison of Classification Techniques for Heart Health Analysis System International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-04, Issue-02 E-ISSN: 2347-2693 Comparison of Classification Techniques for Heart Health Analysis System Karthika

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information

Effective Analysis and Predictive Model of Stroke Disease using Classification Methods

Effective Analysis and Predictive Model of Stroke Disease using Classification Methods Effective Analysis and Predictive Model of Stroke Disease using Classification Methods A.Sudha Student, M.Tech (CSE) VIT University Vellore, India P.Gayathri Assistant Professor VIT University Vellore,

More information

Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News

Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News Sushilkumar Kalmegh Associate Professor, Department of Computer Science, Sant Gadge Baba Amravati

More information

Weather forecast prediction: a Data Mining application

Weather forecast prediction: a Data Mining application Weather forecast prediction: a Data Mining application Ms. Ashwini Mandale, Mrs. Jadhawar B.A. Assistant professor, Dr.Daulatrao Aher College of engg,karad,ashwini.mandale@gmail.com,8407974457 Abstract

More information

Enhanced data mining analysis in higher educational system using rough set theory

Enhanced data mining analysis in higher educational system using rough set theory African Journal of Mathematics and Computer Science Research Vol. 2(9), pp. 184-188, October, 2009 Available online at http://www.academicjournals.org/ajmcsr ISSN 2006-9731 2009 Academic Journals Review

More information

Comparative Analysis of Classification Algorithms on Different Datasets using WEKA

Comparative Analysis of Classification Algorithms on Different Datasets using WEKA Volume 54 No13, September 2012 Comparative Analysis of Classification Algorithms on Different Datasets using WEKA Rohit Arora MTech CSE Deptt Hindu College of Engineering Sonepat, Haryana, India Suman

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

How To Cluster

How To Cluster Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

K-means Clustering Technique on Search Engine Dataset using Data Mining Tool

K-means Clustering Technique on Search Engine Dataset using Data Mining Tool International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 6 (2013), pp. 505-510 International Research Publications House http://www. irphouse.com /ijict.htm K-means

More information

ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Enhanced Boosted Trees Technique for Customer Churn Prediction Model IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction

More information

Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

More information

Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

More information

SVM Ensemble Model for Investment Prediction

SVM Ensemble Model for Investment Prediction 19 SVM Ensemble Model for Investment Prediction Chandra J, Assistant Professor, Department of Computer Science, Christ University, Bangalore Siji T. Mathew, Research Scholar, Christ University, Dept of

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information