Mining Wiki Usage Data for Predicting Final Grades of Students

Size: px
Start display at page:

Download "Mining Wiki Usage Data for Predicting Final Grades of Students"

Transcription

1 Mining Wiki Usage Data for Predicting Final Grades of Students Gökhan Akçapınar, Erdal Coşgun, Arif Altun Hacettepe University Abstract This study aims to predict students final grades (A, B, C, D and F) based on their wiki usage data. Usage data are stored in wiki database in a limited way when default settings are used. Therefore an extension is developed to extend its capability to log students login and navigation data. A tool is developed for extracting information from this data and preprocessing of it. Dataset includes server side wiki usage log of 81 students throughout 3 months. Classification performance of Random Forest, Support Vector Machines, Naive Bayes and Boosted Classification Tree algorithms are compared for classifying students. Tenfold cross validation is used to evaluate the performance of the models. According to our findings, SVM outperforms other methods with the best classification performance. Keywords: wiki, classification, educational data mining, predicting final grade. Main Conference Topic: New Trends and Experiences, Educational Data Mining Introduction Using wiki in online learning environments is increased in recent years especially with the increasing demand on collaborative learning. Wikis can be used in the following fields to support learning: in-class collaboration, group projects outside of class, collaborative environment for learning from peers, peer and teacher feedback and review, and assessment and management of group performance [1]. Although wiki has a great potential for online learning environments, assessment of the individual contribution is difficult and time consuming if traditional methods are used as many students can be contributed to the content creation. On the other hand there are lots of students - system interaction data are stored in wiki database like other online learning environments (e.g. forum, lms, vle). By analyzing these data with the help of statistical and data mining (DM) techniques, lots of useful Information can be extracted for tutoring, assessment or understanding of learning and learner behavior [2, 3]. Educational data mining is one of the remarkable research areas which has emerged in recent years and defined as the application of data mining techniques to dataset that come from educational settings to address important educational questions [4]. According to Romero and Ventura [2] these questions includes Analysis and Visualization of Data, Providing Feedback for Supporting Instructors, Recommendations for Students, Predicting Student s Performance, Student Modeling, Grouping Students, etc. To answer these questions, educational data mining research uses different DM methods such as Prediction, Clustering, Relationship Mining, Discovery with Models and Distillation of Data for Human Judgment [5]. Among others one of the key application of educational data mining is predicting student s performance. Prediction of a student s performance is one of the oldest and most popular applications of DM in education, and different techniques and models have been applied so far [2]. In a recent study Lopez et al. [6] demonstrates the potential of the classification via clustering approach to predict students final marks (passed or failed) on the basis of their participation in forums. Their results showed that student participation in the course forum 1

2 was a good predictor of the final marks for the course. Fausett and Elwasif [7] found that neural networks can be trained to predict students' grades in Calculus I based on their placement test responses. They used student's test response pattern as input and the grade in Calculus I as the target responses. Martinez [8] suggested that student pre-college assessment data can be used to predict academic success (a grade of A, B, or C) in community college courses with discriminant function analysis. Minaei-Bidgoli and Punch [9] presented an approach for classifying students by using genetic algorithms to predict their final grade based on logged data in online learning environment. Superby et al. [10] used classification for predicting factors influencing the academic success of the first-year university students by means of discriminant analysis, neural networks, random forests and decision tree. Kotsiantis et al. [11] compared six different machine learning algorithms for predicting students marks (pass or fail) in Hellenic Open University data. They also compared six regression algorithms to predict students marks on similar data [12]. Delgado et al. [13] implemented a neural network to Moodle access logs and trained trying to predict the surpass of a course from the students. The model proposed by these authors showed that it is possible to predict those students with problems to pass a course. Two recent studies compared different data mining methods and techniques for classifying students based on students Moodle interaction data for predicting the final marks obtained in the course [14, 15]. In this study we sought to examine the extent to which we can predict students course grades (A, B, C, D and F) on the basis of their wiki usage. MediaWiki was used as the wiki engine. MediaWiki is a free, open source and easy to use wiki engine for creating wiki based web sites. We developed an extension to log students login and navigation data which are not tracked in default configuration. Background We applied four of the most commonly used classification algorithm for predicting students final grades and compared their prediction performance. The following paragraphs describe these methods briefly. Random Forest: A random forest is a decision tree ensemble classifier, with each tree grown using some type of randomization. Random forests have a capacity for processing huge amounts of data with high training speeds, based on a Classification and Regression Tree (CART) [16]. CART is a simple statistical tool applying recursive binary partitioning of the feature space. CART is well known for its efficiency in coping with large data sets. However, as the data become noisier, and less information is contained in each variable, the predictive ability of CART diminishes. RF overcomes this problem by introducing random elements into the model by which subsets of variables are chosen at random and bootstrap samples are selected with replacement for tree growing [17]. For each classification tree, a bootstrap sample is drawn from the original samples [18]. At each non-leaf node of a classification tree, the best split feature is selected from a small random subset of the original features. When the forest receives an input vector, each classification tree casts a unique vote, the final prediction is determined by the majority votes of all the trees in the random forest. Since the bootstrap sample is drawn with replacement, the samples which are not in the bootstrap samples are called out-of-bag (OOB) data [18, 19]. Boosted Classification Tree: The algorithm for Boosting Trees evolved from the application of boosting methods to regression trees. The general idea is to compute a sequence of simple CARTs, where each successive tree is built for the prediction residuals of the preceding tree. This method will build binary trees, i.e., partition the data into two samples at each split node. We suppose that user were to limit the complexities of the trees to 3 nodes 2

3 only: a root node and two child nodes, i.e., a single split. Thus, at each step of the boosting (boosting trees algorithm), a simple (best) partitioning of the data is determined, and the deviations of the observed values from the respective means (residuals for each partition) are computed. The next 3-node tree will then be fitted to those residuals, to find another partition that will further reduce the residual (error) variance for the data, given the preceding sequence of trees. It can be shown that such "additive weighted expansions" of trees can eventually produce an excellent fit of the predicted values to the observed values, even if the specific nature of the relationships between the predictor variables and the dependent variable of interest is very complex (nonlinear in nature). Hence, the method of gradient boosting - fitting a weighted additive expansion of simple trees - represents a very general and powerful machine learning algorithm [20]. Support Vector Machines: SVMs are a relatively new computational learning methods based on the statistical learning theory presented by Vapnik [21]. In SVMs, original input space mapped into a high-dimensional dot product space called a feature space, and in the feature space the optimal hyper plane is determined to maximize the generalization ability of the classifier. The maximal hyper plane is found by exploiting the optimization theory, and respecting insights provided by the statistical learning theory [22]. Naïve Bayes: Bayesian classifiers are statistical classifiers. They can predict class membership probabilities, such as the probability that a given sample belongs to a particular class. Bayesian classifier is based on Bayes theorem. Naive Bayesian classifiers assume that the effect of an attribute value on a given class is independent of the values of the other attributes. This assumption is called class conditional independence. It is made to simplify the computation involved and, in this sense, is considered naïve [23]. Let X = (x1, x2,..., xn) be a sample, whose components represent values made on a set of n attributes. In Bayesian terms, X is considered evidence. Let H be some hypothesis, such as that the data X belongs to a specific class C. For classification problems, our goal is to determine P(H X), the probability that the hypothesis H holds given the evidence, (i.e. the observed data sample X). In other words, we are looking for the probability that sample X belongs to class C, given that we know the attribute description of X. According to Bayes theorem, the probability that we want to compute P(H X) can be expressed in terms of probabilities P(H), P(X H), and P(X) as P(H X) = [P(X H)* P(H)] / P(X) [23]. Description of the Data Used The dataset used in this study was gathered from a wiki used by university students during a third-year course. Students used wiki to write reflection about concepts that they learned in Computer Network and Communication course. Variables selected for this experiment were extracted from two different tables. One was a revision table of wiki which stored all changes conducted by students. The other was a table which stored students login and navigation data via extension. Revision table included more than 1900 records and we used a WikLog tool developed by Akçapınar and Aşkar [24] to extract information automatically from this table. The usage data included server side wiki usage log of 81 students with a total of 1800 sessions and page requests throughout 3 months. The tool was developed for extracting information from this data and pre-processing of it. Variables extracted from these two tables are shown in Table 1. Table 2 shows the summary of statistics for the extracted variables. 3

4 Table 1. Variables of a student in a wiki Name Domain Description n_session Usage log Total session count a_time Usage log Average time in one session n_mainpagereturn Usage log Main page return rate n_uniquepage Usage log Unique page visits n_revisits Usage log Total number of revisited web pages n_edit MediaWiki db Total number of edits n_word MediaWiki db Total word count f_grade Class Final grade of the student Table 2. Descriptive statistics for variables mean sd median min max n_session 22,69 20,77 18,00 1,00 143,00 a_time 17,81 7,57 17,38 1,47 46,49 n_mainpagereturn 25,19 13,99 22,00 6,00 80,00 n_uniquepage 143,25 77,83 146,00 2,00 265,00 n_revisits 56,15 18,91 60,00 0,00 87,00 n_edit 21,90 29,24 9,00 0,00 130,00 n_word 161,98 251,66 60,00 0, ,00 Results Naive Bayes, Support Vector Machines, Boosted Classification Tree and Random Forest were implemented by R software. We used the gbm package for BCT, the randomforest package for RF, and the e1071 package for SVM and Naive Bayes. The models were generalized with 10-fold Cross Validation (CV). In this study True Classification Rate of four different data mining techniques for classifying students are compared. Table 3 shows classification accuracy of these techniques. According to these results the best method with our data is SVMs. Table 3. Classification accuracy of classification algorithm Algorithm Classification Accuracy (%) Random Forest 1 63,3 Support Vector Machine 2 67,1 Naïve Bayes 3 59,6 Boosted Classification Tree 4 61,4 1 RF: 1000 tree, 5 mtry. 2 SVM: Radial Based Kernel. 3 Naive Bayes: Threshold: 0.100, Sub-Sample Rate: 0,30. 4 Boosted Classification Tree: 1000 tree, Number of Additive Terms: 200, Learning Rate:

5 Conclusions Although mining educational data to predict students' performance is not a new phenomenon, there is no published paper on the use of data mining techniques to predict student performance based on their wiki usage data until now. This paper reports the comparison of Random Forest, Support Vector Machines, Naive Bayes, and Boosted Classification Tree for classifying students for predicting final grades obtained in an undergraduate course on the basis of their wiki usage data. In recent years, these methods became popular and robust for the prediction problems. We compared different classification algorithm because there is not one single algorithm that obtains the best classification accuracy in all cases and all datasets [15, 25]. According to our findings, SVM outperforms other methods. Possible reason of this result could be that our classification problem is nonlinear. On the other hand, tree based methods have enough performance for prediction as well. These findings showed that data mining methods can help researchers to assess students individual contributions to wiki if the necessary information is stored in a database or in log files. Presented study also showed that students navigation logs and wiki usage data are good predictors of their course performance. For future research, instructors can use the extracted knowledge for decision making and for classifying new students [15]. Feedback is an important variable in changing behavior, and studies suggests that many students will respond appropriately in the face of feedback that they understand [26]. These extracted knowledge can also be used as a feedback to help students who are potentially at risk and intervene in their problems early enough to allow them to change their behavior. References 1. Ben-Zvi, D., Using Wiki to Promote Collaborative Learning in Statistics Education. Technology Innovations in Statistics Education, (1). 2. Romero, C. and S. Ventura, Educational Data Mining: A Review of the State of the Art. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, (6): p Rudas, I.J. and P. Tóth. Web Mining Usage in Course Development. in The SEFI Annual Conference Lisbon, Portugal. 4. Romero, C. and S. Ventura, Data mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, (1): p Baker, R.S.J.d., Data Mining for Education, in In International Encyclopedia of Education, 3rd Ed., B. McGaw, P. Peterson, and E. Baker, Editors. 2011, Oxford, UK: Elsevier. 6. Lopez, M.I., et al. Classification via clustering for predicting final marks based on student participation in forums. in 5th International Conference on Educational Data Mining, EDM Chania, Greece. 7. Fausett, L.V. and W. Elwasif. Predicting performance from test scores using backpropagation and counterpropagation. in Neural Networks, IEEE World Congress on Computational Intelligence., 1994 IEEE International Conference on Martinez, D. Predicting Student Outcomes Using Discriminant Function Analysis Minaei-Bidgoli, B. and W. Punch, Using Genetic Algorithms for Data Mining Optimization in an Educational Web-Based System Genetic and Evolutionary Computation GECCO 2003, E. Cantú-Paz, et al., Editors. 2003, Springer Berlin / Heidelberg. p

6 10. Superby, J.F., J.P. Vandamme, and N. Meskens. Determination of Factors Influencing the Achievement of the First-year University Students using Data Mining Methods. in Workshop on Educational Data Mining Kotsiantis, S., C. Pierrakeas, and P. Pintelas, Predicting Students' Performance in Distance Learning Using Machine Learning Techniques. Applied Artificial Intelligence, (5): p Kotsiantis, S.B. and P.E. Pintelas. Predicting students marks in Hellenic Open University. in Advanced Learning Technologies, ICALT Fifth IEEE International Conference on Delgado, M., et al. Predicting Students Marks from Moodle Logs using Neural Network Models. in Current Developments in Technology-Assisted Education Badajoz. 14. Romero, C., et al. Data mining algorithms to classify students. in Proc. Int. Conf. Educ. Data Mining Montreal, Canada. 15. Romero, C., et al., Web usage mining for predicting final marks of students that use Moodle courses. Computer Applications in Engineering Education, 2010: p. n/a-n/a. 16. Ko, B., S. Kim, and J.-Y. Nam, X-ray Image Classification Using Random Forests with Local Wavelet-Based CS-Local Binary Patterns. Journal of Digital Imaging, (6): p Chen, C.C.M., et al., Methods for Identifying SNP Interactions: A Review on Variations of Logic Regression, Random Forest and Bayesian Logistic Regression. Computational Biology and Bioinformatics, IEEE/ACM Transactions on, (6): p Breiman, L., Random Forests. Machine Learning, (1): p Lin, X., et al., A method for handling metabonomics data from liquid chromatography/mass spectrometry: combinational use of support vector machine recursive feature elimination, genetic algorithm and random forest for feature selection. Metabolomics, (4): p StatSoft, I., Electronic Statistics Textbook. 2011, StatSoft: Tulsa. 21. Vapnik, V., Statistical learning theory. 1998: Wiley. 22. Widodo, A., B.-S. Yang, and T. Han, Combination of independent component analysis and support vector machines for intelligent faults diagnosis of induction motors. Expert Systems with Applications, (2): p Han, J., M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, Second Edition (The Morgan Kaufmann Series in Data Management Systems). 2006: Morgan Kaufmann. 24. Akçapınar, G. and P. Aşkar. Measuring Author Contributions to the Mediawiki. in IADIS International Conference WWW/Internet Rome, Italy. 25. Osmanbegović, E. and M. Suljić, Data Mining Approach for Predicting Student Performance. Economic Review, (1). 26. Bienkowski, M., M. Feng, and B. Means, Enhancing Teaching and Learning Through Educational Data Mining and Learning Analytics: An Issue Brief. 2012: Washington, D.C. 6

Data Mining. Nonlinear Classification

Data Mining. Nonlinear Classification Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

DATA MINING APPROACH FOR PREDICTING STUDENT PERFORMANCE

DATA MINING APPROACH FOR PREDICTING STUDENT PERFORMANCE . Economic Review Journal of Economics and Business, Vol. X, Issue 1, May 2012 /// DATA MINING APPROACH FOR PREDICTING STUDENT PERFORMANCE Edin Osmanbegović *, Mirza Suljić ** ABSTRACT Although data mining

More information

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

More information

E-Learning Using Data Mining. Shimaa Abd Elkader Abd Elaal

E-Learning Using Data Mining. Shimaa Abd Elkader Abd Elaal E-Learning Using Data Mining Shimaa Abd Elkader Abd Elaal -10- E-learning using data mining Shimaa Abd Elkader Abd Elaal Abstract Educational Data Mining (EDM) is the process of converting raw data from

More information

Scholars Journal of Arts, Humanities and Social Sciences

Scholars Journal of Arts, Humanities and Social Sciences Scholars Journal of Arts, Humanities and Social Sciences Sch. J. Arts Humanit. Soc. Sci. 2014; 2(3B):440-444 Scholars Academic and Scientific Publishers (SAS Publishers) (An International Publisher for

More information

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

More information

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three

More information

Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

More information

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Henrik Boström School of Humanities and Informatics University of Skövde P.O. Box 408, SE-541 28 Skövde

More information

Data Mining Algorithms to Classify Students

Data Mining Algorithms to Classify Students Data Mining Algorithms to Classify Students Cristóbal Romero, Sebastián Ventura, Pedro G. Espejo and César Hervás {cromero, sventura, pgonzalez, chervas}@uco.es Computer Science Department, Córdoba University,

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

The Predictive Data Mining Revolution in Scorecards:

The Predictive Data Mining Revolution in Scorecards: January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms

More information

Introducing diversity among the models of multi-label classification ensemble

Introducing diversity among the models of multi-label classification ensemble Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and

More information

How To Solve The Kd Cup 2010 Challenge

How To Solve The Kd Cup 2010 Challenge A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over

More information

Random forest algorithm in big data environment

Random forest algorithm in big data environment Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

More information

Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

More information

Selecting Data Mining Model for Web Advertising in Virtual Communities

Selecting Data Mining Model for Web Advertising in Virtual Communities Selecting Data Mining for Web Advertising in Virtual Communities Jerzy Surma Faculty of Business Administration Warsaw School of Economics Warsaw, Poland e-mail: [email protected] Mariusz Łapczyński

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo [email protected],[email protected]

More information

Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms

Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms Yin Zhao School of Mathematical Sciences Universiti Sains Malaysia (USM) Penang, Malaysia Yahya

More information

Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk [email protected] Tom Kelsey ID5059-19-B &

More information

Chapter 12 Bagging and Random Forests

Chapter 12 Bagging and Random Forests Chapter 12 Bagging and Random Forests Xiaogang Su Department of Statistics and Actuarial Science University of Central Florida - 1 - Outline A brief introduction to the bootstrap Bagging: basic concepts

More information

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, [email protected] Abstract: Independent

More information

SVM Ensemble Model for Investment Prediction

SVM Ensemble Model for Investment Prediction 19 SVM Ensemble Model for Investment Prediction Chandra J, Assistant Professor, Department of Computer Science, Christ University, Bangalore Siji T. Mathew, Research Scholar, Christ University, Dept of

More information

MS1b Statistical Data Mining

MS1b Statistical Data Mining MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

More information

Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

More information

How To Make A Credit Risk Model For A Bank Account

How To Make A Credit Risk Model For A Bank Account TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző [email protected] 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015 RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee [email protected] Seunghee Ham [email protected] Qiyi Jiang [email protected] I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

Better credit models benefit us all

Better credit models benefit us all Better credit models benefit us all Agenda Credit Scoring - Overview Random Forest - Overview Random Forest outperform logistic regression for credit scoring out of the box Interaction term hypothesis

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

More information

Classification and Regression by randomforest

Classification and Regression by randomforest Vol. 2/3, December 02 18 Classification and Regression by randomforest Andy Liaw and Matthew Wiener Introduction Recently there has been a lot of interest in ensemble learning methods that generate many

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel

Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel Copyright 2008 All rights reserved. Random Forests Forest of decision

More information

Car insurance risk assessment with data mining for an Iranian leading insurance company

Car insurance risk assessment with data mining for an Iranian leading insurance company International Journal of Business and Economics Research 2014; 3(3): 128-134 Published online May 30, 2014 (http://www.sciencepublishinggroup.com/j/ijber) doi: 10.11648/j.ijber.20140303.12 Car insurance

More information

Classification algorithm in Data mining: An Overview

Classification algorithm in Data mining: An Overview Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department

More information

Leveraging Ensemble Models in SAS Enterprise Miner

Leveraging Ensemble Models in SAS Enterprise Miner ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

More information

Gerry Hobbs, Department of Statistics, West Virginia University

Gerry Hobbs, Department of Statistics, West Virginia University Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

More information

How To Identify A Churner

How To Identify A Churner 2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management

More information

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 [email protected]

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan

More information

Predicting Student Performance by Using Data Mining Methods for Classification

Predicting Student Performance by Using Data Mining Methods for Classification BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance

More information

Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France [email protected] Massimiliano

More information

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Enhanced Boosted Trees Technique for Customer Churn Prediction Model IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction

More information

Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

More information

ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS

ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS Michael Affenzeller (a), Stephan M. Winkler (b), Stefan Forstenlechner (c), Gabriel Kronberger (d), Michael Kommenda (e), Stefan

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

Distributed forests for MapReduce-based machine learning

Distributed forests for MapReduce-based machine learning Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication

More information

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])

More information

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

More information

Comparison of machine learning methods for intelligent tutoring systems

Comparison of machine learning methods for intelligent tutoring systems Comparison of machine learning methods for intelligent tutoring systems Wilhelmiina Hämäläinen 1 and Mikko Vinni 1 Department of Computer Science, University of Joensuu, P.O. Box 111, FI-80101 Joensuu

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University)

Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University) 260 IJCSNS International Journal of Computer Science and Network Security, VOL.11 No.6, June 2011 Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case

More information

An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset

An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset P P P Health An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset Peng Liu 1, Elia El-Darzi 2, Lei Lei 1, Christos Vasilakis 2, Panagiotis Chountas 2, and Wei Huang

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

Scalable Developments for Big Data Analytics in Remote Sensing

Scalable Developments for Big Data Analytics in Remote Sensing Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,

More information

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE S. Anupama Kumar 1 and Dr. Vijayalakshmi M.N 2 1 Research Scholar, PRIST University, 1 Assistant Professor, Dept of M.C.A. 2 Associate

More information

Predictive Data modeling for health care: Comparative performance study of different prediction models

Predictive Data modeling for health care: Comparative performance study of different prediction models Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath [email protected] National Institute of Industrial Engineering (NITIE) Vihar

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Data Mining for Knowledge Management. Classification

Data Mining for Knowledge Management. Classification 1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh

More information

Predicting borrowers chance of defaulting on credit loans

Predicting borrowers chance of defaulting on credit loans Predicting borrowers chance of defaulting on credit loans Junjie Liang ([email protected]) Abstract Credit score prediction is of great interests to banks as the outcome of the prediction algorithm

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

Decision Trees from large Databases: SLIQ

Decision Trees from large Databases: SLIQ Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values

More information

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

The Operational Value of Social Media Information. Social Media and Customer Interaction

The Operational Value of Social Media Information. Social Media and Customer Interaction The Operational Value of Social Media Information Dennis J. Zhang (Kellogg School of Management) Ruomeng Cui (Kelley School of Business) Santiago Gallino (Tuck School of Business) Antonio Moreno-Garcia

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Application of Event Based Decision Tree and Ensemble of Data Driven Methods for Maintenance Action Recommendation

Application of Event Based Decision Tree and Ensemble of Data Driven Methods for Maintenance Action Recommendation Application of Event Based Decision Tree and Ensemble of Data Driven Methods for Maintenance Action Recommendation James K. Kimotho, Christoph Sondermann-Woelke, Tobias Meyer, and Walter Sextro Department

More information

USE DATA MINING TO IMPROVE STUDENT RETENTION IN HIGHER EDUCATION A CASE STUDY

USE DATA MINING TO IMPROVE STUDENT RETENTION IN HIGHER EDUCATION A CASE STUDY USE DATA MINING TO IMPROVE STUDENT RETENTION IN HIGHER EDUCATION A CASE STUDY Ying Zhang, Samia Oussena Thames Valley University, London,UK [email protected], [email protected] Tony Clark, Hyeonsook

More information

College of Health and Human Services. Fall 2013. Syllabus

College of Health and Human Services. Fall 2013. Syllabus College of Health and Human Services Fall 2013 Syllabus information placement Instructor description objectives HAP 780 : Data Mining in Health Care Time: Mondays, 7.20pm 10pm (except for 3 rd lecture

More information

Predicting Students Final GPA Using Decision Trees: A Case Study

Predicting Students Final GPA Using Decision Trees: A Case Study Predicting Students Final GPA Using Decision Trees: A Case Study Mashael A. Al-Barrak and Muna Al-Razgan Abstract Educational data mining is the process of applying data mining tools and techniques to

More information

A Methodology for Predictive Failure Detection in Semiconductor Fabrication

A Methodology for Predictive Failure Detection in Semiconductor Fabrication A Methodology for Predictive Failure Detection in Semiconductor Fabrication Peter Scheibelhofer (TU Graz) Dietmar Gleispach, Günter Hayderer (austriamicrosystems AG) 09-09-2011 Peter Scheibelhofer (TU

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

Ensemble Data Mining Methods

Ensemble Data Mining Methods Ensemble Data Mining Methods Nikunj C. Oza, Ph.D., NASA Ames Research Center, USA INTRODUCTION Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods

More information

Model Combination. 24 Novembre 2009

Model Combination. 24 Novembre 2009 Model Combination 24 Novembre 2009 Datamining 1 2009-2010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy

More information

REVIEW OF ENSEMBLE CLASSIFICATION

REVIEW OF ENSEMBLE CLASSIFICATION Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IJCSMC, Vol. 2, Issue.

More information

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

Research Phases of University Data Mining Project Development

Research Phases of University Data Mining Project Development Research Phases of University Data Mining Project Development Dorina Kabakchieva 1, Kamelia Stefanova 2, Valentin Kissimov 3, and Roumen Nikolov 4 1 Sofia University St. Kl. Ohridski, 125 Tzarigradsko

More information