Chapter 7: Association Rule Mining in Learning Management Systems

Size: px
Start display at page:

Download "Chapter 7: Association Rule Mining in Learning Management Systems"

Transcription

1 1 Chapter 7: Association Rule Mining in Learning Management Systems Enrique García 1, Cristóbal Romero 1, Sebastián Ventura 1, Carlos de Castro 1, Toon Calders 2 1 Department of Computer Science and Numerical Analysis, University of Cordoba, Spain 2 Eindhoven University of Technology (TU/e), PO Box 513, Eindhoven, The Netherlands Introduction Learning Management Systems (LMSs) can offer a great variety of channels and workspaces to facilitate information sharing and communication among participants in a course. They let educators distribute information to students, produce content material, prepare assignments and tests, engage in discussions, manage distance classes and enable collaborative learning with forums, chats, file storage areas, news services, etc. Some examples of commercial systems are Blackboard [1], WebCT [2] and Top-Class [3] while some examples of free systems are Moodle [4], Ilias [5] and Claroline [6]. One of the most commonly used is Moodle (modular object oriented developmental learning environment), a free learning management system enabling the creation of powerful, flexible and engaging online courses and experiences (Rice, 2006). These e-learning systems accumulate a vast amount of information which is very valuable for analyzing students behaviour and could create a gold mine of educational data [7]. They can record any student activities involved, such as reading, writing, taking tests, performing various tasks, and even communicating with peers. They normally also provide a database that stores all the system s information: personal information about the users (profile), academic results and users interaction data. However, due to the vast quantities of data these systems can generate daily, it is very difficult to manage data analysis manually. Instructors and

2 2 course authors demand tools to assist them in this task, preferably on a continual basis. Although some platforms offer some reporting tools, it becomes hard for a tutor to extract useful information when there are a great number of students [8]. The current LMSs do not provide specific tools allowing educators to thoroughly track and assess all learners activities while evaluating the structure and contents of the course and its effectiveness for the learning process [9]. A very promising area for attaining this objective is the use of data mining. Data mining or knowledge discovery in databases (KDD) is the automatic extraction of implicit and interesting patterns from large data collections. Next to statistics and data visualization, there are many data mining techniques for analyzing the data. Some of the most useful data mining tasks and methods are clustering, classification and association rule mining. These methods uncover new, interesting and useful knowledge based on users usage data. In the last few years, researchers have begun to apply data mining methods to help instructors and administrators to improve e- learning systems [10]. Association rule mining has been applied to e-learning systems for traditionally association analysis (finding correlations between items in a dataset), including, e.g., the following tasks: building recommender agents for on-line learning activities or shortcuts [11], automatically guiding the learner s activities and intelligently generate and recommend learning materials [12], identifying attributes characterizing patterns of performance disparity between various groups of students [13], discovering interesting relationships from a student s usage information in order to provide feedback to the course author [14], finding out the relationships between each pattern of a learner s behavior [15], finding student mistakes often occurring together [16], guiding the search for the best fitting transfer model of student learning [17], optimizing the content of an e-learning portal by determining the content of most interest to the

3 3 user [18], extracting useful patterns to help educators and web masters evaluating and interpreting on-line course activities [11], and personalizing e-learning based on aggregate usage profiles and a domain ontology [19]. Association rules mining is one of the most well studied data mining tasks. It discovers relationships among attributes in databases, producing if-then statements concerning attributevalues [20]. Association rule mining has been applied to web-based education systems from two points of view: 1) help professors to obtain detailed feedback of the e-learning process: e.g., finding out how the students learn on the web, to evaluate the students based on their navigation patterns, to classify the students into groups, to restructure the contents of the web site to personalize the courses; and 2) help students in their interaction with the e-learning system: e.g., adaptation of the course according to the apprentice's progress, e.g., by recommending to them personalized learning paths based on the previous experiences other similar students. This paper is organized in the following way: First, we describe the background of association rule mining in general and more specifically its application to e-learning. Then, we describe the main drawbacks of and some solutions for applying association rule algorithms in LMSs. Next, we show a practical tutorial of using an association rule mining algorithm over data generated from a Moodle system. Finally, the conclusions and further research are outlined. Background IF-THEN rules are one of the most popular ways of knowledge representation, due to their simplicity and comprehensibility [21]. There are different types of rules according to the data mining technique used, for example: classification, association, sequential pattern analysis, prediction, causality induction, optimization, etc. In the area of knowledge discovery in databases the most studied ones are association rules, classifiers and predictors. An example of a

4 4 generic rule format IF-THEN in Extend Backus Naur Form (EBNF) notation it is shown in the Table 1. Table 1. IF-THEN rule format <rule> ::= IF <antecedent> THEN <consequent> <antecedent> ::= <condition> + <consequent> ::= <condition> + <condition> ::= <attribute> <operator> <value> <attribute> ::= Each one of the possible attributes set <value> ::= Each one of the possible values of the attributes set <operator> ::= = > < Before beginning the study in depth of the main association rule mining algorithms, we first formally define what is an association rule [22]. Let I = {i 1, i 2,..., i m } be a set of literals, called items. Let D be a set of transactions, where each transaction T is a set of items such that T I. Each transaction is associated with a unique identifier, called its TID. We say that a transaction T contains X, a set of some items in I, if X I. An association rule is an implication of the form X Y, where X I, Y I, and X Y=. The rule X Y holds in the transaction set D with confidence c if c% of transactions in D that contain X also contain Y. The rule X Y has support s in the transaction set D if s% of transactions in D contain X Y. An example association rule is: "IF diapers in a transaction, THEN beer is in the transaction as well in 30% of the cases. Diapers and beer are bought together in 11 % of the rows in the database". In this example, support and confidence of the rule are: s = P(X Y)=11%

5 5 c = P(Y/X) = P(X Y) / P(X) = s (X Y) / s (X) = 30% Given a set of transactions D, and user-specified thresholds minimum support (called minsup) and minimum confidence (called minconf ), the problem of mining association rules is to generate all association rules that have support and confidence greater than minsup and minconf respectively. Our discussion is neutral with respect to the representation of D. For example, D could be a data file, a relational table, or the result of a relational expression. The previous problem is solved by the Apriori algorithm [20] that is the first and simplest association rule mining algorithm. For an overview of different association rule mining algorithms and implementations, see the Frequent Itemset Mining Implementations (FIMI) repository ( Figure 1 shows the Apriori algorithm. The first pass of the algorithm simply counts item occurrences to determine the frequent itemsets of size (cardinality) 1. A frequent itemset of size k is in literature also called a large k-itemset. All subsequent passes, consists of two phases. First, the large itemsets L k found in the (k)th pass are used to generate the candidate itemsets C k+1 as it is shown in Figure 2. Next, the database is scanned and the support of candidates in C k+1 is counted. C k : Candidate itemsets of size k; L k : Frequent itemsets of size k L 1 = {frequent items}; for (k = 1; L k!= ; k++) do begin C k+1 = {candidates generated from L k }; for each transaction t in the database do increment the count of all candidates in C k+1 that are contained in t; L k+1 = {candidates in C k+1 with minsup} end return k L k ; Figure 1. Pseudo-code of Apriori Algorithm

6 6 Let us consider a set L 3 ={abc, abd, acd, ace, bcd} The join step: - C k+1 is generated by joining L k with itself Self-joining: L 3 *L 3 (Items must be in lexicographic order.) - abcd from abc and abd - acde from acd and ace The prune step: - Any k-itemset that is not frequent cannot be a subset of a frequent (k+1)-itemset - Pruning C 4 : acde is removed because ade is not in L 3 The final candidate set: C 4 ={abcd} Figure 2. Apriori candidate generation There are a lot of different association rule algorithms. A comparative study between the main algorithms that are currently used to discover association rules can be found in [23]: Apriori [22], FP-Growth [24], MagnumOpus [25], Closet [26]. Most of these algorithms require the user to set two thresholds, the minimal support and the minimal confidence, and find all the rules that exceed the thresholds specified by the user. Therefore, the user must possess a certain amount of expertise in order to find the right settings for support and confidence to obtain the best rules. Drawbacks of applying association rule in e-learning In the association rule mining area, most of the research efforts went in the first place to improving the algorithmic performance [27], and in the second place into reducing the output set by allowing the possibility to express constraints on the desired results. Over the past decade a variety of algorithms that address these issues through the refinement of search strategies, pruning techniques and data structures have been developed. While most algorithms focus on the explicit discovery of all rules that satisfy minimal support and confidence constraints for a given dataset, increasing consideration is being given to specialized algorithms that attempt to improve processing time or facilitate user interpretation by reducing the result set size and by

7 7 incorporating domain knowledge [28]. Most of the current data mining tools are too complex for educators to use and their features go well beyond the scope of what an educator might require. As a result, the course administrator is more likely to apply data mining techniques in order to produce reports for instructors who then use these reports to make decisions about how to improve the student s learning and the online courses. Nowadays, normally, data mining tools are designed more for power and flexibility than for simplicity. There are also other specific problems related to the application of association rule mining to e-learning data. Next, we are going to describe some of the main drawbacks of association rule algorithms in e-learning. Finding the appropriate parameter settings of the mining algorithm Association rule mining algorithms need to be configured before they are executed. So, the user has to give appropriate values for the parameters in advance (often leading to too many or too few rules) in order to obtain a good number of rules. Most of these algorithms require the user to set two thresholds, the minimal support and the minimal confidence, and then find all the rules that exceed the thresholds specified by the user. Therefore, the user must possess a certain amount of expertise in order to find the right settings for support and confidence to obtain the best rules. One possible solution to this problem can be to use a parameter-free algorithm. For example, the Weka [29] package implements an Apriori-type algorithm that solves this problem partially. This algorithm reduces iteratively the minimum support, by a factor delta support (Δs) introduced by the user, until a minimum support is reached or a required number of rules has been generated.

8 8 Another improved version of the Apriori algorithm is the Predictive Apriori algorithm [30], which automatically resolves the problem of balance between these two parameters, maximizing the probability of making an accurate prediction for the data set. In order to achieve this, a parameter called the exact expected predictive accuracy is defined and calculated using the Bayesian method, which provides information about the accuracy of the rule found. In [31] experimental tests were performed on a Moodle course by comparing the two previous algorithms. The final results demonstrated better performance for Predictive Apriori than the Apriori-type algorithm using the Δs factor. Discovering too many rules Educational data sets are normally very small if we compare them with databases used in other data mining fields typical sizes are the size of one class, which are often only exemplars. In very few cases, we get data from students. Therefore, the application of traditional association algorithms will be simple and efficient. However, association rule mining algorithms normally discover a huge quantity of rules and do not guarantee that all the rules found are relevant. Event though support and confidence already allow for pruning many associations, often it is desirable to apply other constraints as well; for example, on the attributes that must or cannot be present in the antecedent or consequent of the discovered rules. Another solution is to evaluate, and post-prune the obtained rules in order to find the most interesting rules for a specific problem. Traditionally, the use of objective interestingness measures has been suggested [32], such as support and confidence, mentioned previously, as well as purely statistical measures such as the chi-square statistic and the correlation coefficient in order to measure the dependency inference between data variables. Subjective measures are

9 9 becoming increasingly important [33], in other words measures that are based on subjective factors controlled by the user. Most of the subjective approaches involve user participation in order to express, in accordance with his or her previous knowledge, which rules are of interest. Some suggested subjective measures [34] are: unexpectedness (rules are interesting if they are unknown to the user or contradict the user s knowledge) and actionability (rules are interesting if users can do something with them to their advantage). Liu et al. [34] proposed an Interestingness Analysis System (IAS) which compares rules discovered with the user's knowledge about the area of interest. Let U be the set of a user s specification representing his/her knowledge space, A be the set of discovered association rules, this algorithm implements a pruning technique for removing redundant or insignificant rules by ranking and classifying them into four categories: conforming rules, unexpected consequent rules, unexpected condition rules and both-side unexpected rules. The degrees of membership into each of these four categories are used for ranking the rules. Using their own specification language, they indicate their knowledge about the matter in question, through relationships among the fields or items in the database. Finally, another approximation is that we can see the knowledge database as a rule repository on the basis of which subjective analysis of the rules discovered can be performed [35]. Before running the association rule mining algorithm, the teacher could download the relevant knowledge database, in accordance with his/her profile. The personalization of the rules returned is based on filtering parameters, associated with the type of the course to be analyzed such as: the area of knowledge; the level of education; the difficulty of the course, etc. The rules repository is created on the server in a collaborative way where the experts can vote for each rule

10 10 in the repository, based on the educational considerations and their experience gained in other similar e-learning courses. Discovery of poorly understandable rules A factor that is of major importance in determining the quality of the extracted rules is their comprehensibility. Although the main motivation for rule extraction is to obtain a comprehensible description of the underlying model's hypothesis, this aspect of rule quality is often overlooked due to the subjective nature of comprehensibility, which can not be measured independently of the person using the system [36]. Prior experience and domain knowledge of this person play an important role in assessing the comprehensibility. This contrasts with accuracy that can be considered as a property of the rules and which can be evaluated independently of the users. There are some traditional techniques that have been used in order to improve the comprehensibility of discovered rules, such as, constraining the number of items in the antecedent or consequent of the rule, or performing a discretization of numerical values. Discretization [37] divides the numerical data into categorical classes that are easier to understand for the instructor (categorical values are more user-friendly for the instructor than precise magnitudes and ranges). Another way to improve the comprehensibility of the rules is to incorporate domain knowledge and semantics, and to use a common and well-know vocabulary for the teacher. In the context of web-based educational systems, we can identify some common attributes to a variety of e-learning systems such as LMS and adaptive hypermedia courses, such as visited, time, score, difficulty level, knowledge level, attempts, number of messages, etc. In this context the use of standard metadata for e-learning [38] allows the creation and maintenance of a

11 11 common knowledge base with a common vocabulary susceptible of sharing among different communities of instructors or creators of e-learning courses. The SCORM [39] standard describes a content aggregation model and a tracking model for reusable learning objects to support adaptive instruction based on the learner s objectives, preferences, performance and other factors like instructional techniques. One of the most important features of SCORM is that allows the instructional content designer to specify sequencing rules and navigation behavior while maintaining the possibility of reusing learning resources within multiple and different aggregation contexts. While SCORM provide a framework for the representation and processing of the metadata, it falls short in including the needed support for more specific pedagogical tracking such as the use of collaborative resources. We thus argue that the use of a more specific pedagogical ontology provides a higher level of decision support analysis and mining, based in qualitative issues like: the collaborative degree of activities. Lastly, we consider very important to mention another aspect that can facilitate the comprehensibility of discovered rules, the visualization. The goal of visualization is to help analysts in inspecting the data after to applying the mining task [40] by means of some visual representation of the corpus of rules extracted. A range of visual representations is used, such as tables, two-dimensional matrices, graphs, bar charts, grids, mosaic plots, parallel coordinates, etc. Association rule mining can be integrated with visualization techniques [41] in order to allow users to drive the association rule finding process, giving them control and visual cues to ease understanding of both the process and its results. However, the visualization methods are still difficult to understand for a non-expert in data mining, such as a teacher. Therefore, we consider this question need to be addressed in a near future, and the challenge it will be to apply

12 12 these techniques in a more intuitive way identifying within these data structures the rules that are relevant and meaningful in the context of the e-learning analysis, and representing them in a simple way such as icons, colors, etc, using interactive bi-dimensional and three-dimensional representations. Statistical significance of discovered rules One aspect that is often overlooked when applying data mining techniques on small datasets is that of overfitting. When there is only limited data available, and the hypothesis space (i.e., the total number of possible rules) is huge, it is inevitable in a statistical sense, to get false discoveries because every rule has a small chance of being true in the given data, and the number of rules is huge. The smaller the dataset, the larger the danger for these so-called type-i errors: false rules that accidentally hold in the small dataset. To address this problem it is more than advisable to restrict the hypothesis space by inserting as much as possible background knowledge; i.e.: restricting the body and heads of the rules as much as possible, and maybe, in worst case, even remove numerous attributes from the dataset. When evaluating the outcome of an association rule mining operation, one also has to take very well into account that without further controlled experiments the found associations are, at least, open to debate. An introduction to association rule mining with Weka in a Moodle LMS We are going to use Weka over Moodle data in order to show a practical example of applying association rule mining in an educational environment. Moodle [42] is a well-known, freeware learning management system and open-source software package designed using sound pedagogical principles, to help educators create effective online learning communities. Moodle has a flexible array of module activities and resources to create five kinds of static course material (a text page, a web page, a link to anything on the Web, a

13 13 view into one of the course s directories and a label that displays any text or image), six types of interactive course material (assignments, choice, journal, lesson, quiz and survey) and five kinds of activities where students interact with each other (chat, forum, glossary, wiki and workshop). On the other hand, Weka [29] is an open-source software platform that provides a collection of machine learning and data mining algorithms for data pre-processing, classification, regression, clustering, association rules, and visualization. The process of applying association rule mining over the Moodle data consists of the same four steps as the general data mining process (see Figure 3): Collect data. The LMS system is used by students and the usage and interaction information is stored in the database. We are going to use the students usage data of the Moodle system. Preprocess the data. The data are cleaned and transformed into a mineable format. In order to preprocess the Moodle data we used the MySQL System Tray Monitor and Administrator tools [43] and the Open DB Preprocess task in the Weka Explorer [29]. Apply association rule mining. The data mining algorithms are applied to discover and summarize knowledge of interest to the teacher. Interpret, evaluate and deploy the results. The obtained results or model are interpreted and used by the teacher for further actions. The teacher can use the discovered information for making decision about the students and the Moodle activities of the course in order to improve the students learning.

14 14 Figure 3. Mining Moodle data. Our objective is to use the students usage data of the Moodle system. Moodle has a lot of detailed information about content, users, usage, etc. that it is stored in a relational database. The data is stored in a single database: MySQL and PostgreSQL are best supported, but it can also be used with Oracle, Access, Interbase, any database supporting ODBC connections and others. We have used MySQL because is the world's most popular open source database. Moodle has more than 150 tables, some of the most important are: mdl_log, mdl_assignement, mdl_chat, mdl_forum, mdl_message, mdl_quiz. Data preprocessing allows for transforming the original data into a suitable shape to be used by a particular data mining algorithm or framework. So, before applying a data mining algorithm, a number of general data preprocessing tasks has to be addressed (data cleaning, user identification, session identification, path completion, transaction identification, data transformation and enrichment, data integration, data reduction). Data preprocessing of LMSgenerated data has some specific issues: Moodle and most of the LMS use user authentication (password protection) in which logs have entries identified by users since the users have to log-in, and sessions are already

15 15 identified since users may also have to log-out. So, we can remove the typical user and session identification tasks of preprocessing data of web-based systems. Moodle and most of the LMSs record the students usage information not only in log files but also directly in relational databases. The tracking module can store user interactions at a higher level than simple page access. Databases are more powerful, flexible and less bug-prone that typical log text files for gather detailed access and usage information of the services available (forums, chats, etc.) by the LMS. Therefore, the data gathered by a LMS may require less cleaning and preprocessing than data collected in other systems based on log files. In our case, we are going to do the following preprocessing tasks: Selecting data. The first task is to choose in what specific Moodle courses we are interested to use for mining. Creating summary tables. Starting from the selected course we create summarization tables that aggregate the information at the required level (e.g. student). Student and interaction data are spread over several tables. We have created a new summary table which integrates the most important information for our objective. This table has a summary by row about all the activities done by each student in the course and the final obtained mark by the student in the same course. In order to create this table it is necessary to do several queries to the database in order to obtain the information of the desired students (userid value from mdl_user_students table) and courses (id value of the mdl_course table). Table 2 shows the main attributes used to create summarization tables.

16 16 Table 2. Attributes used for each student. Name Description course Identification number of the course n_assigment Number of realized assignments. n_quiz Number of realized quizzes. n_quiz_a Number of passed quizzes. n_quiz_s Number of failed quizzes. n_messages Number of send messages to the chat. n_messages_ap Number of send messages to the teacher. n_posts Number of send messages to the forum. n_read Number or read messages of the forum. total_time_assignment Total time used in assignment. total_time_quiz Total time used in quiz. total_time_forum Total time used in forum. mark Final mark the student obtained in the course. Discretizing data. Next, we perform a discretization of numerical values in order to increase the interpretation and comprehensibility. Discretization divides the data in categorical classes that are easier to understand for the teacher (categorical values are more familiar to the teacher than precise magnitudes and ranges). In our experiments we discretized all the numerical values of the summarization table (except for course identification value). There are several unsupervised global methods [37] for transforming continuous attributes into discrete Attributes, such as the equal-width method, equal-frequency method or the manual method (in which you have to specify the cut-off points). In this case, we have use the manual method for the mark attribute and the equal-width method for the others attributes. The labels that we have used in the mark attribute are: FAIL, PASS, GOOD and EXCELLENT, and in all the other attributes: LOW, MEDIUM and HIGH. Transforming the data. Finally, we transform the data to the required format used by the data mining algorithm or framework in order to be mineable. In this case, we have

17 17 exported the summary table to a text file with ARFF format used by Weka. An ARFF (Attribute-Relation File Format) file is an ASCII text file that describes a list of instances sharing a set of attributes [44]. Once the data is prepared, we can apply the association rule mining algorithm. In order to do so, it is necessary: 1) to choose the specific association rule mining algorithm; 2) to configure the parameters of the algorithm; 3) to identify which table will be used for the mining; 4) and to specify some other restrictions, such as the maximum number of items and what specific attributes can be present in the antecedent or consequent of the discovered rules. In this case, we have used the Apriori algorithm [20] for finding association rules over the discretized summarization table of the course 110 (Projects), executing this algorithm with a minimum support of 0.3 and a minimum confidence of 0.9 as parameters. Projects is a 3º course subjects of computer science studies oriented to design, develop, document and maintain a full computer science project. Students have to use Moodle to read/download documents/notes, do activities, do quizzes and send/read message to forums in order to complement traditional (face-to-face) classrooms. The instructor can use all the collected data by Moodle for doing data mining; in our case association rule mining. Association rules can discover for example interesting relationships between the Moodle usage and the student's final marks. It can show what type of students actions predict good final marks (if activities done then marks good) or bad final marks (if not message send/read to forum then marks bad) and can be used to detect on time student with problem in order to help them and motivate them.

18 18 Figure 4. Results of Apriori algorithm Finally, we do a data post-processing step in which the obtained results and models are interpreted, evaluated and used by the teacher for further actions. The teacher can use the discovered information for making decisions about the students and the LMS activities of the course in order to improve the students learning. Next, we are going to explain the meaning of some obtained rules. Figure 4 shows how a huge number of association rules can be discovered. There are also a lot of uninteresting rules, like a great number of redundant rules (rules with a generalization of relationships of several rules, like rule 3 with rules 1 and 2, and rule 6 with rule 7 and 8). There are some similar rules (rules with the same element in antecedent and consequent but interchanged, such as rules 15, 16, 17, and rules 18, 19, 20). And there are some random relationships (rules with random relations between variables, such as rules 1 and 5). But there are also rules that show relevant information for educational purposes, like those that show expected

19 19 or conforming relationships (if a students does not send messages, it is logical that he/she does not read them either, such as rule 2, and, similarly, rules 10, 11 and 13). There are also rules that show unexpected relationships (such as rules 4, 12, 14 and 9) which can be very useful for the instructor in decision making about the activities and detecting students with learning problems, like rules 4, 12 and 9 that show that if the number of messages read and messages sent in the forum is very low, and the total time and number of passed quizzes is very low, then the final mark obtained is a failing grade. Starting from this information, the instructor can pay more attention to these students because they are prone to failure. As a result, the instructor can motivate them in time to pass the course. Conclusions and future trends It is still the early days for the total integration of association rule mining in e-learning systems and not many real and fully operative implementations are available. Nevertheless, a good deal of academic research in this area has been published over the last few years. In this paper, we have outlined some of the main drawbacks of the application of association rule mining to data generated by learning management systems. We have proposed some possible solutions for each problem. Furthermore, we have given general outlines, and explained how to carry out a particular KDD process for association rule mining in a LMS system. We believe that some future research lines will focus on: developing an association rule mining tool that can more easily be used by educators; proposing new specific measures of interest with the inclusion of domain knowledge and semantic information; integrating the mining tool into the LMS systems in order to give direct feedback; developing iterative and interactive or guided mining to help educators to apply KDD processes, or even developing an automatic mining system that can perform the mining automatically in an unattended way, such

20 20 that the teacher only has to use the proposed recommendations in order to improve the students learning. Acknowledgments The authors gratefully acknowledge the financial support provided by the Spanish department of Research under TIN C06-03 and P08-TIC-3720 Projects. References [1] BlackBoard, available at (accessed March 16), [2] WebCT, available at (accessed March 16), [3] TopClass, available at (accessed March 16), [4] Moodle, available at (accessed March 16), [5] Ilias, available at (accessed March 16), [6] Claroline, available at (accessed March 16), [7] Mostow, J., and Beck, J., Some useful tactics to modify, map and mine data from intelligent tutors. Natural Language Engineering. 12(2), , [8] Dringus, L. and Ellis, T., Using data mining as a strategy for assessing asynchronous discussion forums. Computer & Education Journal. 45, , [9] Zorrilla, M. E., Menasalvas, E., Marin, D., Mora, E. and Segovia, J., Web usage mining project for improving web-based learning sites. In Web mining workshop, Cataluña, 1 22, [10] Romero, C. and Ventura, S., Data mining in e-learning. Southampton, UK: Wit Press [11] Zaïane, O., Building a Recommender Agent for e-learning Systems. Proceedings of the International Conference in Education (ICCE), Auckland, Nueva Zelanda, 55-59, 2002.

21 21 [12] Lu, J., Personalized e-learning material recommender system. International conference on information technology for application , [13] Minaei-Bidgoli, B., Tan, P. and Punch, W., Mining interesting contrast rules for a webbased educational system. International Conference on Machine Learning Applications [14] Romero, C., Ventura, S., and De Bra, P., Knowledge discovery with genetic programming for providing feedback to courseware author. User Modeling and User- Adapted Interaction: The Journal of Personalization Research. 14(5), , [15] Yu, P., Own, C. and Lin, L., On learning behavior analysis of web based interactive environment. In: Proc. ICCEE, Oslo, Norway, [16] Merceron, A., and Yacef, K, Mining student data captured from a web-based tutoring tool: Initial exploration and results. Journal of Interactive Learning Research. 15:4, , [17] Freyberger, J., Heffernan, N., Ruiz, C., Using association rules to guide a search for best fitting transfer models of student learning. Workshop on Analyzing Student-Tutor Interactions Logs to Improve Educational Outcomes at ITS Conference, [18] Ramli, A.A., Web usage mining using apriori algorithm: UUM learning care portal case. International Conference on Knowledge Management. Malaysia. 1-19, [19] Markellou, P., Mousourouli, I., Spiros, S. and Tsakalidis, A., Using Semantic Web Mining Technologies for Personalized e-learning Experiences. The Fourth IASTED Conference on Web-based Education, WBE Grindelwald, Switzerland, [20] Agrawal, R., Imielinski, T. and Swami, A.N., Mining Association Rules between Sets of Items in Large Databases. In Proceedings of SIGMOD, , [21] Klosgen, W., & Zytkow, J., Handbook of data mining and knowledge discovery. New York: Oxford University Press, [22] Agrawal, R. and Srikant, R., 1994, Fast algorithms for mining association rules. Proceedings of Conf. Very Large Data Bases, Santiago de Chile, [23] Zheng, Z., R. Kohavi and Mason, L., Real world performance of association rules. Sixth ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2(2), 86-98, 2001.

22 22 [24] Han, J., Pei, J., and Yin, Y., Mining frequent patterns without candidate generation. In Proceedings of ACM-SIGMOD International Conference on Management of Data [25] Webb, G.I. OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, , [26] Pei, J., Han, J., and Mao, R. CLOSET: An efficient algorithm for mining frequent closed itemsets. In Proceedings of ACM_SIGMOD International DMKD'00, Dallas, TX, [27] Ceglar, A. and Roddick, J.F., Association mining. ACM Computing Surveys, 38(2), 1-42, [28] Calders, T. and Goethals, B. Non-derivable Itemset Mining. Data Mining and Knowledge Discovery. 14, , [29] Weka, available at (accessed March 16), [30] Scheffer, T., Finding Association Rules That Trade Support Optimally against Confidence. Lecture Notes in Computer Science. 2168, 424+, [31] García, E., Romero, C., Ventura, S. and De Castro, C., Using Rules Discovery for the Continuous Improvement of e-learning Courses. Proc. of the 7th International Conference on Intelligent Data Engineering and Automated Learning- IDEAL LNCS 4224, , [32] Tan, P. and Kumar, V., Interesting Measures for Association Patterns: A Perspectiva, Technical Report TR Department of Computer Science. University of Minnnesota, [33] Silberschatz, A. and Tuzhilin, A., What makes pattterns interesting in Knoledge discovery systems. IEEE Trans. on Knowledge and Data Engineering. 8(6), , [34] Liu B., Wynne H., Shu C. and Yiming M., Analyzing the Subjective Interestingness of Association Rules. IEEE Intelligent Systems and their Applications. 15:5, 47-55, [35] García, E., Romero, C., Ventura, S. and De Castro C., An architecture for making recommendations to courseware authors through association rule mining and collaborative filtering. UMUAI: User Modelling and User Adapted Interaction. 19:99, , 2009.

23 23 [36] Huysmans J., Baesens B. and Vanthienen J., Using Rule Extraction to Improve the Comprehensibility of Predictive Models. (accessed January 20, 2008), [37] Dougherty, J. Kohavi, M. and Sahami, M., Supervised and unsupervised discretization of continuous features. Int. Conf. Machine Learning, Tahoe City, CA, , [38] Brase, J. and Nejdl, W., Ontologies and Metadata for elearning. Springer Verlag, , [39] SCORM, Advanced Distributed Learning. Shareable content object reference model: The SCORM overview. Available at (accessed March 16, 2009), [40] De Oliveira, M. C. F. and Levkowitz, H., From visual data exploration to visual data mining: A survey. IEEE Trans. on Visualization and Computer Graphics. 9(3), , [41] Yamamoto C. H. and De Oliveira, M. C. F., Visualization to assist users in association rule mining tasks. PhD diss. Arizona State Mathematics and Computer Sciences Institute. Sao Paulo University. (accessed March 16, 2009), [42] Rice, W. H., Moodle e-learning course development. A complete guide to successful learning using Moodle. Packt Publishing [43] Ullman, L., Guía de aprendizaje MySQL. Pearson Prentice Hall [44] Witten, I. H. and Frank, E., Data mining: Practical machine learning tools and techniques. Morgan Kaufman

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

Towards Virtual Course Evaluation Using Web Intelligence

Towards Virtual Course Evaluation Using Web Intelligence Towards Virtual Course Evaluation Using Web Intelligence M.E. Zorrilla 1, D. Marín 1, and E. Álvarez 2 1 Department of Mathematics, Statistics and Computation, University of Cantabria. Avda. de los Castros

More information

E-Learning Using Data Mining. Shimaa Abd Elkader Abd Elaal

E-Learning Using Data Mining. Shimaa Abd Elkader Abd Elaal E-Learning Using Data Mining Shimaa Abd Elkader Abd Elaal -10- E-learning using data mining Shimaa Abd Elkader Abd Elaal Abstract Educational Data Mining (EDM) is the process of converting raw data from

More information

Business Intelligence in E-Learning

Business Intelligence in E-Learning Business Intelligence in E-Learning (Case Study of Iran University of Science and Technology) Mohammad Hassan Falakmasir 1, Jafar Habibi 2, Shahrouz Moaven 1, Hassan Abolhassani 2 Department of Computer

More information

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR

More information

Building A Smart Academic Advising System Using Association Rule Mining

Building A Smart Academic Advising System Using Association Rule Mining Building A Smart Academic Advising System Using Association Rule Mining Raed Shatnawi +962795285056 raedamin@just.edu.jo Qutaibah Althebyan +962796536277 qaalthebyan@just.edu.jo Baraq Ghalib & Mohammed

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Association Rule Mining

Association Rule Mining Association Rule Mining Association Rules and Frequent Patterns Frequent Pattern Mining Algorithms Apriori FP-growth Correlation Analysis Constraint-based Mining Using Frequent Patterns for Classification

More information

SPMF: a Java Open-Source Pattern Mining Library

SPMF: a Java Open-Source Pattern Mining Library Journal of Machine Learning Research 1 (2014) 1-5 Submitted 4/12; Published 10/14 SPMF: a Java Open-Source Pattern Mining Library Philippe Fournier-Viger philippe.fournier-viger@umoncton.ca Department

More information

Mining an Online Auctions Data Warehouse

Mining an Online Auctions Data Warehouse Proceedings of MASPLAS'02 The Mid-Atlantic Student Workshop on Programming Languages and Systems Pace University, April 19, 2002 Mining an Online Auctions Data Warehouse David Ulmer Under the guidance

More information

Mining Association Rules: A Database Perspective

Mining Association Rules: A Database Perspective IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.12, December 2008 69 Mining Association Rules: A Database Perspective Dr. Abdallah Alashqur Faculty of Information Technology

More information

USING SEMANTIC WEB MINING TECHNOLOGIES FOR PERSONALIZED E-LEARNING EXPERIENCES

USING SEMANTIC WEB MINING TECHNOLOGIES FOR PERSONALIZED E-LEARNING EXPERIENCES USING SEMANTIC WEB MINING TECHNOLOGIES FOR PERSONALIZED E-LEARNING EXPERIENCES Penelope Markellou 1,2, Ioanna Mousourouli 2, Sirmakessis Spiros 1,2,3, Athanasios Tsakalidis 1,2 1 Research Academic Computer

More information

DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE

DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE SK MD OBAIDULLAH Department of Computer Science & Engineering, Aliah University, Saltlake, Sector-V, Kol-900091, West Bengal, India sk.obaidullah@gmail.com

More information

Intinno: A Web Integrated Digital Library and Learning Content Management System

Intinno: A Web Integrated Digital Library and Learning Content Management System Intinno: A Web Integrated Digital Library and Learning Content Management System Synopsis of the Thesis to be submitted in Partial Fulfillment of the Requirements for the Award of the Degree of Master

More information

Selection of Optimal Discount of Retail Assortments with Data Mining Approach

Selection of Optimal Discount of Retail Assortments with Data Mining Approach Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.

More information

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive

More information

Association rules for improving website effectiveness: case analysis

Association rules for improving website effectiveness: case analysis Association rules for improving website effectiveness: case analysis Maja Dimitrijević, The Higher Technical School of Professional Studies, Novi Sad, Serbia, dimitrijevic@vtsns.edu.rs Tanja Krunić, The

More information

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis

Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis , 23-25 October, 2013, San Francisco, USA Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis John David Elijah Sandig, Ruby Mae Somoba, Ma. Beth Concepcion and Bobby D. Gerardo,

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

Understanding Web personalization with Web Usage Mining and its Application: Recommender System Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Classification of Learners Using Linear Regression

Classification of Learners Using Linear Regression Proceedings of the Federated Conference on Computer Science and Information Systems pp. 717 721 ISBN 978-83-60810-22-4 Classification of Learners Using Linear Regression Marian Cristian Mihăescu Software

More information

MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM

MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM J. Arokia Renjit Asst. Professor/ CSE Department, Jeppiaar Engineering College, Chennai, TamilNadu,India 600119. Dr.K.L.Shunmuganathan

More information

Educational Social Network Group Profiling: An Analysis of Differentiation-Based Methods

Educational Social Network Group Profiling: An Analysis of Differentiation-Based Methods Educational Social Network Group Profiling: An Analysis of Differentiation-Based Methods João Emanoel Ambrósio Gomes 1, Ricardo Bastos Cavalcante Prudêncio 1 1 Centro de Informática Universidade Federal

More information

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi,*

More information

Building a Recommender Agent for e-learning Systems

Building a Recommender Agent for e-learning Systems Building a Recommender Agent for e-learning Systems Osmar R. Zaïane University of Alberta, Edmonton, Alberta, Canada zaiane@cs.ualberta.ca Abstract A recommender system in an e-learning context is a software

More information

Communication Software Laboratory Academic Year 2007-2008. E-learning Platforms. Moodle and Dokeos.

Communication Software Laboratory Academic Year 2007-2008. E-learning Platforms. Moodle and Dokeos. Communication Software Laboratory Academic Year 2007-2008 E-learning Platforms. Moodle and Dokeos. Group 95 Homero Canales Guenaneche 100031592 Fernando García Radigales 100039032 Index 1. Introduction...

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

Data Mining to Recognize Fail Parts in Manufacturing Process

Data Mining to Recognize Fail Parts in Manufacturing Process 122 ECTI TRANSACTIONS ON ELECTRICAL ENG., ELECTRONICS, AND COMMUNICATIONS VOL.7, NO.2 August 2009 Data Mining to Recognize Fail Parts in Manufacturing Process Wanida Kanarkard 1, Danaipong Chetchotsak

More information

How To Write A Summary Of A Review

How To Write A Summary Of A Review PRODUCT REVIEW RANKING SUMMARIZATION N.P.Vadivukkarasi, Research Scholar, Department of Computer Science, Kongu Arts and Science College, Erode. Dr. B. Jayanthi M.C.A., M.Phil., Ph.D., Associate Professor,

More information

Edifice an Educational Framework using Educational Data Mining and Visual Analytics

Edifice an Educational Framework using Educational Data Mining and Visual Analytics I.J. Education and Management Engineering, 2016, 2, 24-30 Published Online March 2016 in MECS (http://www.mecs-press.net) DOI: 10.5815/ijeme.2016.02.03 Available online at http://www.mecs-press.net/ijeme

More information

Predicting Students Final GPA Using Decision Trees: A Case Study

Predicting Students Final GPA Using Decision Trees: A Case Study Predicting Students Final GPA Using Decision Trees: A Case Study Mashael A. Al-Barrak and Muna Al-Razgan Abstract Educational data mining is the process of applying data mining tools and techniques to

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

More information

Postprocessing in Machine Learning and Data Mining

Postprocessing in Machine Learning and Data Mining Postprocessing in Machine Learning and Data Mining Ivan Bruha A. (Fazel) Famili Dept. Computing & Software Institute for Information Technology McMaster University National Research Council of Canada Hamilton,

More information

ONTOLOGY IN ASSOCIATION RULES PRE-PROCESSING AND POST-PROCESSING

ONTOLOGY IN ASSOCIATION RULES PRE-PROCESSING AND POST-PROCESSING ONTOLOGY IN ASSOCIATION RULES PRE-PROCESSING AND POST-PROCESSING Inhauma Neves Ferraz, Ana Cristina Bicharra Garcia Computer Science Department Universidade Federal Fluminense Rua Passo da Pátria 156 Bloco

More information

Users Interest Correlation through Web Log Mining

Users Interest Correlation through Web Log Mining Users Interest Correlation through Web Log Mining F. Tao, P. Contreras, B. Pauer, T. Taskaya and F. Murtagh School of Computer Science, the Queen s University of Belfast; DIW-Berlin Abstract When more

More information

Analysing the Behaviour of Students in Learning Management Systems with Respect to Learning Styles

Analysing the Behaviour of Students in Learning Management Systems with Respect to Learning Styles Analysing the Behaviour of Students in Learning Management Systems with Respect to Learning Styles Sabine Graf and Kinshuk 1 Vienna University of Technology, Women's Postgraduate College for Internet Technologies,

More information

The basic data mining algorithms introduced may be enhanced in a number of ways.

The basic data mining algorithms introduced may be enhanced in a number of ways. DATA MINING TECHNOLOGIES AND IMPLEMENTATIONS The basic data mining algorithms introduced may be enhanced in a number of ways. Data mining algorithms have traditionally assumed data is memory resident,

More information

A Framework of User-Driven Data Analytics in the Cloud for Course Management

A Framework of User-Driven Data Analytics in the Cloud for Course Management A Framework of User-Driven Data Analytics in the Cloud for Course Management Jie ZHANG 1, William Chandra TJHI 2, Bu Sung LEE 1, Kee Khoon LEE 2, Julita VASSILEVA 3 & Chee Kit LOOI 4 1 School of Computer

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

Data Analysis in E-Learning System of Gunadarma University by Using Knime

Data Analysis in E-Learning System of Gunadarma University by Using Knime Data Analysis in E-Learning System of Gunadarma University by Using Knime Dian Kusuma Ningtyas tyaz tyaz tyaz@student.gunadarma.ac.id Prasetiyo prasetiyo@student.gunadarma.ac.id Farah Virnawati virtha

More information

Moodle: Discover Open Source Course Management Software for Medical Education

Moodle: Discover Open Source Course Management Software for Medical Education Moodle: Discover Open Source Course Management Software for Medical Education Serkan Toy, Ph.D Children s Mercy Hospital University of Missouri, Kansas City, Missouri Kadriye O. Lewis, Ed.D Cincinnati

More information

A Survey on Association Rule Mining in Market Basket Analysis

A Survey on Association Rule Mining in Market Basket Analysis International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 4 (2014), pp. 409-414 International Research Publications House http://www. irphouse.com /ijict.htm A Survey

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,

More information

Data Mining: Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar

Data Mining: Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar Data Mining: Association Analysis Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar Association Rule Mining Given a set of transactions, find rules that will predict the occurrence of

More information

known as the Sharable Content Object Reference Model (SCORM). It became the standard for all LMSs. INTRODUCTION

known as the Sharable Content Object Reference Model (SCORM). It became the standard for all LMSs. INTRODUCTION Investigating the Need for a Learning Content Management System Jay Crook Crook Consulting (2011) 9514 Snowfinch Cir., Corpus Christi, TX 78418 E-mail: jay@jaycrook.com INTRODUCTION Many businesses and

More information

Clustering Marketing Datasets with Data Mining Techniques

Clustering Marketing Datasets with Data Mining Techniques Clustering Marketing Datasets with Data Mining Techniques Özgür Örnek International Burch University, Sarajevo oornek@ibu.edu.ba Abdülhamit Subaşı International Burch University, Sarajevo asubasi@ibu.edu.ba

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH

MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH M.Rajalakshmi 1, Dr.T.Purusothaman 2, Dr.R.Nedunchezhian 3 1 Assistant Professor (SG), Coimbatore Institute of Technology, India, rajalakshmi@cit.edu.in

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

Binary Coded Web Access Pattern Tree in Education Domain

Binary Coded Web Access Pattern Tree in Education Domain Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: kc.gomathi@gmail.com M. Moorthi

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm

Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm Purpose: key concepts in mining frequent itemsets understand the Apriori algorithm run Apriori in Weka GUI and in programatic way 1 Theoretical

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

Web Document Clustering

Web Document Clustering Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,

More information

Association Rule Mining: Exercises and Answers

Association Rule Mining: Exercises and Answers Association Rule Mining: Exercises and Answers Contains both theoretical and practical exercises to be done using Weka. The exercises are part of the DBTech Virtual Workshop on KDD and BI. Exercise 1.

More information

A Serial Partitioning Approach to Scaling Graph-Based Knowledge Discovery

A Serial Partitioning Approach to Scaling Graph-Based Knowledge Discovery A Serial Partitioning Approach to Scaling Graph-Based Knowledge Discovery Runu Rathi, Diane J. Cook, Lawrence B. Holder Department of Computer Science and Engineering The University of Texas at Arlington

More information

Learning paths in open source e-learning environments

Learning paths in open source e-learning environments Learning paths in open source e-learning environments D.Tuparova *,1, G.Tuparov 1,2 1 Dept. of Informatics, South West University, 66 Ivan Michailov Str., 2700 Blagoevgrad, Bulgaria 2 Dept. of Software

More information

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING M.Gnanavel 1 & Dr.E.R.Naganathan 2 1. Research Scholar, SCSVMV University, Kanchipuram,Tamil Nadu,India. 2. Professor

More information

Web Mining as a Tool for Understanding Online Learning

Web Mining as a Tool for Understanding Online Learning Web Mining as a Tool for Understanding Online Learning Jiye Ai University of Missouri Columbia Columbia, MO USA jadb3@mizzou.edu James Laffey University of Missouri Columbia Columbia, MO USA LaffeyJ@missouri.edu

More information

Web Data Mining: A Case Study. Abstract. Introduction

Web Data Mining: A Case Study. Abstract. Introduction Web Data Mining: A Case Study Samia Jones Galveston College, Galveston, TX 77550 Omprakash K. Gupta Prairie View A&M, Prairie View, TX 77446 okgupta@pvamu.edu Abstract With an enormous amount of data stored

More information

A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains

A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains Dr. Kanak Saxena Professor & Head, Computer Application SATI, Vidisha, kanak.saxena@gmail.com D.S. Rajpoot Registrar,

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

Methodology for Creating Adaptive Online Courses Using Business Intelligence

Methodology for Creating Adaptive Online Courses Using Business Intelligence Methodology for Creating Adaptive Online Courses Using Business Intelligence Despotović, S., Marijana; Bogdanović, M., Zorica; and Barać, M., Dušan Abstract This paper describes methodology for creating

More information

A Time Efficient Algorithm for Web Log Analysis

A Time Efficient Algorithm for Web Log Analysis A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Machine Learning and Data Mining. Fundamentals, robotics, recognition Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

A Data Model to Ease Analysis and Mining of Educational Data 1

A Data Model to Ease Analysis and Mining of Educational Data 1 A Data Model to Ease Analysis and Mining of Educational Data 1 André Krüger 1,2, Agathe Merceron 2 and Benjamin Wolf 2 {akrueger, merceron, bwolf}@beuth-hochschule.de 1 Aroline AG, Berlin, Germany 2 Beuth

More information

EFL LEARNERS PERCEPTIONS OF USING LMS

EFL LEARNERS PERCEPTIONS OF USING LMS EFL LEARNERS PERCEPTIONS OF USING LMS Assist. Prof. Napaporn Srichanyachon Language Institute, Bangkok University gaynapaporn@hotmail.com ABSTRACT The purpose of this study is to present the views, attitudes,

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY

More information

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information

More information

Scholars Journal of Arts, Humanities and Social Sciences

Scholars Journal of Arts, Humanities and Social Sciences Scholars Journal of Arts, Humanities and Social Sciences Sch. J. Arts Humanit. Soc. Sci. 2014; 2(3B):440-444 Scholars Academic and Scientific Publishers (SAS Publishers) (An International Publisher for

More information

(Refer Slide Time: 01:52)

(Refer Slide Time: 01:52) Software Engineering Prof. N. L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture - 2 Introduction to Software Engineering Challenges, Process Models etc (Part 2) This

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Behavior Usage Model to Manage the Best Practice of e-learning

Behavior Usage Model to Manage the Best Practice of e-learning Behavior Usage Model to Manage the Best Practice of e-learning Songsakda Chayanukro 1, Massudi Mahmuddin 2, Husniza binti Husni 3 1 Universiti Utara Malaysia, Malaysia, songsakda@gmail.com 2 Universiti

More information

{ Mining, Sets, of, Patterns }

{ Mining, Sets, of, Patterns } { Mining, Sets, of, Patterns } A tutorial at ECMLPKDD2010 September 20, 2010, Barcelona, Spain by B. Bringmann, S. Nijssen, N. Tatti, J. Vreeken, A. Zimmermann 1 Overview Tutorial 00:00 00:45 Introduction

More information

Using Data Mining Methods to Predict Personally Identifiable Information in Emails

Using Data Mining Methods to Predict Personally Identifiable Information in Emails Using Data Mining Methods to Predict Personally Identifiable Information in Emails Liqiang Geng 1, Larry Korba 1, Xin Wang, Yunli Wang 1, Hongyu Liu 1, Yonghua You 1 1 Institute of Information Technology,

More information

Educational Data Mining: a Case Study

Educational Data Mining: a Case Study Educational Data Mining: a Case Study Agathe MERCERON * and Kalina YACEF + * ESILV - Pôle Universitaire Léonard de Vinci, France + School of Information Technologies - University of Sydney, Australia,

More information

PREDICTIVE MODELING OF INTER-TRANSACTION ASSOCIATION RULES A BUSINESS PERSPECTIVE

PREDICTIVE MODELING OF INTER-TRANSACTION ASSOCIATION RULES A BUSINESS PERSPECTIVE International Journal of Computer Science and Applications, Vol. 5, No. 4, pp 57-69, 2008 Technomathematics Research Foundation PREDICTIVE MODELING OF INTER-TRANSACTION ASSOCIATION RULES A BUSINESS PERSPECTIVE

More information

Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining -

Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining - Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining - Hidenao Abe, Miho Ohsaki, Hideto Yokoi, and Takahira Yamaguchi Department of Medical Informatics,

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

Mining the Most Interesting Web Access Associations

Mining the Most Interesting Web Access Associations Mining the Most Interesting Web Access Associations Li Shen, Ling Cheng, James Ford, Fillia Makedon, Vasileios Megalooikonomou, Tilmann Steinberg The Dartmouth Experimental Visualization Laboratory (DEVLAB)

More information

Analyzing lifelong learning student behavior in a progressive degree

Analyzing lifelong learning student behavior in a progressive degree Analyzing lifelong learning student behavior in a progressive degree Ana-Elena Guerrero-Roldán, Enric Mor, Julià Minguillón Universitat Oberta de Catalunya Barcelona, Spain {aguerreror, emor, jminguillona}@uoc.edu

More information

Students Behavioural Analysis in an Online Learning Environment Using Data Mining

Students Behavioural Analysis in an Online Learning Environment Using Data Mining Students Behavioural Analysis in an Online Learning Environment Using Data Mining I. P. Ratnapala Computing Centre Faculty of Engineering, University of Peradeniya Peradeniya, Sri Lanka R. G. Ragel, S.

More information

Data Mining: Overview. What is Data Mining?

Data Mining: Overview. What is Data Mining? Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

More information

New Matrix Approach to Improve Apriori Algorithm

New Matrix Approach to Improve Apriori Algorithm New Matrix Approach to Improve Apriori Algorithm A. Rehab H. Alwa, B. Anasuya V Patil Associate Prof., IT Faculty, Majan College-University College Muscat, Oman, rehab.alwan@majancolleg.edu.om Associate

More information

Association Rule Mining: A Survey

Association Rule Mining: A Survey Association Rule Mining: A Survey Qiankun Zhao Nanyang Technological University, Singapore and Sourav S. Bhowmick Nanyang Technological University, Singapore 1. DATA MINING OVERVIEW Data mining [Chen et

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information