MASTER'S THESIS. Mining Changes in Customer Purchasing Behavior

Size: px
Start display at page:

Download "MASTER'S THESIS. Mining Changes in Customer Purchasing Behavior"

Transcription

1 MASTER'S THESIS 2009:097 Mining Changes in Customer Purchasing Behavior - a Data Mining Approach Samira Madani Luleå University of Technology Master Thesis, Continuation Courses Marketing and e-commerce Department of Business Administration and Social Sciences Division of Industrial marketing and e-commerce 2009:097 - ISSN: ISRN: LTU-PB-EX--09/097--SE

2 Abstract: The world around us is changing all the time. For businesses, knowing what is changing and how it has changed is also crucial. One of the most important aspects of surviving in a dynamic market is to know and adapt to changes happening in customer behavior. In Fast Moving Consumer Goods (FMCG) Distribution Company, this issue has more importance. Because of the variety of FMCGs products, distribution companies and their different strategies, the purchasing behavior of customers may change many times during a period and the competition become tougher. The purpose of this study is to help Kalleh Company as a manufacturer and distributor of food products in Iran market to mine changes happening in their customer behavior. Mining changes has several steps includes data collection, data preprocessing, customer segmentation, mining customer behavior patterns and change mining. For customer segmentation, we use Customer Value Matrix. For mining pattern of behavior, we use Apriori algorithm and maximal frequent itemsets. We have different kinds of changes based on the literature, added/ rules, emerging pattern and unexpected changes. Also, there are two measures of similarity and unexpectedness to measure the change. In this study, one time we calculate changes based on these measures from the literature. Then, we modified these measures to calculate the difference between ordinal attribute to bring their information in the calculation of changes. Our contribution is modifying these change measure to bring more information and higher accuracy in change mining. The result presented in the chapter4. Marketing managers can apply these detected changes to be responsive accurately and timely to the changes in the market. In addition, they can use it to evaluate different marketing campaigns to build stronger relationship with their customer and knowing the market better. There are many implications for mining changes in macro in micro aspects of businesses and also in marketing campaigns and manufacturing.

3 Abstract:... 1 Chapter1: Introduction Background of the study: Problem definition: Purpose of this study: Research question: Research motivation: Demarcation: Research outline: Chapter2: Literature Review Mining Customer Behavior: Review of Data Mining Data mining: in brief Data mining Functions: Classification in brief: Clustering in brief: Association Rules in Brief: Association Rule Mining Review: Association Rule mining problem: Apriori Algorithm Association Rule Mining Approaches: Apriori Approach Mining Changes Literature Review: Customer segmentation review: Clustering Analysis Customer Segmentation Model RFM Model RFM Scoring Customer Value Matrix Model

4 Chapter3: Research Methodology Research Methodology: Research Design: Research Purpose: Research Approaches: Research Strategy: Research process: Data Collection and Description: Data Pre-Processing: Customer Segmentation: Customer Value Matrix An effective analytical tool Customer Value Matrix Methodology Mining Customer Behavior: Association Rule Mining: Apriori algorithm: Change Mining: Change Mining: Chapter4: Results & Analysis Data preprocessing result: Data Cleaning Data Transformation result: Customer segmentation (in sql server Customer Value Matrix Result: Customer Behavior Mining: Discretization Result: Association Rule Mining Results: Change Mining:

5 4.4.1 Some examples of change pattern: Association rules and changes based (Chen et al, 2005): Rules with discrete variables in RHS: Change mining with Manhattan distance Chapter5: Conclusion, further research Conclusion: Our contribution: Limitation: Managerial Implication: Future works: References: List of tables Table 2.1: Factors for classification of ARM..25 Table 2.2: Mining in a changing environment timetable 37 Table3.1: Data collected from Kalleh Company 52 Table3.2: calculating variables for customer value matrix 58 Table 4.1: RFM table fields.72 Table 4.2: calculating variables for customer value matrix...73 Table 4.3: calculating variables for customer value matrix...73 Table 4.4: segment information in for period Table 4.5: segment information in for period Table4.6: R quantile 76 Table4.7: M quantile...76 Table4.8: F quantile 77 4

6 Table4.9: Area quantile..78 Table 4.10: Generated rule summary.78 Table 4.11: Generated Rules for period 1 Cluster Table4.12: Generated Rules for period 2 Cluster 1 81 Table4.13: Generated Rules for period 1 Cluster 2 82 Table4.14: Generated Rules for period 2 Cluster 2 84 Table4.15: Generated Rules for period 1 Cluster 3 87 Table4.16: Generated Rules for period 2 Cluster 3 88 Table4.17: Generated Rules for period 1 Cluster 4 89 Table4.18: Generated Rules for period 2 Cluster 4 95 Table4.19:Cat1 quantile..98 Table4.20:Cat2 quantile.. 99 Table4.21:Cat3 quantile Table4.22:Cat5 quantile 101 Table4.23:Cat11 quantile..102 Table4.24:Cat13 quantile..103 Table4.25: Generated Rules for period 1 Cluster 1, Change mining by (Chen et al, 2005) measures & Manhattan distance 103 Table4.26: Generated Rules for period 2 Cluster 1, Change mining by (Chen et al, 2005) measures & Manhattan distance 104 Table4.27: Generated Rules for period 1 Cluster 2, Change mining by (Chen et al, 2005) measures & Manhattan distance Table4.28: Generated Rules for period 2 Cluster 2, Change mining by (Chen et al, 2005) measures & Manhattan distance..107 Table4.29: Generated Rules for period 1 Cluster 3, Change mining by (Chen et al, 2005) measures & Manhattan distance

7 Table4.30: Generated Rules for period 2 Cluster 3, Change mining by (Chen et al, 2005) measures & Manhattan distance Table4.31: Generated Rules for period 1 Cluster 4, Change mining by (Chen et al, 2005) measures & Manhattan distance Table4.32: Generated Rules for period 2 Cluster 4, Change mining by (Chen et al, 2005) measures & Manhattan distance List of figures: Figure 2.1: Knowledge Discovery in Database Processes Figure 2.2 the major steps in data mining process...17 Figure 2.3: Classification of DM techniques...17 Figure 2.4: Classic Problem of association rule mining.20 Figure 2.5: Mining in a changing environment review...36 Figure 2.6: Customer Value Matrix.44 Figure 3.1: Research design of this study 46 Figure 3.2: Change mining process perspective..49 Figure 3.3: Change mining process.50 Figure 3.4: Change mining process in detail...50 Figure 3.5: Product categories of Kalleh company.52 Figure 3.6: customer value matrix..59 Figure 4.1: generalized product category...71 Figure 4.2: The Customer Value Matrix...74 Figure4.3: R histogram Figure4.4: M histogram..76 Figure4.5: F histogram 77 Figure4.6: Area histogram..78 Figure4.7: Cat1 histogram

8 Figure4.8: Cat2 histogram..99 Figure4.9: Cat3 histogram 100 Figure4.10: Cat5 histogram Figure4.11: Cat11 histogram 102 Figure4.12: Cat13 histogram

9 Chapter1: Introduction Background of the study Problem definition Purpose of this study Research question Research motivation Research demarcation Research outline 8

10 1.1Background of the study: The world around us changes continuously. Knowing and adapting to changes is an important aspect of our lives. For businesses, knowing what is changing and how it has changed is also essential (Liu et al, 2000). One of the most important aspects of surviving in a dynamic market is to know and adapt to changes happening in customer behavior. Moreover, in recent years, there has been the explosive growth in the amount of information (Min, S., H., Han, I., 2005). In general, Fast moving consumer goods (FMCG) distribution companies collected huge amount of data from their customers and their purchasing transactions. In this gathered data, we can find interesting hidden information about the customers and their behaviors. The traditional approach for marketing decision making for marketing promotions, campaigns and market research in FMCG distribution companies is to focus more on their internal expert opinions. These experts include the marketing managers and also sales managers who are in constant touch with their salespeople and merchandisers who bring them market information. However, this kind of decision making process ignores the customer data and their behaviors. Furthermore, in today s world where the market is highly competitive and products are overwhelming, customers face with various products and various providers with different marketing strategies (Hossein Javaheri, S., 2008). In such a situation, customer behavior changes all the time due to such a dynamic market (Chen et al, 2005). When the marketing manager became aware of some changes in the market by sales team; he/she does not have any idea about how and where to start understanding these changes and their reasons. It results to design a wide time-consuming and costly market research which its result maybe did not reach on time to the marketing department to react to these changes. Also in such a market, there are many promotion campaigns by company itself and competitors that it is difficult to analyze the effectiveness of them in the market. So, in the competitive environment, there is a need to mine customer data and their transactions to find changes in customer purchasing behavior which is an effective and efficient way to respond to their needs timely and accurately. As a result, many FMCG distribution companies in Iran are trying to move away from traditional way 9

11 for planning their marketing campaigns, promotions and market research by understanding changes happening in their customers purchasing behavior. Change mining helps managers to make better marketing strategies. 1.2Problem definition: Kalleh Company is a private manufacturer and distributor of food product in Iran. It produces different categories of food product from dairy products to ice cream and meats and sauces. It has more than 10 different categories and about 800 products. Now, the company is faced with the challenge of increasing competition. There are some reasons behind it. First, according to the high variation of products, it should compete in different food market like dairy, ice cream and meat. It results to compete with many competitors with different product categories and different marketing strategies. Also there are some powerful governmental companies that make competition tougher for Kalleh. So in such a market, the customer behavior may change by the of companies strategies in the market and also by changing their need by themselves. Kalleh Company in order to answer to the changes in customer purchasing behavior timely and not being behind the customer needs and the competition need to mine changes in the customer purchasing behavior. The goal of Kalleh Company is to mine changes in purchasing behavior of the customers in different segments to respond to these changes timely and accurately to increase its return on investment (ROI). 1.3Purpose of this study: The purpose of this study is to mine changes in customer purchasing behavior. In order to reach this goal we need to building customer purchasing patterns of customers based on the customer, product and transaction data collected in databases. Data mining techniques can help us to reach this goal. According to (Song et al, 2001), data mining is the process of exploration and analysis of large quantities of data in order to discover meaningful pattern and rules. Many of data mining Studies has focused on developing techniques to build precise models to predict customer s behavior, and to set up marketing strategies and customization. According to (Nemati & Barko, 2001; cited by Nemati, H.R., Barko, C. D., 2003), most of data mining applications (72%) are centered on predicting customer 10

12 behavior. Comparatively little attention has been paid to discover changes in databases collected eventually (Liu et al., 2000). From literature review, what is obvious is too much time spent on worrying about absolute numbers, like Lifetime Value. However, what they should really be observing is relative numbers change over time. Highest potential ROI customers from a marketing viewpoint are Customers who are in the process of changing their behavior either accelerating their relationship with you, or ending their relationship with a company (Novo, j., 2008). In many applications, mining changes can be more crucial than producing precise prediction models, which are in the center of existing data mining researches. Regardless of how the model is accurate, it is inactive by itself because it can only predict based on patterns mined in the old data. Acting based on the built model should not guide to actions that may change the environment because otherwise the model will stop to be correct (Liu et al., 2000). Prediction model building is more appropriate in areas where the environment is comparatively steady. However, in many business conditions, constant human interference to the environment is a fact. Businesses simply cannot let nature take its course. They constantly need to do actions in order to provide better services and products by finding the attractive changes and steady patterns in customer behaviors. Still in a comparatively steady environment, changes are also unavoidable due to internal and external issues (Liu et al., 2000). From these viewpoints the question: Which patterns exist? as it is responded by state-of-the art data mining technology, is replaced by the question: How do patterns change? (Böttcher, M., et al, 2006). Actually, discovery of interesting and earlier unidentified changes in customer, product and transaction data, not only let the user monitor the influence of past business decisions but also to get ready today s business for tomorrow s needs (Böttcher, M., et al, 2006). Major changes often need instant concentration and actions to modify the existing practices and/or to change the domain condition (Liu et al, 2000). By using change mining methodology, Kalleh Company can detect different kinds of changes happening in the customer purchasing behavior to build stronger relationship with the customers. Also, understanding changes in customer behavior can assist managers to set up effective and efficient promotion campaigns. (Liu et al, 2000) mentioned that there are two main goals for mining changes in a business environment: 11

13 "To follow the s": The main feature of this kind of applications is the word "follow". Companies like to know where the is going not to be left behind. They need to investigate customers' changing behaviors so as to provide products and services that suit the changing needs of the customers. "To stop or to delay undesirable changes": In this kind of applications, the keyword is "stop". Companies like to know undesirable changes as soon as possible and to plan corrective measures to stop or to delay the pace of such changes. The overall procedure consists of several steps. In the literature, there are some methods for change mining in the dynamic situation. According to (Song et al, 2001), the majority of data mining techniques like association rules and neural networks cannot be used alone because they cannot manage dynamic situation well. (Song et al, 2001) and (Chen et al, 2005) developed a methodology for mining changes. They used association rule to detect interesting association relationships among a large set of data items which introduced by (Agrawal et al., 1993). The methodology detects all kinds of changes. According to (Chen et al, 2005), Change mining has several steps including data preprocessing, customer segmentation, mining association rule and change mining. In the first customers are segmented based on their behavioral variables, recency, frequency and monetary (RFM). Then by building association rule with customer behavioral variable (RFM), customer data and transaction data, we describe the customer purchasing behavior in two different time snapshots, and in the end we compare generated rules for each segment to mine changes in the customer purchasing behavior. To mine changes, various algorithms and techniques should be used. In order to implement these algorithms and techniques, an extensive programming is needed. Finally, we combined all of the algorithms to build a change mining package. 1.4Research question: Based on the problem discussion that we have above, the purpose of this study is to mine changes in customer purchasing behavior. In order to reach this purpose, the research question will be as followed: How businesses can be responsive to the changes of customer behavior in dynamic market. In addition, how businesses can detect and access to the changes happened in the customer behavior pattern to be responsive accurately and timely. 12

14 1.5Research motivation: Recently, we have watched an explosion of data produced and collected by individuals and organizations. This fast growth in data and databases made the problem of data overload (Li, X. B., 2005). More recently, increased computing power has led to greater elasticity in the models one can use and the amount of data that can be stored and processed (Bolton, R. J., 2004) and as a result, data mining techniques have came out and flourished in the past several years to encounter this demand (Li, X. B., 2005). Organizations are starting to understand the importance of data mining in their marketing strategies. In this situation, businesses currently face the challenge of a constantly evolving market where customer needs are changing all the time (Chen et al, 2005). In such an environment, knowing the changes and responding rapidly and correctly to them, has a high importance. While customer needs change over time, if businesses could not meet their need, they would lose their customers who are their ROI resources. Some works have been done in change mining in retailing. One of the businesses that change mining can help it to improve, is FMCG distribution business that face a dynamic markets by huge variation of products and competitors in the market. The purpose of the change mining is following the s that are happening in the customer purchasing pattern, detecting the changes and respond to them timely to satisfy customers more and meet their needs. 1.6Demarcation: This study focus on mining changes in customer purchasing behavior based on the customer and purchasing transaction stored in a database. Change mining has been done by data gathered from a database of FMCG Distributor Company in Iran. Most of the literature reviewed is about mining changes in customer purchasing behavior. Our work focus on building customer behavior patterns by association rule mining and the comparison of these built rules. These patterns just based on their previous transactions. 1.7Research outline: This thesis consists of five chapters. The first chapter is introduction that gives a brief background about subject followed by research question, objectives, and motivation. Chapter 2 is a literature review, consists literature review on data mining, association rule, change mining and customer segmentation. Chapter3 is about our research methodology including data preprocessing, market segmentation, mining customer behavior and change mining. Chapter4 is about the results and analysis. Chapter 5 is the last chapter that contains conclusion, limitation, and further research. 13

15 Chapter2: Literature Review Review of Mining Customer Behavior Review of Data Mining Review of Association Rule Mining Review of Change Mining Review of Customer Segmentation 14

16 2. 1Mining Customer Behavior: Different methods to describe customer behavior exist in the literature. Among them, there are various types of conjunctive rules to build customer behavior pattern including association rules and classification rules (Agrawal R. et al, 1996 & Breiman L., et al., 1984 cited on Adomavicius, G., Tuzhilin, A., 2001) Using rules to describe customer behavior has certain advantages. Besides being descriptive way to portray behaviors, a conjunctive rule is a well-studied concept and it is used widely in data mining, expert systems, and many other areas. In addition, researchers have proposed many rule discovery algorithms in the literature, especially for association rules (Adomavicius, G., Tuzhilin, A., 2001). To discover rules that describe the behavior of customers, we can use various data mining algorithms, like Apriori for association rule mining. Association rules were initially applied for market basket analysis to find the relationships between product items purchased by customers at retail stores (Agrawal, Imielinski, & Swami, 1993; Srikant, Vu, & Agrawal, 1997 cited by Chen et al, 2005). In a research of customer behavior, we can apply association rule to find the correlations between customer demographic variables, purchased product and product databases (Song et al, 2001). In this chapter, we will have a review of data mining, then association rules. Then the next topic will be the change mining of customer behavior in the literature. And following by that finally we will have a brief review of customer segmentation. 2. 2Review of Data Mining Data mining: in brief Today, size of databases can be very large. Within this data you can find hidden strategic information. But when you have a huge amount of data, inducing meaningful conclusions is not easy. The novel answer is data mining being used both to increase revenues and to reduce costs. Many people use data mining as a synonym for another popular word, Knowledge Discovery in Database. In rotation other people define Data Mining as the core process of KDD. The KDD processes are shown in Figure 2.1 (Han, J., & Kamber, M., 2006). Usually KDD has three processes. First one is preprocessing executed before data mining techniques applied to the right data. The preprocessing includes data cleaning, integration, selection and transformation. The main process of KDD is the data mining process. In this process different algorithm are applied to produce 15

17 hidden knowledge. The last process is post-processing comes evaluating the mining result according to users requirements and domain knowledge. Regarding the evaluation results, if the result is satisfactory the knowledge can be presented; else we have to run some or all of those processes again till we get the satisfactory result (Han, J., & Kamber, M., 2006). Figure 2.1: Knowledge Discovery in Database Processes (Song et al, 2001) defines data mining as a process of exploration and analysis of large quantity of data to discover meaningful patterns and rules. (Feelders et al, 2000) define the process of data mining as follows: 16

18 Source: (Feedlers et al, 2000) Figure 2.2 the major steps in data mining Process The data mining returns potential is immense. Innovative organizations worldwide are already using data mining to attract higher-value customers, to configure their product offerings differently to increase sales, and to minimize losses due to mistakes or fraud Data mining Functions: (Dunham, 2002) categorizes data mining to two categories, one is descriptive and the other one is predictive (Figure 2.3). Source: (Dunham, 2002) Figure 2.3: Classification of DM techniques The first and simplest analytical step in data mining is to describe the data- 17

19 summarize its statistical attributes such as means and visual review like charts and graphs, and correlations among variables. The most important step is right data selection, data gathering and data exploration. Sometimes data description alone cannot provide an action plan. You must build a predictive model based on patterns determined from known results, and then examine that model with a new sample data. A good model should never be the same as reality, but it can be a useful guide to know your business. And after all we should empirically verify the model (Twocrows.com, 2005). In the next section, we explain briefly three important data mining techniques Classification in brief: Based on (Han and Kamber, 2006), Classification is automatic model building that can classify a class of objects so as to predict the classification or missing attribute value of future objects whose class may not be known. The process has 2 steps. In the first step, a model is built to describe the characteristics of a set of data classes or concepts based on the collection of training data set. Because data classes or concepts are predefined, this step is also known as supervised learning. In the second step, the model is used to predict the classes of future data or objects. There are several techniques for classification (Han and Kamber, 2006). In Classification by decision tree many researches are done and plenty of algorithms have been designed, Murthy did a extensive survey on decision tree induction (Murthy, 1998; cited by Han, J., & Kamber, M., 2006). Bayesian classification is another technique that can be found in (Duda and Hart, 1973 cited by Han, J., & Kamber, M., 2006). Nearest neighbor methods are also talked about in many statistical texts on classification, such as (Duda and Hart, 1973, cited by Han, J., & Kamber, M., 2006) and (James, 1985, cited by Han, J., & Kamber, M., 2006). Besides, there are many other machine learning and neural network techniques used to help building the classification models Clustering in brief: As we mentioned before, classification can be taken as supervised learning process, clustering is another mining technique similar to classification. However clustering is an unsupervised learning process. "Clustering is the process of grouping a set of physical or abstract objects into classes of similar objects" (Han, J., & Kamber, M., 2006), so that objects within the same cluster must be similar to some extend, also they should be dissimilar to those objects in other clusters. In classification each record belongs to a predefined class, while in clustering there is no predefined class. In clustering, objects are grouped together based on their similarities. (Han, J., & Kamber, M., 2006)Similarities 18

20 between objects are explained by some similarity functions; usually similarities are quantitatively defined as distance or other measures by corresponding domain experts. (Han, J., & Kamber, M., 2006) Most clustering applications are used in market segmentation. When they cluster their customers into different groups, business organizations can provide different personalized services to different group of markets. (Han, J., & Kamber, M., 2006) An extensive survey of current clustering techniques and algorithms is available in (Berkhin, 2002; cited by Han, J., & Kamber, M., 2006) Association Rules in Brief: Association rule mining is one of the most important techniques of data mining. (Agrawal et al, 1993) introduced this method first time. The goal of this technique is extracting interesting correlations, frequent patterns, and associations among sets of items in the transaction databases or other data reservoirs (Agrawal et al, 1993). Association rules are used extensively in various areas. In this study we will use association rule to mine customer behavior pattern to find behavioral changes. In the next section, we will have a review of association rule mining. 2. 3Association Rule Mining Review: Association Rule mining problem: In this section, we will introduce association rule mining problem in detail. A typical association Rule has an implication of the form A B where A is an itemset and B is an itemset that contains only a single atomic condition (Berry & Linoff, 2004). There are two definitions to evaluate each association rule. The support of an association rule is the percentage of records containing both A and B and the confidence of a rule is the percentage of records containing itemset A that also contain itemset B. The support shows the usefulness of a discovered rule and the confidence shows certainty of found association Rules (Berry & Linoff, 2004). We can calculate another variable called Lift. It Measures the difference between confidence and expected value of confidence for a rule. (Berry & Linoff, 2004) define Lift (also called improvement), as a measure telling us how much better a rule is at forecasting the result than just assuming the result in the first place. Lift is the ratio of the density of the target after application of the left-hand side to the density of the target in the population (Berry & Linoff, 2004). Another way of saying this is that lift is the ratio of the records that support the entire rule to the number that would be expected, assuming that there is no relationship between the products (the exact formula is givenlater in the chapter) (Berry & Linoff, 2004). 19

21 2.3.2 Apriori Algorithm Association rule mining is discovering association rules that satisfy the pre-defined minimum support and confidence from a database (Agrawal, R., & Srikant, R., 1994). According to (Agrawal, R., & Srikant, R., 1994), this problem is usually decomposed into two sub problems: One is to find those itemsets whose occurrences surpass a predefined threshold in the database; those itemsets are called frequent or large itemsets. This problem can be later divided into 2 sub problems: candidate large itemsets generation and frequent itemsets generation process. Large or frequent itemsets are those itemsets whose supports surpass the support threshold as and candidate itemsets are those itemsets that are expected or have the hope to be large or frequent. The second problem is producing association rules from those large itemsets with the limits of minimal confidence. You can see the whole process of standard problem of mining association rules in figure 2.3. Source: (Agrawal et al, 1993) Figure 2.4: Classic Problem of association rule mining The whole performance of mining association rules is determined mainly by the first step (Agrawal, R., & Srikant, R.). After the large itemsets are found, the corresponding association rules can be derived in a straightforward manner. the focus of most mining algorithms is counting of large itemsets Efficiently, and many efficient solutions have been designed to target previous criteria (Kantardzic.M, 2003). 20

22 Different kinds of produced AR: One attraction of association rules is the clarity and utility of the results, which are in the form of rules about groups of products. There is a spontaneous attraction to an association rule because it shows how tangible products and services group together (Berry & Linoff, 2004). While association rules are easily understandable, they are not always useful (Berry & Linoff, 2004). There are 3 types of generated association rules: Actionable rules, trivial rules and inexplicable rules. Actionable rules are the useful rule holds high-quality, actionable information. Once the pattern is found, it is not often hard to justify, and thinking about rule in the real environment can lead to insights and actions. Because the rule is easily understood, it recommends plausible causes and possible interventions (Berry & Linoff, 2004). Another type of association rule is trivial rules.. Many people in business know trivial results. Although it is valid and well supported in the data, it is still not practical. A simple example is customers purchasing hamburgers buy hamburger buns. A subtler problem drops within the same category. An apparently interesting result may be the result of past marketing programs and product bundles. Although other data mining techniques have this problem but market basket analysis is vulnerable to reproducing the success of prior marketing campaigns because of its dependence on un-summarized point-of-sale data, exactly the same data that defines the success of the campaign. Trivial rules have one advantage and that is when a rule should appear 100 percent of the time, the few cases where it does not hold supply a lot of information about data quality. An area where business operations, data collection, and processing may need to be more refined indicates the exceptions to trivial rules (Berry & Linoff, 2004). Inexplicable results seem to have no interpretation and do not recommend a course of action. There is a caution and that is when applying market basket analysis, many of the results are often either trivial or inexplicable; trivial rules reproduce common knowledge about the business, which waste the effort used to apply complex analysis techniques and Inexplicable rules are flukes in the data and are not actionable (Berry & Linoff, 2004). ARM Approaches Classification: Association rule mining is a well studied research area; in this section, we will only review some basic and classic approaches for association rule mining. As 21

23 mentioned before, the second sub-problem of ARM is straightforward; most of those approaches focus on the first sub-problem. As mentioned, the first sub-problem can be further divided into two sub-problems: candidate large itemsets generation process and frequent itemsets generation process. Most of the algorithms of mining association rules that surveyed are quite similar, the difference is the extent to which specific improvements have been made. According to (Zhao, Q., Bhowmick, S.S., 2003), there are 3 milestones in ARM classic problem; Apriori approach, tree structure approaches and special issues in ARM. Besides these approaches, there is another approach from (Zaki et al, 1999); class-based algorithms approach. There some features that exists in literature to classify ARM algorithms by different aspect. In the following subsection we will see some of them. Here there are some features, which can be used to classify the algorithms. We can categorize the algorithms based on several basic features that try to best differentiate the various algorithms. These are different features that we have found in literature (summarized in Table 2.1): Target: Basic association rule algorithms actually find all rules with the acceptable support and confidence thresholds. However, there are some more efficient algorithms could be used. One approach which has been done to do this is adding constraints on the rules which have been produced. Algorithms can be categorized as complete (All association rules satisfying the support and confidence are found), constrained (Some subset of all the rules are found, based on a technique limiting them), and qualitative (A subset of the rules are produced based on additional measures, beyond support and confidence, need to be satisfied) (Dunham M.H., et al, 2001). Type: Here we show the type of association rules which are produced (for example regular (Boolean), spatial, temporal, generalized, qualitative, etc.) (Dunham M.H., et al, 2001). Data type: Besides data stored in a database, the type of data also is important. Association rules of a plain text might be very important information to find out. For example, data, mining, and decision may be highly dependent in a paper of knowledge discovery (Dunham M.H., et al, 2001).. Data source: In addition to market basket data, association rules of data absent in the database might play important role for decision purposes of a company (Dunham M.H., et al, 2001). 22

24 Technique: All approaches to date are based on first finding the large itemsets. There could, of course, be other techniques not requiring that large itemsets first be found. Although to date we are not aware of any techniques not generating large itemsets, certainly this possibility does exist with the potential of improved performance. However, (Agrawal et al, 1998) cited in (Dunham M.H., et al, 2001) proposed strongly collective itemsets to evaluate and find itemsets. The term support and confidence are completely different from large itemset approach. An itemset I is said to be strongly collective at level K if the collective strength C (K) of I as well as any subset of I is at least K (Dunham M.H., et al, 2001). Itemset Strategy: Different algorithms consider the generation of items differently. This feature shows how the algorithm considers transactions as well as when the itemsets are produced. One technique, Complete, could produce and count all potential itemsets. The most common approach is that introduced by Apriori. With this strategy, a set of itemsets to count is produced prior to scanning the transactions. This set remains constant during the process. A dynamic strategy produces the itemsets during the scanning of the database itself. A hybrid technique generates some itemsets prior to the database scan, but also adds new itemsets to this counting set during the scan (Dunham M.H., et al, 2001). Transaction Strategy: Different algorithms consider the set of transactions in a different manner. This feature shows how the algorithm scans the set of transaction. The complete strategy checks all transactions in the database. With the sample approach, some subset of the database (sample) is checked prior to processing the complete database. The partition techniques divide the database into partitions. The scanning of the database requires that the partitions be checking individually and in order (Dunham M.H., et al, 2001). Itemset Data Structure: As itemsets are produced, different data structures can be applied to keep track of them. The most usual approach seems to be a hash tree. Alternatively, a trie or lattice may be applied. At least one technique suggests a virtual trie structure where only a portion of the complete trie is actually materialized (Dunham M.H., et al, 2001). Transaction Data Structure: "Each algorithm assumes that the transactions are stored in some basic structure, usually a flat file or a TID list" (Dunham M.H., et al, 2001). Optimization: Many algorithms have been introduced improving on earlier 23

25 algorithms by applying an optimization strategy. Various strategies have considered optimization based on available main memory, whether or not the data is skewed, and pruning of the itemsets to be counted (Dunham M.H., et al, 2001). Architecture: As indicated, the goal of some algorithms is working like sequential function in centralized single processor architecture. Alternatively, algorithms have been designed to work in a parallel manner suitable for a multiprocessor or distributed architecture (Dunham M.H., et al, 2001). Parallelism Strategy: Parallel algorithms can be more described as task or data parallelism (Dunham M.H., et al, 2001). In the literature there some other features that based on them also we can categorize the association rule mining methods; in the following we can consider them: Counting Strategy: This refers to the methods used in counting the candidate itemsets occurrences. There horizontal counting and vertical intersection are two main approaches. The horizontal counting decides about the support value of a candidate itemset by scanning transaction singly, and increasing the counter of the itemset if it is a subset of the transaction. This approach operates well for a rarely occurred candidate because only those transactions containing that itemset need to be checked. The candidate look up operation, however, is very expensive for candidates of large size (Su, J. H., Lin, W. Y., 2004). On the other hand, vertical intersection is applied when the database is in a vertical format such that every record is associated with an item to store the identifiers of the transactions containing that item, called Tidlist. Despite the vertical intersection scheme omits the I/O cost for database scan, it has the following shortage: when a candidate itemset has a support count completely less than the number of transactions, a large amount of unnecessary intersections happens there (Su, J. H., Lin, W. Y., 2004). Search direction: according to (Su, J. H., Lin, W. Y., 2004), there are two main methods for search direction, Bottom-up traversal and Top-down traversal. Today, most Apriori-like approaches apply bottom-up traversal of the search space, which starts from all frequent 1-itemsets upward to the longest frequent itemsets. The most important advantage of this model is that it can effectively prune the search space by exploiting downward closure property: when it recognized one itemset as infrequent, all of its superset is also infrequent. However, this benefit fades when most of the maximal frequent itemsets locating near the largest itemset of the search 24

26 lattice, due to a comparatively small support threshold. In this situation, there are very few itemsets to be pruned (Su, J. H., Lin, W. Y., 2004). Another itemset traversal method is Top-down traversal which applied in the opposite direction, i.e. starting from the longest itemsets downward to the frequent 1-itemsets, or top-down for short (Su, J. H., Lin, W. Y., 2004). This strategy is traditionally applied for discovering maximal frequent itemsets (Tseng, M.C. & Lin, W.Y., 2001; cited by Su, J. H., Lin, W. Y., 2004) But we should consider that though all of the frequent itemsets can be derived from their maximal ones, more counting strategies are needed to gain their exact supports for computing the confidences of association rules. At the same time, if there are huge numbers of items and/or the support threshold is very low; many infrequent itemsets have to be visited before the maximal frequent itemsets are identified. This is why most work on frequent itemsets mining accepts and applies the bottom-up paradigm instead. (Su, J. H., Lin, W. Y., 2004). Search strategy: While the search direction directs the way that the search space is exploited, the search strategy identifies the order in which itemsets are visited (Su, J. H., Lin, W. Y., 2004). One of these strategies is BFS. Most Apriorilike algorithms apply breadth-first search (BFS) because it can facilitate the pruning of candidates with downward closure. This strategy, however, needs more memory to keep the frequent subsets of the pruned candidates (Su, J. H., Lin, W. Y., 2004). Another strategy is DFS; recursively visiting the descendants of an itemset. In the literature, this strategy is usually combined with the counting strategy of vertical intersection because it is enough to keep in memory the tidlists corresponding to the itemsets on the path from the root down to the presently inspected one. (Su, J. H., Lin, W. Y., 2004) Table 2.1: Factors for classification of ARM VALUES Complete, Constrained, Qualitative Regular (Boolean), Generalized, Quantitative, etc. Database Data, Text Market Basket, Beyond Basket DIMENSION Target Type Data type Data source 25

27 Large Itemset, Strongly Collective Itemset Complete, Apriori, Dynamic, Hybrid Complete, Sample, Partitioned Hash Tree, Trie, Virtual Trie, Lattice Flat File, TID Memory, Skewed, Pruning Sequential, Parallel None, Data, Task Sequential Pattern, Frequent Itemset, Structured Pattern Association Rule, Strong gradient relationship, correlation(han book) Horizontal, Vertical Bottom-Up Traversal, Top-Down Traversal, Hybrid BFS, DFS Complete, Heuristic Technique Itemset Strategy Transaction Strategy Itemset Data Structure Transaction Data Structure Optimization Architecture Parallel Strategy Pattern Kind Rule Kind Counting Strategy Search Strategy Search Direction Candidate generation Association Rule Mining Approaches: Apriori Approach AIS Algorithm: The AIS (Agrawal, Imielinski, Swami) algorithm was the first algorithm suggested for mining association rule in (Agrawal et al, 1993). It concentrates on improving the quality of databases simultaneously with necessary functionality to process decision support queries. According to (Zhao, Q., Bhowmick, S.S., 2003), in this algorithm only one item consequent association rules are produced. It means that the consequent of those rules only contain one item, for example we only 26

28 produce rules like ABC D but not rules like AB CD. Disadvantage: The main disadvantage of the AIS algorithm is too many candidate itemsets that at last turned out to be small are produced, needing more space and wastes much effort that turned out to be useless. At the same time this algorithm needs too many passes over the whole database (Zhao, Q., Bhowmick, S.S., 2003) Apriori Algorithm: Apriori is a great improvement in the history of association rule mining, Apriori algorithm was first introduced by Agrawal in (Agrawal, R., & Srikant, R., 1994). The AIS is just a straightforward approach that needs many passes over the database, which produces many candidate itemsets and saving counters of each candidate while most of them turn out to be not frequent. Apriori is more efficient during the candidate generation process for two reasons; Apriori applies a different candidate's generation method and a new pruning technique (Zhao, Q., Bhowmick, S.S., 2003). a) Problem & limitation of Apriori: One is the complex candidate generation process that spends most of the time, space, and memory. Another bottleneck is the several scan of the database. Many new algorithms were designed with some modifications or improvements based on Apriori algorithm. Commonly, there were two approaches: First approach tries to reduce the number of passes over the whole database or replace the whole database with only part of it based on the current frequent itemsets. The other approach tries exploring different types of pruning techniques to make the number of candidate itemsets much lesser. Apriori-TID and Apriori- Hybrid (Agrawal, R., & Srikant, R., 1994), DHP (Park et al, 1995; cited by Zhao, Q., Bhowmick, S.S., 2003), SON (Savesere et al, 1995) are modifications of the Apriori algorithm (Zhao, Q., Bhowmick, S.S., 2003) Optimized Apriori algorithms: According to problems of Apriori, which have been mentioned in previous section, some new approaches are introduced. In the following section we will have them; item pruning and database passes over reduction. a) Transaction and Item Pruning: This is one of the main optimization of the Apriori Algorithm. There is no 27

29 need to inspect the whole database each time it is needed to count occurrence of candidate itemsets. This optimization reduced drastically the needed time to count the support for the candidate sets and enhances the performance. Transaction pruning was present in 2 algorithms; AprioriTid, Apriori Hybrid and DHP. AprioriTID, Apriori Hybrid: AprioriTID was introduced in the same paper with Apriori. For all that, it does not state it explicitly, it uses transaction pruning to improve Apriori performance. The main difference comes from where it does not use the whole database to count support for candidate sets, and it uses another approach (Ayad, A. M., 2000).The main disadvantage of this algorithm is the size of the alternative set that shows the database may go beyond the size of the actual database in early stages thus loosing its edge on Apriori. Because of this disadvantage another algorithm, Apriori Hybrid introduced. It uses Apriori at the first stages and then shifts to AprioriTID when transaction pruning is more effective (Ayad, A. M., 2000). DHP: DHP (Dynamic Hashing and Pruning algorithm) is another algorithm that introduced by (Park et al, 1995). It uses probabilistic counting to decline the number of candidate itemsets counted during each round of Apriori execution. This decline is completed by subjecting each candidate k itemset to a hash-based filtering step in addition to the pruning step (Ayad, A. M., 2000). Throughout candidate counting in round k -1, the algorithm builds a hash table. Each entry in the hash table is a counter that retains the sum of the supports of the k-itemsets that correspond to that exacting entry of the hash table. The algorithm uses this information in round k to prune the set of candidate k-itemsets. After subset pruning as in Apriori, the algorithm can remove a candidate itemset if the count in its hash table entry is smaller than the minimum support threshold. According to, It has 2 advantage, first this algorithm is also based on the monotone Apriori property, where a hash table is built for the purpose of reducing the candidate space by pre-computing the proximate support for the k+1 item set while counting the k-itemset. DHP has another important advantage, the transaction trimming, which has been applied by removing the transactions that do not contain any frequent items. However, this trimming and the pruning properties caused some problems that made it impractical in many cases (Ayad, A. M., 2000). 28

30 b) Reducing the number of database passes: As mentioned before, the main disadvantage of the classical Apriori is the several passes it has to do on the databases the number of which is equal to the length of the longest frequent itemset (Pattern) present in the database(zhao, Q., Bhowmick, S.S., 2003). Many optimization efforts focused on eliminating the number of database passes. They differed, however, in how the number of passes decreased. This is the focus of this section. Database Partitioning: (Savasere et al, 1995) developed Partition, an algorithm that requires only two scans of the transaction database. The database is divided into disjoint partitions, each small enough to fit in memory. In a first scan, the algorithm reads each partition and computes locally frequent itemsets on each partition using Apriori. In the second scan, the algorithm counts the support of all locally frequent itemsets toward the complete database. If an itemset is frequent with respect to the complete database, it must be frequent in at least one partition; therefore, the second scan counts a superset of all potentially frequent items. The main achievement of Partition is the reduction of database activity. It was shown that this reduction was not obtained at the expense of more CPU utilization. It was shown however, that the number of partition greatly affects the performance of the algorithm by affecting the number of locally frequent itemsets that turn to be globally infrequent. The algorithm was shown to be vulnerable to data skew (Ayad, A. M., 2000). Dynamic itemset counting: (Brin et al, 1997) proposed the Dynamic Itemset Counting algorithm. DIC partitions the database into several blocks marked by start points and repeatedly scans the database. In contrast to Apriori, DIC can add new candidate itemsets at any start point, instead of just at the beginning of a new database scan. At each start point, DIC estimates the support of all itemsets that are currently counted and adds new itemsets to the set of candidate itemsets if all its subsets are estimated to be frequent (Brin et al, 1997). If DIC adds all frequent itemsets and their negative border to the set of candidate itemsets during the first scan, it will have counted each itemset s exact support at some point during the second scan; thus DIC will complete in two scans (Ayad, A. M., 2000). The Dynamic Item set Counting (DIC) 29

Association Rule Mining: A Survey

Association Rule Mining: A Survey Association Rule Mining: A Survey Qiankun Zhao Nanyang Technological University, Singapore and Sourav S. Bhowmick Nanyang Technological University, Singapore 1. DATA MINING OVERVIEW Data mining [Chen et

More information

MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH

MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH M.Rajalakshmi 1, Dr.T.Purusothaman 2, Dr.R.Nedunchezhian 3 1 Assistant Professor (SG), Coimbatore Institute of Technology, India, rajalakshmi@cit.edu.in

More information

Mining changes in customer behavior in retail marketing

Mining changes in customer behavior in retail marketing Expert Systems with Applications 28 (2005) 773 781 www.elsevier.com/locate/eswa Mining changes in customer behavior in retail marketing Mu-Chen Chen a, *, Ai-Lun Chiu b, Hsu-Hwa Chang c a Department of

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Selection of Optimal Discount of Retail Assortments with Data Mining Approach

Selection of Optimal Discount of Retail Assortments with Data Mining Approach Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive

More information

Building A Smart Academic Advising System Using Association Rule Mining

Building A Smart Academic Advising System Using Association Rule Mining Building A Smart Academic Advising System Using Association Rule Mining Raed Shatnawi +962795285056 raedamin@just.edu.jo Qutaibah Althebyan +962796536277 qaalthebyan@just.edu.jo Baraq Ghalib & Mohammed

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Enhanced Boosted Trees Technique for Customer Churn Prediction Model IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction

More information

Data Mining Analytics for Business Intelligence and Decision Support

Data Mining Analytics for Business Intelligence and Decision Support Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing

More information

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge

More information

Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm

Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm Purpose: key concepts in mining frequent itemsets understand the Apriori algorithm run Apriori in Weka GUI and in programatic way 1 Theoretical

More information

Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis

Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis , 23-25 October, 2013, San Francisco, USA Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis John David Elijah Sandig, Ruby Mae Somoba, Ma. Beth Concepcion and Bobby D. Gerardo,

More information

Improving Apriori Algorithm to get better performance with Cloud Computing

Improving Apriori Algorithm to get better performance with Cloud Computing Improving Apriori Algorithm to get better performance with Cloud Computing Zeba Qureshi 1 ; Sanjay Bansal 2 Affiliation: A.I.T.R, RGPV, India 1, A.I.T.R, RGPV, India 2 ABSTRACT Cloud computing has become

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Discovery of Maximal Frequent Item Sets using Subset Creation

Discovery of Maximal Frequent Item Sets using Subset Creation Discovery of Maximal Frequent Item Sets using Subset Creation Jnanamurthy HK, Vishesh HV, Vishruth Jain, Preetham Kumar, Radhika M. Pai Department of Information and Communication Technology Manipal Institute

More information

PREDICTIVE MODELING OF INTER-TRANSACTION ASSOCIATION RULES A BUSINESS PERSPECTIVE

PREDICTIVE MODELING OF INTER-TRANSACTION ASSOCIATION RULES A BUSINESS PERSPECTIVE International Journal of Computer Science and Applications, Vol. 5, No. 4, pp 57-69, 2008 Technomathematics Research Foundation PREDICTIVE MODELING OF INTER-TRANSACTION ASSOCIATION RULES A BUSINESS PERSPECTIVE

More information

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING M.Gnanavel 1 & Dr.E.R.Naganathan 2 1. Research Scholar, SCSVMV University, Kanchipuram,Tamil Nadu,India. 2. Professor

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

Association Rule Mining

Association Rule Mining Association Rule Mining Association Rules and Frequent Patterns Frequent Pattern Mining Algorithms Apriori FP-growth Correlation Analysis Constraint-based Mining Using Frequent Patterns for Classification

More information

Data Mining Techniques

Data Mining Techniques 15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses

More information

Mining an Online Auctions Data Warehouse

Mining an Online Auctions Data Warehouse Proceedings of MASPLAS'02 The Mid-Atlantic Student Workshop on Programming Languages and Systems Pace University, April 19, 2002 Mining an Online Auctions Data Warehouse David Ulmer Under the guidance

More information

Chapter 2 Literature Review

Chapter 2 Literature Review Chapter 2 Literature Review 2.1 Data Mining The amount of data continues to grow at an enormous rate even though the data stores are already vast. The primary challenge is how to make the database a competitive

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

Project Report. 1. Application Scenario

Project Report. 1. Application Scenario Project Report In this report, we briefly introduce the application scenario of association rule mining, give details of apriori algorithm implementation and comment on the mined rules. Also some instructions

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Mobile Phone APP Software Browsing Behavior using Clustering Analysis Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

Customer Analytics. Turn Big Data into Big Value

Customer Analytics. Turn Big Data into Big Value Turn Big Data into Big Value All Your Data Integrated in Just One Place BIRT Analytics lets you capture the value of Big Data that speeds right by most enterprises. It analyzes massive volumes of data

More information

New Matrix Approach to Improve Apriori Algorithm

New Matrix Approach to Improve Apriori Algorithm New Matrix Approach to Improve Apriori Algorithm A. Rehab H. Alwa, B. Anasuya V Patil Associate Prof., IT Faculty, Majan College-University College Muscat, Oman, rehab.alwan@majancolleg.edu.om Associate

More information

Data Mining: Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar

Data Mining: Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar Data Mining: Association Analysis Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar Association Rule Mining Given a set of transactions, find rules that will predict the occurrence of

More information

DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE

DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE SK MD OBAIDULLAH Department of Computer Science & Engineering, Aliah University, Saltlake, Sector-V, Kol-900091, West Bengal, India sk.obaidullah@gmail.com

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

USE OF DATA MINING TO DERIVE CRM STRATEGIES OF AN AUTOMOBILE REPAIR SERVICE CENTER IN KOREA

USE OF DATA MINING TO DERIVE CRM STRATEGIES OF AN AUTOMOBILE REPAIR SERVICE CENTER IN KOREA USE OF DATA MINING TO DERIVE CRM STRATEGIES OF AN AUTOMOBILE REPAIR SERVICE CENTER IN KOREA Youngsam Yoon and Yongmoo Suh, Korea University, {mryys, ymsuh}@korea.ac.kr ABSTRACT Problems of a Korean automobile

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Data Mining System, Functionalities and Applications: A Radical Review

Data Mining System, Functionalities and Applications: A Radical Review Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

A Survey on Association Rule Mining in Market Basket Analysis

A Survey on Association Rule Mining in Market Basket Analysis International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 4 (2014), pp. 409-414 International Research Publications House http://www. irphouse.com /ijict.htm A Survey

More information

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

More information

CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES

CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES International Journal of Scientific and Research Publications, Volume 4, Issue 4, April 2014 1 CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES DR. M.BALASUBRAMANIAN *, M.SELVARANI

More information

Chapter 20: Data Analysis

Chapter 20: Data Analysis Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

Distributed Apriori in Hadoop MapReduce Framework

Distributed Apriori in Hadoop MapReduce Framework Distributed Apriori in Hadoop MapReduce Framework By Shulei Zhao (sz2352) and Rongxin Du (rd2537) Individual Contribution: Shulei Zhao: Implements centralized Apriori algorithm and input preprocessing

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

A Data Mining Tutorial

A Data Mining Tutorial A Data Mining Tutorial Presented at the Second IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN 98) 14 December 1998 Graham Williams, Markus Hegland and Stephen

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

A THREE-TIERED WEB BASED EXPLORATION AND REPORTING TOOL FOR DATA MINING

A THREE-TIERED WEB BASED EXPLORATION AND REPORTING TOOL FOR DATA MINING A THREE-TIERED WEB BASED EXPLORATION AND REPORTING TOOL FOR DATA MINING Ahmet Selman BOZKIR Hacettepe University Computer Engineering Department, Ankara, Turkey selman@cs.hacettepe.edu.tr Ebru Akcapinar

More information

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi,*

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Overview. Background. Data Mining Analytics for Business Intelligence and Decision Support

Overview. Background. Data Mining Analytics for Business Intelligence and Decision Support Mining Analytics for Business Intelligence and Decision Support Chid Apte, PhD Manager, Abstraction Research Group IBM TJ Watson Research Center apte@us.ibm.com http://www.research.ibm.com/dar Overview

More information

Data Mining Applications in Manufacturing

Data Mining Applications in Manufacturing Data Mining Applications in Manufacturing Dr Jenny Harding Senior Lecturer Wolfson School of Mechanical & Manufacturing Engineering, Loughborough University Identification of Knowledge - Context Intelligent

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

Chapter 6: Episode discovery process

Chapter 6: Episode discovery process Chapter 6: Episode discovery process Algorithmic Methods of Data Mining, Fall 2005, Chapter 6: Episode discovery process 1 6. Episode discovery process The knowledge discovery process KDD process of analyzing

More information

MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM

MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM J. Arokia Renjit Asst. Professor/ CSE Department, Jeppiaar Engineering College, Chennai, TamilNadu,India 600119. Dr.K.L.Shunmuganathan

More information

PREDICTIVE DATA MINING ON WEB-BASED E-COMMERCE STORE

PREDICTIVE DATA MINING ON WEB-BASED E-COMMERCE STORE PREDICTIVE DATA MINING ON WEB-BASED E-COMMERCE STORE Jidi Zhao, Tianjin University of Commerce, zhaojidi@263.net Huizhang Shen, Tianjin University of Commerce, hzshen@public.tpt.edu.cn Duo Liu, Tianjin

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

MBA 8473 - Data Mining & Knowledge Discovery

MBA 8473 - Data Mining & Knowledge Discovery MBA 8473 - Data Mining & Knowledge Discovery MBA 8473 1 Learning Objectives 55. Explain what is data mining? 56. Explain two basic types of applications of data mining. 55.1. Compare and contrast various

More information

Data Warehouse design

Data Warehouse design Data Warehouse design Design of Enterprise Systems University of Pavia 21/11/2013-1- Data Warehouse design DATA PRESENTATION - 2- BI Reporting Success Factors BI platform success factors include: Performance

More information

Mining Sequence Data. JERZY STEFANOWSKI Inst. Informatyki PP Wersja dla TPD 2009 Zaawansowana eksploracja danych

Mining Sequence Data. JERZY STEFANOWSKI Inst. Informatyki PP Wersja dla TPD 2009 Zaawansowana eksploracja danych Mining Sequence Data JERZY STEFANOWSKI Inst. Informatyki PP Wersja dla TPD 2009 Zaawansowana eksploracja danych Outline of the presentation 1. Realtionships to mining frequent items 2. Motivations for

More information

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Nine Common Types of Data Mining Techniques Used in Predictive Analytics 1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better

More information

Efficient Integration of Data Mining Techniques in Database Management Systems

Efficient Integration of Data Mining Techniques in Database Management Systems Efficient Integration of Data Mining Techniques in Database Management Systems Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex France

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Numerical Algorithms Group

Numerical Algorithms Group Title: Summary: Using the Component Approach to Craft Customized Data Mining Solutions One definition of data mining is the non-trivial extraction of implicit, previously unknown and potentially useful

More information

Data Mining for Retail Website Design and Enhanced Marketing

Data Mining for Retail Website Design and Enhanced Marketing Data Mining for Retail Website Design and Enhanced Marketing Inaugural-Dissertation zur Erlangung des Doktorgrades der Mathematisch-Naturwissenschaftlichen Fakultät der Heinrich-Heine-Universität Düsseldorf

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Easily Identify Your Best Customers

Easily Identify Your Best Customers IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do

More information

IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users

IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users 1 IT and CRM A basic CRM model Data source & gathering Database Data warehouse Information delivery Information users 2 IT and CRM Markets have always recognized the importance of gathering detailed data

More information

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm R.Karthiyayini 1, J.Jayaprakash 2 Assistant Professor, Department of Computer Applications, Anna University (BIT Campus),

More information

How To Solve The Kd Cup 2010 Challenge

How To Solve The Kd Cup 2010 Challenge A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China catch0327@yahoo.com yanxing@gdut.edu.cn

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Eric Hsueh-Chan Lu Chi-Wei Huang Vincent S. Tseng Institute of Computer Science and Information Engineering

More information

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1 Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints

More information

A Time Efficient Algorithm for Web Log Analysis

A Time Efficient Algorithm for Web Log Analysis A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,

More information

Explanation-Oriented Association Mining Using a Combination of Unsupervised and Supervised Learning Algorithms

Explanation-Oriented Association Mining Using a Combination of Unsupervised and Supervised Learning Algorithms Explanation-Oriented Association Mining Using a Combination of Unsupervised and Supervised Learning Algorithms Y.Y. Yao, Y. Zhao, R.B. Maguire Department of Computer Science, University of Regina Regina,

More information

Data Mining: Foundation, Techniques and Applications

Data Mining: Foundation, Techniques and Applications Data Mining: Foundation, Techniques and Applications Lesson 1b :A Quick Overview of Data Mining Li Cuiping( 李 翠 平 ) School of Information Renmin University of China Anthony Tung( 鄧 锦 浩 ) School of Computing

More information

2.1. Data Mining for Biomedical and DNA data analysis

2.1. Data Mining for Biomedical and DNA data analysis Applications of Data Mining Simmi Bagga Assistant Professor Sant Hira Dass Kanya Maha Vidyalaya, Kala Sanghian, Distt Kpt, India (Email: simmibagga12@gmail.com) Dr. G.N. Singh Department of Physics and

More information

Comparison of Data Mining Techniques for Money Laundering Detection System

Comparison of Data Mining Techniques for Money Laundering Detection System Comparison of Data Mining Techniques for Money Laundering Detection System Rafał Dreżewski, Grzegorz Dziuban, Łukasz Hernik, Michał Pączek AGH University of Science and Technology, Department of Computer

More information

CHAPTER-24 Mining Spatial Databases

CHAPTER-24 Mining Spatial Databases CHAPTER-24 Mining Spatial Databases 24.1 Introduction 24.2 Spatial Data Cube Construction and Spatial OLAP 24.3 Spatial Association Analysis 24.4 Spatial Clustering Methods 24.5 Spatial Classification

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Fuzzy Association Rules

Fuzzy Association Rules Vienna University of Economics and Business Administration Fuzzy Association Rules An Implementation in R Master Thesis Vienna University of Economics and Business Administration Author Bakk. Lukas Helm

More information

Data Mining and Neural Networks in Stata

Data Mining and Neural Networks in Stata Data Mining and Neural Networks in Stata 2 nd Italian Stata Users Group Meeting Milano, 10 October 2005 Mario Lucchini e Maurizo Pisati Università di Milano-Bicocca mario.lucchini@unimib.it maurizio.pisati@unimib.it

More information

Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I

Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Data is Important because it: Helps in Corporate Aims Basis of Business Decisions Engineering Decisions Energy

More information

Data Mining: Overview. What is Data Mining?

Data Mining: Overview. What is Data Mining? Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

More information

Binary Coded Web Access Pattern Tree in Education Domain

Binary Coded Web Access Pattern Tree in Education Domain Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: kc.gomathi@gmail.com M. Moorthi

More information

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

More information

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning

More information

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical

More information

Keywords: Mobility Prediction, Location Prediction, Data Mining etc

Keywords: Mobility Prediction, Location Prediction, Data Mining etc Volume 4, Issue 4, April 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Data Mining Approach

More information

AN IMPROVED PRIVACY PRESERVING ALGORITHM USING ASSOCIATION RULE MINING(27-32) AN IMPROVED PRIVACY PRESERVING ALGORITHM USING ASSOCIATION RULE MINING

AN IMPROVED PRIVACY PRESERVING ALGORITHM USING ASSOCIATION RULE MINING(27-32) AN IMPROVED PRIVACY PRESERVING ALGORITHM USING ASSOCIATION RULE MINING AN IMPROVED PRIVACY PRESERVING ALGORITHM USING ASSOCIATION RULE MINING Ravindra Kumar Tiwari Ph.D Scholar, Computer Sc. AISECT University, Bhopal Abstract-The recent advancement in data mining technology

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information