Market Basket Analysis and Mining Association Rules
|
|
|
- Alaina Newman
- 10 years ago
- Views:
Transcription
1 Market Basket Analysis and Mining Association Rules 1
2 Mining Association Rules Market Basket Analysis What is Association rule mining Apriori Algorithm Measures of rule interestingness 2
3 Market Basket Analysis One basket tells you about what one customer purchased at one time. A loyalty card makes it possible to tie together purchases by a single customer (or household) over time. 3
4 Market Basket Analysis Retail each customer purchases different set of products, different quantities, different times MBA uses this information to: Identify who customers are (not by name) Understand why they make certain purchases Gain insight about its merchandise (products): Fast and slow movers Products which are purchased together Products which might benefit from promotion Take action: Store layouts Which products to put on specials, promote, coupons Combining all of this with a customer loyalty card it becomes even more valuable
5 more than just the contents of shopping carts It is also about what customers do not purchase, and why. If customers purchase baking powder, but no flour, what are they baking? If customers purchase a mobile phone, but no case, are you missing an opportunity? It is also about key drivers of purchases; for example, the gourmet mustard that seems to lie on a shelf collecting dust until a customer buys that particular brand of special gourmet mustard in a shopping excursion that includes hundreds of dollars' worth of other products. Would eliminating the mustard (to replace it with a better selling item) threaten the entire customer relationship? 5
6 Market Basket Analysis association rules can be applied on other types of baskets. Items purchased on a credit card, such as rental cars and hotel rooms, provide insight into the next product that customers are likely to purchase, Optional services purchased by telecommunications customers (call waiting, call forwarding, DSL, speed call, and so on) help determine how to bundle these services together to maximize revenue. Banking products used by retail customers (money market accounts, certificate of deposit, investment services, car loans, and so on) identify customers likely to want other products. Unusual combinations of insurance claims can be a sign of fraud and can spark further investigation. Medical patient histories can give indications of likely complications based on certain combinations of treatments. 6
7 Market Basket Analysis The order is the fundamental data structure for market basket data. An order represents a single purchase event by a customer. The customer entity is optional and should be available when a customer can be identified over time. Tracking customers over time makes it possible to determine, for instance, which grocery shoppers bake from scratch interesting to the makers of flour as well as prepackaged cake mixes and the makers of aprons and kitchen appliances. The store entity provides information about where the products are purchased. 7
8 Market Basket Analysis: Basic Measures Ex. of questions to ask before starting with fancy techniques: Average number of orders per customer Change in average number of orders per customer over time Average number of unique items per order Change in average number of unique items per order over time Proportion of customers and average purchase size of customers who purchase the most popular products Proportion of customers and average purchase size of customers who purchase the least popular products Average order size Changes in average order size over time Average order size by important dimensions, such as geography, method of payment, time of year, and so on 8
9 In some cases, few customers have multiple purchases, so the proportion of orders per customer is close to one, suggesting a business opportunity to increase the number of sales per customer. The Figure represents the number of unique items ever purchased by the depth of the relationship (the number of orders) for customers who purchased more than one item from a small specialty retailer. 9
10 Tracking Marketing Interventions 10
11 Mining Association Rules Market Basket Analysis What is Association rule mining Apriori Algorithm Measures of rule interestingness 11
12 What Is Association Rule Mining? Association rule mining Finding frequent patterns, associations, correlations, or causal structures among sets of items in transaction databases Understand customer buying habits by finding associations and correlations between the different items that customers place in their shopping basket Applications Basket data analysis, cross marketing, catalog design, loss leader analysis, web log analysis, fraud detection (supervisor >examiner) 12
13 How can Association Rules be used? Probably mom was calling dad at work to buy diapers on way home and he decided to buy a sixpack as well. The retailer could move diapers and beers to separate places and position high profit items of interest to young fathers along the path. 13
14 Rule form What Is Association Rule Mining? Antecedent Consequent [support, confidence] (support and confidence are user defined measures of interestingness) Examples buys(x, computer ) buys(x, financial management software ) [0.5%, 60%] age(x, ) ^ income(x, K ) buys(x, car ) [1%,75%] 14
15 How can Association Rules be used? Let the rule discovered be {Bagels,...} {Potato Chips} Potato chips as consequent => Can be used to determine what should be done to boost its sales Bagels in the antecedent => Can be used to see which products would be affected if the store discontinues selling bagels Bagels in antecedent and Potato chips in the consequent => Can be used to see what products should be sold with Bagels to promote sale of Potato Chips 15
16 Orange juice and soda are more likely to be purchased together than any other two items. Detergent is never purchased with window cleaner or milk. Milk is never purchased with soda or detergent. 16
17 Actionable Rules Association rule types contain high quality, actionable information Trivial Rules information already well known by those familiar with the business Inexplicable Rules no explanation and do not suggest action Trivial and Inexplicable Rules occur most often 17
18 Rules Wal Mart customers who purchase Barbie dolls have a 60% likelihood of also purchasing one of three types of candy bars [Forbes, Sept 8, 1997] Customers who purchase maintenance agreements are very likely to purchase large appliances (Linoff and Berry experience) When a new hardware store opens, one of the most commonly sold items is toilet bowl cleaners (Linoff and Berry experience) 18
19 Basic Concepts Given: (1) database of transactions, (2) each transaction is a list of items purchased by a customer in a visit Find: all rules that correlate the presence of one set of items (itemset) with that of another set of items E.g., 35% of people who buys salmon also buys cheese 19
20 Rule Basic Measures A B [ s, c ] Support: denotes the frequency of the rule within transactions. A high value means that the rule involves a great part of database. support(a B [ s, c ]) = p(a B) Confidence: denotes the percentage of transactions containing A which also contain B. It is an estimation of conditioned probability. confidence(a B [ s, c ]) = p(b A) = sup(a,b)/sup(a). 20
21 Tr1 Tr2 Tr3 Tr4 Shoes, Socks, Tie, Belt Shoes, Socks, Tie, Belt, Shirt, Hat Shoes, Tie Shoes, Socks, Belt Transaction Shoes Socks Tie Belt Shirt Scarf Hat Socks Tie Support is 50% (2/4) Confidence is 66.67% (2/3) 21
22 transactions Rule C => D support confidence / 22
23 Example Trans. Id Purchased Items 1 A,D 2 A,C 3 A,B,C 4 B,E,F Definitions: Itemset: A,B or B,E,F Support of an itemset: Sup(A,B)=1 Sup(A,C)=2 Frequent pattern: Given min. sup=2, {A,C} is a frequent pattern For minimum support = 50% and minimum confidence = 50%, we have the following rules A => C with 50% support and 66% confidence C => A with 50% support and 100% confidence 23
24 Baskets = documents Other Applications items = words in those documents Lets us find words that appear together unusually frequently, i.e., linked concepts. Word 1 Word 2 Word 3 Word 4 Doc Doc Doc Word 4 => Word 3 When word 4 occurs in a document there a big probability of word 3 occurring 24
25 Baskets = sentences Other Applications items = documents containing those sentences Items that appear together too often could represent plagiarism. Doc 1 Doc 2 Doc 3 Doc 4 Sent Sent Sent Doc 4 => Doc 3 When a sentence occurs in document 4 there is a big probability of occurring in document 3 25
26 Baskets = Web pages; items = linked pages. Other Applications Pairs of pages with many common references may be about the same topic. Baskets = Web pages p i ; items = pages that link to p i Pages with many of the same links may be mirrors or about the same topic. wp a wp b wp c wp d wp1 wp2 26
27 Randall Matignon 2007, Data Mining Using SAS Enterprise Miner, Wiley (book). 27
28 The support probability of each rule is identified by the color of the symbols and the confidence probability of each rule is identified by the shape of the symbols. The items that have the highest confidence are Heineken beer, crackers, chicken, and peppers. Randall Matignon 2007, Data Mining Using SAS Enterprise Miner, Wiley (book). 28
29 29
30 Mining Association Rules Market Basket Analysis What is Association rule mining Apriori Algorithm Measures of rule interestingness 30
31 Boolean association rules Each transaction is converted to a Boolean vector 31
32 Finding Association Rules Approach: 1) Find all itemsets that have high support These are known as frequent itemsets 2) Generate association rules from frequent itemsets 32
33 Itemset Lattice for 5 products null A B C D E AB AC AD AE BC BD BE CD CE DE ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE ABCD ABCE ABDE ACDE BCDE ABCDE 33
34 Apriori principle Any subset of a frequent itemset must be frequent A transaction containing {beer, diaper, nuts} also contains {beer, diaper} {beer, diaper, nuts} is frequent {beer, diaper} must also be frequent 34
35 Apriori principle No superset of any infrequent itemset should be generated or tested Many item combinations can be pruned 35
36 Apriori principle for pruning candidates If an itemset is infrequent, then all of its supersets must also be infrequent null A B C D E AB AC AD AE BC BD BE CD CE DE Found to be Infrequent ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE Pruned supersets ABCD ABCE ABDE ACDE BCDE ABCDE 36
37 Mining Frequent Itemsets (the Key Step) Find the frequent itemsets: the sets of items that have (at least) minimum support A subset of a frequent itemset must also be a frequent itemset Generate length (k+1) candidate itemsets from length k frequent itemsets, and Test the candidates against DB to determine which are in fact frequent 37
38 How to Generate Candidates? step 1 The items in L k 1 are listed in an order Step 1: self joining L k 1 insert into C k select p.item 1, p.item 2,, p.item k 1, q.item k 1 from L k 1 p, L k 1 q where p.item 1 =q.item 1,, p.item k 2 =q.item k 2, p.item k 1 < q.item k 1 A D E = = < A D F 38
39 L3 set of itemsets of size 3 that are frequent (itemsets need to be sorted) itemset 1 A B C itemset 2 A D E itemset 3 A D F itemset 4 A E F A,D,E,F Two itemsets with the same k 1 = 3 1 elements are merged and inserted in C4 (the set of candidate itemsets of size 4). Those in C4 that have support above the threshold are inserted in L4. 39
40 The Apriori Algorithm Example Database D TID Items Scan D itemset sup. {1} 2 {2} 3 {3} 3 {4} 1 {5} 3 C 1 L 1 itemset sup {1 2} 1 {1 3} 2 {1 5} 1 {2 3} 2 {2 5} 3 {3 5} 2 itemset sup L 2 C 2 C 2 {1 3} 2 {2 3} 2 Scan D {2 5} 3 {3 5} 2 C 3 itemset Scan D L 3 {2 3 5} itemset sup {2 3 5} 2 Min. support: 2 transactions itemset sup. {1} 2 {2} 3 {3} 3 {5} 3 itemset {1 2} {1 3} {1 5} {2 3} {2 5} {3 5} 40
41 How to Generate Candidates? step 2 Step 2: pruning for all itemsets c in C k do for all (k 1) subsets s of c do if (s is not in L k 1 ) then delete c from C k A D E F Note: In step 1 we may generate candidates that include itemsets of size k 1 that are not frequent. The pruning step may be able to eliminate some candidates without counting their support. Counting support is a computationally demanding task. 41
42 Example of Generating Candidates step 1 L 3 = {abc, abd, acd, ace, bcd} Self joining: L 3 *L 3 joining abc and abd gives abcd joining acd and ace gives acde 42
43 Example of Generating Candidates step 2 L 3 = {abc, abd, acd, ace, bcd} Pruning (before counting its support): abcd: check if abc, abd, acd, bcd, are in L3 acde: check if acd, ace, ade, cde are in L3 acde is removed because ade is not in L 3 Thus C 4 ={abcd} 43
44 The Apriori Algorithm C k : Candidate itemset of size k L k : frequent itemset of size k Join Step: C k is generated by joining L k 1 with itself Prune Step: Any (k 1) itemset that is not frequent cannot be a subset of a frequent k itemset Algorithm: L 1 = {frequent items}; for (k = 1; L k!= ; k++) do begin C k+1 = candidates generated from L k ; for each transaction t in database do increment the count of all candidates in C k+1 that are contained in t L k+1 = candidates in C k+1 with min_support end return L = k L k ; 44
45 Generating AR from frequent intemsets Confidence (A B) = P(B A) = support_count({a,b}) support_count({a}) For every frequent itemset x, generate all non empty subsets of x For every non empty subset s of x, output the rule s (x s) if support_count({x}) support_count({s}) min_conf 45
46 Rule Generation How to efficiently generate rules from frequent itemsets? In general, confidence does not have an anti monotone property But confidence of rules generated from the same itemset has an anti monotone property L = {A,B,C,D}: c(abc D) c(ab CD) c(a BCD) Confidence is non increasing as number of items in rule consequent increases 46
47 From Frequent Itemsets to Association Rules Q: Given frequent set {A,B,E}, what are possible association rules? A => B, E A, B => E A, E => B B => A, E B, E => A E => A, B => A,B,E (empty rule), or true => A,B,E 47
48 Generating Rules: example Items Min_support: 60% 1 ACD Min_confidence: 75% 2 BCE 3 ABCE 4 BE 5 ABCE Frequent Itemset Support Trans-ID Rule Conf. {BC} =>{E} 100% {BE} =>{C} 75% {CE} =>{B} 100% {B} =>{CE} 75% {C} =>{BE} 75% {E} =>{BC} 75% {BCE},{AC} 60% {BC},{CE},{A} 60% {BE},{B},{C},{E} 80% 48
49 Exercice TID Items 1 Bread, Milk, Chips, Mustard 2 Beer, Diaper, Bread, Eggs 3 Beer, Coke, Diaper, Milk 4 Beer, Bread, Diaper, Milk, Chips 5 Coke, Bread, Diaper, Milk 6 Beer, Bread, Diaper, Milk, Mustard 7 Coke, Bread, Diaper, Milk Converta os dados para o formato booleano e para um suporte de 40%, aplique o algoritmo apriori. Bread Milk Chips Mustard Beer Diaper Eggs Coke
50 0.4*7= 2.8 C1 L1 C2 L2 Bread 6 Bread 6 Bread,Milk 5 Bread,Milk 5 Milk 6 Milk 6 Bread,Beer 3 Bread,Beer 3 Chips 2 Beer 4 Bread,Diaper 5 Bread,Diaper 5 Mustard 2 Diaper 6 Bread,Coke 2 Milk,Beer 3 Beer 4 Coke 3 Milk,Beer 3 Milk,Diaper 5 Diaper 6 Milk,Diaper 5 Milk,Coke 3 Eggs 1 Milk,Coke 3 Beer,Diaper 4 Coke 3 Beer,Diaper 4 Diaper,Coke 3 Beer,Coke 1 Diaper,Coke 3 C3 L3 Bread,Milk,Beer 2 Bread,Milk,Diaper 4 Bread,Milk,Diaper 4 Bread,Beer,Diaper 3 Bread,Beer,Diaper 3 Milk,Beer,Diaper 3 Milk,Beer,Diaper 3 Milk,Diaper,Coke 3 Milk,Beer,Coke Milk,Diaper,Coke 3 8 C C
51 Mining Association Rules Market Basket Analysis What is Association rule mining Apriori Algorithm Measures of rule interestingness 51
52 Interestingness Measurements How good is the association Rule? Are all of the strong association rules discovered interesting enough to present to the user? How can we measure the interestingness of a rule? Subjective measures A rule (pattern) is interesting if it is unexpected (surprising to the user); and/or actionable (the user can do something with it) (only the user can judge the interestingness of a rule) 52
53 Objective measures of rule interest Support Confidence or strength Lift or Interest or Correlation Conviction Leverage or Piatetsky Shapiro 53
54 Criticism to Support and Confidence Example 1: (Aggarwal & Yu, PODS98) Among 5000 students 3000 play basketball 3750 eat cereal 2000 both play basket ball and eat cereal basketball not basketball sum(row) cereal % not cereal % sum(col.) % 40% play basketball eat cereal [40%, 66.7%] misleading because the overall percentage of students eating cereal is 75% which is higher than 66.7%. play basketball not eat cereal [20%, 33.3%] is more accurate, although with lower support and confidence 54
55 Lift of a Rule LIFT( A sup( A, B) p( B A) B) sup( A) sup( B) p( B) play basketball eat cereal [40%, 66.7%] play basketball not eat cereal [20%, 33.3%] LIFT LIFT basketball not basketball sum(row) cereal not cereal sum(col.)
56 Problems With Lift Rules that hold 100% of the time may not have the highest possible lift. For example, if 5% of people are Vietnam veterans and 90% of the people are more than 5 years old, we get a lift of 0.05/(0.05*0.9)=1.11 which is only slightly above 1 for the rule Vietnam veterans > more than 5 years old. And, lift is symmetric: not eat cereal play basketball [20%, 80%] LIFT
57 Conviction of a Rule sup( A) sup( B) P( A) PB ( ) P( A)( 1 PB ( )) Conv ( A B) sup( A, B) P( A, B) P( A) P( A, B) Conviction is a measure of the implication and has value 1 if items are unrelated. play basketball eat cereal [40%, 66.7%] eat cereal play basketball conv:0.85 Conv play basketball not eat cereal [20%, 33.3%] not eat cereal play basketball conv:1.43 Conv
58 Conviction conviction of X=>Y can be interpreted as the ratio of the expected frequency that X occurs without Y (that is to say, the frequency that the rule makes an incorrect prediction) if X and Y were independent divided by the observed frequency of incorrect predictions. A conviction value of 1.2 shows that the rule would be incorrect 20% more often (1.2 times as often) if the association between X and Y was purely random chance. 58
59 Leverage of a Rule Leverage or Piatetsky Shapiro P SA ( B) sup( AB, ) sup( A) sup( B) PS (or Leverage): is the proportion of additional elements covered by both the premise and consequence above the expected if independent. 59
60 Comments Traditional methods such as database queries: support hypothesis verification about a relationship such as the co occurrence of diapers & beer. Data Mining methods automatically discover significant associations rules from data. Find whatever patterns exist in the database, without the user having to specify in advance what to look for (data driven). Therefore allow finding unexpected correlations 60
61 Choosing the Right Set of Items 61
62 Product Hierarchies Help to Generalize Items Are large fries and small fries the same product? Is the brand of ice cream more relevant than its flavor? Which is more important: the size, style, pattern, or designer of clothing? Is the energy saving option on a large appliance indicative of customer behavior? Market basket analysis produces the best results when the items occur in roughly the same number of transactions in the data. This helps prevent rules from being dominated by the most common items. Product hierarchies can help here. Roll up rare items to higher levels in the hierarchy, so they become more frequent. More common items may not have to be rolled up at all. 62
63 63
64 Virtual Items Go Beyond the Product Hierarchy The purpose of virtual items is to enable the analysis to take advantage of information that goes beyond the product hierarchy. Examples of virtual items might be designer labels, such as Calvin Klein, that appear in both apparel departments and perfumes low fat and no fat products in a grocery store energy saving options on appliances whether the purchase was made with cash, a credit card, or check the day of the week or the time of day the transaction occurred virtual items can also be used to specify which group, such as an existing location or a new location, generates the transaction. 64
65 Application Difficulties Wal Mart knows that customers who buy Barbie dolls (it sells one every 20 seconds) have a 60% likelihood of buying one of three types of candy bars. What does Wal Mart do with information like that? 'I don't have a clue,' says Wal Mart's chief of merchandising, Lee Scott. See KDnuggets 98:01 for many ideas 65
66 Some Suggestions By increasing the price of Barbie doll and giving the type of candy bar free, wal mart can reinforce the buying habits of that particular types of buyer Highest margin candy to be placed near dolls. Special promotions for Barbie dolls with candy at a slightly higher margin. Take a poorly selling product X and incorporate an offer on this which is based on buying Barbie and Candy. If the customer is likely to buy these two products anyway then why not try to increase sales on X? Probably they can not only bundle candy of type A with Barbie dolls, but can also introduce new candy of Type N in this bundle while offering discount on whole bundle. As bundle is going to sell because of Barbie dolls & candy of type A, candy of type N can get free ride to customers houses. And with the fact that you like something, if you see it often, Candy of type N can become popular. 66
67 References Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, 2 edition (4 Jun 2006), Gordon S. Linoff and Michael J. Berry, Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, 3rd Edition edition (1 April 2011) Vipin Kumar and Mahesh Joshi, Tutorial on High Performance Data Mining, 1999 Rakesh Agrawal, Ramakrishnan Srikan, Fast Algorithms for Mining Association Rules, Proc VLDB, 1994 ( 67
Mining Association Rules. Mining Association Rules. What Is Association Rule Mining? What Is Association Rule Mining? What is Association rule mining
Mining Association Rules What is Association rule mining Mining Association Rules Apriori Algorithm Additional Measures of rule interestingness Advanced Techniques 1 2 What Is Association Rule Mining?
Data Mining: Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar
Data Mining: Association Analysis Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar Association Rule Mining Given a set of transactions, find rules that will predict the occurrence of
Data Mining Apriori Algorithm
10 Data Mining Apriori Algorithm Apriori principle Frequent itemsets generation Association rules generation Section 6 of course book TNM033: Introduction to Data Mining 1 Association Rule Mining (ARM)
Data Mining Association Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 6. Introduction to Data Mining
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/24
Association Rule Mining
Association Rule Mining Association Rules and Frequent Patterns Frequent Pattern Mining Algorithms Apriori FP-growth Correlation Analysis Constraint-based Mining Using Frequent Patterns for Classification
Association Analysis: Basic Concepts and Algorithms
6 Association Analysis: Basic Concepts and Algorithms Many business enterprises accumulate large quantities of data from their dayto-day operations. For example, huge amounts of customer purchase data
Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm
Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm Purpose: key concepts in mining frequent itemsets understand the Apriori algorithm run Apriori in Weka GUI and in programatic way 1 Theoretical
Unique column combinations
Unique column combinations Arvid Heise Guest lecture in Data Profiling and Data Cleansing Prof. Dr. Felix Naumann Agenda 2 Introduction and problem statement Unique column combinations Exponential search
Data Mining Techniques Chapter 9: Market Basket Analysis and Association Rules
Data Mining Techniques Chapter 9: Market Basket Analysis and Association Rules Market basket analysis.................................................... 2 Market basket data I.....................................................
Project Report. 1. Application Scenario
Project Report In this report, we briefly introduce the application scenario of association rule mining, give details of apriori algorithm implementation and comment on the mined rules. Also some instructions
MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH
MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH M.Rajalakshmi 1, Dr.T.Purusothaman 2, Dr.R.Nedunchezhian 3 1 Assistant Professor (SG), Coimbatore Institute of Technology, India, [email protected]
International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET
DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand
A Survey on Association Rule Mining in Market Basket Analysis
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 4 (2014), pp. 409-414 International Research Publications House http://www. irphouse.com /ijict.htm A Survey
Selection of Optimal Discount of Retail Assortments with Data Mining Approach
Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.
Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler
Data Mining: Introduction Lecture Notes for Chapter 1 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused - Web
DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE
DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE SK MD OBAIDULLAH Department of Computer Science & Engineering, Aliah University, Saltlake, Sector-V, Kol-900091, West Bengal, India [email protected]
Product recommendations and promotions (couponing and discounts) Cross-sell and Upsell strategies
WHITEPAPER Today, leading companies are looking to improve business performance via faster, better decision making by applying advanced predictive modeling to their vast and growing volumes of data. Business
MBA 8473 - Data Mining & Knowledge Discovery
MBA 8473 - Data Mining & Knowledge Discovery MBA 8473 1 Learning Objectives 55. Explain what is data mining? 56. Explain two basic types of applications of data mining. 55.1. Compare and contrast various
Grocery Shopping Within a Budget
Grocery Shopping Within a Budget Lesson Plan Grade Level 10-12 Take Charge of Your Finances National Content Standards Family and Consumer Science Standards: 1.1.6, 2.1.1, 2.1.2, 2.1.3, 2.5.1, 2.6.1, 2.6.2,
Data Mining: An Overview. David Madigan http://www.stat.columbia.edu/~madigan
Data Mining: An Overview David Madigan http://www.stat.columbia.edu/~madigan Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms:
Mining an Online Auctions Data Warehouse
Proceedings of MASPLAS'02 The Mid-Atlantic Student Workshop on Programming Languages and Systems Pace University, April 19, 2002 Mining an Online Auctions Data Warehouse David Ulmer Under the guidance
HOW TO USE MINITAB: DESIGN OF EXPERIMENTS. Noelle M. Richard 08/27/14
HOW TO USE MINITAB: DESIGN OF EXPERIMENTS 1 Noelle M. Richard 08/27/14 CONTENTS 1. Terminology 2. Factorial Designs When to Use? (preliminary experiments) Full Factorial Design General Full Factorial Design
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
Grocery Shopping Within a Budget Grade Level 10-12
1.8.2 Grocery Shopping Within a Budget Grade Level 10-12 Take Charge of Your Finances Materials provided by: Heide Mankin, Billings Senior High School, Billings, Montana Janice Denson, Twin Bridges High
Leading Practices in Market Basket Analysis
Leading Practices in Market Basket Analysis How Top Retailers are Using Market Basket Analysis to Win Margin and Market Share Larry Gordon, Partner The FactPoint Group 349 First Street Los Altos, CA 94022
ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam
ECLT 5810 E-Commerce Data Mining Techniques - Introduction Prof. Wai Lam Data Opportunities Business infrastructure have improved the ability to collect data Virtually every aspect of business is now open
Gerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
Introduction to Market Basket Analysis Bill Qualls, First Analytics, Raleigh, NC
Paper AA07-2013 Introduction to Market Basket Analysis Bill Qualls, First Analytics, Raleigh, NC ABSTRACT Market Basket Analysis (MBA) is a data mining technique which is widely used in the consumer package
MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM
MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM J. Arokia Renjit Asst. Professor/ CSE Department, Jeppiaar Engineering College, Chennai, TamilNadu,India 600119. Dr.K.L.Shunmuganathan
Quick Introduction of Data Mining Techniques
Quick Introduction of Data Mining Techniques *Sources partially from Introduction to Data Mining, by P.-N. Tan, M. Steinbach, V. Kumar, Addison-Wesley, 2005. Main Data Mining Techniques Link Analysis Associations
Market Basket Analysis for a Supermarket based on Frequent Itemset Mining
www.ijcsi.org 257 Market Basket Analysis for a Supermarket based on Frequent Itemset Mining Loraine Charlet Annie M.C. 1 and Ashok Kumar D 2 1 Department of Computer Science, Government Arts College Tchy,
Analytics on Big Data
Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis
Merchandise Accounts. Chapter 7 - Unit 14
Merchandise Accounts Chapter 7 - Unit 14 Merchandising... Merchandising... There are many types of companies out there Merchandising... There are many types of companies out there Service company - sells
Boolean Algebra (cont d) UNIT 3 BOOLEAN ALGEBRA (CONT D) Guidelines for Multiplying Out and Factoring. Objectives. Iris Hui-Ru Jiang Spring 2010
Boolean Algebra (cont d) 2 Contents Multiplying out and factoring expressions Exclusive-OR and Exclusive-NOR operations The consensus theorem Summary of algebraic simplification Proving validity of an
Benchmark Databases for Testing Big-Data Analytics In Cloud Environments
North Carolina State University Graduate Program in Operations Research Benchmark Databases for Testing Big-Data Analytics In Cloud Environments Rong Huang Rada Chirkova Yahya Fathi ICA CON 2012 April
Data Mining & Knowledge Discovery: Personalization and Profiling Technologies
Data Mining & Knowledge Discovery: Personalization and Profiling Technologies 1 Predictive Modeling and Knowledge Discovery via Data Mining v A black box that makes predictions about the future based on
Data Mining2 Advanced Aspects and Applications
Data Mining2 Advanced Aspects and Applications Fosca Giannotti and Mirco Nanni Pisa KDD Lab, ISTI-CNR & Univ. Pisa http://www-kdd.isti.cnr.it/ DIPARTIMENTO DI INFORMATICA - Università di Pisa anno accademico
Databases - Data Mining. (GF Royle, N Spadaccini 2006-2010) Databases - Data Mining 1 / 25
Databases - Data Mining (GF Royle, N Spadaccini 2006-2010) Databases - Data Mining 1 / 25 This lecture This lecture introduces data-mining through market-basket analysis. (GF Royle, N Spadaccini 2006-2010)
Chapter 6: Episode discovery process
Chapter 6: Episode discovery process Algorithmic Methods of Data Mining, Fall 2005, Chapter 6: Episode discovery process 1 6. Episode discovery process The knowledge discovery process KDD process of analyzing
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
Mine Your Business A Novel Application of Association Rules for Insurance Claims Analytics
Mine Your Business A Novel Application of Association Rules for Insurance Claims Analytics Lucas Lau and Arun Tripathi, Ph.D. Abstract: This paper describes how a data mining technique known as Association
RDB-MINER: A SQL-Based Algorithm for Mining True Relational Databases
998 JOURNAL OF SOFTWARE, VOL. 5, NO. 9, SEPTEMBER 2010 RDB-MINER: A SQL-Based Algorithm for Mining True Relational Databases Abdallah Alashqur Faculty of Information Technology Applied Science University
Building A Smart Academic Advising System Using Association Rule Mining
Building A Smart Academic Advising System Using Association Rule Mining Raed Shatnawi +962795285056 [email protected] Qutaibah Althebyan +962796536277 [email protected] Baraq Ghalib & Mohammed
<Insert Picture Here> Oracle Retail Data Model Overview
Oracle Retail Data Model Overview The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into
lesson four shopping wisely teacher s guide
lesson four shopping wisely teacher s guide shopping wisely web sites web sites for shopping wisely The Internet is probably the most extensive and dynamic source of information in our society. The following
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Data Mining in a nutshell Dr. Osmar R. Zaïane Fall 2007 Extract interesting knowledge (rules, regularities, patterns, constraints) from data in large collections. Knowledge
Frequent item set mining
Frequent item set mining Christian Borgelt Frequent item set mining is one of the best known and most popular data mining methods. Originally developed for market basket analysis, it is used nowadays for
Chapter 4 Data Mining A Short Introduction. 2006/7, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Data Mining - 1
Chapter 4 Data Mining A Short Introduction 2006/7, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Data Mining - 1 1 Today's Question 1. Data Mining Overview 2. Association Rule Mining
Figure 4-1 Price Quantity Quantity Per Pair Demanded Supplied $ 2 18 3 $ 4 14 4 $ 6 10 5 $ 8 6 6 $10 2 8
Econ 101 Summer 2005 In-class Assignment 2 & HW3 MULTIPLE CHOICE 1. A government-imposed price ceiling set below the market's equilibrium price for a good will produce an excess supply of the good. a.
SAP Predictive Analysis: Strategy, Value Proposition
September 10-13, 2012 Orlando, Florida SAP Predictive Analysis: Strategy, Value Proposition Thomas B Kuruvilla, Solution Management, SAP Business Intelligence Scott Leaver, Solution Management, SAP Business
Apriori-Map/Reduce Algorithm
Apriori-Map/Reduce Algorithm Jongwook Woo Computer Information Systems Department California State University Los Angeles, CA Abstract Map/Reduce algorithm has received highlights as cloud computing services
Die Welt Multimedia-Reichweite
Die Welt Multimedia-Reichweite 1) Background The quantification of Die Welt s average daily audience (known as Multimedia-Reichweite, MMR) has been developed by Die Welt management, including the research
Lecture Notes on Database Normalization
Lecture Notes on Database Normalization Chengkai Li Department of Computer Science and Engineering The University of Texas at Arlington April 15, 2012 I decided to write this document, because many students
Mining Association Rules: A Database Perspective
IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.12, December 2008 69 Mining Association Rules: A Database Perspective Dr. Abdallah Alashqur Faculty of Information Technology
OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH
OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 Online Analytic Processing OLAP 2 OLAP OLAP: Online Analytic Processing OLAP queries are complex queries that Touch large amounts of data Discover
GROCERY SHOPPING (BEING A SMART CONSUMER) &
GROCERY SHOPPING (BEING A SMART CONSUMER) & Mrs. Anthony CONVENIENCE FOODS VS. SCRATCH DO YOU EAT WITH YOUR FAMILY Once a month? Once a week? 2-3 times a week? Every night? FAMILY DINNER STATISTICS The
Canada s Organic Market National Highlights, 2013
Canada s Organic Market National Highlights, 2013 IN LATE 2012, THE CANADA ORGANIC TRADE ASSOCIATION LAUNCHED THE ORGANIC MARKET RESEARCH PROGRAM, THE MOST COMPREHENSIVE STUDY OF CANADA S ORGANIC MARKETPLACE
Introduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
Unit 3 Boolean Algebra (Continued)
Unit 3 Boolean Algebra (Continued) 1. Exclusive-OR Operation 2. Consensus Theorem Department of Communication Engineering, NCTU 1 3.1 Multiplying Out and Factoring Expressions Department of Communication
Selection and Preparation of Foods Management of the Food Budget*
Selection and Preparation of Foods Management of the Food Budget* Healthy meals on a limited budget! How can you serve healthy meals on a limited budget? It takes some time and planning, but you and your
Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm
R. Sridevi et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi,*
#10655 ADVENTURES IN THE GROCERY STORE WITH CHEF ANDREW
C a p t i o n e d M e d i a P r o g r a m VOICE (800) 237-6213 TTY (800) 237-6819 FAX (800) 538-5636 E-MAIL [email protected] WEB www.cfv.org #10655 ADVENTURES IN THE GROCERY STORE WITH CHEF ANDREW LEARNING
Static Data Mining Algorithm with Progressive Approach for Mining Knowledge
Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive
EXECUTIVE SUMMARY Loyalty/Rewards is Seen as Part of the Marketing Mix
CONSUMER ATTITUDES & BEHAVIOR RETAIL LOYALTY/REWARDS PROGRAMS LOYALOGY.COM - OCTOBER 2014 EXECUTIVE SUMMARY Loyalty/Rewards is Seen as Part of the Marketing Mix Retail loyalty or rewards programs are becoming
Big Data Analysis Technology
Big Data Analysis Technology Tobias Hardes (6687549) Email: [email protected] Seminar: Cloud Computing and Big Data Analysis, L.079.08013 Summer semester 2013 University of Paderborn Abstract
Data Mining Techniques
15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses
A/B SPLIT TESTING GUIDE How to Best Test for Conversion Success
A/B SPLIT TESTING GUIDE How to Best Test for Conversion Success CONVERSION OPTIMIZATION WHITE PAPER BY STEELHOUSE Q2 2012 TABLE OF CONTENTS Executive Summary...3 Importance of Conversion...4 What is A/B
Learning ZoneXpress P.O. Box 1022, Owatonna, MN 55060 888-455-7003 www.learningzonexpress.com
Name Hour 1. What is a good thing to have when you go shopping? 2. Name three of the many departments in a grocery store: a. b. c. 3. "Staples" are products you use everyday. Name three staples: a. b.
Association Rule Mining: Exercises and Answers
Association Rule Mining: Exercises and Answers Contains both theoretical and practical exercises to be done using Weka. The exercises are part of the DBTech Virtual Workshop on KDD and BI. Exercise 1.
EFFECTIVE USE OF THE KDD PROCESS AND DATA MINING FOR COMPUTER PERFORMANCE PROFESSIONALS
EFFECTIVE USE OF THE KDD PROCESS AND DATA MINING FOR COMPUTER PERFORMANCE PROFESSIONALS Susan P. Imberman Ph.D. College of Staten Island, City University of New York [email protected] Abstract
MARRYING BIG DATA WITH FASHION COMMERCE
MARRYING BIG DATA WITH FASHION COMMERCE By Tuoc Luong INTRODUCTION Businesses are leveraging Big Data to grow consumer traffic and improve monetization. Like never before, businesses are collecting, processing
How To Use Data Mining For Loyalty Based Management
Data Mining for Loyalty Based Management Petra Hunziker, Andreas Maier, Alex Nippe, Markus Tresch, Douglas Weers, Peter Zemp Credit Suisse P.O. Box 100, CH - 8070 Zurich, Switzerland [email protected],
Save Time and Money at the Grocery Store
Save Time and Money at the Grocery Store Plan a Grocery List Making a list helps you recall items you need and also saves you time. Organize your list according to the layout of the grocery store. For
The ASSOC Procedure Overview Procedure Syntax Details Example References
The ASSOC Procedure Overview Procedure Syntax PROC ASSOC Statement CUSTOMER Statement TARGET Statement Details Example References Overview Association discovery is the identification of items that occur
Unit 5 Tips for Saving Money
Unit 5 Tips for Saving Money Personal Reflection In Lesson 1 you learned the keys for money management. The keys are: Plan. Spend Less. Save More. Adjust. This lesson is about spending less. In this lesson,
Association Rule Mining: A Survey
Association Rule Mining: A Survey Qiankun Zhao Nanyang Technological University, Singapore and Sourav S. Bhowmick Nanyang Technological University, Singapore 1. DATA MINING OVERVIEW Data mining [Chen et
Data Mining with SAS. Mathias Lanner [email protected]. Copyright 2010 SAS Institute Inc. All rights reserved.
Data Mining with SAS Mathias Lanner [email protected] Copyright 2010 SAS Institute Inc. All rights reserved. Agenda Data mining Introduction Data mining applications Data mining techniques SEMMA
You can eat healthy on any budget
You can eat healthy on any budget Is eating healthy food going to cost me more money? Eating healthy meals and snacks does not have to cost you more money. In fact, eating healthy can even save you money.
Implementation of Data Mining Techniques to Perform Market Analysis
Implementation of Data Mining Techniques to Perform Market Analysis B.Sabitha 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, P.Balasubramanian 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Figure 1, A Monopolistically Competitive Firm
The Digital Economist Lecture 9 Pricing Power and Price Discrimination Many firms have the ability to charge prices for their products consistent with their best interests even thought they may not be
WHAT IS THE CONSUMER PRICE INDEX?
INTRODUCTION The Consumer Price Index, commonly referred to as the CPI, is one of the most used of the statistical series produced by the Statistical Institute of Jamaica (STATIN). In its many applications,
New Matrix Approach to Improve Apriori Algorithm
New Matrix Approach to Improve Apriori Algorithm A. Rehab H. Alwa, B. Anasuya V Patil Associate Prof., IT Faculty, Majan College-University College Muscat, Oman, [email protected] Associate
Using Data Mining and Machine Learning in Retail
Using Data Mining and Machine Learning in Retail Omeid Seide Senior Manager, Big Data Solutions Sears Holdings Bharat Prasad Big Data Solution Architect Sears Holdings Over a Century of Innovation A Fortune
Data Mining Applications in Manufacturing
Data Mining Applications in Manufacturing Dr Jenny Harding Senior Lecturer Wolfson School of Mechanical & Manufacturing Engineering, Loughborough University Identification of Knowledge - Context Intelligent
WEBLOG RECOMMENDATION USING ASSOCIATION RULES
IADIS International Conference on Web Based Communities 2006 WEBLOG RECOMMENDATION USING ASSOCIATION RULES Juan Julián Merelo Guervós, Pedro A. Castillo, Beatriz Prieto Campos Departamento de Arquitectura
IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users
1 IT and CRM A basic CRM model Data source & gathering Database Data warehouse Information delivery Information users 2 IT and CRM Markets have always recognized the importance of gathering detailed data
Healthy Grocery Shopping on a Budget. Tips for smart spending at the grocery store
Healthy Grocery Shopping on a Budget Tips for smart spending at the grocery store Re c ipe e! In d si Grocery Store Science Eating well does not need to cost a lot of money. Here are some ways to choose
Finding Frequent Itemsets using Apriori Algorihm to Detect Intrusions in Large Dataset
Finding Frequent Itemsets using Apriori Algorihm to Detect Intrusions in Large Dataset Kamini Nalavade1, B.B. Meshram2, Abstract With the growth of hacking and exploiting tools and invention of new ways
Downtown St. Paul Banquet Menu Catered By:
Downtown St. Paul Banquet Menu Catered By: 175 W. 7th Street, St. Paul, MN 55102 (651) 556-1420 www.theliffey.com Catering Coordinator: Whitney Gale (612) 252-1639 [email protected] Menu Created
