A Decision Theoretic Approach to Targeted Advertising

Save this PDF as:

Size: px
Start display at page:

Transcription

3 84 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000 Let M be the binary variable that denotes whether or not a person was sent the mailing, with values m0 (not mailed) and m1 (mailed). Let S be the binary variable that denotes whether or not a person subscribes to MSN, with values s0 (did not subscribe) and s1 (did subscribe). Using these variables, we can re-write the (identifiable) fractions involved in the expected lift as (MLE) probabilities: (NAlw + NPers) N (NAlw + NAnti) N p(s = s t im = mo) Note that MAP estimates for these probabilities can be obtained instead from the given fractions and prior knowledge. Plugging into the definition of ELP (left side of Equation 1) we have: the values for the variables in X to define the subpopulations that may have different values for ELP. The statistical model we build is one for the probability distribution p(sim, X). There are many model classes that can be used to represent this distribution, including generalized linear models, support vector machines, and Bayesian networks. In this paper, we concentrate on decision trees which are described by (e.g.) Breiman, Friedman, Olshen and Stone (1984). The probability distribution p(sim, X) can be used to calculate the ELP for anyone in the population. In particular, if we know the values { x1,..., Xn} for the person, we have: ELP = rs p(s = StiM = mt,xt = X],...,Xn = Xn) -ru. p(s = StiM = mo,xt = Xj,...,Xn = Xn) -c ELP= (2) rs p(s = StiM = mt) -ru p(s = StiM = mo) - c In the next section, we describe methods for automatically identifying sub-populations that yield large expected lifts in profit as a result of the mailing. As an example, the expected profit from mailing to the entire population (i.e. using the mail-to-all strategy) may be negative, but our methods might discover that there is lots of money to be earned by mailing to the sub-population of females. 3 IDENTIFYING PROFITABLE TARGETS In this section, we describe how to use the data collected from the randomized experiment to build a statistical model that can calculate the ELP for anyone in the population. In particular, we introduce a new decision-tree learning algorithm that can be used to divide the population of potential customers into groups for the purpose of maximizing profit in an advertising campaign. The experimental data consists of, for each person, a set of values for all distinctions in the domain of interest. The distinctions in the domain necessarily include the two binary variables M (whether or not we mailed to the person) and S (whether or not the person subscribed to MSN) that were introduced in the previous section. We use X= {X1,..., Xn} to denote the other distinctions that are in our data. These distinctions are precisely those that we collected in the Windows 95 registration process. The statistical model uses A decision tree T can be used to represent the distribution of interest. The structure of a decision tree is a tree, where each internal node I stores a mapping from the values of a predictor variable Xj (or M) to the children of I in the tree. Each leaf node L in the tree stores a probability distribution for the target variable S. The probability of the target variable S, given a set of values {M = m, X1 = x1,..., Xn = Xn} for the predictor variables, is obtained by starting at the root of T and using the internal-node mappings to traverse down the tree to a leaf node. We call the mappings in the internal nodes splits. When an internal node I maps values of variable Xj (or M) to its children, we say that Xj is the split variable of node I, and that I is a split on X1. For example, the decision tree shown in Figure 1 stores a probability distribution p(sim, X1, X2). In the example, X1 has two values {1, 2}, and X2 has three values {1, 2, 3}. In the figure, the internal nodes are drawn with circles, and the leaf nodes are drawn with boxes. As we traverse down the tree, the splits at each internal node are described by the label of the node and by the labels of the out-going edges. In particular, if the current internal node of the traversal is labeled with X;, we move next to a child of that node by following the edge that is labeled with the given value x;. Given values {X1 = 1, X2 = 2, M = m0} for the predictors, we obtain p(sixt = 1, X2 = 2, M = m0} by traversing the tree in Figure 1 as follows (the traversal for this prediction is emphasized in the figure by dark edges). We start at the root node of the tree, and see that the root node is a split on X2 Because x2 = 2, we traverse down the right-most child of the root. This next internal node is a split on X1, which

4 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS (1996) and Chickering, Beckerman and Meek (1997) both grow decision trees to represent the conditional distributions in Bayesian networks. Figure 1: Example decision tree for the distribution p(sim,x) is has value 1, so we move next to the left child of this node. Finally, because M = m0, we move to the right child, which is a leaf node. We extract the conditional distribution directly from the leaf, and conclude that p(s = s1ix1 = 1, Xz = 2, M = mo) = 0.2 and p(s = so!x1 = 1, X2 = 2, M = mo) = 0.8. As an example of how we would use this tree for targeting, suppose we have a potential customer with values xl = 1, x2 = 2, and we would like to decide if we should mail to him. Further suppose that ru = 10, rs = 8, and c = 0.5. We plug these constants and the probabilities extracted from the tree into Equation 2 to get: ELP = 8 X X = 0. 7 Because the expected lift in profit is positive, we would send the mailing to the person. Note that under the given cost/benefit scenario we should not send the mailing to a person for which x2 = 1 or x2 = 3, even though mailing to such a person increases the chances that he will subscribe to MSN. There are several types of splits that can exist in a decision tree. A complete split is a split where each value of the split variable maps to a separate child. Examples of complete splits in the figure are splits on the binary variables. Another type is a binary split, where the node maps one of the values of the split variable to one child, and all other values of the split variable to another. The root node is a binary split in the figure. Decision trees are typically constructed from data using a greedy search algorithm in conjunction with a scoring criterion that evaluates how good the tree is given the data. See Breiman et al. (1984) for examples of such algorithms. Buntine (1993) applies Bayesian scoring to grow decision trees; in our experiments, we use a particular Bayesian scoring function to be described in Section 4. Friedman and Goldszmidt The objective of these traditional decision-tree learning algorithms is to identify the tree that best models the distribution of interest, namely p(sjm, X). That is, the scoring criterion evaluates the predictive accuracy of the tree. In our application, however, the primary objective is to maximize profit, and although the objectives are related, the tree that best models the conditional distribution may not be the most useful when making decisions about who to mail in our campaign. We now consider a modification that can be made to a standard decision-tree learning algorithm that more closely approximates our objective. Recall that the expected lift in profit is the difference between two probabilities: one probability where M = m1 and the other probability where M = m0. Consequently, it might be desirable for the decisiontree algorithm (or any statistical model learning algorithm) to do its best to model the difference between these two probabilities rather than to directly model the conditional distribution. In the case of decision trees, one heuristic that can facilitate this goal is to insist that there be a split on M along any path from the root node to a leaf node in the tree. One approach to ensure this property, which is the approach we took in our experiments, is to insist that the last split on any path is on M. Whereas most tree learning algorithms grow trees by replacing leaf nodes with splits, algorithms using this approach need to be modified to replace the last split (on M) in the tree with a subtree that contains a split on some variable X; E X, followed by a (last) split on M for each child of the node that splits X;. 1 An example of such a replacement is shown in Figure 2. In Figure 2a, we show a decision tree where the last split on every path is on M. In Figure 2b we show a replacement of one of these splits that might be considered by a typical learning algorithm. Note that because leaves used to compute the ELP for any person are necessarily siblings in these trees, it is easy to describe an advertising decision in terms of the other variables. In particular, any path from the root node to an M split describes a unique partition of the population, and the sibling leaf nodes determine the mailing decision for all of the members of that population. As an example, a path in the tree to a split on M might correspond to males who have lots 11n fact, our implementation of this algorithm does not explicitly apply the last split in the tree. Instead, our scoring criterion is modified to evaluate the tree as if there was a last split on M for every leaf node.

6 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS next. Otherwise2, we checked whether or not the corresponding person subscribed to MSN; if he did, we added r 8 - c to our total revenue, and if he did not, we added -c. Finally, we divided our total revenue by the number of records for which our recommendation matched the random assignment in the experiment to get an expected revenue per person. For comparison purposes, we also calculated the perperson expected revenue for the simple strategy of mailing to everyone. Then, for both of the algorithms, we measured the improvement in the corresponding per-person revenue over the per-person revenue from the mail-to-all strategy. We found that comparing the algorithms using these improvements was very useful for analyzing multiple cost/benefit scenarios; the improvement from using a tree strategy over using the mail-to-all strategy converges to a small number as the benefit from the advertisement grows large, whereas the per-person revenue from a tree strategy will continue to increase with the benefit. Figure 3 shows our results using a single c = 42 cents and varying r8 = ru from 1 to 15 dollars. For both algorithms, the improvement in the per-person revenue over the mail-to-all per-person revenue is plotted for each value of r s. The new algorithm FORCE slightly outperforms the simple algorithm NORMAL for benefits less than ten dollars, but for larger benefits the trees yield identical decisions. Although the improvements over the mail-to-all strategy decrease with increasing subscription revenue, they will never be zero, and the strategy resulting from either tree will be preferred to the mailto-all strategy in this domain. The reason is that both models have identified populations for which the mailing is either independent of the subscription rate, or for which the mailing is actually detrimental F !!! 0 e. tn - c Cll E e c. 0.1.E *-FORCE - -o- - NORMAL 5 DISCUSSION In this paper we have discussed how to use machinelearning techniques to help in a targeted advertising campaign. We presented a new decision-tree learning algorithm that attempts to identify trees that will be particularly useful for maximizing revenue in such a campaign. Because experimental data of the type used in Section 4 is difficult to obtain, we were only able to evaluate our algorithm in a single domain. Benefit (Dollars) Figure 3: Expected per-person improvement over the mail-to-all strategy for both algorithms, using c = 42 cents and varying rs = ru from 1 to 15 An interesting question is why the new approach only provided marginal improvement over the simpler algorithm. It turns out that for the MSN domain, all 2 In our experiments, the number of times that our recommendation matched the experiment ranged from a low of roughly 5,000 times to a high of roughly 25,000 times

7 88 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000 of the trees learned using the off-the-shelf algorithm had splits on M for the majority of the paths from the root to the leaves. That is, the condition that motivated our new algorithm in the first place is almost satisfied using the simple algorithm. We expect that in general this will not be the case, and that our algorithm will prove to lead searches to better trees for targeting. [Ling and Li, 1998) Ling, C. and Li, C. (1998). Data mining for direct marketing: Problems and solutions. In Proceedings 4th International Conference on Knowledge Discovery in Databases (KDD-98). New York. An obvious alternative approach that meets our spliton-m criterion is to make the first split in the tree on M. Although our heuristic criterion is met, the approach clearly does not encourage a greedy algorithm to identify trees that predict ELP = p(s = s1[m = ml ) - p(s = S1[M = m0). In fact, this approach is equivalent to independently learning two decision trees: one for the data where M = m0 and another for the data where M = m1. In experiments not presented in this paper, we have found that the approach results in significantly inferior trees. An interesting extension to this work is to consider campaigns where the advertisement offers a special price. In fact, if the experiment consisted of mailing advertisements of various discounts, we could use our techniques to simultaneously identify the best discount and corresponding mailing strategy. References [Breiman et al., 1984) Breiman, L., Friedman, J., 01- shen, R., and Stone, C. (1984). Classification and Regression Trees. Wadsworth & Brooks, Monterey, CA. [Buntine, 1993) Buntine, W. (1993). Learning classification trees. In Artificial Intelligence Frontiers in Statistics: AI and statistics III. Chapman and Hall, New York. [Chickering et al., 1997) Chickering, D., Heckerman, D., and Meek, C. (1997). A Bayesian approach to learning Bayesian networks with local structure. In Proceedings of Thirteenth Conference on Uncertainty in Artificial Intelligence, Providence, RI. Morgan Kaufmann. [Friedman and Goldszmidt, 1996) Friedman, N. and Goldszmidt, M. (1996). Learning Bayesian netwotks with local structure. In Proceedings of Twelfth Conference on Uncertainty in Artificial Intelligence, Portland, OR, pages Morgan Kaufmann. [Hughes, 1996) Hughes, A. (1996). The Complete Database Marketer: Second-Generation Strategies and Techniques for Tapping the Power of Your Customer Database. Irwin Professional, Chicago, Ill.

Gerry Hobbs, Department of Statistics, West Virginia University

Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

Ira J. Haimowitz Henry Schwarz

From: AAAI Technical Report WS-97-07. Compilation copyright 1997, AAAI (www.aaai.org). All rights reserved. Clustering and Prediction for Credit Line Optimization Ira J. Haimowitz Henry Schwarz General

Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

Lecture 10: Regression Trees

Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,

D-optimal plans in observational studies

D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational

THE HYBRID CART-LOGIT MODEL IN CLASSIFICATION AND DATA MINING. Dan Steinberg and N. Scott Cardell

THE HYBID CAT-LOGIT MODEL IN CLASSIFICATION AND DATA MINING Introduction Dan Steinberg and N. Scott Cardell Most data-mining projects involve classification problems assigning objects to classes whether

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

Data Mining for Direct Marketing: Problems and

Data Mining for Direct Marketing: Problems and Solutions Charles X. Ling and Chenghui Li Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 Tel: 519-661-3341;

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

Decision Tree Learning on Very Large Data Sets

Decision Tree Learning on Very Large Data Sets Lawrence O. Hall Nitesh Chawla and Kevin W. Bowyer Department of Computer Science and Engineering ENB 8 University of South Florida 4202 E. Fowler Ave. Tampa

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

Unit 4 DECISION ANALYSIS. Lesson 37. Decision Theory and Decision Trees. Learning objectives:

Unit 4 DECISION ANALYSIS Lesson 37 Learning objectives: To learn how to use decision trees. To structure complex decision making problems. To analyze the above problems. To find out limitations & advantages

Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

Prediction of Stock Performance Using Analytical Techniques

136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

Classification/Decision Trees (II)

Classification/Decision Trees (II) Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Right Sized Trees Let the expected misclassification rate of a tree T be R (T ).

Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing

www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University

!"!!"#\$\$%&'()*+\$(,%!"#\$%\$&'()*""%(+,'-*&./#-\$&'(-&(0*".\$#-\$1"(2&."3\$'45"

!"!!"#\$\$%&'()*+\$(,%!"#\$%\$&'()*""%(+,'-*&./#-\$&'(-&(0*".\$#-\$1"(2&."3\$'45"!"#"\$%&#'()*+',\$\$-.&#',/"-0%.12'32./4'5,5'6/%&)\$).2&'7./&)8'5,5'9/2%.%3%&8':")08';:

ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS

ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS Abstract D.Lavanya * Department of Computer Science, Sri Padmavathi Mahila University Tirupati, Andhra Pradesh, 517501, India lav_dlr@yahoo.com

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

Numerical Algorithms Group

Title: Summary: Using the Component Approach to Craft Customized Data Mining Solutions One definition of data mining is the non-trivial extraction of implicit, previously unknown and potentially useful

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.

Uplift Modeling in Direct Marketing

Paper Uplift Modeling in Direct Marketing Piotr Rzepakowski and Szymon Jaroszewicz National Institute of Telecommunications, Warsaw, Poland Abstract Marketing campaigns directed to randomly selected customers

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods

Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods Jerzy B laszczyński 1, Krzysztof Dembczyński 1, Wojciech Kot lowski 1, and Mariusz Paw lowski 2 1 Institute of Computing

Identifying SPAM with Predictive Models

Identifying SPAM with Predictive Models Dan Steinberg and Mikhaylo Golovnya Salford Systems 1 Introduction The ECML-PKDD 2006 Discovery Challenge posed a topical problem for predictive modelers: how to

Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One

The Optimality of Naive Bayes

The Optimality of Naive Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada email: hzhang@unbca E3B 5A3 Abstract Naive Bayes is one of the most

Classification algorithm in Data mining: An Overview

Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department

Decision Trees. Andrew W. Moore Professor School of Computer Science Carnegie Mellon University. www.cs.cmu.edu/~awm awm@cs.cmu.

Decision Trees Andrew W. Moore Professor School of Computer Science Carnegie Mellon University www.cs.cmu.edu/~awm awm@cs.cmu.edu 42-268-7599 Copyright Andrew W. Moore Slide Decision Trees Decision trees

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

Predicting earning potential on Adult Dataset

MSc in Computing, Business Intelligence and Data Mining stream. Business Intelligence and Data Mining Applications Project Report. Predicting earning potential on Adult Dataset Submitted by: xxxxxxx Supervisor:

DATA MINING METHODS WITH TREES

DATA MINING METHODS WITH TREES Marta Žambochová 1. Introduction The contemporary world is characterized by the explosion of an enormous volume of data deposited into databases. Sharp competition contributes

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics

Getting Even More Out of Ensemble Selection

Getting Even More Out of Ensemble Selection Quan Sun Department of Computer Science The University of Waikato Hamilton, New Zealand qs12@cs.waikato.ac.nz ABSTRACT Ensemble Selection uses forward stepwise

6.042/18.062J Mathematics for Computer Science. Expected Value I

6.42/8.62J Mathematics for Computer Science Srini Devadas and Eric Lehman May 3, 25 Lecture otes Expected Value I The expectation or expected value of a random variable is a single number that tells you

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

Tree Ensembles: The Power of Post- Processing. December 2012 Dan Steinberg Mikhail Golovnya Salford Systems

Tree Ensembles: The Power of Post- Processing December 2012 Dan Steinberg Mikhail Golovnya Salford Systems Course Outline Salford Systems quick overview Treenet an ensemble of boosted trees GPS modern

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Henrik Boström School of Humanities and Informatics University of Skövde P.O. Box 408, SE-541 28 Skövde

Invited Applications Paper

Invited Applications Paper - - Thore Graepel Joaquin Quiñonero Candela Thomas Borchert Ralf Herbrich Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, UK THOREG@MICROSOFT.COM JOAQUINC@MICROSOFT.COM

Recognizing Informed Option Trading

Recognizing Informed Option Trading Alex Bain, Prabal Tiwaree, Kari Okamoto 1 Abstract While equity (stock) markets are generally efficient in discounting public information into stock prices, we believe

11.3 BREAK-EVEN ANALYSIS. Fixed and Variable Costs

385 356 PART FOUR Capital Budgeting a large number of NPV estimates that we summarize by calculating the average value and some measure of how spread out the different possibilities are. For example, it

Random forest algorithm in big data environment

Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

The Data Mining Process

Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

Data Mining Practical Machine Learning Tools and Techniques

Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we

A Non-Linear Schema Theorem for Genetic Algorithms

A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

Dependency Networks for Collaborative Filtering and Data Visualization

264 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000 Dependency Networks for Collaborative Filtering and Data Visualization David Heckerman, David Maxwell Chickering, Christopher Meek, Robert Rounthwaite,

Efficiency in Software Development Projects

Efficiency in Software Development Projects Aneesh Chinubhai Dharmsinh Desai University aneeshchinubhai@gmail.com Abstract A number of different factors are thought to influence the efficiency of the software

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Sagarika Prusty Web Data Mining (ECT 584),Spring 2013 DePaul University,Chicago sagarikaprusty@gmail.com Keywords:

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier

International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing

Using Adaptive Random Trees (ART) for optimal scorecard segmentation

A FAIR ISAAC WHITE PAPER Using Adaptive Random Trees (ART) for optimal scorecard segmentation By Chris Ralph Analytic Science Director April 2006 Summary Segmented systems of models are widely recognized

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC

Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University)

260 IJCSNS International Journal of Computer Science and Network Security, VOL.11 No.6, June 2011 Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario jdu43@uwo.ca

Data Mining in CRM & Direct Marketing Jun Du The University of Western Ontario jdu43@uwo.ca Outline Why CRM & Marketing Goals in CRM & Marketing Models and Methodologies Case Study: Response Model Case

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,

Using Control Groups to Target on Predicted Lift:

Using Control Groups to Target on Predicted Lift: Building and Assessing Uplift Models Nicholas J. Radcliffe Portrait Software The Smith Centre The Fairmile Henley-on-Thames Oxfordshire RG9 6AB UK Department

The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

Decision Trees from large Databases: SLIQ

Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values

A Learning Algorithm For Neural Network Ensembles

A Learning Algorithm For Neural Network Ensembles H. D. Navone, P. M. Granitto, P. F. Verdes and H. A. Ceccatto Instituto de Física Rosario (CONICET-UNR) Blvd. 27 de Febrero 210 Bis, 2000 Rosario. República

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges Allison Cornia Database Marketing Manager Microsoft Corporation C. Olivia Rud Executive Vice President Data Square, LLC Overview

Visualizing class probability estimators

Visualizing class probability estimators Eibe Frank and Mark Hall Department of Computer Science University of Waikato Hamilton, New Zealand {eibe, mhall}@cs.waikato.ac.nz Abstract. Inducing classifiers

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better

Smart Grid Data Analytics for Decision Support

1 Smart Grid Data Analytics for Decision Support Prakash Ranganathan, Department of Electrical Engineering, University of North Dakota, Grand Forks, ND, USA Prakash.Ranganathan@engr.und.edu, 701-777-4431

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

Converting a Number from Decimal to Binary

Converting a Number from Decimal to Binary Convert nonnegative integer in decimal format (base 10) into equivalent binary number (base 2) Rightmost bit of x Remainder of x after division by two Recursive

Chapter 20: Data Analysis

Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

not possible or was possible at a high cost for collecting the data.

Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

Philosophies and Advances in Scaling Mining Algorithms to Large Databases

Philosophies and Advances in Scaling Mining Algorithms to Large Databases Paul Bradley Apollo Data Technologies paul@apollodatatech.com Raghu Ramakrishnan UW-Madison raghu@cs.wisc.edu Johannes Gehrke Cornell

Large-Sample Learning of Bayesian Networks is NP-Hard

Journal of Machine Learning Research 5 (2004) 1287 1330 Submitted 3/04; Published 10/04 Large-Sample Learning of Bayesian Networks is NP-Hard David Maxwell Chickering David Heckerman Christopher Meek Microsoft

Data Mining Applications in Higher Education

Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

On the effect of data set size on bias and variance in classification learning

On the effect of data set size on bias and variance in classification learning Abstract Damien Brain Geoffrey I Webb School of Computing and Mathematics Deakin University Geelong Vic 3217 With the advent

Machine Learning in Spam Filtering

Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov kt@ut.ee Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.

Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL

Paper SA01-2012 Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL ABSTRACT Analysts typically consider combinations

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

Business Intelligence. Tutorial for Rapid Miner (Advanced Decision Tree and CRISP-DM Model with an example of Market Segmentation*)

Business Intelligence Professor Chen NAME: Due Date: Tutorial for Rapid Miner (Advanced Decision Tree and CRISP-DM Model with an example of Market Segmentation*) Tutorial Summary Objective: Richard would

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.

International Journal of Emerging Research in Management &Technology Research Article October 2015 Comparative Study of Various Decision Tree Classification Algorithm Using WEKA Purva Sewaiwar, Kamal Kant

Evolutionary Detection of Rules for Text Categorization. Application to Spam Filtering

Advances in Intelligent Systems and Technologies Proceedings ECIT2004 - Third European Conference on Intelligent Systems and Technologies Iasi, Romania, July 21-23, 2004 Evolutionary Detection of Rules

The Math. P (x) = 5! = 1 2 3 4 5 = 120.

The Math Suppose there are n experiments, and the probability that someone gets the right answer on any given experiment is p. So in the first example above, n = 5 and p = 0.2. Let X be the number of correct

Data Mining Classification: Decision Trees

Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set

Overview Evaluation Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes training set, validation set, test set holdout, stratification

Data Mining Applications in Fund Raising

Data Mining Applications in Fund Raising Nafisseh Heiat Data mining tools make it possible to apply mathematical models to the historical data to manipulate and discover new information. In this study,

TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP

TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

An Evaluation of Self-adjusting Binary Search Tree Techniques

SOFTWARE PRACTICE AND EXPERIENCE, VOL. 23(4), 369 382 (APRIL 1993) An Evaluation of Self-adjusting Binary Search Tree Techniques jim bell and gopal gupta Department of Computer Science, James Cook University,

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Arun K Mandapaka, Amit Singh Kushwah, Dr.Goutam Chakraborty Oklahoma State University, OK, USA ABSTRACT Direct