Predicting Car Purchase Intent Using Data Mining Approach
|
|
|
- Basil Lawson
- 9 years ago
- Views:
Transcription
1 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD) Predicting Car Purchase Intent Using Data Mining Approach 1 Yap Bee Wah, 2 Nor Huwaina Ismail Faculty of Computer and Mathematical Sciences Universiti Teknologi MARA Shah Alam, Selangor, Malaysia [email protected] 1,[email protected] 2 3 Simon Fong 3 Department of Computer and Information Science, University of Macau, China [email protected] 3 Abstract Data mining involves the exploration and analysis of large databases to find patterns and valuable information that can aid in decision making. This paper illustrates the use of data mining approach to build predictive models for predicting customer s intent of car purchase after booking a car. Records show that a customer who has booked a car has the tendency to cancel their booking. Three data mining predictive models: Logistic Regression (LR), Decision Tree (DT) and Neural Network (NN) were used to model the intent of purchase (IOP). The sample for this study has 1935 cases. The data was partitioned into training (70%) and validation (30%) samples. Comparisons of the performance of these three predictive models were based on the validation accuracy rate, sensitivity and specificity. Results show that all three models validation accuracy rate are quite similar (LR= 91.79%, CART=91.17%, NN=91.17%) while LR has the highest sensitivity (LR=87.77%, CART=85.47%, NN=85.89%). Important customer characteristics were also revealed from these models. Keywords- logistic regression, decision tree, data mining, classification, predictive modeling I. INTRODUCTION Data mining is one of the stages in the overall process of Knowledge Discovery in large databases (KDD). With the emergence of data mining software, data mining is gaining popularity among banks, telecommunication companies, insurance companies, educational institutions and business organizations to gain valuable information from the data which can aid in decision-making. Such organizations can use data mining for finding undiscovered patterns and/or relationships in large databases [1-5]. The goal of data mining is to find patterns in historical data that shed light on customer purchase behaviour, needs and preferences. Such valuable information can help organizations improve their business performance and practices such as improving target marketing, sales, and customer management. The different stages in the data mining process have been described in [2], [3] and [5]. The kinds of information that can be discovered depend upon the data mining objectives and techniques employed. Data mining techniques can be categorized into three categories: classification and prediction, cluster analysis and association analysis. Classification and prediction techniques fall under predictive modeling. Predictive modeling is also known as supervised classification or supervised learning because the prediction model is constructed from the data where the target or response variable is known. Generally, Linear Discriminant Analysis and logistic regression are two popular statistical methods to construct predictive models [6]. However, with the emergence of Data mining software such as SAS Enterprise Miner and SPSS Clementine, not only the classical methods but new novel predictive modeling and classification techniques such as decision tree, neural networks, support vector machine (SVM), and k-nearest neighbours are available for practical applications to real data from various discipline. Various studies in different subject areas have compared their predictive performance. For example, the ability of neural network models was compared with conventional techniques such as discriminant analysis, probit analysis and logistic regression in evaluating credit risk in Egyptian banks [7]. Some of these data mining classification algorithms were compared in predicting breast cancer survival [8] while [9] used an integrated data mining methodology to predict graft survival for heart-lung transplantation patients. Reference [10] investigated the performance of the SVM approach in credit rating prediction in comparison with back propagation neural networks while [11] reported that compared with neural networks, genetic programming and decision tree classifiers, the SVM classifier achieved identical classification accuracy with relatively few input variables. The performance of these data mining techniques will continuously be compared in different area of applications. The objective of this paper is to develop predictive model to foretell a customer s intent of purchase after booking a car. This study considered and compared the predictive ability of Logistic Regression (LR), decision tree (C5.0, CHAID and CART) and Neural Network (NN) models. This paper is organized as follows. In Section 2, we briefly review the applications of predictive models and the selection of variables. Section 3 presents the methodology for constructing the models. The results are discussed in Section 4. Finally, some concluding remarks are given in Section /11/$ IEEE 2052
2 II. METHODOLOGY A. Logistic Regression Logistic regression is a popular non-linear statistical model and widely applied in many fields. In contrast to multiple regression model, the logistic regression model a binary or polytomous dependent variable. For a binary dependent variable, the event of interest is coded as 1 and the nonevent as 0. The logistic regression model is written as: P( Y = 1) log = α + β P Y 1 ( = 1) 2 + Equation (1) can be solved to obtain 1 P( Y = 1) = z 1+ e where (2) where + β X + + β X 1 + β 2 X +... β k X k (1) k X k The logistic regression model enables us to calculate the probability of event Y=1 occurring for each case. The predictors, X k can be a mixture of continuous and categorical variables. B. Decision Tree A decision tree model consists of a set of rules for dividing a large collection of observations into smaller homogeneous group with respect to a particular target variable. The target variable is usually categorical and the decision tree model is used either to calculate the probability that a given record belongs to each of the target category, or to classify the record by assigning it to the most likely category. Decision tree can also be used for continuous target variable although multiple linear regression models are more suitable for such variable. Given a target variable and a set of explanatory variables, decision algorithms automatically determine which variables are most important, and subsequently sort the observations into the correct output category [12]. The common decision tree algorithms in data mining software are CHAID (Chi-Square Automatic Interaction Detector), CART (Classification and Regression tree) and C5. The CART algorithm uses gini as the splitting criteria for categorical dependent variable while C5 uses entropy. Meanwhile, CHAID uses chi-square test as the splitting criteria. These algorithms will produce the tree-like structure diagram and the decision rules whereby important information can be extracted. C. Artificial Neural Networks Artificial Neural Networks (ANNs) are seen as an attractive alternative to traditional statistical methods. They are modeled after the human brain, which can be perceived as a highly connected network of neurons (called nodes in neural networks terminology). Each node (in a layer of nodes) receives inputs from at least one node in a previous layer and combines the inputs and generates an output to at least one node in the next layer. Generally, the independent variables comprise the input layer and the dependent variable comprises the output layer. Between the input and output layers, one or more hidden layers of nodes may exist. The multilayer perceptron (MLP) is the most widely used neural network model in data analysis. ANNs can identify and learn correlated patterns between input data sets and corresponding target values. However, Artificial neural networks (ANNs) have been criticized for its black box approach and interpretative difficulties. Nevertheless, they provide an alternative model to be compared with other classification techniques. After training, ANNs can be used to predict the outcome for new independent input data ([1],[4],[13],[14]). D. Literature on Car Purchase In building a predictive model, historical data on customers who previously purchased or cancelled car booking are required. Reference [15] conducted a study on one thousand recent buyers of a new car. Among those, seventeen percent only considered the brand of their previous car before purchase another car. The factors that influence the consideration of a single brand are satisfaction with the previous car and dealer, socio-demographic variables (being old, with a lower education and lower income), low perceived risk, and a number of product-specific elements (owning only one car, not owning a foreign car, staying in the same product segment, having driven only 30,000 kilometers with the previous car and having owned ten cars in the past). In predicting purchase behavior from stated intentions [16] proposed a unified model and applied it to a survey which involved randomly selected 2000 households. For the automobile data, the purchase intention is defined as 1 if the consumer intends to purchase or (actually purchases) an automobile within 12 months. Meanwhile, the purchase intention is defined as 0 if the consumer does not intend to purchase or (does not actually purchase) an automobile within 12 months. They considered variables such as occupation and education level of household head, type of residence, income, number of cars and years of cars currently in household. According to [17] current owners of cars are more likely to repurchase the brands they currently own when they are asked intent questions. In addition, the purchase behavior of current car owners is more consistent with their brand attitudes. Firsttime car buyers, on the other hand, are more likely to purchase brands that have large market shares. Reference [18] presented a model which produces simultaneous forecasts of car holding, new car purchase and scrappage. All are sensitive to changes in income and prices or car costs. The basic theoretical foundation of their model is the assumption that a potential car holder is a person between 18 and 75 years old. Car holder means here a person holding a registered car. Car holding is largely determined by income, people s expectation and car cost components. Evolution of car holding is sensitive to economic circumstances. Nevertheless, new car purchase is very much more sensitive to economic circumstances than is car holding. The role of affordability is also an important predictor of purchase instead of attitudes and purchase intentions. That is why income is an important variable in economics and is examined extensively. Total family income (TFI) is used to segment markets, profile consumers, and provide explanations for changes in purchasing patterns [19]. Reference [20] examined the impact of gender on 2053
3 car buyer satisfaction and found that the attitudes of male and female consumers toward car purchasing showed a clear difference. It is clearly shown that the price of a car to be important for both male and female buyers, but for different reasons. For male buyer, paying a higher price for a car means that they can have higher expectations and impress others more, while for female buyers a higher price is more important in assuring them that their car will perform as it should. Women are becoming an increasing force in the car buyer market. Their pattern of car buying differs from men. Women tend to buy lower-priced cars, and are strongest in the compact and subcompact segments. Hence, many car companies aim some of their advertising specifically at women. In a forecasting model of car ownership in Sweden, income is reported as an important predictor of car ownership. Income rates are growing faster among women than men in which 2 per cent growth in income for women and constant income for men. Male car ownership is forecast to grow only by 3 per cent to the year 2010, while female car ownership is forecast to grow by 70 per cent. Thus, he suggested that female car ownership is now the strategic factor for the future development of motorization [21]. In a study on households intention to replace the old car, the replacement intention has positive relationship with the quality of the new car and negative relationship with the perception of the old car. In other words, the household intent to replace their old car is based on the total number of miles driven, age of the car and the anticipated number of repairs [22]. E. Selection From the literature review and availability of data from the car dealer company, a description of the variables in the dataset are shown in Table 1. TABLE I. DESCRIPTION OF VARIABLES Role Name Type Description Intent of Credit card application Purchase (IOP) Target Binary 0 : Purchase 1: cancel age Input Continuous Age in years Income group Input Categorical Car status Input Categorical gender Input Binary LOU Input Categorical Car_Price Input Categorical Monthly income 0 : < : : : : : > Status of this car: 1 :Additional car 2: Replacement car 3:First car Applicant is 1: Male, 2: Female House (1: No 2: Yes) Price of car(rm): 1 : : Name Role Type Book_fee Input Categorical Down_pay Input Categorical Description 3 : , > 100,000 Booking fee (RM) 1 : < : : : Down Payment (RM) 1: 0 2: : : : ,000 6: > 100,000 Loan_amt Input Binary Loan amount (RM) 1 : : ,000 3: >100, Model type Input Categorical Twelve model types F. Modeling using Clementine The sample data was first partitioned into a training sample (70%) and a validation sample (30%). The training sample data is used to build the models, while the validation sample data is for validation of the models. Fig. 1 depicts the data modeling process using SPSS Clementine. Fig. 1. Data Mining Process Flow Diagram The pentagon-shaped nodes show the construction of the models using logistic regression, decision trees (CART) and neural network. The diamond-shaped nodes show the model outputs of the respective models. For the logistic regression model, four selection methods (ENTER, STEPWISE, FORWARDS, BACKWARDS) were compared using the Analysis and Evaluation nodes. While for decision tress, the C5.0, CHAID and CART models were generated and compared. Then, the three predictive models which are stepwise logistic regression, CART and neural network are connected to the analysis node which provides the computation of accuracy rates, while the evaluation node produces the lift charts. 2054
4 III. RESULTS Car_Status = ** ** In this section the results of the predictive models are presented A. Logistic Regression Results For the Enter method, all variables are significant predictors except for gender. Meanwhile, the Forward, Backward and Stepwise models selected the same significant predictors. Table 3 summarizes the logistic regression results using Enter and Stepwise selection method. Based on the results in Table 2, the validation accuracy rates for the Enter and Stepwise models achieved the same value (91.79%). However, the Stepwise model has a highervalidation sensitivity (87.77%). TABLE II. ACCURACY RATE Model Sample Accuracy Sensitivity Specificity rate Enter Training Validation Stepwise Training Validation Results in Table 3 shows that, those without LOU, those with low income (< RM2000) and low booking fee are more likely to cancel their booking. Cancellation is also more likely for those who are purchasing a first car. Further crosstabulation results revealed that cancellation was more for model 9 and 4. Meanwhile, model 12 has the lowest cancellation rate. TABLE III. STEPWISE LOGISTIC REGRESSION RESULTS B (Enter) B (Stepwise) Constant ** ** Age -.046** -.046** Gender = F.12 LOU = N 7.162** 7.189** Booking_Fee = ** ** Booking_Fee = ** ** Booking_Fee = Car_Price = Car_Price = Car_Price = Income_Group = * 3.449* Income_Group = ** 3.444** Income_Group = ** 2.343** Income_Group = ** 1.772** Income_Group = ** 1.791** Model_Type = ** 3.043** Model_Type = * 4.332** Model_Type = Model_Type = ** 5.959** Model_Type = ** 4.929** Model_Type = Model_Type = ** 2.748** Model_Type = ** 5.691** Model_Type = ** 4.329** Model_Type = ** 6.283** Model_Type = ** Car_Status = Chi-Square ** ** -2LL Nagelkerke R-Sq B. Decision Tree Model Results Decision tree is easy to understand and can be easily converted to a set of rules. Moreover, they can classify both categorical and numerical data and require no priori assumptions about the data. Because of the advantages listed above, the decision tree approach is extensively utilized for both classification and prediction purposes. The CART model finds four variables to be influential on the intent of purchase (LOU, booking fee, model type and car status) and the decision tree rules are listed in Table 4 while Fig. 2 shows the CART model. CANCEL TABLE IV. replacement car. Car Model: 2, 4, 6, 8, 9,10 or 11. replacement car. Car Model 1, 3, 5, 7 or 12. Booking fees are RM200- RM300, RM300-RM500 or RM500-RM1000. first car or additional car. Income groups are <RM2000, RM2000- RM4000 or RM4000- RM6000. CART RULES PURCHASE Customers have letter of undertaking (LOU) replacement car. Car Model: 1,3, 5, 7 or 12. Booking fee is <RM200. first car or additional car. Income groups are RM6000-RM8000, RM8000-RM10,000 or >RM10,000. Ages of customers are more than 43 years old. Car model: 1, 5, 11 or 12. Table 5 displays the sensitivity, specificity and the classification accuracy for each decision tree model. The sensitivity rate is the true positive rate (the percentage of customers who cancelled booking predicted correctly) while specificity is the true negative rate (percentage of those who purchase predicted correctly). All three models performances are quite similar. The CART produces simple rules and hence was chosen to be compared with LR and NN models. 2055
5 TABLE V. ACCURACY RATE, SENSITIVITY AND SPECIFICITY Model Sample Accuracy Sensitivity Specificity rate C5.0 Training Validation CHAID Training Validation CART Training Validation C. Neural Network Model For Neural Network (NN) model, the neural network has 34 neurons in the input layer, 3 neurons in the hidden layer and 2 neurons in the output layer. Table 5 shows the importance of the input variables in descending order. The top five most important input variables in descending order of importance are: Letter of undertaking, income group, model type, car status and car price. The estimated of accuracy rate of the neural network model is 90.79%. This is based on the correct classification rate in the training sample. TABLE VI. RELATIVE IMPORTANCE OF INPUT VARIABLES Importance value Letter of Undertaking Income Group Model Type Car Status Car Price 0.07 Booking Fee Age Gender D. Model Comparisons Comparison between these LR, CART and NN models was made to determine the best model. The accuracy rates for training and validation samples are given in Table 6. All three models predictive accuracy is quite comparable with Logistic Regression model having a slightly higher sensitivity. TABLE VII. ACCURACY RATE Model Sample Accuracy Sensitivity Specificity rate Logistic Training Regression Validation CART Training Neural Network Validation Training Validation IV. CONCLUSION There has been a rapid growth of data mining in business, applications, social and medical research. Logistic regression is the most popular statistical model to predict the probability of an event happening. With the emergence of data mining, nontraditional statistical methods such as neural networks, support vector machine and decision trees are gaining popularity in the search for a good predictive model. Data mining usually involves modeling large volumes of data and the focus is on the practical importance of the information or knowledge gained from the models. This study illustrated the construction and evaluation of three predictive models which include logistic regression, decision tree and neural network model to predict the intent of purchase of a car. Results revealed no models outperform the other but important characteristics of customers were obtained from the logistic regression and CART model. Work is in progress to cover other classification techniques such as SVM and Bayesian classification. The performance of predictive models depends on the data structure, data quality and variable selection. With the availability of data mining software, data mining models are easy to construct and apply in the business industry. However, a successful data mining project requires the involvement of experts in data mining, subject area experts and people in the business organization. REFERENCES [1] M. J. A. Berry and G. S. Linoff, Data Mining Ttehniques: For Marketing, Sales, and Customer Support. New York: John Wiley & Sons, Inc, [2] H. Jiawei and K. Micheline, Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, [3] A. Feelders, H. Daniels, M. Holsheimer, Methodological and Practical Aspects of Data Mining, Information & Management, , [4] G. Paolo, Applied Data Mining for Business and Industry, John Wiley & Sons, [5] K.J. Cios and L.A. Kurgan, Trends in data mining and knowledge discovery, Advanced Information and Knowledge Processing, 1-26,2005. [6] D. J. Hand and W. E. Henley, Statistical classification methods in consumer credit scoring: A review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 160(3), ,1997. [7] H. Abdou, J. Pointon, and A. El-Masry, Neural nets versus conventional techniques in credit scoring in Egyptian banking. Expert System with Applications, 35, , [8] A. Endo, T. Shibata and H. Tanaka, Comparisons of seven algorithms to predict breast cancer survival, Biomedical Soft Computing and Human Sciences, Vol 13, No. 2, 11-16, [9] A. Oztekin, D. Delen, Z. Kong, Predicting the graft survival for heartlung transplantation patients: An integrated data mining methodology,international Journal of Medical Informatics, 78(12),,e84- e96,2009. [10] Z. Huang, H. Chen,, C-J. Hsu, W-H. Chen, and S. Wu, Credit rating analysis with support vector machines and neural networks: a market comparative study. Decision Support System, 37, , [11] C-L. Huang, M-C. Chen, and, C-J. Wang, Credit scoring with a data mining approach based on support vector machines. Expert System with Applications, 37, ,2007. [12] D. Olson and S. Yong, Introduction to Business Data Mining. McGraw Hill International Edition,
6 [13] J.D. Olden and D.A. Jackson, Illuminating the black box : a randomization approach for understanding variable contributions in artificial neural networks, Ecological Modeling, 154, ,2002. [14] C. K. Hian and K.L. Chan, Going concern prediction using data mining techniques, Managerial Auditing Journal, Vol 19, No 3, , [15] E. Lapersonne, G. Laurent and J-J Le Goff, Consideration sets of size one: An empirical investigation of automobile purchases. International Journal of Research in Marketing 12, 55-66,1995. [16] B. Sun and Morwitz, V.G., Stated intentions and purchase behavior: A unified model. International Journal of Research in Marketing,Volume 27( 4), ,2010. [17] G.J. Fitzsimons and Mortwitz, V.G.,The effect of measuring intent on brand-level purchase behavior. Journal of Consumer Research Inc., 23,1-11,1996. [18] Jorgensen, F. and Wentzel-Larsen, T, Forecasting car holding,scrappage and new car purchase in Norway, Journal of Transport Economics and Policy 24(2), ,1990. [19] Notani, A.S., Perceptions of affordability: Their role in predicting purchase intent and purchase. Journal of Economic Psychology 18, ,1995. [20] Moutinho, L., Davies, F. and Curry, B.,The impact of gender on car buyer satisfaction and loyalty. Journal of Retailing and Consumer Services 3(3), ,1996. [21] Jansson, J. O., Car demand modeling and forecasting:a new approach. Journal of Transport Economics and Policy 23(2), ,1989. [22] Marell, A., Davidsson, P., Garling, T. and Laitila, T., Direct and indirect effects on households intentions to replace the old car. Journal of Retailing and Consumer Services 11, 1 8,
Prediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
Data quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry
Advances in Natural and Applied Sciences, 3(1): 73-78, 2009 ISSN 1995-0772 2009, American Eurasian Network for Scientific Information This is a refereed journal and all articles are professionally screened
Towards applying Data Mining Techniques for Talent Mangement
2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Towards applying Data Mining Techniques for Talent Mangement Hamidah Jantan 1,
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
Chapter 12 Discovering New Knowledge Data Mining
Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to
Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring
714 Evaluation of Feature election Methods for Predictive Modeling Using Neural Networks in Credits coring Raghavendra B. K. Dr. M.G.R. Educational and Research Institute, Chennai-95 Email: [email protected]
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier
International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
Comparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University)
260 IJCSNS International Journal of Computer Science and Network Security, VOL.11 No.6, June 2011 Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case
Data mining and statistical models in marketing campaigns of BT Retail
Data mining and statistical models in marketing campaigns of BT Retail Francesco Vivarelli and Martyn Johnson Database Exploitation, Segmentation and Targeting group BT Retail Pp501 Holborn centre 120
A Neural Network based Approach for Predicting Customer Churn in Cellular Network Services
A Neural Network based Approach for Predicting Customer Churn in Cellular Network Services Anuj Sharma Information Systems Area Indian Institute of Management, Indore, India Dr. Prabin Kumar Panigrahi
A Property & Casualty Insurance Predictive Modeling Process in SAS
Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing
Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com
SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING
Joseph Twagilimana, University of Louisville, Louisville, KY
ST14 Comparing Time series, Generalized Linear Models and Artificial Neural Network Models for Transactional Data analysis Joseph Twagilimana, University of Louisville, Louisville, KY ABSTRACT The aim
Algorithmic Scoring Models
Applied Mathematical Sciences, Vol. 7, 2013, no. 12, 571-586 Algorithmic Scoring Models Kalamkas Nurlybayeva Mechanical-Mathematical Faculty Al-Farabi Kazakh National University Almaty, Kazakhstan [email protected]
Dr. U. Devi Prasad Associate Professor Hyderabad Business School GITAM University, Hyderabad Email: [email protected]
96 Business Intelligence Journal January PREDICTION OF CHURN BEHAVIOR OF BANK CUSTOMERS USING DATA MINING TOOLS Dr. U. Devi Prasad Associate Professor Hyderabad Business School GITAM University, Hyderabad
Predicting Student Performance by Using Data Mining Methods for Classification
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance
Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing
www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University
Keywords data mining, prediction techniques, decision making.
Volume 5, Issue 4, April 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analysis of Datamining
A Basic Guide to Modeling Techniques for All Direct Marketing Challenges
A Basic Guide to Modeling Techniques for All Direct Marketing Challenges Allison Cornia Database Marketing Manager Microsoft Corporation C. Olivia Rud Executive Vice President Data Square, LLC Overview
A New Approach for Evaluation of Data Mining Techniques
181 A New Approach for Evaluation of Data Mining s Moawia Elfaki Yahia 1, Murtada El-mukashfi El-taher 2 1 College of Computer Science and IT King Faisal University Saudi Arabia, Alhasa 31982 2 Faculty
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
USING LOGIT MODEL TO PREDICT CREDIT SCORE
USING LOGIT MODEL TO PREDICT CREDIT SCORE Taiwo Amoo, Associate Professor of Business Statistics and Operation Management, Brooklyn College, City University of New York, (718) 951-5219, [email protected]
Enhanced Boosted Trees Technique for Customer Churn Prediction Model
IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
White Paper. Data Mining for Business
White Paper Data Mining for Business January 2010 Contents 1. INTRODUCTION... 3 2. WHY IS DATA MINING IMPORTANT?... 3 FUNDAMENTALS... 3 Example 1...3 Example 2...3 3. OPERATIONAL CONSIDERATIONS... 4 ORGANISATIONAL
Customer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
NEURAL NETWORKS IN DATA MINING
NEURAL NETWORKS IN DATA MINING 1 DR. YASHPAL SINGH, 2 ALOK SINGH CHAUHAN 1 Reader, Bundelkhand Institute of Engineering & Technology, Jhansi, India 2 Lecturer, United Institute of Management, Allahabad,
What is Data Mining? MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling
MS4424 Data Mining & Modelling MS4424 Data Mining & Modelling Lecturer : Dr Iris Yeung Room No : P7509 Tel No : 2788 8566 Email : [email protected] 1 Aims To introduce the basic concepts of data mining
International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013
A Short-Term Traffic Prediction On A Distributed Network Using Multiple Regression Equation Ms.Sharmi.S 1 Research Scholar, MS University,Thirunelvelli Dr.M.Punithavalli Director, SREC,Coimbatore. Abstract:
International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET
DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand
An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century
An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century Nora Galambos, PhD Senior Data Scientist Office of Institutional Research, Planning & Effectiveness Stony Brook University AIRPO
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Nine Common Types of Data Mining Techniques Used in Predictive Analytics
1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better
A Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
TNS EX A MINE BehaviourForecast Predictive Analytics for CRM. TNS Infratest Applied Marketing Science
TNS EX A MINE BehaviourForecast Predictive Analytics for CRM 1 TNS BehaviourForecast Why is BehaviourForecast relevant for you? The concept of analytical Relationship Management (acrm) becomes more and
Weather forecast prediction: a Data Mining application
Weather forecast prediction: a Data Mining application Ms. Ashwini Mandale, Mrs. Jadhawar B.A. Assistant professor, Dr.Daulatrao Aher College of engg,karad,[email protected],8407974457 Abstract
Data Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
USE OF DATA MINING TO DERIVE CRM STRATEGIES OF AN AUTOMOBILE REPAIR SERVICE CENTER IN KOREA
USE OF DATA MINING TO DERIVE CRM STRATEGIES OF AN AUTOMOBILE REPAIR SERVICE CENTER IN KOREA Youngsam Yoon and Yongmoo Suh, Korea University, {mryys, ymsuh}@korea.ac.kr ABSTRACT Problems of a Korean automobile
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
Predictive Modeling of Titanic Survivors: a Learning Competition
SAS Analytics Day Predictive Modeling of Titanic Survivors: a Learning Competition Linda Schumacher Problem Introduction On April 15, 1912, the RMS Titanic sank resulting in the loss of 1502 out of 2224
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
Identification of User Patterns in Social Networks by Data Mining Techniques: Facebook Case
Identification of User Patterns in Social Networks by Data Mining Techniques: Facebook Case A. Selman Bozkır 1, S. Güzin Mazman 2, and Ebru Akçapınar Sezer 1 1 Hacettepe University, Department of Computer
A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model
A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model Twinkle Patel, Ms. Ompriya Kale Abstract: - As the usage of credit card has increased the credit card fraud has also increased
AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.
AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree
Small Business Credit Scoring: A Comparison of Logistic Regression, Neural Network, and Decision Tree Models
Small Business Credit Scoring: A Comparison of Logistic Regression, Neural Network, and Decision Tree Models Marijana Zekic-Susac University of J.J. Strossmayer in Osijek, Faculty of Economics in Osijek
Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign
Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Arun K Mandapaka, Amit Singh Kushwah, Dr.Goutam Chakraborty Oklahoma State University, OK, USA ABSTRACT Direct
SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH
330 SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH T. M. D.Saumya 1, T. Rupasinghe 2 and P. Abeysinghe 3 1 Department of Industrial Management, University of Kelaniya,
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,
How To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
City University of Hong Kong. Information on a Course offered by Department of Management Sciences with effect from Semester A in 2010 / 2011
City University of Hong Kong Information on a Course offered by Department of Management Sciences with effect from Semester A in 200 / 20 Part I Course Title: Enterprise Data Mining Course Code: MS4224
Data Mining Techniques for Mortality at Advanced Age
Data Mining Techniques for Mortality at Advanced Age Lijia Guo, Ph.D., A.S.A. and Morgan C. Wang, Ph.D. University of Central Florida Abstract This paper addresses issues and techniques for advanced age
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data
Course Syllabus. Purposes of Course:
Course Syllabus Eco 5385.701 Predictive Analytics for Economists Summer 2014 TTh 6:00 8:50 pm and Sat. 12:00 2:50 pm First Day of Class: Tuesday, June 3 Last Day of Class: Tuesday, July 1 251 Maguire Building
Prediction of Cancer Count through Artificial Neural Networks Using Incidence and Mortality Cancer Statistics Dataset for Cancer Control Organizations
Using Incidence and Mortality Cancer Statistics Dataset for Cancer Control Organizations Shivam Sidhu 1,, Upendra Kumar Meena 2, Narina Thakur 3 1,2 Department of CSE, Student, Bharati Vidyapeeth s College
A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH
205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology
How to Get More Value from Your Survey Data
Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................2
Data Mining with SAS. Mathias Lanner [email protected]. Copyright 2010 SAS Institute Inc. All rights reserved.
Data Mining with SAS Mathias Lanner [email protected] Copyright 2010 SAS Institute Inc. All rights reserved. Agenda Data mining Introduction Data mining applications Data mining techniques SEMMA
POST-HOC SEGMENTATION USING MARKETING RESEARCH
Annals of the University of Petroşani, Economics, 12(3), 2012, 39-48 39 POST-HOC SEGMENTATION USING MARKETING RESEARCH CRISTINEL CONSTANTIN * ABSTRACT: This paper is about an instrumental research conducted
Neural Networks in Data Mining
IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V6 PP 01-06 www.iosrjen.org Neural Networks in Data Mining Ripundeep Singh Gill, Ashima Department
Predictive Dynamix Inc
Predictive Modeling Technology Predictive modeling is concerned with analyzing patterns and trends in historical and operational data in order to transform data into actionable decisions. This is accomplished
Easily Identify Your Best Customers
IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do
Paper AA-08-2015. Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM
Paper AA-08-2015 Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM Delali Agbenyegah, Alliance Data Systems, Columbus, Ohio 0.0 ABSTRACT Traditional
STATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
Churn Prediction. Vladislav Lazarov. Marius Capota. [email protected]. [email protected]
Churn Prediction Vladislav Lazarov Technische Universität München [email protected] Marius Capota Technische Universität München [email protected] ABSTRACT The rapid growth of the market
PharmaSUG2011 Paper HS03
PharmaSUG2011 Paper HS03 Using SAS Predictive Modeling to Investigate the Asthma s Patient Future Hospitalization Risk Yehia H. Khalil, University of Louisville, Louisville, KY, US ABSTRACT The focus of
Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.
Title Introduction to Data Mining Dr Arulsivanathan Naidoo Statistics South Africa OECD Conference Cape Town 8-10 December 2010 1 Outline Introduction Statistics vs Knowledge Discovery Predictive Modeling
DATA MINING AND REPORTING IN HEALTHCARE
DATA MINING AND REPORTING IN HEALTHCARE Divya Gandhi 1, Pooja Asher 2, Harshada Chaudhari 3 1,2,3 Department of Information Technology, Sardar Patel Institute of Technology, Mumbai,(India) ABSTRACT The
Predictive time series analysis of stock prices using neural network classifier
Predictive time series analysis of stock prices using neural network classifier Abhinav Pathak, National Institute of Technology, Karnataka, Surathkal, India [email protected] Abstract The work pertains
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three
BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, [email protected]) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA
ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,
Microsoft Azure Machine learning Algorithms
Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql [email protected] http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation
A Hybrid Data Mining Model to Improve Customer Response Modeling in Direct Marketing
A Hybrid Data Mining Model to Improve Customer Response Modeling in Direct Marketing Maryam Daneshmandi [email protected] School of Information Technology Shiraz Electronics University Shiraz, Iran
International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through
Neural Networks and Back Propagation Algorithm
Neural Networks and Back Propagation Algorithm Mirza Cilimkovic Institute of Technology Blanchardstown Blanchardstown Road North Dublin 15 Ireland [email protected] Abstract Neural Networks (NN) are important
MERGING BUSINESS KPIs WITH PREDICTIVE MODEL KPIs FOR BINARY CLASSIFICATION MODEL SELECTION
MERGING BUSINESS KPIs WITH PREDICTIVE MODEL KPIs FOR BINARY CLASSIFICATION MODEL SELECTION Matthew A. Lanham & Ralph D. Badinelli Virginia Polytechnic Institute and State University Department of Business
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
MS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Knowledge Based Descriptive Neural Networks
Knowledge Based Descriptive Neural Networks J. T. Yao Department of Computer Science, University or Regina Regina, Saskachewan, CANADA S4S 0A2 Email: [email protected] Abstract This paper presents a
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
Feature Subset Selection in E-mail Spam Detection
Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
from Larson Text By Susan Miertschin
Decision Tree Data Mining Example from Larson Text By Susan Miertschin 1 Problem The Maximum Miniatures Marketing Department wants to do a targeted mailing gpromoting the Mythic World line of figurines.
Stock Portfolio Selection using Data Mining Approach
IOSR Journal of Engineering (IOSRJEN) e-issn: 2250-3021, p-issn: 2278-8719 Vol. 3, Issue 11 (November. 2013), V1 PP 42-48 Stock Portfolio Selection using Data Mining Approach Carol Anne Hargreaves, Prateek
A Comparison of Decision Tree and Logistic Regression Model Xianzhe Chen, North Dakota State University, Fargo, ND
Paper D02-2009 A Comparison of Decision Tree and Logistic Regression Model Xianzhe Chen, North Dakota State University, Fargo, ND ABSTRACT This paper applies a decision tree model and logistic regression
Introduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
Clustering Marketing Datasets with Data Mining Techniques
Clustering Marketing Datasets with Data Mining Techniques Özgür Örnek International Burch University, Sarajevo [email protected] Abdülhamit Subaşı International Burch University, Sarajevo [email protected]
