THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION



Similar documents
The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Forecasting the Direction and Strength of Stock Market Movement

A DATA MINING APPLICATION IN A STUDENT DATABASE

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Single and multiple stage classifiers implementing logistic discrimination

What is Candidate Sampling

An Interest-Oriented Network Evolution Mechanism for Online Communities

Calculating the high frequency transmission line parameters of power cables

Traffic-light a stress test for life insurance provisions

A Secure Password-Authenticated Key Agreement Using Smart Cards

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Lecture 2: Single Layer Perceptrons Kevin Swingler

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Gender Classification for Real-Time Audience Analysis System

Research on Evaluation of Customer Experience of B2C Ecommerce Logistics Enterprises

Detecting Credit Card Fraud using Periodic Features

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

The Application of Fractional Brownian Motion in Option Pricing

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Data Mining from the Information Systems: Performance Indicators at Masaryk University in Brno

Credit Limit Optimization (CLO) for Credit Cards

Calculation of Sampling Weights

Performance Analysis and Coding Strategy of ECOC SVMs

BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK. 0688,

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Statistical Methods to Develop Rating Models

L10: Linear discriminants analysis

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Improved SVM in Cloud Computing Information Mining

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

Design and Development of a Security Evaluation Platform Based on International Standards

STATISTICAL DATA ANALYSIS IN EXCEL

Data Mining Analysis and Modeling for Marketing Based on Attributes of Customer Relationship

Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

An artificial Neural Network approach to monitor and diagnose multi-attribute quality control processes. S. T. A. Niaki*

ERP Software Selection Using The Rough Set And TPOSIS Methods

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

A Multi-mode Image Tracking System Based on Distributed Fusion

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

CONSTRUCTING A SALES FORECASTING MODEL BY INTEGRATING GRA AND ELM:A CASE STUDY FOR RETAIL INDUSTRY

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

The OC Curve of Attribute Acceptance Plans

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

An Integrated Approach of AHP-GP and Visualization for Software Architecture Optimization: A case-study for selection of architecture style

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Statistical Approach for Offline Handwritten Signature Verification

8 Algorithm for Binary Searching in Trees

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Invoicing and Financial Forecasting of Time and Amount of Corresponding Cash Inflow

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

A Simple Approach to Clustering in Excel

USING GOAL PROGRAMMING TO INCREASE THE EFFICIENCY OF MARKETING CAMPAIGNS

SIMPLE LINEAR CORRELATION

Multiple-Period Attribution: Residuals and Compounding

An Alternative Way to Measure Private Equity Performance

The Journal of Systems and Software

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

Semantic Link Analysis for Finding Answer Experts *

Traditional versus Online Courses, Efforts, and Learning Performance

FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES

Mining Multiple Large Data Sources

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST)

Searching for Interacting Features for Spam Filtering

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns

A COLLABORATIVE TRADING MODEL BY SUPPORT VECTOR REGRESSION AND TS FUZZY RULE FOR DAILY STOCK TURNING POINTS DETECTION

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

Support Vector Machines

How To Predct On The Web For Hfmd

Set. algorithms based. 1. Introduction. System Diagram. based. Exploration. 2. Index

Lei Liu, Hua Yang Business School, Hunan University, Changsha, Hunan, P.R. China, Abstract

A novel Method for Data Mining and Classification based on

Estimating the Number of Clusters in Genetics of Acute Lymphoblastic Leukemia Data

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

Subcontracting Structure and Productivity in the Japanese Software Industry

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications


Using Content-Based Filtering for Recommendation 1

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

Efficient Project Portfolio as a tool for Enterprise Risk Management

Transcription:

Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4, pp. 30-30 (2005) 30 THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Yu-Mn Chang *, Yu-Cheh Lo and Shang-Y Ln Department of Industral Engneerng and Management I-Shou Unversty Kaohsung (840), Tawan ABSTRACT Nowadays, the capablty to collect data has been expanded enormously and provdes enterprses huge amount of data. The nterestng knowledge or the hgh-valued nformaton about the customer can be extracted by data mnng. By followng the market segmentaton strategy, an enterprse could ncrease the expected profts. However, as for the enterprse, the customer s basc data ncludng some demographc varables and geographc varables are easer to obtan than the behavor data of customers. The customer value may be predcted through the customer s basc data. Followed by takng marketng actvtes to those customers wth hgh value, the enterprses could avod unnecessary marketng cost. The research constructed a marketng decson model whch utlzed the demographc and geographc varables as nput of three ndvdual classfers - BP network, decson tree, and Mahalanobs dstance - to predct a new customer s value. In order to mprove the accuracy of predcton, ths study combned three classfers to predct new customer s value through the dfferent combnng methods. The results show the multple classfer system combned by BP has the best predcton accuracy. Keywords: Customer Relaton Management, Data Mnng, Neural Network, Mahalanobs Dstance, Decson Tree *. INTRODUCTION The ssues of customer relatonshp management (CRM) have attracted many concerns nowadays and busness operaton model has gradually turned from product-focused to customer-centrc. Enterprses have come to realze that customer nformaton s one of ther key assets. As the enterprses explore the customer behavor n depth, they fnd that not all customers wll brng profts and ust a small percentage of all users of the products the best customers account for a large porton of an organzaton s sales. Snce customers are dfferent, concentratng on the heavy user market segment s an attractve strategy. A CRM system s a process to comple nformaton that ncreases understandng of how to manage an organzaton s relatonshps wth ts customers [6]. By cooperatng wth the marketng actvtes, CRM systems can brng a lot of proft and help enterprses to survve n a contnuous changng and compettve envronment. CRM systems play the roles n collectng customers data and ntensfyng the relatonshp between customers and organzatons thus achevng the obectves of establshng customer loyalty. Most * Correspondng author: ymchang@su.edu.tw busnesses agree that t costs sx tmes as much to get a new customer as t does to keep an old customer. Prahalad and Ramaswamy [0] showed that f an organzaton can ncrease 5% of customer retenton rate, the profts from customers wll move up 25% on average. General speakng, customer data of an organzaton s gathered from the nteractons wth customers, such as customers basc data and the sales transacton data. By analyzng the data, the organzaton can understand customer dfferences. Once organzatons learn more and more about ther customers, they can use that knowledge to serve them better. Data mnng s a well-known analytc technque that can be used to turn customer data nto customer knowledge. Typcally, 20 percent of the customers buy 80 percent of the product sold, and t s the famous 80/20 prncple. These 20 percent are heavy users or and may be the best customers. In terms of marketng, focus on heavy users can get more revenues. Therefore, organzatons should dvde a heterogeneous market nto a number of smaller, more homogeneous subgroups, and that s called customer segmentaton. In the past, most customer segmentaton was dfferentated by RFM (Recency, Frequency and Monetary) scores or consumpton patterns. The segmentaton process s mplemented by mnng huge hstorcal transacton

302 Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4 (2005) data. Owng to the easy obtaned property of the customer s basc data, f an organzaton could make the marketng decson for new comng customers accordng to ther basc data, t can avods unnecessary marketng cost and complex data mnng process. The am of ths study s to test the feasblty of usng the customer s basc data ncludng demographc varables and geographc varables n marketng decson. To that purpose, the customer data ncludng the basc data and the purchasng transacton data of a warehouse n southern Tawan wll be used. The research focused on the skn care products and ther correspondng transacton data durng June to November n 2004. The study apples some classfers to categorze customers nto two groups by the customer s basc data. One group s the potental hgh-value customers and organzatons should take marketng actvtes on customers n ths group. The other one s the customers wth relatvely low-value and needn t to take much marketng effort. Three knds of classfer ncludng back-propagaton (BP) neural network, decson tree, and Mahalanobs dstance, are used for customer classfcaton. Recently, multple classfer systems have been used for practcal applcatons to mprove classfcaton accuracy [4,,5]. Consequently, the research wll combne the ndvdual classfers to a multple classfer system to get better classfcaton accuracy. Ths paper s organzed as follows: frst, the ssues of data mnng technques, multple classfers, and the neural network are presented. The followng secton s devoted to descrbe to descrbe the data and the analyses methodology. In the next secton, the performance of three ndvdual classfers and the multple classfer system s reported. The fnal secton summarzes the fndng of the research and outlnes some suggestons for future research. 2. LITERATURE REVIEW 2. Data Mnng Data mnng (DM) s the exploraton and analyss of large quanttes of data n order to dscover meanngful patterns and rules []. Gven the enormous sze of databases, DM s the technology for knowledge dscovery n databases. DM s an nterdscplnary feld that combnes statstcs, database management, computer scence, artfcal ntellgence, machne learnng, and mathematcal algorthms. Ths technology provdes dfferent methodologes for decson-makng, problem solvng, analyss, dagnoss, ntegraton, learnng, and nnovaton [8]. Berry and Lnoff [] defned sx common DM tasks: classfcaton, estmaton, predcton, affnty groupng or market basket analyss, clusterng and proflng. A popular applcaton of data mnng wth CRM s customer segmentaton. The purpose of segmentaton s to dentfy behavoral segments and to talor products, servces, and marketng messages to each segment. [5,3] Data mnng should be embedded n a busness CRM strategy that spells out the actons to be taken as a result of what s learned through DM. 2.2 Multple Classfers The combnaton of multple classfers has been used for practcal applcatons to mprove classfcaton accuracy. The combnaton methods can be dvded nto two categores: seral combnaton and parallel combnaton. Multple classfers wth seral combnaton lnks sngle classfer n a sequence. The output of a classfer s passed to the classfer n the next poston n the sequence. Whle parallel combnaton approach consders all the output of the classfers and ntegrates them by a combnaton algorthm. Prevous methods for parallel classfers combnaton nclude maorty vote, naïve Bayes, behavor-knowledge space method, Borda count, and neural network. Schele [2] deemed that combnng only few classfers can obtan good performance, and combne the sutable number of classfers can ncrease the robustness of classfer. He also mentoned combnng complementary classfer can rase the robustness and the accuracy thus the total classfer can be attaned. It s practcable to combne smple classfer rather than desgn a sngle complex classfer. When the data s bnary, the most used combnaton method s maorty vote. The maor lmtaton of maorty vote s that the number of classfers s odd. If the data s contnuous type, the way to combne ncludes Bayes method, maxmum precson, decson tree, self organzng map (SOM), and so on. Table summarzes prevous research n multple classfers and the results. They all showed that multple classfers wll perform better than the sngle classfer. 2.3 Artfcal Neural Network An artfcal neural network s an nformaton processng technology by the bologcal bran and ther neural system. Artfcal neural network s a knd of computng system that conssts of software and hardware. It offers a mathematcal model that attempts to mmc the ablty of organsm s nervous system and learn from experence. An artfcal neuron mtates the bologcal neuron whch receves sgnals or nputs from outsde envronment or other neurons. After the summaton and transfer computaton, the neuron wll output a sgnal to other neurons or the outsde envronment. Neural network learnng can be supervsed or unsupervsed. Learnng s accomplshed by modfyng network connecton weghts whle a set of nput nstances s repeatedly passed through the network. Neural networks - the

Y. M. Chang et al.: The Applcaton of Data Mnng Technques and Multple Classfers 303 artfcal s usually dropped - contnuous to grow n popularty wthn the busness, scentfc, and academc worlds. They are powerful tool readly appled to predcton, classfcaton, and clusterng. 3. METHODOLOGY Ths research ntends to apply data mnng technques to segment customers based the transacton data, and to construct a multple classfer system to predct the value of a new customer by utlzng the demographc and geographc varables of customer s basc data. The archtecture of the research s outlned n Fgure. The study frst apples back-propagaton network (BPN), Mahalanobs dstance (MD), and decson tree ndvdually as the ndvdual classfers, then combnes the three ndvdual classfers to a multple classfer system. The parallel combnaton methods adopted n the research nclude maorty vote, BPN, and SOM network. The predcton accuracy of the ndvdual classfer and the multple classfer system wll be compared by an emprcal case. The followng subsectons wll ntroduce the three ndvdual classfer adopted n the study. Authors Xu et al. [5] Wang et al. [4] Sboner et al. [] Table : Researches about multple classfers and ther contents Research Contents They used Bayesan formalsm, votng prncple, and Dempster-Shafer belef theory to combne dfferent classfers. The paper proposed a Kohonen self-organzng neural network to ntegrate the results of 5 other neural network classfers. They used radar recognton problem as demonstraton. The proposed combnaton method s nsenstve to classfer correlaton. They combned lnear dscrmnant algorthm, k-nn and decson tree by maorty votes to classfy 52 skn mages. Customers demographc data and geographc data Classfer Classfer 2 Classfer 3 Customer value low Don t take marketng actvtes hgh Take marketng actvtes Fgure : The archtecture of the research 3. Back-propagaton Neural Network Back-propagaton network s a feed-forward neural network that calculates output values from nput values. The structure of the network s typcal of networks used for predcton and classfcaton. The neurons are organzed nto three layers, as shown n Fgure 2. Each neuron n the nput layer s connected exactly one nput varable and t does not do any computaton, whch means t copes ts nput value to ts output value. In the study, the nput varables are the customer s basc data. Before nputtng them to the network, the varables should be normalzed. The next layer s called the hdden layer because t s connected nether to the nputs nor to the output of the network. Each neuron n the hdden layer s fully connected to all neurons n the nput layer and calculates ts output by multplyng the value of each nput by ts correspondng weght, addng these up, and applyng the transfer functon. The output layer s to represent the output varable of the network. Neurons n the output layer are fully connected to all neurons n the hdden layer. The back-propagaton learnng s to adust the connected weghts to mnmze the error that s the dfference

304 Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4 (2005) between the network output and the target (actual result). The error s fed back through the network. The adustment process wll contnue untl the error converges to an acceptable value. The neural network s used to calculate a sngle value that represents whether to take marketng actvtes, so there s only one neuron n the output layer. In consderaton of network parameters settng, the research appled the desgn of experments to determne the best network parameters. T [ T T,, T ] = (2), 2 L n Step 4: Calculate the output of the hdden layer ( Y ) and the output of the output layer ( Y ). net Y [ W X ] θ () t = h h () t = f ( net () t ) = () t net Y h + exp t Y t W ( ) ( ) = net [ ] θ () t ( ) = net () t () t = f net () t + exp (3) (4) (5) (6) Fgure 2: The structure of the back-propagaton neural network The learnng algorthm of BP network s ntroduced brefly as follows. Nomenclature m: Number of neurons n nput layer l: Number of neurons n hdden layer n: Number of neurons n output layer η : Learnng rate of the network W h : Weght matrx of nput layer to hdden layer, h =,2, L,l, =,2, L, m W : Weght matrx of hdden layer to output layer, =,2, L, m, =,2, L, n θ : Bas vector of hdden layer, =,2, L, m θ : Bas vector of output layer, =,2, L, n X: Input vector T: Target vector Y: Output vector δ : Error Algorthm Step: Set the learnng rate as η. Step2: Randomze the ntal values of θ, and θ. W h, W, Step 3: Input a tranng sample wth nput vector X and target vector T. Suppose t s the t-th sample. X () t [ X () t X () t,, X () t ] = (), 2 L l Step 5: Calculate the error δ. The error of the output layer ( δ ) s calculated by the followng equaton: = Y Y [ ] [ T () t Y () t ] δ (7) The error of the output layer ( δ ) s calculated as follows. = Y [ Y ] W () t δ () t δ [ ] (8) Step 6: Calculate the amount of the weght change ( W ) and bas change ( θ ). From hdden layer to output layer: W = δ Y = η δ η (9) θ (0) From nput layer to hdden layer: W h = δ X = η δ η () θ (2) Step 7: Update all the weghts and bases. From hdden layer to output layer: W θ ( t ) = W + W ( t ) = θ + θ + (3) + (4) From nput layer to hdden layer: W θ h ( t ) = Wh + Wh ( t ) = θ + θ + (5) + (6)

Y. M. Chang et al.: The Applcaton of Data Mnng Technques and Multple Classfers 305 Step 8: Repeat Step 3 to step 7 untl the error converges to a specfc value or executng the specfed number of learnng cycles. 3.2 Decson Tree A decson s a smple structure where non-termnal nodes represent tests on one or more attrbutes and termnal nodes represent decson outcomes. Decson trees are powerful and popular for both predcton and classfcaton. Decson trees are easy for us to understand and can be transferred nto rules thus they are very attractve. The path from root node to the termnal node forms the classfcaton rules. Fgure 3 shows a general tree structure. Decson trees are constructed usng only those attrbutes best able to dfferentate the concepts to be learned. Repeatedly splt the data at each node nto smaller and smaller groups n such a way that each new generaton of nodes has greater purty than ts ancestors wth respect to the larger varable []. At the start of the process, there s a tranng set consstng of preclassfed records, the target. The tree s bult by splttng the records at each node accordng to a functon of a sngle nput varable. The remanng tranng set nstances test the accuracy of the constructed trees. If the decson tree classfes the nstances correctly, the process termnates. Class A Condton ROOT Condton 2 data are dffcult to explore the relatonshp among them. The Mahalanobs dstance (MD) s a dstance functon whch s used to measure the homogenety between multvarate data by the covarance matrx. It s also consdered as one of the bnary classfcaton methods. When the data are homogeneous, the MD wll be small, and n most case the MD s less than 2. The MD has sgnfcance n pattern recognton [7]. It can also be used as the core of a manufacturng control system [6]. Suppose a customer has k characterstcs (Z..Z k ) that can be used for classfcaton. Let Z mean the -th characterstc of the -th customer, =,,k, and =,,n, where k s the number of the feld of customer s basc data, and n s the number of customers. The orgnal data should be normalzed by Eq. (7) before calculatng the MD. z = where (7) Z m σ ( Z+ Z2 + Zn) (8) m = K+ n σ ( ) 2 + + ( ) 2 = Z m K Zn m (9) n The normalzed data are shown n Table 2. All characterstcs now have the average 0 and standard devaton. The MD s correspondng to the correlaton between each characterstc. Let r st be the correlated coeffcent between s-th and t-th characterstc. r st n = z = + + + sl ztl z z z z L z z n l= n s t s2 t2 sn tn (20) Class B Class C Fgure 3: A bnary decson tree The research adopts CART [2] algorthm whch grows bnary trees and contnues splttng as long as new splts can be found that ncrease purty. The CART algorthm dentfes canddate subtrees through a process of repeated prunng. CART reles on a concept called the adusted error rate to dentfy the least useful branches whch wll be pruned. The processes of usng decson trees for data mnng are descrbed as follows. Frst, select the algorthm of decson tree. The paper adopts CART algorthm. Second, construct the decson tree by the tranng data set. Thrd, prune the decson tree by evaluaton. 3.3 Mahalanobs Dstance Ambguous and nconsstent mult-dmensonal The correlaton matrx R can be expressed by Eq. (2). R A r2 K r k r r K M M O M rk rk2 K 2 2k = (2) Let A be the nverse of the correlaton matrx R. a a2 K a k a a a K M M O M ak ak2 K akk 2 22 2k = (22)

306 Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4 (2005) No. Characterstcs Table 2: Standardzed data Z Z 2 Z Z k z z 2 z z k 2 z 2 z 22 z 2 z k2...... z z 2 z z k n z n z 2n z n z kn The Mahalanobs dstance s defned by the Eq. (23) 2 k D = λ K = k a zλ zλ ( λ =,2,..., n) = (23) All the MDs form the Mahalanobs space. In general, the vale of MD s less than 2.5 and the probablty of MD beng larger than 4 s qute low. A threshold must be desgnated as a decson crteron n a MD space. In the study, the threshold s obtaned by settng the Type and Type II error. 4. AN EMPIRICAL STUDY The research took the customer data of a large warehouse n southern Tawan as an example to verfy the effectveness of the proposed approach. We focused on the purchasng transacton data of skn care products from June to November 30 n 2004. There were total 49 customers and 036 transacton data n total. 4. Data Preprocessng and Customer Segmentaton The study utlzed Mcrosoft SQL Server 2000 as the data processng platform and bult a CRM database ncludng both the transacton table and the customer table. The customer table has two knds of data: demographc varables and geographc varables. Demographc varables nclude Age, Gender, Occupaton, Income, Educaton, and Martal Status. There s ust one geographc varable to be analyzed Commercal Crcle, that represents the dstance between a customer and the warehouse. The paper dstngushes the customer value accordng to the segmentaton results of Luo [9], n whch the customers were dfferentated nto sx clusters: premum customers, prospect customers, uneconomc customers, new customers, undesrable customers, and lost customers. If a busness can ncrease 5% of retenton rate, t can ncrease 25% proft n average. Snce the customer values of the sx segmentatons are dfferent, the enterprse should make a marketng decson carefully. The study suggests advertsng to those customers wth hgh value and savng the marketng cost of low value customers. In terms of marketng, the research defnes two categores of customers. The frst category s defned as the marketng target and t conssts of premum customers and prospect customers. Customers n the second category ncludng uneconomc customers, new customers, undesrable customers, and lost customers, are low proftable or even cause negatve proft. The enterprse should avod takng marketng actvtes to those customers n the second category. The dstrbuton of the two categores s shown n Table 3. Table 3: The dstrbuton of customers Customer Number of Proporton Segmentaton Customers Category Premum 03 739 7.94% (Marketng) Prospect 636 Category 2 Uneconomc 64 (Needn t New 238 3380 82.06% Marketng) Undesrable 894 Lost 084 Total 49 00% To ncrease the accuracy of predcton, the research combnes three ndvdual classfers and compares the performance of multple classfer system wth ndvdual classfer. The accuracy s defned as the percentage of number of correct classfcaton over total samples n the test data set. It can be observed that the number of customers n Category 2 s much larger than the one n Category. To avod the over-learnng on tranng data set cause the low accuracy, the study randomly selected 739 customers from Category 2 and dvded the customers n two categores nto tranng samples and test samples, as shown n Table 4. 4.2 Accuracy of the Indvdual Classfer The research dvded customers nto two

Y. M. Chang et al.: The Applcaton of Data Mnng Technques and Multple Classfers 307 categores. Category contans the customers who create hgh value to the busness and they are expected to respond to the marketng actvtes. On the other hand, Category 2 contans customers wth low value to the busness and they are worthless to take marketng actvtes. The purpose of the paper s to use the customer s basc data to predct whch category a new customer belongs to. The consdered basc data ncludes Age, Gender, Occupaton, Income, Educaton, Martal Status and Commercal Crcle. The followng subsectons llustrate the accuracy of the three ndvdual classfers BP network, decson tree, and Mahalanobs dstance - n predctng the value of a new customer. 4.2.2 Decson Tree The paper utlzed Answer Tree 3. as the decson tree analyss tool. The tranng data set was used for constructng a decson tree wth 9 rules. After prunng procedure, the decson tree fnally has 4 rules (Fgure 4). A rule s created by followng one path of the tree. The 4 rules and ther correspondng accuracy are descrbed below. Table 4: The dstrbuton of tranng and test data set Number of Category Data Set Total Customers Tranng 493 Test 246 739 2 Tranng 493 Test 246 739 4.2. Back-propagaton Network The collected customer s basc data conssts of seven varables, but not al varables have sgnfcant nfluence n classfcaton. Takng the Ch-square test on all the varables between two categores, we found that ust four varables have sgnfcant dfference. The four varables are Commercal Crcle, Age, Gender, and Educaton. The nput varables to the network are the four varables that have sgnfcant dfference. The paper apples the Taguch method to determne the optmal parameters of the BP network. The output layer contans two neurons, one represents Category and the other represents Category 2. We desgned a network wth one hdden layer, thus the network archtecture became 4-h-2, where h meant the number of the hdden neurons. The study tested three settngs about h accordng to the sum of nput neurons and output neurons. They are half of the sum, the sum, and double of the sum, respectvely. Matlab 7.0 was used as the neural network analyss tool. The study chose LM tranng algorthm of the BP network. Three mportant parameters n LM tranng algorthm are µ, µ_dec, and µ_nc. We selected three levels of all the factors affectng the BP performance when takng the Taguch experments. After the parameter optmzaton procedure, the study obtaned the correspondng weghts of the network by learnng from the tranng samples. The study used the test samples to evaluate the performance of the BP classfer. The accuracy of BP network on the test data set s 8.7%. Fgure 4: The bnary decson tree used to classfy customers Rule : IF Commercal Crcle 5, THEN classfy the customer to Category. (Accuracy s 85.4%) Rule 2: IF Commercal Crcle 5 & Age 34.5, THEN classfy the customer to Category 2. (Accuracy s 85.85%) Rule 3: IF Commercal Crcle 5 & Age 34.5 & Age 49.5, THEN classfy the customer to Category. (Accuracy s 7.03%) Rule 4: IF Commercal Crcle 5 & Age 49.5, THEN classfy the customer to Category 2. (Accuracy s 70.83%) Usng the four rules to classfy the test data set, the accuracy s 8.30%. 4.2.3 Mahalanobs Dstance The research utlzed Matlab 7.0 to calculate the Mahalanobs dstance between customers. The customer s basc data contans 7 varables. To test the effectveness of the varables n MD calculaton, the research frst performed the Taguch experments by

308 Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4 (2005) adoptng the L 8 (2 7 ) orthogonal array. The expermental results showed the most sgnfcant varables n MD calculaton are Commercal Crcle, Age, Occupaton, and Martal Status. The four varables were used as the characterstcs descrbed n Sec. 3.3 to calculate the MD. The MD classfcaton results of the test samples ndcate the accuracy s 68.29%. The accuracy of three ndvdual classfers s lsted n Table 5. It can be observed that the BP network has the best performance n predctng the customer value, and the MD method has the worst performance. Table 5: The accuracy of the ndvdual classfer Accuracy (%) BP MD Decson Tree 8.7 68.29 8.30 4.3 Multple Classfer System To ncrease the predcton accuracy of customer value, the study tred to combne the three ndvdual classfers by maorty vote strategy, BP combnaton, and SOM combnaton. The performance of dfferent combnaton methods s analyzed carefully and s llustrated n the followng. 4.3. Combnaton by Maorty Vote Strategy As the name of the method suggest, the result of the maorty vote s decded by votng. When affrmatve votes are larger than the negatve votes on one category, the classfcaton result s that category. In ths case, f more than two classfers categorze a customer to Category, the votng strategy classfes the customer to Category, and vce versa. Usng the maorty vote strategy to combne multple classfers, the accuracy rases to 84.55%. 4.3.2 Combnaton by Back-propagaton Network Take the results of the three ndvdual classfers as the network nput, and the output layer stll has 2 neurons to ndcate the two categores the customers should be classfed. Note f the result of the ndvdual classfer s Category, the nput value for the correspondng nput neuron s ; on the other hand, the nput value s 2. The hdden layer n ths model was also set to one, and the number of hdden neurons was set to half of the sum of nput neurons and the output neurons. The BP combnaton method generated 85.57% classfcaton accuracy of test data set. 4.3.3 Combnaton by SOM Network The neurons of nput layer of the SOM network are also 3, whch represent the three ndvdual classfers output, respectvely. The SOM network topology s set to 2. The research also utlzed Matlab 7.0 for SOM network applcaton. The classfcaton result of the test data ndcates the accuracy s 83.54%. Table 6 lsts the performance of the three combnaton methods. Compared to the accuracy shown n Table 5, the accuracy s mproved whatever the combnaton methods we used. However, the BP combnaton method generates the best predcton result. Table 7 shows the partal classfcaton results. If the customer s classfed to the rght category, the value s. On the contrary, f one s wrong classfed, the value wll be 0. Observed the customer d 505607, among the three ndvdual classfers, ust decson tree has the correct classfcaton. Both BP and MD get the wrong classfcaton. However, after the combnaton by BP and SOM, the customer s classfed to the rght category. Customers wth d 505579, 505599, and 5056036 are wrong classfed by the decson tree. But after combnaton, they all are classfed to the rght category. The results tell combng the multple classfers can cover the mstake of the ndvdual classfer, therefore ncreases the accuracy of the classfcaton. Table 6: Comparson of dfferent combnaton methods Accuracy (%) Maorty Vote BP Combnaton SOM Combnaton 84.55 85.57 83.54 5. CONCLUSIONS AND FUTURE WORKS The research utlzed the customer s basc data to predct the customer value for offerng the sutable marketng strategy. The customer s basc data are easer to be obtaned than the customer behavor data extracted from the purchasng transacton data. If busnesses can use the customer s basc data to predct a new customer s value, they can make the sutable marketng decson n advance of the actual purchasng behavors. The value of the research s to provde busnesses such opportunty to faster respond to new comng customers and decrease the marketng cost. Accordng to the fndng of the research, not all the varables n customer s basc data would be used as the nput varables n three ndvdual classfers. Among them there exst the same nput varables n three ndvdual classfers: Commercal Crcle and Age. It means that the two varables play an mportant role n the warehouse operaton. The dstance between customer and the warehouse wll nfluence the nclnaton of customer to go to the warehouse, and the age of a customer has a great effect on the

Y. M. Chang et al.: The Applcaton of Data Mnng Technques and Multple Classfers 309 purchasng behavor of the skn care products. The emprcal analyses prove that the performance of the multple classfer system s better than the ndvdual classfer whatever the combnaton method t uses. The best combnaton method s BP network. The results are useful for busness to estmate a new customer value and apply a sutable marketng strategy to that customer. Though the bult multple classfer system performs well n predctng customer value, the authors suggest some drectons for future research:. The MD method s only effectve n classfyng obects nto two categores. It causes the lmtaton of the customer classfcaton and the correspondng marketng strategy. Further researches can devote to explorng the performance of other classfers. 2. Emoton marketng s based on consumer s personalty. Some researches tred to buld the relatonshp between the constellaton and the personalty. Snce most busnesses collect the brthday data of ther customers, we have the chance to nvestgate how constellaton affects the customer purchasng behavor. 3. The marketng cost s not consdered n the paper. Moreover, the effect of marketng strateges s not evaluated by the research. Both should be concerned n the future researches nterestng n the topc. Table 7: Part of the classfcaton results of all the classfers used n the research Indvdual Classfer Multple Classfer System CARD_NO MD Decson Maorty SOM BP BP Tree Vote Combnaton Combnaton 505079 0 505087 5050857 50505 0 0 0 0 50556 505287 505344 0 505542 505579 0 505607 0 0 0 5052039 5052462 5052572 0 5052740 505292 5053034 0 5053655 0 505599 0 5056036 0 570002497 0 0 0 0 0 REFERENCES. Berry, M. J. A. and Lnoff, G. S., 997, Data Mnng Technques: For Marketng Sale and Customer Support, John Wley & Sons Inc., New York. 2. Breman, L., Fredman, J. H., Olshen, R. A. and Stone, C. J., 984, Classfcaton and Regresson Trees, Wadsworth Internatonal Group, Monterey, CA. 3. Frawley, W. J., Patetsky-Shapro, G. and Matheus, C. J., 99, Knowledge Dscovery n Databases: An Overvew, Knowledge Dscovery n Databases, AAAI/ MIT Press, pp. -30. 4. Gunter, S. and Bunke, H., 2004, Feature selecton algorthms for the generaton of multple classfer systems and ther applcaton to handwrtten word recognton, Pattern Recognton Letters, Vol. 25, No., pp. 323-336. 5. Ha, S. H. and Park, S. C., 998, Applcaton of

30 Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4 (2005) data mnng tools to hotel data mart on the Intranet for database marketng, Expert Systems wth Applcatons, Vol. 5, No., pp. -3. 6. Hayash, S., Tanaka, Y. and Kodama, E., 200, A new manufacturng control system usng Mahalanobs dstance for maxmsng productvty, IEEE Internatonal Symposum on Semconductor Manufacturng Conference, pp. 59-62. 7. Kato, N., Abe, M. and Nemoto, Y., 997, A handwrtten character recognton system usng modfed Mahalanobs dstance, Systems and Computers n Japan, Vol. 28, No., pp. 46-55. 8. Lao, S. H., 2003, Knowledge management technologes and applcatons-lterature revew from 995 to 2002, Expert Systems wth Applcatons, Vol. 25, No. 2, pp. 55-64. 9. Luo, Y. C., 2005, The applcaton of data mnng technque to customer segmentaton and marketng decson, Department of Industral Engneerng and Management, I-Shou Unversty, Master Thess. 0. Prahalad, C. K. and Ramaswamy, V., 2000, Co-optng customer competence, Harvard Busness Revew, Vol. 76, No., pp. 79-87.. Sboner, A., 2003, A multple classfer system for early melanoma dagnoss, Artfcal Intellgence n Medcne, Vol. 27, No., pp. 29-44. 2. Schele, B., 2002, How many classfers do I need? Internatonal Conference on Pattern Recognton, Quebec, Canada. 3. Shaw, M. J., Subramanam, C., Tan, G. W. and Welge, M. E., 200, Knowledge management and data mnng for marketng, Decson Support Systems, Vol. 3, pp. 27-37. 4. Wang, Y. H., Ma, S. D. and Tan, T. N., 999, Combnaton of multple classfers wth neural networks, Internatonal Federaton of Automatc Control, pp. 65-69. 5. Xu, L., Krzyzak, A. and Suen, C. Y., 992, Method of combnng multple classfers and ther applcaton to handwrtten numeral recognton, IEEE Transportaton System, Vol. 22, No. 3, pp. 48-435. 6. Zkmund, W. G., McLeod, R. and Glbert, F. W., 2003, Customer Relatonshp Management Integratng Marketng Strategy and Informaton Technology, John Wley & Sons Inc., New York. ABOUT THE AUTHORS Yu-Mn Chang s an assstant professor n the department of Industral Engneerng and Management at I-Shou Unversty, Tawan, R.O.C. She receved her Ph.D. degree n Industral Engneerng and Engneerng Management from Natonal Tsng-Hua Unversty. Her current research and teachng nterests are n logstcs management, data mnng, and automated nspecton. She s a member of CIIE, ORST, and TAAI. Yu-Cheh Lo s a graduate student of Industral Engneerng and Management at I-Shou Unversty. Her research nterest s customer relatonshp management. Shang-Y Ln s a graduate student of Industral Engneerng and Management at I-Shou Unversty. Hs research nterests are data mnng and customer relatonshp management. (Receved September 2005, revsed November 2005, accepted December 2005)