An Archecure o Suppor Dsrbued Daa Mnng Servces n E-Commerce Envronmens S. Krshnaswamy 1, A. Zaslavsky 1, S.W. Loke 2 School of Compuer Scence & Sofware Engneerng, Monash Unversy 1 900 Dandenong Road, Caulfeld Eas 3145, Ausrala Emal: {shonal.krshnaswamy, arkady.zaslavsky}@csse.monash.edu.au CRC for Enerprse Dsrbued Sysems Technology 2 900 Dandenong Road, Caulfeld Eas 3145, Ausrala Emal: swloke@dsc.monash.edu.au Absrac Ths paper presens our hybrd archecural model for Dsrbued Daa Mnng (DDM) whch s alored o mee he needs of e-busnesses where applcaon servce provders sell DDM servces o e-commerce users and sysems. The hybrd archecure negraes he clenserver and he moble agen echnologes. Ths model focuses on he opmsaon and cosng ssues of DDM whch are parcularly relevan n he conex of bllng users for daa mnng servces. 1. Inroducon The paradgm of Applcaon Servce Provders (ASP) has emerged recenly o address he applcaon sofware needs of medum range enerprses. The underlyng prncple of ASP s he noon of renng sofware [4]. Thus, nsead of buyng a package and nsallng, organsaons logon o an applcaon servce provder (va eher he Inerne or a dedcaed communcaon channel) and use he applcaon packages provded by he ASP and pay for hs usage. Ths paradgm s parcularly useful for small o medum range organsaons as hey are ofen consraned by he hgh cos of sofware. The emergence of ASP s closely ed wh e-commerce [8]. Technologes lke e-commerce provde an opporuny for small and medum range companes o compee n global markes, whch were prevously he doman of large organsaons and mul-naonals. ASP echnology provdes smaller companes wh cung edge sofware echnology ha makes he e-busness arena a more level playng feld. The objecve of hs paper s o demonsrae how dsrbued daa mnng (DDM) can be provded as a servce hosed by an ASP n an e-commerce envronmen. We presen ou hybrd DDM archecure ha s alored o mee he specfc needs of an envronmen where e- commerce sysems nerac wh ASP for fulflmen of her daa mnng needs. Daa mnng has been recognsed as a conrbung echnology n e-commerce frameworks [1]. However, s he evoluon of daa mnng along he dmenson of dsrbuon ha has enabled easer negraon wh e-commerce envronmens (whch are nrnscally lnked wh he World Wde Web, dsrbuon and heerogeney). Dsrbued daa mnng sysems are largely seen as operang whn nraorgansaonal domans (albe o mee he needs of global organsaons whch have dsrbued and heerogeneous daa resources). However, when dsrbued daa mnng moves from he confnes of an organsaon o become a generc servce provded by an ASP and accessed by dfferen componens of an e-commerce sysem, here are addonal requremens and challenges ha have o be addressed. The prmary ssues of concern are bllng of users based on esmaed coss and response mes, mproved performance o mee real me needs and he ably o be flexble and exensble o he dverse daa mnng needs of dfferen organsaons. In hs paper we presen our hybrd DDM archecure whch addresses he above quesons. The paper s THE WORK REPORTED IN THIS PAPER HAS BEEN FUNDED IN PART BY THE CO-OPERATIVE RESEARCH CENTRE PROGRAM THROUGH THE DEPARTMENT OF INDUSTRY, SCIENCE AND TOURISM OF THE COMMONWEALTH GOVERNMENT OF AUSTRALIA.
organsed as follows. In secon 2, we llusrae he neracons of an e-commerce envronmen wh a DDM servce provded by an ASP. In secon 3 we presen our hybrd archecure for DDM. In secon 4 we presen he cos models whch we have developed for he dsrbued daa mnng process. In secon 5 we dscuss relaed work. Fnally, n secon 6 we dscuss he curren saus of our work and he fuure drecons. 2. Role of DDM n an E-Commerce Scenaro In hs secon, we presen a hypohecal e-commerce scenaro and focus on he role of dsrbued daa mnng as a servce hosed by an ASP. Consder an on-lne shoppng cenre whch consss of cusomers, vendors and a rader. The cusomers access he shoppng cenre hrough a web-nerface and nerac wh he vendors va he rader. The rader a one-level provdes caalogue servces o cusomers n erms of vendor profles and avalably of goods and servces. A anoher level, he rader negoaes ransacons beween he cusomers and he vendors. The need for dsrbued daa mnng n such a scenaro arses from wo possble sources, namely, he vendor and he rader. The vendors daa mnng requremens have her orgns n radonal daa mnng applcaons such as marke baske analyss. The rader s daa mnng needs wll be cenred on cusomer-proflng o mprove he level of servce provded o ndvdual cusomers. Snce he envronmen s nherenly dsrbued and heerogeneous, he focus s on dsrbued daa mnng. In addon o he complexy of dsrbuon, e- commerce adds o he mnng process an addonal dmenson of complexy by emphassng he mporance of opmsed response me. For example, n a suaon where a produc requred by a cusomer s no currenly avalable, he rader mgh wan o provde he cusomer wh deals such as he lkelhood of when he produc would be avalable by analysng pas rends or smlar producs offered by vendors. The rader mgh also wan o gve he cusomer he ncenve for wang by analysng dependences wh seasonal specals. Tradonally, boh he rader and ndvdual vendors would have her own daa mnng sysems o mee her ndvdual busness needs. However, he rapdly emergng ASP rend provdes a means for dsrbued daa mnng o be a generc servce. The advanage of hs s ha allows organsaons o access daa mnng servces whou havng o be concerned wh he seng up coss. Furher, such a servce would have he advanage of beng exensble enough o ncorporae a sue of daa mnng algorhms ha dfferen users and/or he ASP would offer as an negraed servce. A framework for he role of dsrbued daa mnng n he e-commerce scenaro dscussed above s llusraed n fgure 1. The componens of he fgure are: Cusomers. Cusomers are buyers who use he on-lne shoppng cenre o procure goods and servces. E-Commerce sysem. The e-commerce sysem provdes he nfrasrucure for he on-lne shoppng cenre. I comprses a web nerface, an e-caalog, an nermedary and a daabase. The web nerface s he pon of access for he cusomers no he shoppng cenre. The ecaalog s a drecory of he goods, servces and vendor profles. The nermedary negoaes ransacons beween he cusomers and he vendors. The daabase s used o manan ransacon deals, vendor and cusomer nformaon for use by he e-caalog and he nermedary. Vendors. The vendors are he busnesses ha use he onlne shoppng cenre as a medum for markeng and sellng her producs. Daabase WEB-INTERFACE E-CATALOG TRADER On-lne Shoppng Cenre Vendor 1 Vendor 2 DDM Servce Fla Oracle Legacy Applcaon Fles Sybase Servce Provder Fgure 1 DDM n an E-Commerce Envronmen Applcaon Servce Provders (ASP). The ASP provdes applcaon servces o he e-commerce sysem componens and he vendors. The focus n he above scenaro s on he daa mnng servce ha s provded by he ASP. Those vendors ha requre hs servce and he e-commerce sysem pay he ASP for accessng he dsrbued daa mnng sysems ha s provded. Dsrbued Daa Mnng (DDM) Sysem. Ths s he dsrbued daa mnng sysem ha he ASP uses o provde generc daa mnng servces o s subscrbers. In order o suppor he robus funconng of he sysem n he envronmen llusraed n fgure 2 needs o possess ceran characerscs such as heerogeney, cosng nfrasrucure, opmsaon, secury and exensbly. Heerogeney mples ha he sysem mus be able o mne daa from heerogeneous and dsrbued locaons. I mus be able o suppor user requremens wh respec o dfferen dsrbued compung paradgms (ncludng he clen-server and moble agen based models). The underlyng phlosophy s ha he ASP should no mpose one of he models on he users and mus be able o
suppor specfc needs and requremens ha are suable for he user. The cosng nfrasrucure refers o he sysem havng a framework for esmang he coss of dfferen asks. Ths mples ha a ask ha requres hgher compuaonal resources and/or faser response me should cos he users more on a relave scale of coss. Furher, he sysem should be able o opmse he dsrbued daa mnng process o provde he users wh he bes response me possble (gven he consrans of he mnng envronmen and he expenses he user s wllng o ncur). Secury mples ha n some nsances, he user mgh be mnng hghly sensve daa ha should no leave he owner s se. In such cases, he opon s o use he moble-agen model where he mnng algorhm and he relevan parameers are shpped o he daa se and a he end of he process he moble agen s desroyed on he se self (.e. does no leave he se). The sysem mus be exensble o provde for a wde range of mnng algorhms. User mus be able o regser her algorhms wh he ASP for use n her specfc DDM jobs. Ths mples ha here needs o be a hgh level semanc specfcaon of he dsrbued daa mnng process. The above dscusson hghlghs he need for dsrbued daa mnng n e-commerce. I also oulnes he specfc requremens for a DDM sysem o operae n an e-commerce envronmen as a generc servce provded by an ASP. 3. Hybrd Model for Dsrbued Daa Mnng In hs secon we presen our hybrd model for dsrbued daa mnng whch s alored o mee he requremens for DDM sysems o operae n e-commerce and ASP domnaed envronmens. The dsngushng feaures of hs archecure are he negraon of he clen-server model and he moble-agen paradgm and an opmser whch bulds cos esmaes for DDM asks. I suppors he ably for mnng o be performed a remoe ses usng moble agens and also ncorporaes dedcaed daa mnng servers wh well-defned compuaonal resources. Ths helps o deal wh heerogeneous and vared clen needs. The cos esmaes address he ssues of cosng and opmsaon of he DDM process. The hybrd model operaes on he prncple of adopng he mos suable approach for a DDM ask dependng on user and resource consrans. Thus, has he opon of usng he clen-server model or he mobleagen model or an negraed approach nvolvng boh. The componens of he hybrd DDM archecure llusraed n fgure 2 are as follows: Users. The users reques daa mnng servces by connecng o he dsrbued daa mnng server. Dedcaed Dsrbued Daa Mnng Server. Ths s a server wh hgh compuaonal power ha acs as boh he pon of conrol for he dsrbued daa mnng process and he provson of dedcaed resources for mnng. The server manans he dsrbued daa mnng managemen sysem. Dsrbued Daa Mnng Managemen Sysem (DDMMS). The DDMMS s he sofware ha performs he varous asks assocaed wh he dsrbued daa mnng process. The DDMMS forms he core of hs archecure and he way s srucured encapsulaes he framework for resource opmsaon. The componens whn he DDMMS nclude a user manager, algorhm manager, opmser, mnng process manager and an agen conrol cenre. We now presen a dealed oulne of he funconaly and srucure of each of hese subcomponens. User Manager. The users connec o he dsrbued daa mnng sysem hrough he user manager. The user manager performs he followng funcons: auhencaon of users, proflng of he daa mnng ask n erms specfyng he user requremens ncludng he daa mnng query, he oupu requred, and he me frame whn whch he oupu s requred and assgnng prores o asks as hey arrve. Algorhm Manager. The algorhm manager s prmary ask s o manan he daa mnng algorhms ha are par of he dsrbued daa mnng sysem. Users can regser any mnng algorhm wh he sysem. The users can choose o make avalable he algorhms ha hey have regsered o oher users. A he me of ncorporang an algorhm no he sysem, he algorhm manager records mea level nformaon abou he algorhm and s characerscs such as name, verson, npu parameers, operang envronmen and oupu produced. The algorhm manager feeds hs nformaon o he mnng process manager, whch manans profles abou algorhmc characerscs. Opmser. The opmser s he componen ha s prmarly responsble for buldng an esmaed cos of alernave sraeges and deermnng he bes opon for performng he daa mnng ask o mee user needs. The opmser neracs wh he mnng process manager n order o collec sascs regardng he curren saus of he communcaon channels and he ask profle (specfcally o deermne he user requremens for ask compleon and he algorhm allocaed for he ask). I also neracs wh he agen conrol cenre (.e. he mne sweeper agen) for deals regardng he daa se sze. Usng he daa colleced by he mne sweeper and he mnng process manager, he opmser bulds an esmaed cos model for he alernave ways o perform he daa mnng and decdes on he opon ha wll mee he user requremens as closely as possble.
USERS Resul PC Worksaon Noebook DDM Mangemen Sysem User Manager Algorhm Manager O p Mnng Process Manager m s User Agen e Knowledge r Inegraor Mne Sweeper Agen Agen Conrol Cenre Mnng Agen Local Compuaonal Resources Resource Monorng Agens ''0Ã6HUYHU Daa ransfer for mnng locally Daa Server 1 Clen Server Model Moble Agen Model Nework Monorng Agen Daa Server2 Saus Monorng Informaon Fgure 2. Hybrd Model For Dsrbued Daa Mnng Mnng Process Manager. Ths module forms he core of he dsrbued daa mnng sysem. I s bascally he coordnang facly beween he dfferen componens of he sysem and provdes nformaon ha s curren abou he sysem. I forms a pon of reference from whch nformaon can be obaned regardng he curren saus of varous aspecs of he sysem. The mnng process manger can be vewed as a dynamc drecory for dsrbued daa mnng asks. We are currenly developng a specfcaon of he enre specrum of he daa mnng process. Ths specfcaon wll form he bass for he operaons of hs componen. To he bes of our knowledge, he mnng process manager s he frs negraed aemp n dynamcally rackng and specfyng he componens and her neracons whn he dsrbued daa mnng framework. Agen Conrol Cenre (ACC). The agen conrol cenre s he framework whn whch he agen acves n he dsrbued daa mnng sysem ake place. The ACC s responsble for acvang/generang/assemblng oher agens requred for he daa mnng process. The dfferen agen ypes and her asks are brefly dscussed below. User Agens. The user agen s prmary ask s o suppor he push model for provdng users wh updaes of he saus of her asks and he fnal resuls of he daa mnng. Nework Monorng Agen. Ths agen connuously monors he lnks o he daa servers by raversng he nework and updang he communcaon lnks saus n he mnng process manager. Daa Resource Monorng Agen. A Daa Resource Monorng agen s assgned o each daa source ha becomes par of he sysem. The agen s responsble for provdng nformaon abou he conens of he daa sources o he mnng process manager. Mne-Sweeper Agen. The mne-sweeper agen s responsble for ravellng o a daa server, performng preprocessng of he daa, deermnng he avalable compuaonal resources a he daa server and esmang he daa sze. Mnng Agen. The mnng agen s an nsanaon of he mnng algorhm allocaed for a ask. Knowledge Inegraor. The knowledge negraor s a componen ha combnes he daa mnng resuls from dfferen daa sources and provdes he fnal resul o he user agen whch n urn communcaes hs o he user. In hs secon of he paper we have proposed an archecure for dsrbued daa mnng whch ncludes an opmsaon componen. We now presen a mahemacal cos model used by he opmser o develop cos esmaes for DDM asks. 4. Cos Models for Dsrbued Daa Mnng In hs secon of he paper, we presen our cos models for dsrbued daa mnng. The cos formulae are
esmaes of he dsrbued daa mnng response me for a gven ask usng a specfed archecural model when envronmenal facors are aken no consderaon. The cos model provdes he heorecal bass for esmang response mes for DDM asks and s used by he opmser n he hybrd archecural model. We nally presen he general cos model for dsrbued daa mnng. We hen llusrae how he general model s mapped o he cos funcons for alernave DDM scenaros. The response me (expressed a a hgh level of absracon) for dsrbued daa mnng s as follows: T = ddm + k (1) where T s he response me, ddm s he me aken o perform mnng n a dsrbued envronmen and k s he me aken o perform knowledge negraon. Dependng on he model used for dsrbued daa mnng (.e. moble agen or clen-server) and he dfferen scenaros whn each model, he facors whch deermne ddm wll change. Ths resuls n a consequen change n he acual cos funcon ha deermnes ddm. In he followng dscusson we presen he dfferen dsrbued daa mnng scenaros and he cos funcons o deermne ddm for each case. 4.1. Moble Agen Model Ths case s characersed by a gven dsrbued daa mnng ask beng execued n s enrey usng he moble agen paradgm. The core seps nvolved are: submsson of a ask by a user, dspachng of moble agen (or agens) o he respecve daa server (or servers), daa mnng and he reurn of moble agen(s) from he daa resource(s) wh mnng resuls. Ths model s characersed by a se of moble agens raversng he relevan daa servers o perform mnng. In general, hs can be expressed as m moble agens raversng n daa sources. There are hree possble alernaves whn hs scenaro. The frs possbly s m = n, where he number of moble agens s equal o he number of daa servers. Ths mples ha one daa mnng agen s sen o each daa source nvolved n he dsrbued daa mnng ask. The second opon s m < n, where he number of moble agens s less han he number of daa servers. The mplcaon of havng fewer agens han servers s ha some agens may be requred o raverse more han one server. We do no consder he hrd case of m > n snce hs s n effec equvalen o he case 1 above where here s a moble agen avalable per daa server. Each of he above alernaves has s own cos funcon. These cos models are descrbed as follows. 4.1.1. Equal number of moble agens and daa servers (m=n). Ths s a case, as llusraed n fgure 3, where daa mnng from dfferen dsrbued daa servers s performed n parallel. The algorhm used across he dfferen daa servers can be unform or vared. The sysem dspaches a moble agen encapsulang he daa mnng algorhm (wh he relevan parameers) o each of he daa servers parcpang n he dsrbued daa mnng acvy. Le n be he number of daa servers. Therefore, he number of moble agens s n (snce m=n). In order o derve he cos funcon for he general case nvolvng n daa servers and n daa mnng agens, we frs formulae he cos funcon for he case where here s one daa server and one daa mnng agen. Agen 1 Daa Source 1 Agen 2 Daa Source2 Agen Cenre Agen3 Daa Source n Fgure 3. Equal number of moble agens and daa sources Le us consder he case where daa mnng has o be performed a he h daa server (.e 1 n ). The cos funcon for he response me o perform dsrbued daa mnng nvolvng he h daa server s as follows: ddm = dm () + dmagen (AC, ) + resulagen (,AC) (2) The erms n he above cos esmae are dscussed below. ddm. Ths s he response me for performng dsrbued daa mnng. In hs parcular case he dsrbued daa mnng process s characersed by one daa server and one moble agen. dmagen (AC, ). In our cos model, he represenaon mobleagen (x, y) refers o he me aken by he agen mobleagen o ravel from node x o node y. Therefore dmagen (AC, ) s he me aken by he moble agen dmagen (whch s he agen encapsulang he mnng algorhm and he relevan parameers) o ravel from he agen cenre (AC) o he daa server (). In general, he me aken for a moble agen o ravel depends on he followng facors: he sze of he agen and he bandwdh beween nodes (e.g. n klobs per second). The ravel me s proporonal o he sze of he agen and s nversely proporonal o he bandwdh (.e. he me aken ncreases as he agen sze ncreases and decreases as he bandwdh ncreases). Ths can be expressed as follows: dmagen (AC, ) sze of dmagen (3) dmagen (AC, ) 1 / bandwdh (4) From (3) and (4): dmagen (AC, ) = ( k * sze of dmagen ) / (bandwdh beween AC and ) In he above expresson for he me aken by he daa mnng agen o ravel from he agen cenre o he daa
server, k s a consan. In [12] he sze of an agen s gven by he followng rple, sze of an agen = < Agen Sae, Agen Code, Agen Daa> where, agen sae s he execuon sae of he agen, agen code s he program ha s encapsulaed whn he agen ha performs he agen s funconaly and agen daa s he daa ha he agen carres (eher as a resul of some compuaon performed a a remoe locaon or he addonal parameers ha he agen code requres). On adapng he above represenaon o express he sze of he daa mnng agen (dmagen), we ge, sze of an dmagen = < dmagen sae, daa mnng algorhm, npu parameers> resulagen (, AC). Ths s he me aken for he daa mnng resuls o be ransferred from he daa server () o he agen cenre (AC) s esmaed smlarly o he dmagen. Ths agen does no carry code o be execued, bu merely ransfers he resuls o he agen cenre for knowledge negraon. I mus be noed ha unlke he me aken by he daa mnng agen o ravel from he agen cenre o he daa se, he me aken for he resul o be carred canno be esmaed a pror snce sze of he resuls depends on he characerscs of he daa. dm (). Ths s he me aken o perform daa mnng a server. The duraon of he daa mnng process ( dm ) depends on he facors such as he processor speed, avalable man memory, daa sze and complexy of he algorhm. We are currenly workng on developng echnques for esmang he daa mnng response me based on hsorcal nformaon. In he foregong dscusson, we developed he cos esmae for dsrbued daa mnng nvolvng a scenaro where here was one moble agen and one daa server. We now exend he cos esmae for he general case characersed by n moble agens and n dsrbued daa sources. Le here be n daa sources whch need o be accessed for a parcular dsrbued daa mnng exercse. The agen cenre dspaches n moble agens encapsulang he respecve mnng algorhms and parameers (.e. one o each of he daa sources) concurrenly. Mnng s performed a each of he ses n parallel and he resuls are reurned o he agen cenre. Snce he mnng s performed a he dsrbued locaons concurrenly, he oal me aken s equal o he me nerval requred by he server whch akes he longes me o perform mnng and reurn resuls. Therefore, ddm = max( dm () + dmagen (AC, ) + resulagen (,AC) where = 1..n (5) The expresson dm ()+ dmagen (AC,)+ resulagen (,AC) represens he duraon of he daa mnng process n he h server and vares from 1 o n. The pars of he expresson such as dm (), dmagen (AC,) and resulagen (,AC) esmaed as explaned prevously. The knowledge negraon can only ake place afer all he resuls are reurned o he agen cenre. 4.1.2. Fewer moble agens han daa servers (m<n). Ths s he second case, as llusraed n fgure 4, where he number of moble agens (m) avalable for dsrbued daa mnng s less han he number of daa sources (n) parcpang n a dsrbued daa mnng ask (.e. m < n). In order o derve he cos formula for he general case nvolvng m daa mnng agens and n daa sources, we frs develop he cos model for he suaon nvolvng one agen and n servers. Agen 1 Daa Source 1 Daa Source2 Agen Cenre Agen 1 Agen3 Daa Source n Fgure 4. Fewer moble agens han daa servers The prncpal dfference beween hs scenaro (where here are lesser agens han daa sources) and he prevous case (where here was one agen/daa source) s ha n he laer dsrbued daa mnng s performed concurrenly n all he daa ses. In he former, ceran moble agens are assgned o more han one daa se for mnng. Hence such an agen mus carry he mnng algorhm and he resuls obaned a he frs daa se n s pah o all he specfed servers where has o mne. We make an assumpon ha he mnng agen reurns o he agen cenre only afer has accomplshed s ask n all he respecve daa sources assgned o. Ths mples, ha as he agen ravels hrough s specfed roue of daa servers, s sze ncreases (o nclude he resuls obaned from each se ha has been mned). Before consderng he general case of m agens and n daa sources, we frs develop he cos model for he suaon where here s only one moble agen for n dsrbued daa ses. The agen s dspached from he agen cenre o he frs of he n daa ses. From here on, he agen complees he mnng ask and carres he resuls obaned o he nex daa se and so on, unl each of he n ses have been mned. The mnng agen hen fnally reurns o he agen cenre where knowledge negraon s performed pror o reurnng he consoldaed resuls o he user. Therefore, he response me for dsrbued daa mnng s as saed n equaon (1). In hs scenaro, ddm s as expressed n equaon (7) as follows:
n 1 ddm = dmagen (AC, 1) + 1 dm = () + n - 1 dmagen (, 1) + dm = + (n) + dmagen (n, AC) In he above cos esmae, he erm dmagen (AC, 1) s he me aken for he daa mnng agen o ravel from he agen cenre o he frs daa se (he daa servers beng numbered from 1 hrough n). The second erm s he oal me aken by he mnng agen o perform daa mnng a server and hen ravel o server +1 as ranges from 1 o (n-1). The esmaon for me aken o perform daa mnng a a gven se and he me aken o for he agen o ravel beween any wo ses s performed as dscussed prevously. However, n hs case, he sze of he dmagen s specfed as follows: sze of an dmagen = < dmagen sae, daa mnng algorhm(s), npu parameers + daa mnng resuls> Thus he daa carred by he daa mnng agen n hs case ncludes boh he npu parameers and he daa mnng resuls. Ths mples ha he sze of he agen ncreases ncremenally as he ses ha have been mned ncreases along s roue. We now exend he above cos esmae for he general case nvolvng m moble agens and n DDM ses. The n daa ses are numbered from 1 hrough n. Le ds be labellng of a se of daa ses. Therefore ds = { 1,2,,n}. Le m be he number of daa mnng agens avalable. The se ds s dvded no m subses ds 1, ds 2,,ds m wh he followng propery: ds ds (.e. ds {1,2,..., n}, 1 m). The ses ds (1 m ) have he addonal propery of ndependence. Tha s, ds ds j = φ, j. A corollary of he above s ha he sum of he cardnales of he subses ds 1, ds 2,,ds m s equal o he cardnaly of ds. Ths mples ha ds 1 + ds 2 + + ds m = ds. Thus, he n daa servers are dvded no m subses and he h daa mnng agen s assgned he ask of mnng he daa ses n he subse ds (where 1 m ). Dsrbued daa mnng can occur concurrenly a one level (.e. here are m dfferen mnng agens operang a he same me). However, each of hese m agens could be assgned several ses o mne (.e. he agen has o ravel o dfferen ses). Thus he h agen (where 1 m) has o ravel o and perform mnng n he number of ses specfed n ds. The oal me aken o mne s herefore he me aken by he agen, whch akes he maxmum me nerval o complee s ask. The cos esmae for he h agen s response me s as follows n equaon (8): ds 1 j = 1 + ddm dm () (ds = dmagen dm (ds (AC, (j)) + ds (1)) dmagen j + ( ds ) + dmagen (ds ( ds ), AC) + (j, 1) In he above expresson, he frs erm s he me aken by he daa mnng agen o ravel from he agen cenre o he frs daa server n s pah (.e. he h subse ds ). The erm nvolvng he summaon s he me aken for he agen o mne and o ravel o he respecve daa ses whn he se assgned o (excludng he fnal se). The second las erm s he me aken o mne a he las daa se n s pah. The fnal erm n he expresson s he me aken for he agen o ravel from he las se on s pah o he agen cenre. Snce here are m agens operang concurrenly, he me aken for compleon of he dsrbued daa mnng process s he me aken by he agen requrng he longes compleon me. Thus, ddm = max( ddm () ), where = 1..m (9) where ddm () s esmaed from equaon (8). 4.2. Clen-server Model The cos esmae for he response me n DDM sysems ha use he radonal clen-server paradgm s presened n hs secon. Typcally, daa from dsrbued sources s brough o he daa mnng server a fas, parallel server - and hen mned. Le here be n daa ses from whch daa has o be mned. Le s be he daa se obaned from he h se (where 1 n). The response for DDM for he daa se s from he h se s as expressed n equaon (11) as follows: ddm () = daatransfer (,DMS, s ) + dm (DMS), 1 n The erm daatransfer (,DMS, s ) s he me aken o ransfer he daa se (s ) from he h se o he DDM server (DMS) and esmaed as follows: daatransfer (,DMS, s ) = sze of s / ( bandwdh beween and DMS ) The second erm, dm (DMS) s he me aken o mne a he daa mnng server and s esmaed as dscussed n secon 2.1. As llusraed from equaon (11), he daa ransfer componen adds o he ddm process. Ths can be a sgnfcan addon when he daa volumes are large and/or he bandwdh s low. If he mnng s done n a parallel server, hen he oal response me for n daa sources s equal o he me aken by he daa se requrng n T = ( ) + = 1 ddm he maxmum processng me. From equaon (11), T = max( ddm () ) + k, where = 1..n (12) If on he oher hand he clen-server model mnes he daa ses sequenally, he oal response me s as expressed n equaon (13) as follows: 4.3. Hybrd Model The hybrd model has he followng wo prncpal characerscs. I combnes he bes aspecs of he agen model and he clen-server approach by ncorporang an k
agen framework wh a dedcaed daa mnng server. I brngs wh he advanage of combnng he concep of dedcaed daa mnng resources (and hus allevang he ssues assocaed wh lack of conrol over remoe compuaonal resources n he agen model). I also has he ably o crcumven he communcaon overheads assocaed wh he clen-server approach. Ths gves he model he opon of applyng eher approach o a parcular DDM ask. Le n be he number of daa ses o be mned. The opmser decdes on he bass of he cos esmaes ha bulds ha n a ses are o be mned usng he agen model and n cs ses are o be mned usng he clen-server paradgm: n = n a + n cs. When n cs =0, he DDM archecural model deployed s n effec he agen model and vce-versa. The response me n he hybrd model assumng ha he moble agen-based mnng componens and he clen-server mnng are done n parallel s he me aken by he echnque requrng a longer duraon. hybrd = max ( na, cs ) (14) where na s he me aken o mne he n a ses usng he agen model and cs s he me aken o mne he n cs ses usng he clen-server paradgm. 5. Relaed Work Several DDM sysems usng eher agen-based archecures or he clen-server model have been proposed. The agen paradgm s he more popular approach for buldng DDM sysems. These nclude Parallel Daa Mnng Usng Agens (PADMA) [5], Generc Daa Mnng [2], InfoSleuh [7], Java Agens for Mea-Learnng (JAM) [11], Beszng Knowledge hrough Dsrbued Heerogeneous Inducon (BODHI) [6] and Papyrus [10]. DDM sysems based on he clen-server paradgm nclude DecsonCenre [3] and InellMner [9]. To he bes of our knowledge, here has been no prevous aemp o develop an archecural model o suppor onlne DDM servces provded by Applcaon Servce Provders. 6. Conclusons and Fuure Drecons In hs paper, we have presened a hybrd archecure for dsrbued daa mnng ha negraes he clen-server and he moble agen models. We have shown how he hybrd DDM model enables he provson of generc DDM servces by an ASP n an e-commerce envronmen by addressng he ssues of opmsaon, cosng and exensbly. We have also developed a cos model for esmang he DDM response me for alernave scenaros o ad n he opmsaon process. The expermenal valdaon of he cos model and he mplemenaon of he hybrd DDM archecure are currenly n progress. References [1] Adam,N,R., Dogramac,O., Gangopadhyay,A., and Yesha,Y., (1999), Elecronc Commerce: Techncal, Busness and Legal Issues, Prence Hall, New Jersey, USA. [2] Boa,J,A., Garjo,J,R., and Skarmea,A,F., (1998), A Generc Daa Mnng Sysem: Basc Desgn and Implemenaon Gudelnes, n Workshop on Dsrbued Daa Mnng a he 4h In. Conf. on Daa Mnng and Knowledge Dscovery (KDD- 98), New York, USA, AAAI Press. [3] Characha,J., Darlngon, J., Guo,Y., Hedvall,S., Köhler,M., and Syed,J., (1999), An Archecure for Dsrbued Enerprse Daa Mnng, n Proc. of he 7h In. Conf. on Hgh Performance Compung and Neworkng (HPCN Europe 99), Amserdam, The Neherlands, Sprnger- Verlag LNCS 1593. [4] Clark-Dckson,P., (1999), Flag-fall for Applcaon Renal, Sysems, (Augus), pp.23-31. [5] Kargupa,H., Hamzaoglu,I. and Safford,B., (1997), Scalable, Dsrbued Daa Mnng Usng An Agen Based Archecure, n Proc. of he 3rd In. Conf. on Knowledge Dscovery and Daa Mnng, Newpor Beach, Calforna, (eds), D.Heckerman, H.Mannla, D.Pregbon, and R.Uhurusamy, AAAI Press, pp. 211-214. [6] Kargupa,H., Park,B., Hershberger,D., and Johnson, E., (1999), Collecve Daa Mnng: A New Perspecve Toward Dsrbued Daa Mnng, o appear n Advances n Dsrbued Daa Mnng, (eds) H.Kargupa and P.Chan, AAAI Press. [7] Marn,G., Unruh,A., and Urban,S., (1999), An Agen Infrasrucure for Knowledge Dscovery and Even Deecon, Techncal Repor MCC-INSL-003-99, Mcroelecroncs and Compuer Technology Corporaon (MCC). [8] Morency,J., (1999), Applcaon Servce Provders and E- Busness, Nework World Fuson Newsleer, URL:hp://www.nwfuson.com/newsleers/nsm/0705nm.hml [9] Parhasarahy,S., and Subramonan,R., (1999), Faclang Daa Mnng on a nework of worksaons, o appear n Advances n Dsrbued Daa Mnng, (eds) H. Kargupa and P.Chan, AAAI Press. [10] Ramu,A,T., (1998), Incorporang Transporable Sofware Agens no a Wde Area Hgh Performance Dsrbued Daa Mnng Sysems, Masers Thess, Unversy of Illnos, Chcago, USA. [11] Solfo,S,J., Prodromds,A,L., Tseleps, L., Lee,W., Fan,D., and Chan,P,K., (1997), JAM: Java Agens for Mea-Learnng over Dsrbued Daabases, n Proc. of he 3rd In. Conf. on Daa Mnng and Knowledge Dscovery (KDD-97), Newpor Beach, Calforna, (eds) D.Heckerman, H.Mannla, D.Pregbon, and R.Uhurusamy, AAAI Press, pp. 74-81. [12] Sraßer,M., and Schwehm,M., (1997), A Performance Model for Moble Agen Sysems, n Proc. of he In. Conf. on Parallel and Dsrbued Processng Technques and Applcaons (PDPTA 97), (eds) H. Arabna, Vol II, CSREA, pp. 1132-1140.