Probabilistic Latent Semantic User Segmentation for Behavioral Targeted Advertising*
|
|
- Liliana Francis
- 8 years ago
- Views:
Transcription
1 Probablstc Latent Semantc User Segmentaton for Behavoral Targeted Advertsng* Xaohu Wu 1,2, Jun Yan 2, Nng Lu 2, Shucheng Yan 3, Yng Chen 1, Zheng Chen 2 1 Department of Computer Scence Bejng Insttute of Technology Bejng, Chna, xaohuwu85@gmal.com, chenyng1@bt.edu.cn 2 Mcrosoft Research Asa Sgma Center, 49 Zhchun Road Bejng, Chna, {v-xwu, junyan, nngl, zhengc}@mcrosoft.com 3 Natonal Unversty of Sngapore Offce E , 4 Engneerng Drve 3 Sngapore, eleyans@nus.edu.sg ABSTRACT Behavoral Targetng (BT, whch ams to delver the most approprate advertsements to the most approprate users, s attractng much attenton n onlne advertsng maret. A ey challenge of BT s how to automatcally segment users for ads delvery, and good user segmentaton may sgnfcantly mprove the ad clc-through rate (CTR. Dfferent from classcal user segmentaton strateges, whch rarely tae the semantcs of user behavors nto consderaton, we propose n ths paper a novel user segmentaton algorthm named Probablstc Latent Semantc User Segmentaton (PLSUS. PLSUS adopts the probablstc latent semantc analyss to mne the relatonshp between users and ther behavors so as to segment users n a semantc manner. We perform experments on the real world ad clc through log of a commercal search engne. Comparng wth the other two classcal clusterng algorthms, K-Means and CLUTO, PLSUS can further mprove the ads CTR up to 100%. To our best nowledge, ths wor s an early semantc user segmentaton study for BT n academa. Categores and Subject Descrptors H.3.5 [Informaton Storage and Retreval]: Onlne Informaton Servce Commercal Servce; I.5.1 [Pattern Recognton]: Models Statstcal General Terms Algorthms, Performance, Expermentaton Keywords Behavoral Targetng (BT, User segmentaton, probablstc latent semantc analyss 1. INTRODUCTION Nowadays, a large number of advertsers would le to publsh ther advertsements through Internet, whch brought a new Permsson to mae dgtal or hard copes of all or part of ths wor for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. ADKDD 09, June 28, 2009, Pars, France. Copyrght 2009 ACM $ developng feld nown as onlne advertsng scence. Sponsored search [9] and contextual ads [4] are two of the most wdely studed onlne advertsng busness models. Besdes, Behavoral Targetng, whch ams to analyze users behavors to delver approprate ads to potental consumers, has been valdated to mae onlne advertsng more effectve [19]. A crucal porton n BT s the problem of user segmentaton, whch ams at groupng users nto user segments wth smlar behavors. Snce advertsers generally select user segments most relevant to ther ads, f users wth smlar purchase ntentons are successfully gathered nto the same segment, advertser may gan more proft from the ads delvery. Thus, the qualty of user segmentaton has domnant mpact on the performance of behavoral targeted advertsng. In ths paper, we focus on the problem of user segmentaton for BT n search engne advertsng. User segmentaton s a process arrangng each user nto one or more segments to guarantee that users wth smlar nterests or purchase ntentons are wthn the same segment. We formulate the problem of user segmentaton as follows. Suppose a set of onlne users s gven. For each user, we adopt hs/her hstorcal onlne behavors such as queres to depct hs/her nterests. Some ads have been dsplayed to these users, and these ads are recorded wth the status whether they are clced n the mpresson. Our objectve s to group all users nto approprate segments by the analyss of user behavors n order to mprove the ad clc probablty wthn the user segments n contrast to the massve maret ads. Conventonal user segmentaton approaches such as classfcaton and clusterng stand two lmtatons. (1 Many tradtonal strateges utlze eywords as features, and then mplement clusterng or classfcaton on these features. In ths way, two users, who have the smlar buyng ntentons but have no common words between each other, shall not be put nto the same segment. (2 Many classcal clusterng methods do not allow an object to belong to multple clusters, whch means one user can only stay n a unque segment. We notce that semantc approaches such as Latent Semantc Analyss (LSA [8], Probablstc Latent Semantc Analyss (PLSA [13] and Latent Drchlet Allocaton (LDA [1] are wdely studed and adopted n feld of document classfcaton. Among those approaches, PLSA effectvely mnes the relatonshp between document and word wth a hdden varable called topc. Besdes, PLSA has the ablty *The wor was done when the frst author was ntern at Mcrosoft Research Asa. 10
2 to tae one document nto multple topcs. Motvated by PLSA n the feld of text mnng, we propose to analyze the smlarty between user-query and document-word and present a semantc approach called Probablstc Latent Semantc User Segmentaton (PLSUS for handlng the lmtatons of those tradtonal user segmentaton strateges. In order to execute semantc user segmentaton for BT, we frstly splt users search queres nto terms and thus each user can be represented as a collecton of terms [19]. Usng ths Bag of Words (BOW representaton [18], the latent varable, whch presents the topcs,.e. semantc user nterests, s nvolved to represent users. Ths topcal varable s able to brdge users and ther observed behavors. We wll show that the latent semantc topcs can present users nterests mplyng potental purchase ntenton n our experments. For ths reason, we drectly utlze the latent semantc topcs to segment users. The Expectaton Maxmzaton (EM approach s appled to mne the latent semantc topcs. Snce a user may have multple nterests, to better mage the user s nterests, we set a threshold and push the user nto those segments wth the probabltes larger than the predefned threshold. In the experments, we compare our proposed PLSUS and a modfed verson, whch s nown as Sngle-PLSUS wth two commonly used clusterng algorthms, CLUTO and -Means. Sngle-PLSUS only allows a user n a unque segment as many tradtonal clusterng algorthms do. The results show that PLSUS can mprove the ads CTR up to 100%. In addton, PLSUS has good performance on classcal F-measure. The rest of ths paper s organzed as follows. In Secton 2, we ntroduce the bacground nowledge about BT and semantc graphcal models n text mnng. In Secton 3, we descrbe our soluton to semantc user segmentaton, namely PLSUS. In Secton 4, we ntroduce the expermental confguraton and results wth analyss. Fnally n Secton 5, we conclude ths paper along wth future wor dscusson. 2. BACKGROUND In ths secton, we ntroduce the bacground nowledge for better understandng ths wor. We demonstrate the basc nowledge on BT ncludng the defnton and related commercal systems n Secton 2.1. In addton, we revew the semantc approaches such as LSA, PLSA, and LDA n Secton Behavor Targetng Behavoral Targetng s an advertsng methodology, whch s burgeonng n onlne advertsng. Wth ths technque, ads can be effectvely delvered to the most relevant users. Behavor Targetng may mprove the performance of onlne advertsement delvery by two major steps, namely, user segmentaton and user segments ranng. In the user segmentaton step, based on onlne behavor such as vsted webstes, clced pages and nput queres, users are located nto some user segments created n system. In the user segments ranng step, gven an ad, user segments are raned by relevance and the top segments are chosen for ads delvery. Thus, BT successfully dsplays the ads to those most approprate users. At present, BT s attractng more and more attenton n both ndustry and academa. In ndustry, a large amount of commercal systems nvolvng Behavoral Targetng were proposed: Adln [20], whch taes the short user sesson nto consderaton for BT, DoubleClc [24], whch adopts specal features such as browser types and operaton systems of users to mprove the user segmentaton step, Specfcmeda [28], whch predcts each user s nterest and purchase ntenton as a score, and the Yahoo! Smart ads [30], whch ntegrates the demographc and geographc targetng. Addtonally, Almond Net [21], Blue Lthum [27], Burst [23], NebuAd [25], Phorm [26], Revenue Scence [26], and TACODA [29] are the commercal systems wth BT. In academa, Yan et al. [19] frst studed the mprovement of BT n commercal search engnes from three aspects ncludng effectveness, mprovement, and the best strategy for BT. User segmentaton s a process arrangng each user nto one or more segments by a specfc crteron. In BT, ths crteron s to endeavor to guarantee that users wth smlar nterests and purchase ntentons are n the same segment. However, we cannot derve that nformaton drectly. The most wdespread way s mnng the user behavors to represent user nterests and purchase ntentons. That means users wth the smlar behavors mply that they have the smlar favors. Thus, user segmentaton for BT can be descrbed as attemptng to place each user n one or more segments for guaranteeng that the users wth smlar behavors are n the same segment. Snce advertsers tend to choose most relevant segments to pay, the qualty of user segmentaton s extraordnarly crucal. On one hand, f system can gather more users wth smlar nterests nto one segment, advertsers wll buy fewer segments to delver ther ads. On the other hand, apparently, CTR s to mprove f the smlarty between each par of users wthn the same segment s large. Thus, user segmentaton s a ey problem n BT applcaton. Tradtonal user segmentaton approaches for BT can be classfed nto three categores, namely manual user segmentaton, user classfcaton, and user clusterng. Manual rule based user segmentaton, whch classfes users nto segments manually, suffers from a sgnfcant defcency n tme cost. As a result of that large scale data s used for BT, ths method was hardly adopted by the commercal systems. User classfcaton and user segmentaton respectvely mplement classfcaton and clusterng for users. The tradtonal clusterng or classfcaton approaches have two lmtatons n ths applcaton scenaro. (1 Users are segmented only based on contents of ther behavors, not ther semantc nterests. Wth the Bag of Words model, tradtonal strateges utlze terms as features n order to mplement clusterng. That means two users wth the smlar purchase ntentons but wthout same terms between each other have lttle chance to be grouped nto one segment. (2 Many clusterng methods whch are wdely used for BT concentrate on settlng one object n one cluster. On account of ths lmtaton, f a user has two completely dfferent nterests, only one nterest can be presented and the other one has to be dscarded. Thus, t s desred to propose new semantc segmentaton approaches for BT. 2.2 Semantc Analyss Semantc analyss, whch s a well establshed technque n ndustry, mnes hdden semantc relatonshps among objects. Latent Semantc Analyss (LSA [8] s the well-nown approach for dervng the latent semantc relatonshp and wdely used n automatc ndexng and nformaton retreval. The man dea s mappng hgh-dmensonal vectors to low-dmensonal ones n the latent semantc space. Probablstc Latent Semantc Analyss (PLSA model [13, 15], whch s derved from LSA, s able to capture hdden varables wth sold statstcal foundaton. Each 11
3 object s represented by the convex combnaton of topc, whch s a latent varable n PLSA. Latent Drchlet Allocaton (LDA [1] s smlar to PLSA. The dfference between these two models s that the topc dstrbuton s assumed to have a Drchlet pror n LDA. In document classfcaton, LDA derves more reasonable mxtures of topcs. However, the wor n [11] has proved that the PLSA model s equvalent to the LDA model under a unform Drchlet pror dstrbuton. In ths wor, we focus on PLSA to derve our PLSUS model. PLSA s a sgnfcant breathrough, snce t can dscover latent varables wth more flexblty. Besdes, usng the EM algorthm, we can easly estmate the value n PLSA. In practce, PLSA s wdely used n many felds such as document classfcaton [2, 3, 10, 17], nformaton retreval [14], web usage mnng [16], coctaton analyss [5, 6] and collaboratve flterng [7, 12]. However, there are rare wors whch apply PLSA to user segmentaton for BT. In our study, followng the Bag of Words model, we descrbe each user as a collecton of terms, whch are extracted from ther behavors, such that we can represent users n the Bag of Words model, whch s smlar to the commonly used document representaton strategy. 3. PROBABILISTIC LATENT SEMANTIC USER SEGMENTATION (PLSUS In ths secton, we ntroduce our semantc user segmentaton algorthm. PLSA, whch can dscover the latent relatonshp between two objects, s wdely studed n document classfcaton and clusterng problems. In text mnng, we generally use the Bag of Words model [18] to represent documents. Accordng to the wor of Yan et al. [19], users behavors can be represented by ther hstorcal queres. Notce the fact that query conssts of terms, thus we can treat each query as one set of terms. Through ths way, each user can be represented by a bag of words, whch s the same as the representaton of text document. Let u U u, u,..., u } { 1 2 n stand for a user, where U presents the set of all users for BT, suppose t T t, t,..., t } s a term, where T represents the j { 1 2 m vocabulary of all terms used by all users. We defne of all terms used by u, thus, T Tu u U T u as the set Then, we defne the co-occurrence matrx N { n( u, t j }, where n( u, t j descrbes the number of tme t j used by u. To semantcally segment users, we ntroduce the latent varable z Z z, z,..., z } whch represents the topcs,.e. semantc { 1 2 l ntentons of users. Ths latent varable has the close relatonshp wth both user and query, whch has been transformed nto terms. From the user s perspectve, topc mples the hdden nterest of user. On the other hand, from the term s perspectve, terms n one topc may be gathered wth some specfed feld. Here, we assume that for a gven topc varable z, users and terms are ndependent to each other. We adopt the classcal aspect model [13] n PLSUS. The graphc model of aspect model s gven by Fgure 1. Fgure 1. Graph of the aspect model In the BT scenaro, each user has the probablty P z u to ( generate a topc z, and then z has the probablty P ( t j z to generate term t j. Gven the basc model, P( u, t j P( u P( t j u P ( t j u P( t j z P( z u z Z Notce that, ths model contans the probablty P z u and ( P ( u whch are not convenent to compute. Thus, we transform ths model nto another equalng form, P ( u, t j P( z P( u z P( t j z, z Z where P ( z presents the probablty that z s observed n Z, P ( u z s the probablty that u s relevant to the gven topc z and P ( t j z s the probablty that t j s related to the gven topc z. The Graphcal model representaton s shown n Fgure 2. Fgure 2. Graph of the PLSUS. The same as PLSA n the feld of text mnng, we am to maxmze the lelhood defned as, L n m n( u, t j logp( u, t j 1 j1 n m l n( u, t j logp( z P( u z P( t j z 1 j1 1 In order to maxmze L, we adapt the classcal Expectaton Maxmzaton (EM approach. EM approach s wdely used n computng maxmum lelhood n latent varable model. EM s an teratve method whch alternates between performng two steps. (1 Expectaton step (E step. Usng the current estmates of parameters, we compute the posteror probabltes P z u, t ( j for the latent varable. (2 Maxmzaton step (M step. Amng to maxmze complete maxmze lelhood E [ L c ], we update P ( z, P( u z and P ( t j z. 12
4 After fnshng EM computaton, PLSUS ams to segment users wth the model obtaned. Snce the topc has the close relatonshp wth user and term, apparently, topc can be used as user segment. In ths way, the semantc attrbutes become the domnant factors n user segmentaton. Thus, we am to solve the queston of how to segment users nto dfferent topcs. To solve ths queston, we focus on an mportant probablty P z u whch presents the ( topc (user segment z s observed wth a gven user u. It can descrbe how close the relatonshp between z and u s. P z u s able to be computed by, ( m j1 m l n( u, t j P( z u, t j P( z u. n( u, t j P( z' u, t j j1 ' 1 Intutvely, the easest way to segment users nto topcs s that, computng all P ( z u, z Z for each u, and then puttng u nto the topc wth the hghest P z u. However, ths ( approach of user segmentaton cannot handle the followng crcumstance: If a user s nterested n sports and coong whle there are two topcs whch exactly mply sports and coong, ths segmentaton method wll choose only one topc for a user at most. In ths way, we may lose a user s nterest. In order to get over ths defcency, we present a novel approach for segmentng the users based on the probablty P z u. Here, we apply a threshold ( for user segmentaton. Let S be the set of user segments and s S as the segment wth topc z, thus the user segmentaton approach s, u s u s f P( z u threshold, otherwse. Comparng wth those tradtonal clusterng methods, ths smple method allows one user belong to multple segments. 4. EXPERIMENTS In ths secton, we systematcally evaluate the proposed PLSUS algorthm. Two normal clusterng methods are used as baselnes n experments. Also, to better compare wth normal clusterng approaches, a modfed PLSUS whch we called Sngle-PLSUS s nducted. Some evaluatons are used n our experments to measure the performance of each approach. 4.1 Data Sets In ths part, we use a one day s ads clc-through log record collectng from a commercal search engne. Ths data can effectvely present users clc-through behavors. Table 1 shows the format of ths data used n our experments. From ths table, we can see that there are four propertes for the data we focused on. UserId presents a specfed user, dfferent user has dfferent UserId. Smlar to UserId, AdId s used as the unque dentfcaton for each advertsement. Query shows the content of a query used by user, and we can dvde t nto terms to adapt to PLSUS. ClcCnt s an mportant property whch s used n our evaluaton metrcs such as CTR. From the example n Table 1, we now a specfed user wth UserId EEEC97C25FD50C1AB282 D39FB13976D9 used a query whose content s boos, and then the system dsplays an advertsement wth AdId to ths user. However, ths user dd not clc ths ad. Table 1. Format of log record used n our experments. UserId Query EEEC97C25FD50C1AB282D39FB13976D9 Boos AdId ClcCnt 0 We use two datasets ncludng 120,000 and 150,000 log records respectvely to verfy the performance of PLSUS. Both of them contan thousands of users. In our experments, we tae all users n 120,000 log dataset nto 5 and 10 segments, whle all users n 150,000 log dataset are pushed n 10 and 20 segments respectvely usng dfferent approaches. 4.2 Experment Setup In ths part, we ntroduce the ey steps of our experments. In user segmentaton, let A a, a,..., a } be the set of ads n our { 1 2 n U { u1, u2,..., um dataset, } be the group of users who have dsplayed a. Furthermore, after we segment users wth dfferent approaches, we acqure the user segments. Thus, we defne user D U { d ( U, d ( U,..., d ( U }, 1, 2,..., n be the ( 1 2 dstrbuton of U wth our obtaned user segments and d ( U the set of users who belong to the th segment. Apparently, the th segment can be descrbe as, d 1,2,..., n The ey steps n our experments are, d ( U (1 We compare PLSUS, Sngle-PLSUS, -Means and CLUTO n our dataset, where Sngle-PLSUS s a modfed PLSUS whch we wll ntroduce n latter secton of ths paper. (2 We utlze the dfferent threshold whch s adopted n segmentng users after comng out the fnal model by EM algorthm to test the senstvty of PLSUS. 4.3 Evaluaton Metrcs In [19], Yan ntroduced some evaluatons whch can measure the BT s performance effectvely. Consultng these good evaluatons, we perform four evaluatons to measure the performance of each approach and to compare our soluton wth the baselnes. They are, ads Clc-Through Rate (CTR, ads Clc-Through Rate Improvement, ads clc Entropy and F-measure. Wth the symbols we defned, CTR can be represented by, where ( u j s defned as, m 1 CTR a ( u j m j1 1 ( uj 0 CTR of a over user segment (, f ujclceda otherwse d s, 13
5 1 CTR( a d ( uj, d ( U uj d ( U where d ( U s the number of users n d. ( U Note that CTR a s the raw CTR. n other words, CTR a s ( the CTR over all users dsplayed a. CTR ( a d presents the CTR of each user segment after segmentaton. In order to measure the mprovement of CTR by user segmentaton, we defne a new evaluaton metrc for PLSUS. Ths new evaluaton should satsfy two condtons, (1 Maxmum: choosng the segment whch has maxmum CTR. Ths s reasonable because ad publsher would le to recommend the user segment wth hghest ad clc probablty to advertser for ads delvery. (2 Majorty: the number of users n ths segment cannot be less than average. Ths condton can reduce some specal stuaton. For example, the th user segment only has 1 user and he/she clced a. Then, CTR ( a d 1. Apparently, ths segment s not approprate to be recommended to advertser. Integratng these two condtons, we defne the CTR mprovement for a as, where CTR( a d ( a CTR( a ( a, CTR( a * d ( a arg max{ CTR( a ~ d { d Thus, CTR mprovement Entropy s defned as, where, * d, d d ~ } d ( U 1 1,2,..., K and } m K ( a / n. K Enp( a P( d a logp( d a, 1 1 P( d a ( uj m uj d ( U Note that the smaller the Entropy s, the better results we wll obtan [19]. The classcal F-measure ncludng Precson, Recall and F measure, are defned as, F( a Pre( a d CTR( a d Rec( a d d uj d ( U m j1 ( uj ( uj 2Pre( a d Rec( a d Pre( a d Rec( a d ( where the larger F-measure s, the better performance we have. 4.4 Results In ths part, we ntroduce the detals n our experments and show the results. To show the performance of PLSUS, we am to compare PLSUS wth tradtonal clusterng methods. CLUTO and -Means are selected as the baselnes. However, t s unfar to compare CLUTO and -Means wth PLSUS snce PLSUS allows one user belong to multple segments, whle both CLUTO and - Means permt one user to belong to only one user segment. In order to solve ths problem, a Sngle-PLSUS s mplemented to brdge the gap between PLSUS and tradtonal clusterng approaches. By Sngle-PLSUS, a gven user u s settled n a P ( z u unque segment z whch has the max. On one hand, comparng Sngle-PLSUS wth CLUTO and -Means can show whether the semantc approach mproves BT s performance. On the other hand, t can clearly show the mpact on allowng one user to belong to multple segments by comparson between PLSUS and Sngle-PLSUS. The results are shown n Table 2-4. Note that the best results are n bold face. Note that we set threshold 0.2, the further explanaton s shown n the latter sectons. CTR s one of the most basc and crtcal evaluaton metrc for onlne advertsng problems. From the Table 2, we can generally observe two phenomena. Frst, by ncreasng the number of segments, the mprovement of CTR s ncreasng smultaneously. In the 150,000 log dataset, as the segments doubled, the mprovement of CTR ncreases two fold. In the same dataset, wth the 20 segments, the PLSUS mproves CTR up to 100% aganst tradtonal CLUTO. Second, all semantc approaches have the good performances on CTR mprovement. By further analyss, Sngle-PLSUS totally exceeds CLUTO and -Means. Ths fact proves that the semantc approach s approprate to be adopted n BT. Snce we gathered all queres used for each user and dvde these queres nto terms, we dscover the correspondence between user-query and document-words. The results verfy the correctness of our dea. The observaton of comparson between PLSUS and Sngle-PLSUS shows the advantages from allowng user to be pushed nto multple segments. Besdes, n Yan s wor [19], CTR mprovement wth CLUTO and -Means are around 100% by group users nto 20 segments, whch has been proved by our expermental results. Snce Yan s experments shown that CTR mprovement can reach to 670% by 160 user segments n the large scale dataset, we are confdent to expect that we can mprove CTR more than that f we group users nto more segments. In our future wor, we wll ncrease the scalablty to verfy ths concluson. We compute the average ads clc Entropy over all ads n the dataset we used. The result s shown n Table 3. Generally, all user segmentaton approaches entropes are almost the same. In ths case, entropy has less effect on dstncton among those methods than CTR. From the detaled observaton, we dscover that the entropy of PLSUS s larger than others. Consderng ther attrbutes, the reason s easy to get. The same crteron of user segmentaton, whch allows sngle user belong to multple segments, s used n PLSUS. That means there s more than one segment whch may have been delvered an ad many tmes. In ths way, the entropy s naturally larger than those user segmentaton approaches whch only assocate one user wth one segment. 14
6 Table 2. CTR mprovement of dfferent user segmentaton strateges. 5 segments n 10 segments n 10 segments n 20 segments n PLSUS Sngle-PLSUS CLUTO Means Table 3. Ads clc Entropy of dfferent user segmentaton strateges. 5 segments n 10 segments n 10 segments n 20 segments n PLSUS Sngle-PLSUS CLUTO Means PLSUS Sngle-PLSUS CLUTO -Means Table 4. F-measure of dfferent user segmentaton strateges. 5 segments n 10 segments n 10 segments n 20 segments n Precson % % % % Recall % % % % F % % % % Precson % % % % Recall % % % % F % % % % Precson % % % % Recall % % % % F % % % % Precson % % % % Recall % % % % F % % % % Precson, Recall and F-measure are shown n Table 4. Note that, the results reported n ths table are the average over all ads. Frst of all, we dscover the two facts that: (1 semantc approaches have better presentatons n Precson. Snce we choose the CTR as the Precson, ths result can be predcted by CTR mprovement. (2 Wthn three semantc approaches, PLSUS performs better than others. By these two facts, we can conclude that our proposed methods are helpful to mprove the Precson (CTR. An nterestng observaton s the Recall of tradtonal clusterng approaches s hgher than others n our two small datasets. Consderng the low precson, we can decde that the hgh-ctr segments clustered by CLUTO or -Means should nclude many users. In other words, the way to mprove CTR of a segment n tradtonal approaches s to add more users to ths segment. On the contrary, semantc user segmentaton can mprove the CTR wthout buldng user segment wth too large populaton. Ths characterstc s very useful for accurate ads delvery. Integratng Precson and Recall, the F-measure can evaluate the performance of user segmentaton. From the results of hgh F-measures of PLSUS and Sngle-PLSUS, we can draw the concluson that semantc user segmentaton has better performance than classcal clusterng methods. Fnally, we dscuss the nfluence of parameter threshold n PLSUS model. We set up a seres of experments whch group users nto 10 segments on the 120,000 log record data. Apparently, f threshold 0. 5, the output of PLSUS wll be constant. Thus, we set threshold from 0.05 to 0.5 and the Fgure 3-4 dsplay the results. Snce bgger threshold ndcates that user have smaller chance to be collected nto multple segments, the CTR mprovement lowers down when threshold becomes bgger n Fgure 3. However, f we tae a too small threshold, each user 15
7 wll have bg opportunty to be settled n many segments. In ths way, each segment wll contan too much users and lead bg entropy. The result n Fgure 4 shows ths fact. Analyzng Fgure 3-4, we consder that threshold around 0.2 can perform good performance both on CTR mprovement and entropy. Therefore, we set threshold 0. 2 for PLSUS n the experment whch compares four user segmentaton approaches. Fgure 3. Change of CTR mprovement wth ncreasng threshold. Fgure 4. Change of ads clc Entropy wth ncrease threshold. 5. CONCLUSION AND FUTURE WORK In ths paper, we developed a novel semantc approach called PLSUS for BT. We compared the proposed PLSUS algorthm wth two tradtonal clusterng user segmentaton approaches, CLUTO and -Means. From the expermental results we can draw the concluson that semantc approach PLSUS brngs better mprovements for BT n contrast to the tradtonal user clusterng, especally n terms of CTR mprovement. In our future wor, we wll pay more attenton to Latent Drchlet (LDA. It has been noted that, LDA has better results n document classfcaton than PLSA. Thus, we wll study ths model and attempt to apply t to user segmentaton for verfyng whether t has better performance for BT than PLSUS does. In addton, we wll modfy the EM algorthm to parallelze PLSUS. We beleve t s helpful to further ncrease the algorthmc scalablty and mprove the effcency. 6. REFERENCES [1] D. Ble, A. Ng, and M. Jordan. Latent Drchlet allocaton. Journal of Machne Learnng Research, 3(2003, [2] T. Brants, F. Chen, and I. Tsochantards. Topc-based document segmentaton wth probablstc latent semantc analyss. In Proceedngs of CIKM '02 (Las Palmas, June 2002, ACM Press, [3] T. Brants and R. Stolle. Fnd smlar documents n document collectons. In Proceedngs of LREC '02 (Span, June [4] A. Broder, M. Fontoura, V. Josfovs and L. Redel. A semantc approach to contextual advertsng. In Proceedngs of SIGIR '07 (Amsterdam, July 2007, ACM Press, [5] D. Cohn and H. Chang. Learng to probablstcally dentfyng authortatve documents. In Proceedngs of the ICML '00 (Stanford, June 2000, Morgan Kauffmann, [6] D. Cohn and T. Hofmann. The mssng ln: A probablstc model of document content and hypertext connectvty. In Proceedng of NIPS '00 (Denver, November 2000, MIT Press. [7] A. Das, M. Datar and A. Garg. Google news personalzaton: scalable onlne collaboratve flterng. In Proceedng WWW '07 (Banff, May 2007, ACM Press, [8] S. Deerwester, S. Dumas, G. Furnas, T. Landauer, and R. Hashman. Indexng by latent semantc analyss. Journal of the Amercan Socety for Informaton Scence, 41(1990, [9] D. C. Fan and J. O. Pedersen. Sponsored search: a bref hstory. In Bulletn of the Amercan Socety for Informaton Scence and Technology, [10] E. Gausser, C. Goutte, K. Popat, and F. Chen. A herarchcal model for clusterng and categorzng documents. In Advances n Informaton Retreval Proceedngs of the 24th BCS-IRSG European Colloquum on IR Research (Glasgow, March [11] M. Grolam and A. Kabán. On an equvalence between PLSI and LDA. In Proceedng SIGIR '03 (Toronto, July 2003, ACM Press, [12] A. Harpale and Y. Yang. Personalzed actve learnng for collaboratve flterng. In Proceedng of SIGIR '08 (Sngapore, July 2008, ACM Press, [13] T. Hofmann. Probablstc latent semantc analyss. In Proceedngs of UAI '99 (Stocholm, July 1999, Morgan Kaufmann, [14] T. Hofmann. Probablstc latent semantc ndexng. In Proceedngs of SIGIR '99 (Bereley, August 1999, ACM Press, [15] T. Hofmann. Unsupervsed learnng by probablstc latent semantc analyss. Machne Learnng Journal, 42(2001, [16] X. Jn, Y. Zhou, and B. Mobasher. Web usage mnng based on probablstc latent semantc analyss. In Proceedngs of KDD '04 (Seattle, August 2004, ACM Press, [17] Y. Km, J. Chang, and B. Zhang. An emprcal study on dmensonalty optmzaton n text mnng for lngustc nowledge acquston. In Proceedngs of KDD '03 (Seoul, Aprl 2003, ACM Press,
8 [18] G. Salton and C. Bucley. Term-weghtng approaches n automatc text retreval. Informaton Processng and Management: an Internatonal Journal, 24 (1988, [19] J. Yan, N. Lu, G. Wang, W. Zhang, Y. Jang and Z. Chen. How much the Behavoral Targetng can help onlne advertsng? In Proceedng of WWW '09 (Madrd, Aprl 2009, ACM Press, [20] Adln, Dc28hZShnCI [21] Almond Net, [22] Blue Lthum, [23] Burst, [24] Double Clc, [25] NebuAd, [26] Phorm, [27] Revenue Scence, ons.asp [28] Specfcmeda, [29] TACODA, [30] Yahoo! Smart Ads, 17
The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis
The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.
More informationAn Interest-Oriented Network Evolution Mechanism for Online Communities
An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne
More informationDescriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications
CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary
More informationForecasting the Direction and Strength of Stock Market Movement
Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems
More informationSemantic Link Analysis for Finding Answer Experts *
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 28, 51-65 (2012) Semantc Lnk Analyss for Fndng Answer Experts * YAO LU 1,2,3, XIAOJUN QUAN 2, JINGSHENG LEI 4, XINGLIANG NI 1,2,3, WENYIN LIU 2,3 AND YINLONG
More informationAn Empirical Study of Search Engine Advertising Effectiveness
An Emprcal Study of Search Engne Advertsng Effectveness Sanjog Msra, Smon School of Busness Unversty of Rochester Edeal Pnker, Smon School of Busness Unversty of Rochester Alan Rmm-Kaufman, Rmm-Kaufman
More informationBayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending
Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success
More informationMining Multiple Large Data Sources
The Internatonal Arab Journal of Informaton Technology, Vol. 7, No. 3, July 2 24 Mnng Multple Large Data Sources Anmesh Adhkar, Pralhad Ramachandrarao 2, Bhanu Prasad 3, and Jhml Adhkar 4 Department of
More informationDEFINING %COMPLETE IN MICROSOFT PROJECT
CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,
More informationPEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS
PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS Yunhong Xu, Faculty of Management and Economcs, Kunmng Unversty of Scence and Technology,
More informationExploiting Recommendation on Social Media Networks
Internatonal Journal of Scence and Research IJSR) ISSN Onln: 2319-7064 Index Coperncus Value 2013): 6.14 Impact Factor 2013): 4.438 Explotng Recommendaton on Socal Meda Networs Swat A. Adhav 1, Sheetal
More informationModule 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..
More informationA DATA MINING APPLICATION IN A STUDENT DATABASE
JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul
More informationAn Alternative Way to Measure Private Equity Performance
An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate
More informationForecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network
700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School
More informationWhat is Candidate Sampling
What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble
More informationWeb Object Indexing Using Domain Knowledge *
Web Object Indexng Usng Doman Knowledge * Muyuan Wang Department of Automaton Tsnghua Unversty Bejng 100084, Chna (86-10)51774518 Zhwe L, Le Lu, We-Yng Ma Mcrosoft Research Asa Sgma Center, Hadan Dstrct
More informationHow To Understand The Results Of The German Meris Cloud And Water Vapour Product
Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller
More informationSingle and multiple stage classifiers implementing logistic discrimination
Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,
More informationVision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION
Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble
More informationDocument Clustering Analysis Based on Hybrid PSO+K-means Algorithm
Document Clusterng Analyss Based on Hybrd PSO+K-means Algorthm Xaohu Cu, Thomas E. Potok Appled Software Engneerng Research Group, Computatonal Scences and Engneerng Dvson, Oak Rdge Natonal Laboratory,
More informationJoint Optimization of Bid and Budget Allocation in Sponsored Search
Jont Optmzaton of Bd and Budget Allocaton n Sponsored Search Wenan Zhang Shangha Jao Tong Unversty Shangha, 224, P. R. Chna wnzhang@apex.sjtu.edu.cn Yong Yu Shangha Jao Tong Unversty Shangha, 224, P. R.
More informationPSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12
14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed
More informationHow Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence
1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh
More informationEnterprise Master Patient Index
Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an
More informationCan Auto Liability Insurance Purchases Signal Risk Attitude?
Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang
More informationIMPACT ANALYSIS OF A CELLULAR PHONE
4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng
More informationData Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,
More informationOn the Optimal Control of a Cascade of Hydro-Electric Power Stations
On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;
More informationMethodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications
Methodology to Determne Relatonshps between Performance Factors n Hadoop Cloud Computng Applcatons Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng and
More informationMultiple-Period Attribution: Residuals and Compounding
Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens
More informationA Secure Password-Authenticated Key Agreement Using Smart Cards
A Secure Password-Authentcated Key Agreement Usng Smart Cards Ka Chan 1, Wen-Chung Kuo 2 and Jn-Chou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,
More informationTitle Language Model for Information Retrieval
Ttle Language Model for Informaton Retreval Rong Jn Language Technologes Insttute School of Computer Scence Carnege Mellon Unversty Alex G. Hauptmann Computer Scence Department School of Computer Scence
More informationA Simple Approach to Clustering in Excel
A Smple Approach to Clusterng n Excel Aravnd H Center for Computatonal Engneerng and Networng Amrta Vshwa Vdyapeetham, Combatore, Inda C Rajgopal Center for Computatonal Engneerng and Networng Amrta Vshwa
More informationANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING
ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,
More informationProduct Quality and Safety Incident Information Tracking Based on Web
Product Qualty and Safety Incdent Informaton Trackng Based on Web News 1 Yuexang Yang, 2 Correspondng Author Yyang Wang, 2 Shan Yu, 2 Jng Q, 1 Hual Ca 1 Chna Natonal Insttute of Standardzaton, Beng 100088,
More informationFREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES
FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES Zuzanna BRO EK-MUCHA, Grzegorz ZADORA, 2 Insttute of Forensc Research, Cracow, Poland 2 Faculty of Chemstry, Jagellonan
More informationA DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña
Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION
More informationDesign and Development of a Security Evaluation Platform Based on International Standards
Internatonal Journal of Informatcs Socety, VOL.5, NO.2 (203) 7-80 7 Desgn and Development of a Securty Evaluaton Platform Based on Internatonal Standards Yuj Takahash and Yoshm Teshgawara Graduate School
More informationContext-aware Mobile Recommendation System Based on Context History
TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.12, No.4, Aprl 2014, pp. 3158 ~ 3167 DOI: http://dx.do.org/10.11591/telkomnka.v124.4786 3158 Context-aware Moble Recommendaton System Based on Context
More informationMETHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS
METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng
More informationThe Current Employment Statistics (CES) survey,
Busness Brths and Deaths Impact of busness brths and deaths n the payroll survey The CES probablty-based sample redesgn accounts for most busness brth employment through the mputaton of busness deaths,
More informationImproved SVM in Cloud Computing Information Mining
Internatonal Journal of Grd Dstrbuton Computng Vol.8, No.1 (015), pp.33-40 http://dx.do.org/10.1457/jgdc.015.8.1.04 Improved n Cloud Computng Informaton Mnng Lvshuhong (ZhengDe polytechnc college JangSu
More informationAnalyzing Search Engine Advertising: Firm Behavior and Cross-Selling in Electronic Markets
WWW 008 / Refereed Track: Internet Monetzaton - Sponsored Search Aprl -5, 008 Beng, Chna Analyzng Search Engne Advertsng: Frm Behavor and Cross-Sellng n Electronc Markets Anndya Ghose Stern School of Busness
More informationUsing Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council
Usng Supervsed Clusterng Technque to Classfy Receved Messages n 137 Call Center of Tehran Cty Councl Mahdyeh Haghr 1*, Hamd Hassanpour 2 (1) Informaton Technology engneerng/e-commerce, Shraz Unversty (2)
More informationEVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP. Kun-chan Lan and Tsung-hsun Wu
EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP Kun-chan Lan and Tsung-hsun Wu Natonal Cheng Kung Unversty klan@cse.ncku.edu.tw, ryan@cse.ncku.edu.tw ABSTRACT Voce over IP (VoIP) s one of
More informationPerformance Analysis and Coding Strategy of ECOC SVMs
Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School
More informationA Performance Analysis of View Maintenance Techniques for Data Warehouses
A Performance Analyss of Vew Mantenance Technques for Data Warehouses Xng Wang Dell Computer Corporaton Round Roc, Texas Le Gruenwald The nversty of Olahoma School of Computer Scence orman, OK 739 Guangtao
More informationFault tolerance in cloud technologies presented as a service
Internatonal Scentfc Conference Computer Scence 2015 Pavel Dzhunev, PhD student Fault tolerance n cloud technologes presented as a servce INTRODUCTION Improvements n technques for vrtualzaton and performance
More informationNEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION
NEURO-FUZZY INFERENE SYSTEM FOR E-OMMERE WEBSITE EVALUATION Huan Lu, School of Software, Harbn Unversty of Scence and Technology, Harbn, hna Faculty of Appled Mathematcs and omputer Scence, Belarusan State
More informationOpen Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1
Send Orders for Reprnts to reprnts@benthamscence.ae The Open Cybernetcs & Systemcs Journal, 2014, 8, 115-121 115 Open Access A Load Balancng Strategy wth Bandwdth Constrant n Cloud Computng Jng Deng 1,*,
More informationA Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing
A Replcaton-Based and Fault Tolerant Allocaton Algorthm for Cloud Computng Tork Altameem Dept of Computer Scence, RCC, Kng Saud Unversty, PO Box: 28095 11437 Ryadh-Saud Araba Abstract The very large nfrastructure
More informationRecurrence. 1 Definitions and main statements
Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.
More informationRisk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008
Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn
More informationAn Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement
An Enhanced Super-Resoluton System wth Improved Image Regstraton, Automatc Image Selecton, and Image Enhancement Yu-Chuan Kuo ( ), Chen-Yu Chen ( ), and Chou-Shann Fuh ( ) Department of Computer Scence
More informationPricing Model of Cloud Computing Service with Partial Multihoming
Prcng Model of Cloud Computng Servce wth Partal Multhomng Zhang Ru 1 Tang Bng-yong 1 1.Glorous Sun School of Busness and Managment Donghua Unversty Shangha 251 Chna E-mal:ru528369@mal.dhu.edu.cn Abstract
More informationCluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms
Internatonal Journal of Appled Informaton Systems (IJAIS) ISSN : 2249-0868 Foundaton of Computer Scence FCS, New York, USA Volume 7 No.7, August 2014 www.jas.org Cluster Analyss of Data Ponts usng Parttonng
More informationRank Based Clustering For Document Retrieval From Biomedical Databases
Jayanth Mancassamy et al /Internatonal Journal on Computer Scence and Engneerng Vol.1(2), 2009, 111-115 Rank Based Clusterng For Document Retreval From Bomedcal Databases Jayanth Mancassamy Department
More informationTHE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION
Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4, pp. 30-30 (2005) 30 THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Yu-Mn Chang *, Yu-Cheh
More informationCloud-based Social Application Deployment using Local Processing and Global Distribution
Cloud-based Socal Applcaton Deployment usng Local Processng and Global Dstrbuton Zh Wang *, Baochun L, Lfeng Sun *, and Shqang Yang * * Bejng Key Laboratory of Networked Multmeda Department of Computer
More informationLuby s Alg. for Maximal Independent Sets using Pairwise Independence
Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent
More informationMobile App Recommendations with Security and Privacy Awareness
Moble App Recommendatons wth Securty and Prvacy Awareness Hengshu Zhu 1 Hu Xong 2 Yong Ge 3 Enhong Chen 1 1 Unversty of Scence and Technology of Chna, 2 Rutgers Unversty, 3 UNC Charlotte zhs@mal.ustc.edu.cn,
More informationA Dynamic Load Balancing for Massive Multiplayer Online Game Server
A Dynamc Load Balancng for Massve Multplayer Onlne Game Server Jungyoul Lm, Jaeyong Chung, Jnryong Km and Kwanghyun Shm Dgtal Content Research Dvson Electroncs and Telecommuncatons Research Insttute Daejeon,
More informationL10: Linear discriminants analysis
L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss
More informationModeling and Simulation of Multi-Agent System of China's Real Estate Market Based on Bayesian Network Decision-Making
Int. J. on Recent Trends n Engneerng and Technology, Vol. 11, No. 1, July 2014 Modelng and Smulaton of Mult-Agent System of Chna's Real Estate Market Based on Bayesan Network Decson-Makng Yang Shen, Shan
More informationiavenue iavenue i i i iavenue iavenue iavenue
Saratoga Systems' enterprse-wde Avenue CRM system s a comprehensve web-enabled software soluton. Ths next generaton system enables you to effectvely manage and enhance your customer relatonshps n both
More informationCalculation of Sampling Weights
Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample
More informationRobust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School
Robust Desgn of Publc Storage Warehouses Yemng (Yale) Gong EMLYON Busness School Rene de Koster Rotterdam school of management, Erasmus Unversty Abstract We apply robust optmzaton and revenue management
More informationIWFMS: An Internal Workflow Management System/Optimizer for Hadoop
IWFMS: An Internal Workflow Management System/Optmzer for Hadoop Lan Lu, Yao Shen Department of Computer Scence and Engneerng Shangha JaoTong Unversty Shangha, Chna lustrve@gmal.com, yshen@cs.sjtu.edu.cn
More informationbenefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).
REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or
More informationFace Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)
Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton
More informationProject Networks With Mixed-Time Constraints
Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa
More informationStochastic Protocol Modeling for Anomaly Based Network Intrusion Detection
Stochastc Protocol Modelng for Anomaly Based Network Intruson Detecton Juan M. Estevez-Tapador, Pedro Garca-Teodoro, and Jesus E. Daz-Verdejo Department of Electroncs and Computer Technology Unversty of
More informationBERNSTEIN POLYNOMIALS
On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful
More informationAssessing Student Learning Through Keyword Density Analysis of Online Class Messages
Assessng Student Learnng Through Keyword Densty Analyss of Onlne Class Messages Xn Chen New Jersey Insttute of Technology xc7@njt.edu Brook Wu New Jersey Insttute of Technology wu@njt.edu ABSTRACT Ths
More information320 The Internatonal Arab Journal of Informaton Technology, Vol. 5, No. 3, July 2008 Comparsons Between Data Clusterng Algorthms Osama Abu Abbas Computer Scence Department, Yarmouk Unversty, Jordan Abstract:
More informationAN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE
AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE Yu-L Huang Industral Engneerng Department New Mexco State Unversty Las Cruces, New Mexco 88003, U.S.A. Abstract Patent
More informationGender Classification for Real-Time Audience Analysis System
Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa vhr@yandex.ru, shmaglt_lev@yahoo.com, andrey.shemakov@gmal.com,
More informationA Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression
Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy S-curve Regresson Cheng-Wu Chen, Morrs H. L. Wang and Tng-Ya Hseh Department of Cvl Engneerng, Natonal Central Unversty,
More informationNetwork Security Situation Evaluation Method for Distributed Denial of Service
Network Securty Stuaton Evaluaton Method for Dstrbuted Denal of Servce Jn Q,2, Cu YMn,2, Huang MnHuan,2, Kuang XaoHu,2, TangHong,2 ) Scence and Technology on Informaton System Securty Laboratory, Bejng,
More informationAn Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems
STAN-CS-73-355 I SU-SE-73-013 An Analyss of Central Processor Schedulng n Multprogrammed Computer Systems (Dgest Edton) by Thomas G. Prce October 1972 Techncal Report No. 57 Reproducton n whole or n part
More informationA Programming Model for the Cloud Platform
Internatonal Journal of Advanced Scence and Technology A Programmng Model for the Cloud Platform Xaodong Lu School of Computer Engneerng and Scence Shangha Unversty, Shangha 200072, Chna luxaodongxht@qq.com
More informationTopic Identification based on Bayesian Belief Networks in the context of an Air Traffic Control Task
Procesamento del Lenguaje Natural, núm. 35 (2005), pp. 327-332 recbdo 06-05-2005; aceptado 01-06-2005 Topc Identfcaton based on Bayesan Belef Networs n the context of an Ar Traffc Control Tas F. Fernández,
More informationThe OC Curve of Attribute Acceptance Plans
The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4
More informationResearch on Evaluation of Customer Experience of B2C Ecommerce Logistics Enterprises
3rd Internatonal Conference on Educaton, Management, Arts, Economcs and Socal Scence (ICEMAESS 2015) Research on Evaluaton of Customer Experence of B2C Ecommerce Logstcs Enterprses Yle Pe1, a, Wanxn Xue1,
More informationA Novel Auction Mechanism for Selling Time-Sensitive E-Services
A ovel Aucton Mechansm for Sellng Tme-Senstve E-Servces Juong-Sk Lee and Boleslaw K. Szymansk Optmaret Inc. and Department of Computer Scence Rensselaer Polytechnc Insttute 110 8 th Street, Troy, Y 12180,
More informationEfficient Project Portfolio as a tool for Enterprise Risk Management
Effcent Proect Portfolo as a tool for Enterprse Rsk Management Valentn O. Nkonov Ural State Techncal Unversty Growth Traectory Consultng Company January 5, 27 Effcent Proect Portfolo as a tool for Enterprse
More informationWeb Spam Detection Using Machine Learning in Specific Domain Features
Journal of Informaton Assurance and Securty 3 (2008) 220-229 Web Spam Detecton Usng Machne Learnng n Specfc Doman Features Hassan Najadat 1, Ismal Hmed 2 Department of Computer Informaton Systems Faculty
More informationFast Fuzzy Clustering of Web Page Collections
Fast Fuzzy Clusterng of Web Page Collectons Chrstan Borgelt and Andreas Nürnberger Dept. of Knowledge Processng and Language Engneerng Otto-von-Guercke-Unversty of Magdeburg Unverstätsplatz, D-396 Magdeburg,
More informationStudy on Model of Risks Assessment of Standard Operation in Rural Power Network
Study on Model of Rsks Assessment of Standard Operaton n Rural Power Network Qngj L 1, Tao Yang 2 1 Qngj L, College of Informaton and Electrcal Engneerng, Shenyang Agrculture Unversty, Shenyang 110866,
More informationWhen Network Effect Meets Congestion Effect: Leveraging Social Services for Wireless Services
When Network Effect Meets Congeston Effect: Leveragng Socal Servces for Wreless Servces aowen Gong School of Electrcal, Computer and Energy Engeerng Arzona State Unversty Tempe, AZ 8587, USA xgong9@asuedu
More informationA Dynamic Energy-Efficiency Mechanism for Data Center Networks
A Dynamc Energy-Effcency Mechansm for Data Center Networks Sun Lang, Zhang Jnfang, Huang Daochao, Yang Dong, Qn Yajuan A Dynamc Energy-Effcency Mechansm for Data Center Networks 1 Sun Lang, 1 Zhang Jnfang,
More informationSearching for Interacting Features for Spam Filtering
Searchng for Interactng Features for Spam Flterng Chuanlang Chen 1, Yun-Chao Gong 2, Rongfang Be 1,, and X. Z. Gao 3 1 Department of Computer Scence, Bejng Normal Unversty, Bejng 100875, Chna 2 Software
More informationAD-SHARE: AN ADVERTISING METHOD IN P2P SYSTEMS BASED ON REPUTATION MANAGEMENT
1 AD-SHARE: AN ADVERTISING METHOD IN P2P SYSTEMS BASED ON REPUTATION MANAGEMENT Nkos Salamanos, Ev Alexogann, Mchals Vazrganns Department of Informatcs, Athens Unversty of Economcs and Busness salaman@aueb.gr,
More informationAPPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT
APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT Toshhko Oda (1), Kochro Iwaoka (2) (1), (2) Infrastructure Systems Busness Unt, Panasonc System Networks Co., Ltd. Saedo-cho
More informationGaining Insights to the Tea Industry of Sri Lanka using Data Mining
Proceedngs of the Internatonal MultConference of Engneers and Computer Scentsts 2008 Vol I Ganng Insghts to the Tea Industry of Sr Lanka usng Data Mnng H.C. Fernando, W. M. R Tssera, and R. I. Athauda
More informationScale Dependence of Overconfidence in Stock Market Volatility Forecasts
Scale Dependence of Overconfdence n Stoc Maret Volatlty Forecasts Marus Glaser, Thomas Langer, Jens Reynders, Martn Weber* June 7, 007 Abstract In ths study, we analyze whether volatlty forecasts (judgmental
More informationOverview of monitoring and evaluation
540 Toolkt to Combat Traffckng n Persons Tool 10.1 Overvew of montorng and evaluaton Overvew Ths tool brefly descrbes both montorng and evaluaton, and the dstncton between the two. What s montorng? Montorng
More informationDynamic Pricing for Smart Grid with Reinforcement Learning
Dynamc Prcng for Smart Grd wth Renforcement Learnng Byung-Gook Km, Yu Zhang, Mhaela van der Schaar, and Jang-Won Lee Samsung Electroncs, Suwon, Korea Department of Electrcal Engneerng, UCLA, Los Angeles,
More informationCausal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting
Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of
More information