Probabilistic Latent Semantic User Segmentation for Behavioral Targeted Advertising*

Size: px
Start display at page:

Download "Probabilistic Latent Semantic User Segmentation for Behavioral Targeted Advertising*"

Transcription

1 Probablstc Latent Semantc User Segmentaton for Behavoral Targeted Advertsng* Xaohu Wu 1,2, Jun Yan 2, Nng Lu 2, Shucheng Yan 3, Yng Chen 1, Zheng Chen 2 1 Department of Computer Scence Bejng Insttute of Technology Bejng, Chna, xaohuwu85@gmal.com, chenyng1@bt.edu.cn 2 Mcrosoft Research Asa Sgma Center, 49 Zhchun Road Bejng, Chna, {v-xwu, junyan, nngl, zhengc}@mcrosoft.com 3 Natonal Unversty of Sngapore Offce E , 4 Engneerng Drve 3 Sngapore, eleyans@nus.edu.sg ABSTRACT Behavoral Targetng (BT, whch ams to delver the most approprate advertsements to the most approprate users, s attractng much attenton n onlne advertsng maret. A ey challenge of BT s how to automatcally segment users for ads delvery, and good user segmentaton may sgnfcantly mprove the ad clc-through rate (CTR. Dfferent from classcal user segmentaton strateges, whch rarely tae the semantcs of user behavors nto consderaton, we propose n ths paper a novel user segmentaton algorthm named Probablstc Latent Semantc User Segmentaton (PLSUS. PLSUS adopts the probablstc latent semantc analyss to mne the relatonshp between users and ther behavors so as to segment users n a semantc manner. We perform experments on the real world ad clc through log of a commercal search engne. Comparng wth the other two classcal clusterng algorthms, K-Means and CLUTO, PLSUS can further mprove the ads CTR up to 100%. To our best nowledge, ths wor s an early semantc user segmentaton study for BT n academa. Categores and Subject Descrptors H.3.5 [Informaton Storage and Retreval]: Onlne Informaton Servce Commercal Servce; I.5.1 [Pattern Recognton]: Models Statstcal General Terms Algorthms, Performance, Expermentaton Keywords Behavoral Targetng (BT, User segmentaton, probablstc latent semantc analyss 1. INTRODUCTION Nowadays, a large number of advertsers would le to publsh ther advertsements through Internet, whch brought a new Permsson to mae dgtal or hard copes of all or part of ths wor for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. ADKDD 09, June 28, 2009, Pars, France. Copyrght 2009 ACM $ developng feld nown as onlne advertsng scence. Sponsored search [9] and contextual ads [4] are two of the most wdely studed onlne advertsng busness models. Besdes, Behavoral Targetng, whch ams to analyze users behavors to delver approprate ads to potental consumers, has been valdated to mae onlne advertsng more effectve [19]. A crucal porton n BT s the problem of user segmentaton, whch ams at groupng users nto user segments wth smlar behavors. Snce advertsers generally select user segments most relevant to ther ads, f users wth smlar purchase ntentons are successfully gathered nto the same segment, advertser may gan more proft from the ads delvery. Thus, the qualty of user segmentaton has domnant mpact on the performance of behavoral targeted advertsng. In ths paper, we focus on the problem of user segmentaton for BT n search engne advertsng. User segmentaton s a process arrangng each user nto one or more segments to guarantee that users wth smlar nterests or purchase ntentons are wthn the same segment. We formulate the problem of user segmentaton as follows. Suppose a set of onlne users s gven. For each user, we adopt hs/her hstorcal onlne behavors such as queres to depct hs/her nterests. Some ads have been dsplayed to these users, and these ads are recorded wth the status whether they are clced n the mpresson. Our objectve s to group all users nto approprate segments by the analyss of user behavors n order to mprove the ad clc probablty wthn the user segments n contrast to the massve maret ads. Conventonal user segmentaton approaches such as classfcaton and clusterng stand two lmtatons. (1 Many tradtonal strateges utlze eywords as features, and then mplement clusterng or classfcaton on these features. In ths way, two users, who have the smlar buyng ntentons but have no common words between each other, shall not be put nto the same segment. (2 Many classcal clusterng methods do not allow an object to belong to multple clusters, whch means one user can only stay n a unque segment. We notce that semantc approaches such as Latent Semantc Analyss (LSA [8], Probablstc Latent Semantc Analyss (PLSA [13] and Latent Drchlet Allocaton (LDA [1] are wdely studed and adopted n feld of document classfcaton. Among those approaches, PLSA effectvely mnes the relatonshp between document and word wth a hdden varable called topc. Besdes, PLSA has the ablty *The wor was done when the frst author was ntern at Mcrosoft Research Asa. 10

2 to tae one document nto multple topcs. Motvated by PLSA n the feld of text mnng, we propose to analyze the smlarty between user-query and document-word and present a semantc approach called Probablstc Latent Semantc User Segmentaton (PLSUS for handlng the lmtatons of those tradtonal user segmentaton strateges. In order to execute semantc user segmentaton for BT, we frstly splt users search queres nto terms and thus each user can be represented as a collecton of terms [19]. Usng ths Bag of Words (BOW representaton [18], the latent varable, whch presents the topcs,.e. semantc user nterests, s nvolved to represent users. Ths topcal varable s able to brdge users and ther observed behavors. We wll show that the latent semantc topcs can present users nterests mplyng potental purchase ntenton n our experments. For ths reason, we drectly utlze the latent semantc topcs to segment users. The Expectaton Maxmzaton (EM approach s appled to mne the latent semantc topcs. Snce a user may have multple nterests, to better mage the user s nterests, we set a threshold and push the user nto those segments wth the probabltes larger than the predefned threshold. In the experments, we compare our proposed PLSUS and a modfed verson, whch s nown as Sngle-PLSUS wth two commonly used clusterng algorthms, CLUTO and -Means. Sngle-PLSUS only allows a user n a unque segment as many tradtonal clusterng algorthms do. The results show that PLSUS can mprove the ads CTR up to 100%. In addton, PLSUS has good performance on classcal F-measure. The rest of ths paper s organzed as follows. In Secton 2, we ntroduce the bacground nowledge about BT and semantc graphcal models n text mnng. In Secton 3, we descrbe our soluton to semantc user segmentaton, namely PLSUS. In Secton 4, we ntroduce the expermental confguraton and results wth analyss. Fnally n Secton 5, we conclude ths paper along wth future wor dscusson. 2. BACKGROUND In ths secton, we ntroduce the bacground nowledge for better understandng ths wor. We demonstrate the basc nowledge on BT ncludng the defnton and related commercal systems n Secton 2.1. In addton, we revew the semantc approaches such as LSA, PLSA, and LDA n Secton Behavor Targetng Behavoral Targetng s an advertsng methodology, whch s burgeonng n onlne advertsng. Wth ths technque, ads can be effectvely delvered to the most relevant users. Behavor Targetng may mprove the performance of onlne advertsement delvery by two major steps, namely, user segmentaton and user segments ranng. In the user segmentaton step, based on onlne behavor such as vsted webstes, clced pages and nput queres, users are located nto some user segments created n system. In the user segments ranng step, gven an ad, user segments are raned by relevance and the top segments are chosen for ads delvery. Thus, BT successfully dsplays the ads to those most approprate users. At present, BT s attractng more and more attenton n both ndustry and academa. In ndustry, a large amount of commercal systems nvolvng Behavoral Targetng were proposed: Adln [20], whch taes the short user sesson nto consderaton for BT, DoubleClc [24], whch adopts specal features such as browser types and operaton systems of users to mprove the user segmentaton step, Specfcmeda [28], whch predcts each user s nterest and purchase ntenton as a score, and the Yahoo! Smart ads [30], whch ntegrates the demographc and geographc targetng. Addtonally, Almond Net [21], Blue Lthum [27], Burst [23], NebuAd [25], Phorm [26], Revenue Scence [26], and TACODA [29] are the commercal systems wth BT. In academa, Yan et al. [19] frst studed the mprovement of BT n commercal search engnes from three aspects ncludng effectveness, mprovement, and the best strategy for BT. User segmentaton s a process arrangng each user nto one or more segments by a specfc crteron. In BT, ths crteron s to endeavor to guarantee that users wth smlar nterests and purchase ntentons are n the same segment. However, we cannot derve that nformaton drectly. The most wdespread way s mnng the user behavors to represent user nterests and purchase ntentons. That means users wth the smlar behavors mply that they have the smlar favors. Thus, user segmentaton for BT can be descrbed as attemptng to place each user n one or more segments for guaranteeng that the users wth smlar behavors are n the same segment. Snce advertsers tend to choose most relevant segments to pay, the qualty of user segmentaton s extraordnarly crucal. On one hand, f system can gather more users wth smlar nterests nto one segment, advertsers wll buy fewer segments to delver ther ads. On the other hand, apparently, CTR s to mprove f the smlarty between each par of users wthn the same segment s large. Thus, user segmentaton s a ey problem n BT applcaton. Tradtonal user segmentaton approaches for BT can be classfed nto three categores, namely manual user segmentaton, user classfcaton, and user clusterng. Manual rule based user segmentaton, whch classfes users nto segments manually, suffers from a sgnfcant defcency n tme cost. As a result of that large scale data s used for BT, ths method was hardly adopted by the commercal systems. User classfcaton and user segmentaton respectvely mplement classfcaton and clusterng for users. The tradtonal clusterng or classfcaton approaches have two lmtatons n ths applcaton scenaro. (1 Users are segmented only based on contents of ther behavors, not ther semantc nterests. Wth the Bag of Words model, tradtonal strateges utlze terms as features n order to mplement clusterng. That means two users wth the smlar purchase ntentons but wthout same terms between each other have lttle chance to be grouped nto one segment. (2 Many clusterng methods whch are wdely used for BT concentrate on settlng one object n one cluster. On account of ths lmtaton, f a user has two completely dfferent nterests, only one nterest can be presented and the other one has to be dscarded. Thus, t s desred to propose new semantc segmentaton approaches for BT. 2.2 Semantc Analyss Semantc analyss, whch s a well establshed technque n ndustry, mnes hdden semantc relatonshps among objects. Latent Semantc Analyss (LSA [8] s the well-nown approach for dervng the latent semantc relatonshp and wdely used n automatc ndexng and nformaton retreval. The man dea s mappng hgh-dmensonal vectors to low-dmensonal ones n the latent semantc space. Probablstc Latent Semantc Analyss (PLSA model [13, 15], whch s derved from LSA, s able to capture hdden varables wth sold statstcal foundaton. Each 11

3 object s represented by the convex combnaton of topc, whch s a latent varable n PLSA. Latent Drchlet Allocaton (LDA [1] s smlar to PLSA. The dfference between these two models s that the topc dstrbuton s assumed to have a Drchlet pror n LDA. In document classfcaton, LDA derves more reasonable mxtures of topcs. However, the wor n [11] has proved that the PLSA model s equvalent to the LDA model under a unform Drchlet pror dstrbuton. In ths wor, we focus on PLSA to derve our PLSUS model. PLSA s a sgnfcant breathrough, snce t can dscover latent varables wth more flexblty. Besdes, usng the EM algorthm, we can easly estmate the value n PLSA. In practce, PLSA s wdely used n many felds such as document classfcaton [2, 3, 10, 17], nformaton retreval [14], web usage mnng [16], coctaton analyss [5, 6] and collaboratve flterng [7, 12]. However, there are rare wors whch apply PLSA to user segmentaton for BT. In our study, followng the Bag of Words model, we descrbe each user as a collecton of terms, whch are extracted from ther behavors, such that we can represent users n the Bag of Words model, whch s smlar to the commonly used document representaton strategy. 3. PROBABILISTIC LATENT SEMANTIC USER SEGMENTATION (PLSUS In ths secton, we ntroduce our semantc user segmentaton algorthm. PLSA, whch can dscover the latent relatonshp between two objects, s wdely studed n document classfcaton and clusterng problems. In text mnng, we generally use the Bag of Words model [18] to represent documents. Accordng to the wor of Yan et al. [19], users behavors can be represented by ther hstorcal queres. Notce the fact that query conssts of terms, thus we can treat each query as one set of terms. Through ths way, each user can be represented by a bag of words, whch s the same as the representaton of text document. Let u U u, u,..., u } { 1 2 n stand for a user, where U presents the set of all users for BT, suppose t T t, t,..., t } s a term, where T represents the j { 1 2 m vocabulary of all terms used by all users. We defne of all terms used by u, thus, T Tu u U T u as the set Then, we defne the co-occurrence matrx N { n( u, t j }, where n( u, t j descrbes the number of tme t j used by u. To semantcally segment users, we ntroduce the latent varable z Z z, z,..., z } whch represents the topcs,.e. semantc { 1 2 l ntentons of users. Ths latent varable has the close relatonshp wth both user and query, whch has been transformed nto terms. From the user s perspectve, topc mples the hdden nterest of user. On the other hand, from the term s perspectve, terms n one topc may be gathered wth some specfed feld. Here, we assume that for a gven topc varable z, users and terms are ndependent to each other. We adopt the classcal aspect model [13] n PLSUS. The graphc model of aspect model s gven by Fgure 1. Fgure 1. Graph of the aspect model In the BT scenaro, each user has the probablty P z u to ( generate a topc z, and then z has the probablty P ( t j z to generate term t j. Gven the basc model, P( u, t j P( u P( t j u P ( t j u P( t j z P( z u z Z Notce that, ths model contans the probablty P z u and ( P ( u whch are not convenent to compute. Thus, we transform ths model nto another equalng form, P ( u, t j P( z P( u z P( t j z, z Z where P ( z presents the probablty that z s observed n Z, P ( u z s the probablty that u s relevant to the gven topc z and P ( t j z s the probablty that t j s related to the gven topc z. The Graphcal model representaton s shown n Fgure 2. Fgure 2. Graph of the PLSUS. The same as PLSA n the feld of text mnng, we am to maxmze the lelhood defned as, L n m n( u, t j logp( u, t j 1 j1 n m l n( u, t j logp( z P( u z P( t j z 1 j1 1 In order to maxmze L, we adapt the classcal Expectaton Maxmzaton (EM approach. EM approach s wdely used n computng maxmum lelhood n latent varable model. EM s an teratve method whch alternates between performng two steps. (1 Expectaton step (E step. Usng the current estmates of parameters, we compute the posteror probabltes P z u, t ( j for the latent varable. (2 Maxmzaton step (M step. Amng to maxmze complete maxmze lelhood E [ L c ], we update P ( z, P( u z and P ( t j z. 12

4 After fnshng EM computaton, PLSUS ams to segment users wth the model obtaned. Snce the topc has the close relatonshp wth user and term, apparently, topc can be used as user segment. In ths way, the semantc attrbutes become the domnant factors n user segmentaton. Thus, we am to solve the queston of how to segment users nto dfferent topcs. To solve ths queston, we focus on an mportant probablty P z u whch presents the ( topc (user segment z s observed wth a gven user u. It can descrbe how close the relatonshp between z and u s. P z u s able to be computed by, ( m j1 m l n( u, t j P( z u, t j P( z u. n( u, t j P( z' u, t j j1 ' 1 Intutvely, the easest way to segment users nto topcs s that, computng all P ( z u, z Z for each u, and then puttng u nto the topc wth the hghest P z u. However, ths ( approach of user segmentaton cannot handle the followng crcumstance: If a user s nterested n sports and coong whle there are two topcs whch exactly mply sports and coong, ths segmentaton method wll choose only one topc for a user at most. In ths way, we may lose a user s nterest. In order to get over ths defcency, we present a novel approach for segmentng the users based on the probablty P z u. Here, we apply a threshold ( for user segmentaton. Let S be the set of user segments and s S as the segment wth topc z, thus the user segmentaton approach s, u s u s f P( z u threshold, otherwse. Comparng wth those tradtonal clusterng methods, ths smple method allows one user belong to multple segments. 4. EXPERIMENTS In ths secton, we systematcally evaluate the proposed PLSUS algorthm. Two normal clusterng methods are used as baselnes n experments. Also, to better compare wth normal clusterng approaches, a modfed PLSUS whch we called Sngle-PLSUS s nducted. Some evaluatons are used n our experments to measure the performance of each approach. 4.1 Data Sets In ths part, we use a one day s ads clc-through log record collectng from a commercal search engne. Ths data can effectvely present users clc-through behavors. Table 1 shows the format of ths data used n our experments. From ths table, we can see that there are four propertes for the data we focused on. UserId presents a specfed user, dfferent user has dfferent UserId. Smlar to UserId, AdId s used as the unque dentfcaton for each advertsement. Query shows the content of a query used by user, and we can dvde t nto terms to adapt to PLSUS. ClcCnt s an mportant property whch s used n our evaluaton metrcs such as CTR. From the example n Table 1, we now a specfed user wth UserId EEEC97C25FD50C1AB282 D39FB13976D9 used a query whose content s boos, and then the system dsplays an advertsement wth AdId to ths user. However, ths user dd not clc ths ad. Table 1. Format of log record used n our experments. UserId Query EEEC97C25FD50C1AB282D39FB13976D9 Boos AdId ClcCnt 0 We use two datasets ncludng 120,000 and 150,000 log records respectvely to verfy the performance of PLSUS. Both of them contan thousands of users. In our experments, we tae all users n 120,000 log dataset nto 5 and 10 segments, whle all users n 150,000 log dataset are pushed n 10 and 20 segments respectvely usng dfferent approaches. 4.2 Experment Setup In ths part, we ntroduce the ey steps of our experments. In user segmentaton, let A a, a,..., a } be the set of ads n our { 1 2 n U { u1, u2,..., um dataset, } be the group of users who have dsplayed a. Furthermore, after we segment users wth dfferent approaches, we acqure the user segments. Thus, we defne user D U { d ( U, d ( U,..., d ( U }, 1, 2,..., n be the ( 1 2 dstrbuton of U wth our obtaned user segments and d ( U the set of users who belong to the th segment. Apparently, the th segment can be descrbe as, d 1,2,..., n The ey steps n our experments are, d ( U (1 We compare PLSUS, Sngle-PLSUS, -Means and CLUTO n our dataset, where Sngle-PLSUS s a modfed PLSUS whch we wll ntroduce n latter secton of ths paper. (2 We utlze the dfferent threshold whch s adopted n segmentng users after comng out the fnal model by EM algorthm to test the senstvty of PLSUS. 4.3 Evaluaton Metrcs In [19], Yan ntroduced some evaluatons whch can measure the BT s performance effectvely. Consultng these good evaluatons, we perform four evaluatons to measure the performance of each approach and to compare our soluton wth the baselnes. They are, ads Clc-Through Rate (CTR, ads Clc-Through Rate Improvement, ads clc Entropy and F-measure. Wth the symbols we defned, CTR can be represented by, where ( u j s defned as, m 1 CTR a ( u j m j1 1 ( uj 0 CTR of a over user segment (, f ujclceda otherwse d s, 13

5 1 CTR( a d ( uj, d ( U uj d ( U where d ( U s the number of users n d. ( U Note that CTR a s the raw CTR. n other words, CTR a s ( the CTR over all users dsplayed a. CTR ( a d presents the CTR of each user segment after segmentaton. In order to measure the mprovement of CTR by user segmentaton, we defne a new evaluaton metrc for PLSUS. Ths new evaluaton should satsfy two condtons, (1 Maxmum: choosng the segment whch has maxmum CTR. Ths s reasonable because ad publsher would le to recommend the user segment wth hghest ad clc probablty to advertser for ads delvery. (2 Majorty: the number of users n ths segment cannot be less than average. Ths condton can reduce some specal stuaton. For example, the th user segment only has 1 user and he/she clced a. Then, CTR ( a d 1. Apparently, ths segment s not approprate to be recommended to advertser. Integratng these two condtons, we defne the CTR mprovement for a as, where CTR( a d ( a CTR( a ( a, CTR( a * d ( a arg max{ CTR( a ~ d { d Thus, CTR mprovement Entropy s defned as, where, * d, d d ~ } d ( U 1 1,2,..., K and } m K ( a / n. K Enp( a P( d a logp( d a, 1 1 P( d a ( uj m uj d ( U Note that the smaller the Entropy s, the better results we wll obtan [19]. The classcal F-measure ncludng Precson, Recall and F measure, are defned as, F( a Pre( a d CTR( a d Rec( a d d uj d ( U m j1 ( uj ( uj 2Pre( a d Rec( a d Pre( a d Rec( a d ( where the larger F-measure s, the better performance we have. 4.4 Results In ths part, we ntroduce the detals n our experments and show the results. To show the performance of PLSUS, we am to compare PLSUS wth tradtonal clusterng methods. CLUTO and -Means are selected as the baselnes. However, t s unfar to compare CLUTO and -Means wth PLSUS snce PLSUS allows one user belong to multple segments, whle both CLUTO and - Means permt one user to belong to only one user segment. In order to solve ths problem, a Sngle-PLSUS s mplemented to brdge the gap between PLSUS and tradtonal clusterng approaches. By Sngle-PLSUS, a gven user u s settled n a P ( z u unque segment z whch has the max. On one hand, comparng Sngle-PLSUS wth CLUTO and -Means can show whether the semantc approach mproves BT s performance. On the other hand, t can clearly show the mpact on allowng one user to belong to multple segments by comparson between PLSUS and Sngle-PLSUS. The results are shown n Table 2-4. Note that the best results are n bold face. Note that we set threshold 0.2, the further explanaton s shown n the latter sectons. CTR s one of the most basc and crtcal evaluaton metrc for onlne advertsng problems. From the Table 2, we can generally observe two phenomena. Frst, by ncreasng the number of segments, the mprovement of CTR s ncreasng smultaneously. In the 150,000 log dataset, as the segments doubled, the mprovement of CTR ncreases two fold. In the same dataset, wth the 20 segments, the PLSUS mproves CTR up to 100% aganst tradtonal CLUTO. Second, all semantc approaches have the good performances on CTR mprovement. By further analyss, Sngle-PLSUS totally exceeds CLUTO and -Means. Ths fact proves that the semantc approach s approprate to be adopted n BT. Snce we gathered all queres used for each user and dvde these queres nto terms, we dscover the correspondence between user-query and document-words. The results verfy the correctness of our dea. The observaton of comparson between PLSUS and Sngle-PLSUS shows the advantages from allowng user to be pushed nto multple segments. Besdes, n Yan s wor [19], CTR mprovement wth CLUTO and -Means are around 100% by group users nto 20 segments, whch has been proved by our expermental results. Snce Yan s experments shown that CTR mprovement can reach to 670% by 160 user segments n the large scale dataset, we are confdent to expect that we can mprove CTR more than that f we group users nto more segments. In our future wor, we wll ncrease the scalablty to verfy ths concluson. We compute the average ads clc Entropy over all ads n the dataset we used. The result s shown n Table 3. Generally, all user segmentaton approaches entropes are almost the same. In ths case, entropy has less effect on dstncton among those methods than CTR. From the detaled observaton, we dscover that the entropy of PLSUS s larger than others. Consderng ther attrbutes, the reason s easy to get. The same crteron of user segmentaton, whch allows sngle user belong to multple segments, s used n PLSUS. That means there s more than one segment whch may have been delvered an ad many tmes. In ths way, the entropy s naturally larger than those user segmentaton approaches whch only assocate one user wth one segment. 14

6 Table 2. CTR mprovement of dfferent user segmentaton strateges. 5 segments n 10 segments n 10 segments n 20 segments n PLSUS Sngle-PLSUS CLUTO Means Table 3. Ads clc Entropy of dfferent user segmentaton strateges. 5 segments n 10 segments n 10 segments n 20 segments n PLSUS Sngle-PLSUS CLUTO Means PLSUS Sngle-PLSUS CLUTO -Means Table 4. F-measure of dfferent user segmentaton strateges. 5 segments n 10 segments n 10 segments n 20 segments n Precson % % % % Recall % % % % F % % % % Precson % % % % Recall % % % % F % % % % Precson % % % % Recall % % % % F % % % % Precson % % % % Recall % % % % F % % % % Precson, Recall and F-measure are shown n Table 4. Note that, the results reported n ths table are the average over all ads. Frst of all, we dscover the two facts that: (1 semantc approaches have better presentatons n Precson. Snce we choose the CTR as the Precson, ths result can be predcted by CTR mprovement. (2 Wthn three semantc approaches, PLSUS performs better than others. By these two facts, we can conclude that our proposed methods are helpful to mprove the Precson (CTR. An nterestng observaton s the Recall of tradtonal clusterng approaches s hgher than others n our two small datasets. Consderng the low precson, we can decde that the hgh-ctr segments clustered by CLUTO or -Means should nclude many users. In other words, the way to mprove CTR of a segment n tradtonal approaches s to add more users to ths segment. On the contrary, semantc user segmentaton can mprove the CTR wthout buldng user segment wth too large populaton. Ths characterstc s very useful for accurate ads delvery. Integratng Precson and Recall, the F-measure can evaluate the performance of user segmentaton. From the results of hgh F-measures of PLSUS and Sngle-PLSUS, we can draw the concluson that semantc user segmentaton has better performance than classcal clusterng methods. Fnally, we dscuss the nfluence of parameter threshold n PLSUS model. We set up a seres of experments whch group users nto 10 segments on the 120,000 log record data. Apparently, f threshold 0. 5, the output of PLSUS wll be constant. Thus, we set threshold from 0.05 to 0.5 and the Fgure 3-4 dsplay the results. Snce bgger threshold ndcates that user have smaller chance to be collected nto multple segments, the CTR mprovement lowers down when threshold becomes bgger n Fgure 3. However, f we tae a too small threshold, each user 15

7 wll have bg opportunty to be settled n many segments. In ths way, each segment wll contan too much users and lead bg entropy. The result n Fgure 4 shows ths fact. Analyzng Fgure 3-4, we consder that threshold around 0.2 can perform good performance both on CTR mprovement and entropy. Therefore, we set threshold 0. 2 for PLSUS n the experment whch compares four user segmentaton approaches. Fgure 3. Change of CTR mprovement wth ncreasng threshold. Fgure 4. Change of ads clc Entropy wth ncrease threshold. 5. CONCLUSION AND FUTURE WORK In ths paper, we developed a novel semantc approach called PLSUS for BT. We compared the proposed PLSUS algorthm wth two tradtonal clusterng user segmentaton approaches, CLUTO and -Means. From the expermental results we can draw the concluson that semantc approach PLSUS brngs better mprovements for BT n contrast to the tradtonal user clusterng, especally n terms of CTR mprovement. In our future wor, we wll pay more attenton to Latent Drchlet (LDA. It has been noted that, LDA has better results n document classfcaton than PLSA. Thus, we wll study ths model and attempt to apply t to user segmentaton for verfyng whether t has better performance for BT than PLSUS does. In addton, we wll modfy the EM algorthm to parallelze PLSUS. We beleve t s helpful to further ncrease the algorthmc scalablty and mprove the effcency. 6. REFERENCES [1] D. Ble, A. Ng, and M. Jordan. Latent Drchlet allocaton. Journal of Machne Learnng Research, 3(2003, [2] T. Brants, F. Chen, and I. Tsochantards. Topc-based document segmentaton wth probablstc latent semantc analyss. In Proceedngs of CIKM '02 (Las Palmas, June 2002, ACM Press, [3] T. Brants and R. Stolle. Fnd smlar documents n document collectons. In Proceedngs of LREC '02 (Span, June [4] A. Broder, M. Fontoura, V. Josfovs and L. Redel. A semantc approach to contextual advertsng. In Proceedngs of SIGIR '07 (Amsterdam, July 2007, ACM Press, [5] D. Cohn and H. Chang. Learng to probablstcally dentfyng authortatve documents. In Proceedngs of the ICML '00 (Stanford, June 2000, Morgan Kauffmann, [6] D. Cohn and T. Hofmann. The mssng ln: A probablstc model of document content and hypertext connectvty. In Proceedng of NIPS '00 (Denver, November 2000, MIT Press. [7] A. Das, M. Datar and A. Garg. Google news personalzaton: scalable onlne collaboratve flterng. In Proceedng WWW '07 (Banff, May 2007, ACM Press, [8] S. Deerwester, S. Dumas, G. Furnas, T. Landauer, and R. Hashman. Indexng by latent semantc analyss. Journal of the Amercan Socety for Informaton Scence, 41(1990, [9] D. C. Fan and J. O. Pedersen. Sponsored search: a bref hstory. In Bulletn of the Amercan Socety for Informaton Scence and Technology, [10] E. Gausser, C. Goutte, K. Popat, and F. Chen. A herarchcal model for clusterng and categorzng documents. In Advances n Informaton Retreval Proceedngs of the 24th BCS-IRSG European Colloquum on IR Research (Glasgow, March [11] M. Grolam and A. Kabán. On an equvalence between PLSI and LDA. In Proceedng SIGIR '03 (Toronto, July 2003, ACM Press, [12] A. Harpale and Y. Yang. Personalzed actve learnng for collaboratve flterng. In Proceedng of SIGIR '08 (Sngapore, July 2008, ACM Press, [13] T. Hofmann. Probablstc latent semantc analyss. In Proceedngs of UAI '99 (Stocholm, July 1999, Morgan Kaufmann, [14] T. Hofmann. Probablstc latent semantc ndexng. In Proceedngs of SIGIR '99 (Bereley, August 1999, ACM Press, [15] T. Hofmann. Unsupervsed learnng by probablstc latent semantc analyss. Machne Learnng Journal, 42(2001, [16] X. Jn, Y. Zhou, and B. Mobasher. Web usage mnng based on probablstc latent semantc analyss. In Proceedngs of KDD '04 (Seattle, August 2004, ACM Press, [17] Y. Km, J. Chang, and B. Zhang. An emprcal study on dmensonalty optmzaton n text mnng for lngustc nowledge acquston. In Proceedngs of KDD '03 (Seoul, Aprl 2003, ACM Press,

8 [18] G. Salton and C. Bucley. Term-weghtng approaches n automatc text retreval. Informaton Processng and Management: an Internatonal Journal, 24 (1988, [19] J. Yan, N. Lu, G. Wang, W. Zhang, Y. Jang and Z. Chen. How much the Behavoral Targetng can help onlne advertsng? In Proceedng of WWW '09 (Madrd, Aprl 2009, ACM Press, [20] Adln, Dc28hZShnCI [21] Almond Net, [22] Blue Lthum, [23] Burst, [24] Double Clc, [25] NebuAd, [26] Phorm, [27] Revenue Scence, ons.asp [28] Specfcmeda, [29] TACODA, [30] Yahoo! Smart Ads, 17

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

Semantic Link Analysis for Finding Answer Experts *

Semantic Link Analysis for Finding Answer Experts * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 28, 51-65 (2012) Semantc Lnk Analyss for Fndng Answer Experts * YAO LU 1,2,3, XIAOJUN QUAN 2, JINGSHENG LEI 4, XINGLIANG NI 1,2,3, WENYIN LIU 2,3 AND YINLONG

More information

An Empirical Study of Search Engine Advertising Effectiveness

An Empirical Study of Search Engine Advertising Effectiveness An Emprcal Study of Search Engne Advertsng Effectveness Sanjog Msra, Smon School of Busness Unversty of Rochester Edeal Pnker, Smon School of Busness Unversty of Rochester Alan Rmm-Kaufman, Rmm-Kaufman

More information

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success

More information

Mining Multiple Large Data Sources

Mining Multiple Large Data Sources The Internatonal Arab Journal of Informaton Technology, Vol. 7, No. 3, July 2 24 Mnng Multple Large Data Sources Anmesh Adhkar, Pralhad Ramachandrarao 2, Bhanu Prasad 3, and Jhml Adhkar 4 Department of

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS

PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS Yunhong Xu, Faculty of Management and Economcs, Kunmng Unversty of Scence and Technology,

More information

Exploiting Recommendation on Social Media Networks

Exploiting Recommendation on Social Media Networks Internatonal Journal of Scence and Research IJSR) ISSN Onln: 2319-7064 Index Coperncus Value 2013): 6.14 Impact Factor 2013): 4.438 Explotng Recommendaton on Socal Meda Networs Swat A. Adhav 1, Sheetal

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

A DATA MINING APPLICATION IN A STUDENT DATABASE

A DATA MINING APPLICATION IN A STUDENT DATABASE JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

Web Object Indexing Using Domain Knowledge *

Web Object Indexing Using Domain Knowledge * Web Object Indexng Usng Doman Knowledge * Muyuan Wang Department of Automaton Tsnghua Unversty Bejng 100084, Chna (86-10)51774518 Zhwe L, Le Lu, We-Yng Ma Mcrosoft Research Asa Sgma Center, Hadan Dstrct

More information

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

How To Understand The Results Of The German Meris Cloud And Water Vapour Product Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm Document Clusterng Analyss Based on Hybrd PSO+K-means Algorthm Xaohu Cu, Thomas E. Potok Appled Software Engneerng Research Group, Computatonal Scences and Engneerng Dvson, Oak Rdge Natonal Laboratory,

More information

Joint Optimization of Bid and Budget Allocation in Sponsored Search

Joint Optimization of Bid and Budget Allocation in Sponsored Search Jont Optmzaton of Bd and Budget Allocaton n Sponsored Search Wenan Zhang Shangha Jao Tong Unversty Shangha, 224, P. R. Chna wnzhang@apex.sjtu.edu.cn Yong Yu Shangha Jao Tong Unversty Shangha, 224, P. R.

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications Methodology to Determne Relatonshps between Performance Factors n Hadoop Cloud Computng Applcatons Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng and

More information

Multiple-Period Attribution: Residuals and Compounding

Multiple-Period Attribution: Residuals and Compounding Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens

More information

A Secure Password-Authenticated Key Agreement Using Smart Cards

A Secure Password-Authenticated Key Agreement Using Smart Cards A Secure Password-Authentcated Key Agreement Usng Smart Cards Ka Chan 1, Wen-Chung Kuo 2 and Jn-Chou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,

More information

Title Language Model for Information Retrieval

Title Language Model for Information Retrieval Ttle Language Model for Informaton Retreval Rong Jn Language Technologes Insttute School of Computer Scence Carnege Mellon Unversty Alex G. Hauptmann Computer Scence Department School of Computer Scence

More information

A Simple Approach to Clustering in Excel

A Simple Approach to Clustering in Excel A Smple Approach to Clusterng n Excel Aravnd H Center for Computatonal Engneerng and Networng Amrta Vshwa Vdyapeetham, Combatore, Inda C Rajgopal Center for Computatonal Engneerng and Networng Amrta Vshwa

More information

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,

More information

Product Quality and Safety Incident Information Tracking Based on Web

Product Quality and Safety Incident Information Tracking Based on Web Product Qualty and Safety Incdent Informaton Trackng Based on Web News 1 Yuexang Yang, 2 Correspondng Author Yyang Wang, 2 Shan Yu, 2 Jng Q, 1 Hual Ca 1 Chna Natonal Insttute of Standardzaton, Beng 100088,

More information

FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES

FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES Zuzanna BRO EK-MUCHA, Grzegorz ZADORA, 2 Insttute of Forensc Research, Cracow, Poland 2 Faculty of Chemstry, Jagellonan

More information

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION

More information

Design and Development of a Security Evaluation Platform Based on International Standards

Design and Development of a Security Evaluation Platform Based on International Standards Internatonal Journal of Informatcs Socety, VOL.5, NO.2 (203) 7-80 7 Desgn and Development of a Securty Evaluaton Platform Based on Internatonal Standards Yuj Takahash and Yoshm Teshgawara Graduate School

More information

Context-aware Mobile Recommendation System Based on Context History

Context-aware Mobile Recommendation System Based on Context History TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.12, No.4, Aprl 2014, pp. 3158 ~ 3167 DOI: http://dx.do.org/10.11591/telkomnka.v124.4786 3158 Context-aware Moble Recommendaton System Based on Context

More information

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng

More information

The Current Employment Statistics (CES) survey,

The Current Employment Statistics (CES) survey, Busness Brths and Deaths Impact of busness brths and deaths n the payroll survey The CES probablty-based sample redesgn accounts for most busness brth employment through the mputaton of busness deaths,

More information

Improved SVM in Cloud Computing Information Mining

Improved SVM in Cloud Computing Information Mining Internatonal Journal of Grd Dstrbuton Computng Vol.8, No.1 (015), pp.33-40 http://dx.do.org/10.1457/jgdc.015.8.1.04 Improved n Cloud Computng Informaton Mnng Lvshuhong (ZhengDe polytechnc college JangSu

More information

Analyzing Search Engine Advertising: Firm Behavior and Cross-Selling in Electronic Markets

Analyzing Search Engine Advertising: Firm Behavior and Cross-Selling in Electronic Markets WWW 008 / Refereed Track: Internet Monetzaton - Sponsored Search Aprl -5, 008 Beng, Chna Analyzng Search Engne Advertsng: Frm Behavor and Cross-Sellng n Electronc Markets Anndya Ghose Stern School of Busness

More information

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council Usng Supervsed Clusterng Technque to Classfy Receved Messages n 137 Call Center of Tehran Cty Councl Mahdyeh Haghr 1*, Hamd Hassanpour 2 (1) Informaton Technology engneerng/e-commerce, Shraz Unversty (2)

More information

EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP. Kun-chan Lan and Tsung-hsun Wu

EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP. Kun-chan Lan and Tsung-hsun Wu EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP Kun-chan Lan and Tsung-hsun Wu Natonal Cheng Kung Unversty klan@cse.ncku.edu.tw, ryan@cse.ncku.edu.tw ABSTRACT Voce over IP (VoIP) s one of

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

A Performance Analysis of View Maintenance Techniques for Data Warehouses

A Performance Analysis of View Maintenance Techniques for Data Warehouses A Performance Analyss of Vew Mantenance Technques for Data Warehouses Xng Wang Dell Computer Corporaton Round Roc, Texas Le Gruenwald The nversty of Olahoma School of Computer Scence orman, OK 739 Guangtao

More information

Fault tolerance in cloud technologies presented as a service

Fault tolerance in cloud technologies presented as a service Internatonal Scentfc Conference Computer Scence 2015 Pavel Dzhunev, PhD student Fault tolerance n cloud technologes presented as a servce INTRODUCTION Improvements n technques for vrtualzaton and performance

More information

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION NEURO-FUZZY INFERENE SYSTEM FOR E-OMMERE WEBSITE EVALUATION Huan Lu, School of Software, Harbn Unversty of Scence and Technology, Harbn, hna Faculty of Appled Mathematcs and omputer Scence, Belarusan State

More information

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1 Send Orders for Reprnts to reprnts@benthamscence.ae The Open Cybernetcs & Systemcs Journal, 2014, 8, 115-121 115 Open Access A Load Balancng Strategy wth Bandwdth Constrant n Cloud Computng Jng Deng 1,*,

More information

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing A Replcaton-Based and Fault Tolerant Allocaton Algorthm for Cloud Computng Tork Altameem Dept of Computer Scence, RCC, Kng Saud Unversty, PO Box: 28095 11437 Ryadh-Saud Araba Abstract The very large nfrastructure

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008 Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

More information

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement An Enhanced Super-Resoluton System wth Improved Image Regstraton, Automatc Image Selecton, and Image Enhancement Yu-Chuan Kuo ( ), Chen-Yu Chen ( ), and Chou-Shann Fuh ( ) Department of Computer Scence

More information

Pricing Model of Cloud Computing Service with Partial Multihoming

Pricing Model of Cloud Computing Service with Partial Multihoming Prcng Model of Cloud Computng Servce wth Partal Multhomng Zhang Ru 1 Tang Bng-yong 1 1.Glorous Sun School of Busness and Managment Donghua Unversty Shangha 251 Chna E-mal:ru528369@mal.dhu.edu.cn Abstract

More information

Cluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms

Cluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms Internatonal Journal of Appled Informaton Systems (IJAIS) ISSN : 2249-0868 Foundaton of Computer Scence FCS, New York, USA Volume 7 No.7, August 2014 www.jas.org Cluster Analyss of Data Ponts usng Parttonng

More information

Rank Based Clustering For Document Retrieval From Biomedical Databases

Rank Based Clustering For Document Retrieval From Biomedical Databases Jayanth Mancassamy et al /Internatonal Journal on Computer Scence and Engneerng Vol.1(2), 2009, 111-115 Rank Based Clusterng For Document Retreval From Bomedcal Databases Jayanth Mancassamy Department

More information

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4, pp. 30-30 (2005) 30 THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Yu-Mn Chang *, Yu-Cheh

More information

Cloud-based Social Application Deployment using Local Processing and Global Distribution

Cloud-based Social Application Deployment using Local Processing and Global Distribution Cloud-based Socal Applcaton Deployment usng Local Processng and Global Dstrbuton Zh Wang *, Baochun L, Lfeng Sun *, and Shqang Yang * * Bejng Key Laboratory of Networked Multmeda Department of Computer

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

Mobile App Recommendations with Security and Privacy Awareness

Mobile App Recommendations with Security and Privacy Awareness Moble App Recommendatons wth Securty and Prvacy Awareness Hengshu Zhu 1 Hu Xong 2 Yong Ge 3 Enhong Chen 1 1 Unversty of Scence and Technology of Chna, 2 Rutgers Unversty, 3 UNC Charlotte zhs@mal.ustc.edu.cn,

More information

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

A Dynamic Load Balancing for Massive Multiplayer Online Game Server A Dynamc Load Balancng for Massve Multplayer Onlne Game Server Jungyoul Lm, Jaeyong Chung, Jnryong Km and Kwanghyun Shm Dgtal Content Research Dvson Electroncs and Telecommuncatons Research Insttute Daejeon,

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

Modeling and Simulation of Multi-Agent System of China's Real Estate Market Based on Bayesian Network Decision-Making

Modeling and Simulation of Multi-Agent System of China's Real Estate Market Based on Bayesian Network Decision-Making Int. J. on Recent Trends n Engneerng and Technology, Vol. 11, No. 1, July 2014 Modelng and Smulaton of Mult-Agent System of Chna's Real Estate Market Based on Bayesan Network Decson-Makng Yang Shen, Shan

More information

iavenue iavenue i i i iavenue iavenue iavenue

iavenue iavenue i i i iavenue iavenue iavenue Saratoga Systems' enterprse-wde Avenue CRM system s a comprehensve web-enabled software soluton. Ths next generaton system enables you to effectvely manage and enhance your customer relatonshps n both

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School Robust Desgn of Publc Storage Warehouses Yemng (Yale) Gong EMLYON Busness School Rene de Koster Rotterdam school of management, Erasmus Unversty Abstract We apply robust optmzaton and revenue management

More information

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop IWFMS: An Internal Workflow Management System/Optmzer for Hadoop Lan Lu, Yao Shen Department of Computer Scence and Engneerng Shangha JaoTong Unversty Shangha, Chna lustrve@gmal.com, yshen@cs.sjtu.edu.cn

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

Stochastic Protocol Modeling for Anomaly Based Network Intrusion Detection

Stochastic Protocol Modeling for Anomaly Based Network Intrusion Detection Stochastc Protocol Modelng for Anomaly Based Network Intruson Detecton Juan M. Estevez-Tapador, Pedro Garca-Teodoro, and Jesus E. Daz-Verdejo Department of Electroncs and Computer Technology Unversty of

More information

BERNSTEIN POLYNOMIALS

BERNSTEIN POLYNOMIALS On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful

More information

Assessing Student Learning Through Keyword Density Analysis of Online Class Messages

Assessing Student Learning Through Keyword Density Analysis of Online Class Messages Assessng Student Learnng Through Keyword Densty Analyss of Onlne Class Messages Xn Chen New Jersey Insttute of Technology xc7@njt.edu Brook Wu New Jersey Insttute of Technology wu@njt.edu ABSTRACT Ths

More information

320 The Internatonal Arab Journal of Informaton Technology, Vol. 5, No. 3, July 2008 Comparsons Between Data Clusterng Algorthms Osama Abu Abbas Computer Scence Department, Yarmouk Unversty, Jordan Abstract:

More information

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE Yu-L Huang Industral Engneerng Department New Mexco State Unversty Las Cruces, New Mexco 88003, U.S.A. Abstract Patent

More information

Gender Classification for Real-Time Audience Analysis System

Gender Classification for Real-Time Audience Analysis System Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa vhr@yandex.ru, shmaglt_lev@yahoo.com, andrey.shemakov@gmal.com,

More information

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy S-curve Regresson Cheng-Wu Chen, Morrs H. L. Wang and Tng-Ya Hseh Department of Cvl Engneerng, Natonal Central Unversty,

More information

Network Security Situation Evaluation Method for Distributed Denial of Service

Network Security Situation Evaluation Method for Distributed Denial of Service Network Securty Stuaton Evaluaton Method for Dstrbuted Denal of Servce Jn Q,2, Cu YMn,2, Huang MnHuan,2, Kuang XaoHu,2, TangHong,2 ) Scence and Technology on Informaton System Securty Laboratory, Bejng,

More information

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems STAN-CS-73-355 I SU-SE-73-013 An Analyss of Central Processor Schedulng n Multprogrammed Computer Systems (Dgest Edton) by Thomas G. Prce October 1972 Techncal Report No. 57 Reproducton n whole or n part

More information

A Programming Model for the Cloud Platform

A Programming Model for the Cloud Platform Internatonal Journal of Advanced Scence and Technology A Programmng Model for the Cloud Platform Xaodong Lu School of Computer Engneerng and Scence Shangha Unversty, Shangha 200072, Chna luxaodongxht@qq.com

More information

Topic Identification based on Bayesian Belief Networks in the context of an Air Traffic Control Task

Topic Identification based on Bayesian Belief Networks in the context of an Air Traffic Control Task Procesamento del Lenguaje Natural, núm. 35 (2005), pp. 327-332 recbdo 06-05-2005; aceptado 01-06-2005 Topc Identfcaton based on Bayesan Belef Networs n the context of an Ar Traffc Control Tas F. Fernández,

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

Research on Evaluation of Customer Experience of B2C Ecommerce Logistics Enterprises

Research on Evaluation of Customer Experience of B2C Ecommerce Logistics Enterprises 3rd Internatonal Conference on Educaton, Management, Arts, Economcs and Socal Scence (ICEMAESS 2015) Research on Evaluaton of Customer Experence of B2C Ecommerce Logstcs Enterprses Yle Pe1, a, Wanxn Xue1,

More information

A Novel Auction Mechanism for Selling Time-Sensitive E-Services

A Novel Auction Mechanism for Selling Time-Sensitive E-Services A ovel Aucton Mechansm for Sellng Tme-Senstve E-Servces Juong-Sk Lee and Boleslaw K. Szymansk Optmaret Inc. and Department of Computer Scence Rensselaer Polytechnc Insttute 110 8 th Street, Troy, Y 12180,

More information

Efficient Project Portfolio as a tool for Enterprise Risk Management

Efficient Project Portfolio as a tool for Enterprise Risk Management Effcent Proect Portfolo as a tool for Enterprse Rsk Management Valentn O. Nkonov Ural State Techncal Unversty Growth Traectory Consultng Company January 5, 27 Effcent Proect Portfolo as a tool for Enterprse

More information

Web Spam Detection Using Machine Learning in Specific Domain Features

Web Spam Detection Using Machine Learning in Specific Domain Features Journal of Informaton Assurance and Securty 3 (2008) 220-229 Web Spam Detecton Usng Machne Learnng n Specfc Doman Features Hassan Najadat 1, Ismal Hmed 2 Department of Computer Informaton Systems Faculty

More information

Fast Fuzzy Clustering of Web Page Collections

Fast Fuzzy Clustering of Web Page Collections Fast Fuzzy Clusterng of Web Page Collectons Chrstan Borgelt and Andreas Nürnberger Dept. of Knowledge Processng and Language Engneerng Otto-von-Guercke-Unversty of Magdeburg Unverstätsplatz, D-396 Magdeburg,

More information

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Study on Model of Risks Assessment of Standard Operation in Rural Power Network Study on Model of Rsks Assessment of Standard Operaton n Rural Power Network Qngj L 1, Tao Yang 2 1 Qngj L, College of Informaton and Electrcal Engneerng, Shenyang Agrculture Unversty, Shenyang 110866,

More information

When Network Effect Meets Congestion Effect: Leveraging Social Services for Wireless Services

When Network Effect Meets Congestion Effect: Leveraging Social Services for Wireless Services When Network Effect Meets Congeston Effect: Leveragng Socal Servces for Wreless Servces aowen Gong School of Electrcal, Computer and Energy Engeerng Arzona State Unversty Tempe, AZ 8587, USA xgong9@asuedu

More information

A Dynamic Energy-Efficiency Mechanism for Data Center Networks

A Dynamic Energy-Efficiency Mechanism for Data Center Networks A Dynamc Energy-Effcency Mechansm for Data Center Networks Sun Lang, Zhang Jnfang, Huang Daochao, Yang Dong, Qn Yajuan A Dynamc Energy-Effcency Mechansm for Data Center Networks 1 Sun Lang, 1 Zhang Jnfang,

More information

Searching for Interacting Features for Spam Filtering

Searching for Interacting Features for Spam Filtering Searchng for Interactng Features for Spam Flterng Chuanlang Chen 1, Yun-Chao Gong 2, Rongfang Be 1,, and X. Z. Gao 3 1 Department of Computer Scence, Bejng Normal Unversty, Bejng 100875, Chna 2 Software

More information

AD-SHARE: AN ADVERTISING METHOD IN P2P SYSTEMS BASED ON REPUTATION MANAGEMENT

AD-SHARE: AN ADVERTISING METHOD IN P2P SYSTEMS BASED ON REPUTATION MANAGEMENT 1 AD-SHARE: AN ADVERTISING METHOD IN P2P SYSTEMS BASED ON REPUTATION MANAGEMENT Nkos Salamanos, Ev Alexogann, Mchals Vazrganns Department of Informatcs, Athens Unversty of Economcs and Busness salaman@aueb.gr,

More information

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT Toshhko Oda (1), Kochro Iwaoka (2) (1), (2) Infrastructure Systems Busness Unt, Panasonc System Networks Co., Ltd. Saedo-cho

More information

Gaining Insights to the Tea Industry of Sri Lanka using Data Mining

Gaining Insights to the Tea Industry of Sri Lanka using Data Mining Proceedngs of the Internatonal MultConference of Engneers and Computer Scentsts 2008 Vol I Ganng Insghts to the Tea Industry of Sr Lanka usng Data Mnng H.C. Fernando, W. M. R Tssera, and R. I. Athauda

More information

Scale Dependence of Overconfidence in Stock Market Volatility Forecasts

Scale Dependence of Overconfidence in Stock Market Volatility Forecasts Scale Dependence of Overconfdence n Stoc Maret Volatlty Forecasts Marus Glaser, Thomas Langer, Jens Reynders, Martn Weber* June 7, 007 Abstract In ths study, we analyze whether volatlty forecasts (judgmental

More information

Overview of monitoring and evaluation

Overview of monitoring and evaluation 540 Toolkt to Combat Traffckng n Persons Tool 10.1 Overvew of montorng and evaluaton Overvew Ths tool brefly descrbes both montorng and evaluaton, and the dstncton between the two. What s montorng? Montorng

More information

Dynamic Pricing for Smart Grid with Reinforcement Learning

Dynamic Pricing for Smart Grid with Reinforcement Learning Dynamc Prcng for Smart Grd wth Renforcement Learnng Byung-Gook Km, Yu Zhang, Mhaela van der Schaar, and Jang-Won Lee Samsung Electroncs, Suwon, Korea Department of Electrcal Engneerng, UCLA, Los Angeles,

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information