Rank Optimization of Personalized Search

Rank Optmzaton of Personalzed Search Ln LI Zhengl YANG Masar KITSUREGAWA Agmentng the global rankng based on the lnkage strctre of the Web s one of the poplar approaches n data engneerng commnty today for enhancng the search and rankng alty of Web nformaton systems. Ths s typcally done throgh atomated learnng of ser nterests and re-rankng of search reslts throgh semantc based personalzaton. In ths paper, we propose a ery context wndow (QCW) based framework for Selectve Tlzaton of search hstory n personalzed learnng and re-rankng (STAR). We condct extensve experments to compare or STAR approach wth the poplar drectory-based search methods (e.g., Google Drectory search) and the general model of most exstng re-rankng schemes of personalzed search. Or expermental reslts show that the proposed STAR framework can effectvely captre ser-specfc ery-dependent personalzaton and mprove the accracy of personalzed search over exstng approaches. 1. Introdcton Encodng hman search experences and personalzng the search reslt delvery throgh rankng optmzaton s a poplar approach to enhance Web search. Althogh the general Web search today s stll performed and delvered predomnantly throgh search algorthms, e.g., Google s PageRank [17] based ery ndependent rankng algorthms, the nterests n mprovng global noton of mportance n rankng search reslts by creatng personalzed vew of mportance have been growng over the recent years. We categorze the research efforts on personalzed search nto three classes of strateges: 1) ery modfcaton or agmentaton [3], [26], 2) lnk-based score personalzaton [8], [9], [15], [17], [19], [22], and 3) search reslt re-rankng [4], [5], [12], [14], [26], [29], [30]. A general process of re-rankng s to devse effcent mechansms to re-order the search reslt rankng sng the global mportance by personalzed rankng crtera. Dept. of Informaton and Commncaton Engneerng, Unversty of Tokyo lln@tkl.s.-tokyo.ac.p Insttte of Indstral Scence, Unversty of Tokyo yangzl@tkl.s.-tokyo.ac.p Reglar Member Insttte of Indstral Scence, Unversty of Tokyo ktsre@tkl.s.-tokyo.ac.p Sch crtera are typcally derved from the modelng of sers search behavor and nterests. In ths paper, we develop a rank optmzaton framework that promotes Selectve Tlzaton of search hstory for personalzed learnng and re-rankng (STAR). Or STAR framework conssts of three desgn prncples and a ste of algorthms for learnng and encodng ser s short-term and long-term search nterests and re-rankng of search reslts throgh a carefl combnaton of recent and prevos search hstores. We show that even thogh short-term nterests based personalzaton sng the most recent search hstores may be effectve at tmes [15], [25], [26], t s generally nstable and fals to captre the changng behavor of the sers. Frthermore, most of exstng long-term nterests based personalzaton sng the entre recent and prevos search hstores fals to dstngsh the relevant search hstory from rrelevant search hstory [4], [18], [30], makng t harder to be an effectve measre alone for search personalzaton. Bearng n mnd of these observatons, or STAR framework advocates three desgn prncples for rank optmzaton. Frst, we devse a so-called ery context wndow (QCW) model to captre the ser s search behavor throgh a collecton of her per-ery based clck-throgh data. Second, we develop a ery-to-ery smlarty model to dstngsh the relevant search memores of personalzed search behavor from rrelevant ones n the QCW of each ser, redcng the noses ncrred by sng ether a recent fragment or the entre QCW. Thrd, we develop a fadng memory based weght fncton to careflly combne the freency of relevant search behavor (long term nterests) wth the most recent search behavor (short term nterests). To show the effectveness of or STAR framework n alty enhancement of personalzed search, we propose length and depth based herarchcal semantc smlarty metrcs and compare the effectveness of for re-rankng strateges: 1) nave re-rankng that s ery and tme ndependent; 2) relevant search memory based re-rankng that s ery dependent bt tme ndependent; 3) fadng memory based re-rankng that s tme dependent bt ery ndependent; and 4) hybrd re-rankng that s both ery and tme dependent. Or experments show that the hybrd re-rankng scheme can effectvely combne the prevos and recent memores throgh a smooth and gradally fadng memory based weghtng fncton. More mportantly, or expermental reslts show that the proposed STAR framework for personalzed search and re-rankng can effectvely captre ser-specfc ery-dependent personalzaton preference and mprove the accracy of personalzed search over the poplar drectory-based search methods (e.g., Google Drectory search) and the general model of most exstng re-rankng schemes of personalzed search. The remander of ths paper s organzed as follows. The overvew of or STAR framework s presented n Secton 2. Then, we dscss bldng QCW-based ser profles and desgnng re-rank strateges n Secton 3 and Secton 4 respectvely. Expermental reslts wll be gven n Secton 5. Related works are revewed n Secton 6. Fnally, we conclde the paper n Secton 7.

2. The STAR Framework Overvew The goal of the STAR framework s to desgn a semantc rch ser profle model to captre the ery context and the search behavor of each ser and ntellgently tlze sch ser profles to enhance the alty of personalzed search by effectvely re-rankng of the search reslts retrned from a general prpose search engne. Fgre 1 gves a sketch of the STAR framework, consstng of three man components. The frst component s the text classfcaton modle that performs herarchcal Web page classfcaton. The poplar way s to classfy the docments nto a drectory-based ontology, sch as Yahoo! Drectory [11], ODP (http://dmoz.org) [27], and so on. Stdes [1], [10], [16] preferred to bldng ther own ontology. Thanks for the fact that herarchcal text classfcaton s well stded n the feld of text processng, n the frst prototype desgn of or STAR framework we drectly tlze the classfed search reslts from Google Drectory search. The second component s the context aware learnng of ser s search behavor. We tlze the per-ery based clck-throgh data to captre ery dependent context and search behavor and develop the ery context wndow (QCW) model to encode sch leanng process. By atomatcally generatng QCW based ser ery profles, the ser learnng modle atomatcally captres the ery dependent context of ser search behavor. For example, or approach focses on the ser s vsted search reslts (Web pages) whch spply s wth not only what knd of content a ser s nterested n (topcs) bt also how mch the ser s nterested n them (clck freency). The thrd component s the ery and tme dependent, hybrd re-rankng scheme that prodces a new ser-centrc, ery dependent rank lst for each ser ery throgh three step process. Frst, t selects the relevant clck records from the entre QCW of a ser throgh the ery-to-ery smlarty analyss. Second, t combnes the recent search memores wth the prevos search memores throgh applyng a fadng memory based weghtng fncton over the selected QCW clck records of a ser. Fnally, t employs herarchcal semantc smlarty measres to compte the personalzed rankng of the search reslts retrned from a general search engne. In the sbseent sectons we wll focs on the techncal detal of the ser learnng modle and the re-rank modle. 3. QCW Based User Learnng Modle Or STAR framework devses the ery context wndow (QCW) to encode the ser specfc and ery dependent search behavor. Gven a ser, her ery context wndow conssts of m ery-dependent context clck records, denoted as 1, 2,..., m. Each clck record n the QCW s composed of the sbmtted ery, the topcs assocated wth the clck search reslts, the clck freency of each topc, and the retrned search reslts of the gven ery. The topcs are extracted from Google Drectory, strctred as a herarchcal tree, so that each clck record has ts own tree. Ths topc tree records the clck behavor of a ser on a specfc ery, whch can tell s what knd techncal detals of clck record selecton are n the next Fgre 1 Overvew of the STAR framework secton. Moreover, we mplement each QCW as a ee. The tal of the ee holds most recently reested eres, whle ts head holds the least recently reested eres. When a new ery s sbmtted, the correspondng record s added to the tal of the ee and the ser model (QCW) s pdated accordngly. Ths ee keeps the chronologcal order of dfferent clck records, whch can easly dfferentate the recent and old search hstores for re-rankng strateges. Fgre 2 shows an example of QCW wth three context records, each corresponds to one ery and ts context encodng of the ery dependent clck-throgh data. For example, a ser npts a ery Dsneyland to Google Drectory search engne, and then npt ery Dsneyland as a root node followed by the clcked topcs. The search reslts are kept n the SRB. Node F s represented by the [Theme\Parks, 6] whch means the ser has clcked some search reslts assocated wth the topc Theme\Parks sx tmes n ths search. In addton, for each topc, we store the top for depth of ts fll path n Google Drectory n a record. For example, the node F s actally stored as the [\Recreaton\ThemeParks]. 4. The Re-rank Modle The QCW based re-rankng modle needs to address three key challenges: (1) how to select relevant context records from the entre QCW gven a ser ery (Secton 4.1); (2) whether all the selected ery-relevant context records play the same role n re-rankng the search reslts of the crrent ery (Secton 4.2); (3) how shold we re-order the search reslts (Secton 4.3 and 4.4)? We se callgraphc pper-case alphabets to represent sets. The elements of a set are denoted by lower case alphabets. For example, U s the set of clck records n QCW and s an element (clck record) of U. U s the cardnalty of the set U. 4. 1 Selectng Relevant Clck Records Gven a new npt ery, we frst select the relevant QCW clck records where the encoded eres are smlar to the crrent npt ery by sng a ery-to-ery smlarty measre. Estmatng the smlarty (relatedness) between

Fgre 2 Qery Context Wndow: clck records are eed p n a chronologcal order eres has a long hstory n tradtonal Informaton Retreval [6], [21], [32]. It s stll hot and actve n varos topcs of Web Informaton Retreval [2], [7], [31]. Up to now t has not been possble to prove that any of these measres otperforms all others n a large set of experments [33]. The smlarty between the two eres can be ndced from the overlap of the two lsts of search reslts (URLs) retrned. Clearly, the ery-reslt-vectors present a better smlarty metrc than ery term-vectors [21]. As ths, we formally defne the ery-to-ery smlarty measre as follows: n SRB P Q(, n ) = (1) SRB p We can get the URL set of search reslts of from the search reslt bffer SRB of the clck record, and the URL set of search reslts P n of n from the crrent search. The smlarty between the two eres s estmated to the fracton of the ntersecton of the two URL sets. In or experments, URL smlarty s measred by ther host name. We wold lke to note that or STAR framework can easly ncorporate other smlarty and specfcty measres. 4. 2 Weghng Relevant Clck Records The selected clck records are the collecton of ser s prevos and recent search behavors whch reflect her nterests. We assme that the ser s nterests wll gradally decay as tme goes on, so we assgn more weghts to more recent QCW clck records and decreasng weghts to older QCW clck records to frther mprove the accracy of the personalzed search sng a fadng memory based weght fncton, defned as follows: F( ) = e log 2 ( U ) hf U n (2) where hf s a half fadng parameter. In or experments, hf s set n the range [0.1, 1]. After the clck record s selected as relevant accordng to Eaton 1, ts effect on the alty of personalzed search depends on ts temporal order. Wth ncreasng the vale of hf, the rate of fadng becomes slow and the weghts on prevos memores ncrease. Ths fadng memory fncton nfes the ser s long-term and short-term nterests encoded n the QCW clck records by assgnng dfferent weghts to these clck records appearng n dfferent temporal order. 4. 3 Captrng Search Interests After the relevant QCW clck records and ther weghts are determned by Eaton 1 and 2, the topcs n these QCW clck records are reflectng the ser s crrent search nterests. Now we can devse a re-rankng mechansm to re-order the search reslts by pttng those that are more smlar to the selected topcs closer to the top of the fnal re-ordered rank lst. In or STAR framework, the topcs n the relevant QCW clck records are strctred n a semantc concept herarchy as shown n Fgre 2. Herarchcal smlarty measres can be sed to assess the smlarty between the related topcs and the search reslts of the gven ery. Let h be the depth of the sbsmer (the deepest node common to two nodes), l be the shortest path length between two topcs, and M be the maxmm depth of topc drectory possessed by a QCW clck record. Two combnatons of depth and length based smlarty measre defned as follows: 2 h n c1 HS( t, ) = (3) 1+ 2 h 0.6 h 0.6 h 0.2 l e e n c2 HS(( t, ) = e (4) 0.6 h 0.6 h e + e Eaton 3 s a smple lnear transformaton fncton of the length and the depth, whle Eaton 4 transfers the length and the depth by a nonlnear fncton and then combnes them by mltplcaton [13]. A QCW clck record may record more than one topc dependng a ser s clck behavor. We frther defne the smlarty between a QCW clck record and a search reslt p n k as: n 1 HS( t ) c n s (, ) = (5) T t T c where each topc c C t s weghted by ts correspondng c sng the clck freency of the topc. We frther normalze the sm of herarchcal smlarty scores throgh dvdng t by the nmber of topcs stored n a clck record. 4. 4 QCW Based Re-rankng In ths secton we wll descrbe for strateges to re-order the search reslts of her crrent ery.

1) Strategy 1 Qery and tme ndependent scheme 1 n s 1( ) = s(, U n ) (6) Strategy 1 s ery and tme ndependent, a nave strategy. Clck records of dfferent past eres are assgned eal weghts regardless of the crrent npt ery. The smlarty scores of past eres wth a search reslt are smmed together and dvded by the nmber of clck records ( U ) n U. Strategy 1 thnks all the past hstores (clck records) are related to a ser s crrent ery. As we dscssed n Secton 1, the entre QCW ncldes nosy memores nrelated to the crrent ery. Most of re-rankng based Web search personalzaton methods n the lteratre [4], [5], [12], [14], [18], [26], [27], [30]. Strategy 1 can represent the general dea of these methods, compared wth the followng three strateges. 2) Strategy 2 Qery dependent scheme 1 n s2( ) = Q( U, n ) s(, p k n ) (7) We defne the Strategy 2 as a ery dependent and tme ndependent strategy, whch s selectve abot by sng Eaton 1 to weght these clck records. Tan et al. [29] dd prelmnary dscsson on ery-dependent selecton of ser profle. However, ther work s n the context of only explotng long-term search hstores of sers and gnores the changes of ser s nterests wth tme. 3) Strategy 3 Tme dependent scheme 1 n s 3( ) = F( U ) s(, p k n ) (8) Strategy 3 strengthens recent memores and weakens the effect of prevos memores by applyng Eaton 2 to each QCW clck record wthot the selecton of relevant contexts n terms of the npt ery lke Strategy 2. If hf s set to a very small vale, the prevos memores cannot have an nflental effect on re-rankng. Researches [15], [25], [26] emphasze that the most recent search s most drectly close to the ser s crrent nformaton need, whch can be regarded as a specal case where hf s close to zero n Strategy 3. 4) Strategy 4 Qery and tme dependent scheme 1 n s 1( ) = F( ) Q( U, n ) s(, p n k )(9) Strategy 4 s ery and tme dependent, a hybrd strategy. As we know, sers have ther own characterstcs of search behavor. To handle the most general case where we have many knds of Web sers and sers wll how dfferent search behavors, Strategy 4 s desgned to select relevant clck records by Eaton 1, bt also assgn greater weghts to the more recent clck records Eaton 2. Gven one of the for strateges, a new relevant score wll be calclated for each of search reslts. We otpt the lst of the search reslts n order of ther assgned scores. In the followng experments, we wll evalate the effectveness of the above for re-rank strateges. 5. Experments 5. 1 Experment Setp and Evalaton Measre The goal of ths paper s to acheve a personalzed rankng by scorng the smlarty between a ser profle and the retrned search reslts. Instead of creatng or own Web search engne, we retreve reslts from Google Drectory search engne and se them as a baselne n the followng evalaton. Moreover, as dscssed n [23], nformatonal eres (IQ) are sch eres where the ser does not have a specal page n mnd and ntends to fnd ot Web pages related to a topc. We frther classfed the goal of IQ nto three categores: new IQ, sem-new IQ, and repeated IQ. A ery s a new IQ f a ser never searches sch a topc before. It means that we cannot get the relevant search hstores. A sem-new IQ has smlar topcal contents wth some of the ser s search hstores. A repeated IQ refers to the ery by whch the ser has already obtaned the desred nformaton, and s searchng for t agan. The followng experments wll evalate the performances of the sem-new and repeated IQs snce or STAR framework wants to se the prevos relevant memores to enhance the crrent search. For new IQs, collaboratve nformaton retreval wll be an nterestng drecton n or ftre work. The evalaton of or framework s a challenge becase crrently there are no stable ery log data sets as a pblc benchmark. We created or own real dataset [12] whch was collected over a ten-day perod (From October 23rd, 2006, to November 1st, 2006). Twelve sers are nvted to search throgh or framework and dge whether the clcked reslts are relevant or not. Users were asked to npt search eres related to ther professonal knowledge n the frst for days, and search eres related to ther hobbes n the next three days. Then, n the last three days, each ser s reested to repeat some searches wth the eres entered n the prevos days. We got a log of abot 300 eres averagng 25 eres per sbect and abot 1200 records of the pages the sers clcked n total. The sze of ths real data set s relatvely small becase the clck data collecton and sers dgments are labor ntensve. The evalaton measre s MAP (mean average precson) whch s wdely sed n rankng problems. 5. 2 Reslts and Dscssons In the real data set, the eres n the last three days are regarded as repeated IQs. The frst seven-day clck-throgh data s dvded nto two parts (odd-day and even-day) as sem-new nformatonal searches. One s for settng p the QCW ser profle and the other s for re-rankng search reslts based on the learned ser profle, and then the two parts are exchanged to rn the evalaton once agan. Here, we set M to be 5 and hf to be 20. In Fgre 3 and Table 1, we smmarze the performance of the proposed for re-rank strateges accordng to dfferent herarchcal semantc measres. MAP dfference means the dfference vale between or strategy and the baselne and MAP% represents the mprovement percentage of or strategy over the baselne.

MAP dfference 0.18 0.13 0.08 Sem-new IQ C 1 0.2 0.18 0.16 C 0.14 0.12 2 0.1 0.08 MAP dfference Repeated IQ C 1 C 2 Fgre 3 The MAP dfferences between or strateges and baselne Table 1 The MAP mprovement percentage of or strateges over baselne Sem-new IQ(%) Repeated IQ(%) Measre Strategy 1 Strategy 2 Strategy 3 Strategy 4 Strategy 1 Strategy 2 Strategy 3 Strategy 4 C1 24.92 25.71 27.13 34.35 55.95 57.72 61.92 72.16 C2 25.50 28.78 26.64 34.88 56.93 66.71 61.92 75.00 The expermental reslts show that the proposed ser-context aware re-rank strateges are more effectve than the baselne. Strategy 1, representng the general dea of most exstng personalzed re-rankng schemes, s nferor to other three strateges. Among the proposed for re-rank strateges, the Strategy 4 broadly shows the best performance. The Strategy 2 wth selectve tlzaton of ser profles, averagely prodces better reslts than the Strategy 1 and Strategy 3. In Fgre 3 and Table 1 the mprovement of repeated IQs n s more obvos than those of sem-new IQs. The larger mprovement of repeated IQs shows that or re-rank strateges can effectvely retreve the Web pages prevosly clcked by sers snce these eres have been sbmtted before and ser s clck behavor has been stored n or QCW. Moreover, n Fgre 3 and n Table 1 we observed that the smlarty measres sng nonlnear transformaton fncton (.e., C2 shown n orange colmns) generally prodce better performance that the smlarty measres sng lnear transformaton (.e., C1 shown n ble colmns). In a word, Strategy 4 wth C2 prodces the largest mprovement, e.g., ts mprovements over baselne are 34.88% and 75% for sem-new IQs and repeated IQs respectvely. From the reslts, we can say that re-rankng of search reslts throgh semantc based personalzaton actally can enhance the general search. We also confrm that there are two crtcal factors: (1) the ery-to-ery smlarty whch captres the long term search nterests of a ser (ery dependent), and (2) the most recent search nterest whch reflects the short term search behavor of a ser (tme dependent). The two factors ndcate that both short-term and long-term memores contrbte to the mprovement. 6. Related Work In ths secton we gve a bref overvew of some related works n the lteratre of personalzed search. There are two knds of context nformaton we can se to model search experence and captre ser search hstores. One s short-term context, whch emphaszes that the most recent search s most drectly close to the ser s crrent nformaton need [15], [25], [26]. Sccessve searches n a sesson sally have the same nformaton need. Detectng a sesson bondary, however, s a dffclt task. The other s long-term context, whch generally assmes that sers wll hold ther nterests over a relatvely long tme. It means that any search n the past may have some effect on the crrent search [4], [14], [18], [30]. These stdes commonly sed all avalable contexts as a whole to mprove the search reslt alty and rankng. Prelmnary dscsson on ths problem n [29] s n the context of only explotng long-term search hstory of sers. In addton, several researchers have sed taxonomc herarchy (a smple drectory based ontology) s sed to represent ser s nterests n the Web search [4], [10], [16], [18], [20], [24]. However, very few have taken nto consderaton the herarchcal strctre of the drectory-based ontology when calclatng smlarty vales between crrent search of a ser and her search hstory. Chrta et al. [4] sng herarchcal semantc measre, however, rered sers to manally select topcs they are nterested n. A ne characterstc of or STAR framework s the development of a selectve se of personalzed search hstory and a combnaton of long term and short term ser search hstores n rank optmzaton of personalzed search. 7. Conclsons We presented a STAR framework for selectve tlzaton of ser search behavors for personalzed learnng and re-rankng. We desgned a novel ser search profle called ery context wndow (QCW) to record the search behavor of a ser. We developed a ery-to-ery smlarty model and the fadng memory based weght fncton. We showed how or STAR framework careflly chose and weghed the relevant clck records as sefl ser context gven an npt ery and how we appled herarchcal semantc smlarty measres n or re-rank strateges. The expermental reslts show that or STAR approach to personalzed search and re-rankng approach can effectvely learn ser-specfc ery-dependent personalzaton preference and sgnfcantly mprove the

accracy of personalzed search over the most exstng personalzed re-rankngs. Or ongong research ncldes desgnng an effectve pdatng polcy for ser profles, and more effectve rank aggregaton methods for frther optmzaton of personalzed search. [References] [1] M. S. Aktas, M. A. Nacar, and F. Menczer. Personalzng PageRank based on doman profles. In WEBKDD, 83 90, 2004. [2] D. Beeferman and A. L. Berger. Agglomeratve clsterng of a search engne ery log. In KDD, 407 416, 2000. [3] P.-A. Chrta, C. S. Fran, and W. Nedl. Personalzed ery expanson for the web. In SIGIR, 7 14, 2007. [4] P. A. Chrta, W. Nedl, R. Pa, and C. Kohlsch tter. Usng ODP metadata to personalze search. In SIGIR, 178 185, 2005. [5] Z. Do, R. Song, and J.-R. Wen. A large-scale evalaton and analyss of personalzed search strateges. In WWW, 581 590, 2007. [6] L. Ftzpatrck and M. Dent. Atomatc feedback sng past eres: Socal searchng? In SIGIR, 306 313, 1997. [7] N. S. Glance. Commnty search assstant. In IUI, 91 96, 2001. [8] T. H. Havelwala. Topc-senstve pagerank: A context-senstve rankng algorthm for web search. IEEE Trans. Knowl. Data Eng., 15(4):784 796, 2003. [9] G. Jeh and J. Wdom. Scalng personalzed web search. In WWW, 271 279, 2003. [10] H. R. Km and P. K. Chan. Learnng mplct ser nterest herarchy for context n personalzaton. In IUI, 101 108, 2003. [11] Y. Labro and T. W. Fnn. Yahoo! As an ontology: Usng Yahoo! categores to descrbe docments. In CIKM, 180 187, 1999. [12] L. L, Z. Yang, B. Wang, and M. Ktsregawa. Dynamc adaptaton strateges for long-term and short-term ser profle to personalze search. In APWeb/WAIM, 228 240, 2007. [13] Y. L, Z. Bandar, and D. McLean. An approach for measrng semantc smlarty between words sng mltple nformaton sorces. IEEE Trans. Knowl. Data Eng., 15(4):871 882, 2003. [14] F. L, C. T. Y, and W. Meng. Personalzed web search for mprovng retreval effectveness. IEEE Trans. Knowl. Data Eng., 16(1):28 40, 2004. [15] Y. Lv, L. Sn, J. Zhang, J.-Y. Ne, W. Chen, and W. Zhang. An teratve mplct feedback approach to personalzed search. In ACL, 585 592, 2006. [16] N. Nanas, V. S. Uren, and A. N. D. Roeck. Bldng and applyng a concept herarchy representaton of a ser profle. In SIGIR, 198 204, 2003. [17] L. Page, S. Brn, R. Motwan, and T. Wnograd. The pagerank ctaton rankng: Brngng order to the web. Techncal report, Stanford Dgtal Lbrary Technologes Proect, 1998. [18] A. Pretschner and S. Gach. Ontology based personalzed search. In ICTAI, 391 398, 1999. [19] F. Q and J. Cho. Atomatc dentfcaton of ser nterest for personalzed search. In WWW, 727 736, 2006. [20] H. rae Km and P. K. Chan. Personalzed rankng of search reslts wth learned ser nterest herarches from bookmarks. In WEBKDD, 32 43, 2005. [21] V. V. Raghavan and H. Sever. On the rese of past optmal eres. In SIGIR, 344 350, 1995. [22] M. Rchardson and P. Domngos. The ntellgent srfer: Probablstc combnaton of lnk and content nformaton n pagerank. In NIPS, 1441 1448, 2001. [23] D. E. Rose and D. Levnson. Understandng ser goals n web search. In WWW, 13 19, 2004. [24] V. Schckel-Zber and B. Faltngs. Inferrng ser s preferences sng ontologes. In AAAI, 1413 1418, 2006. [25] X. Shen, B. Tan, and C. Zha. Context-senstve nformaton retreval sng mplct feedback. In SIGIR, 43 50, 2005. [26] X. Shen, B. Tan, and C. Zha. Implct ser modelng for personalzed search. In CIKM, 824 831, 2005. [27] M. Speretta and S. Gach. Personalzed search based on ser search hstores. In WI, 622 628, 2005. [28] K. Sgyama, K. Hatano, and M. Yoshkawa. Adaptve web search based on ser profle constrcted wthot any effort from sers. In WWW, 675 684, 2004. [29] B. Tan, X. Shen, and C. Zha. Mnng long-term search hstory to mprove search accracy. In KDD, 718 723, 2006. [30] J. Teevan, S. T. Dmas, and E. Horvtz. Personalzng search va atomated analyss of nterests and actvtes. In SIGIR, 449 456, 2005. [31] J.-R.Wen, J.-Y. Ne, and H. Zhang. Qery clsterng sng ser logs. ACM Trans. Inf. Syst., 20(1):59 81, 2002. [32] J. X and W. B. Croft. Qery expanson sng local and global docment analyss. In SIGIR, 4 11, 1996. [33] J. Zobel and A.Moffat. Explorng the smlarty space. SIGIR Form, 32(1):18 34, 1998. Ln LI Ph.D stdent of Gradate School of Informaton Scence and Technology, the Unversty of Tokyo. She receved the Master degree n Whan Unversty of Technology, Chna. Her research nterests nclde Web mnng, personalzaton and recommendaton. Zhengl YANG Research assocate of Insttte of Indstral Scence, the Unversty of Tokyo. He receved Ph.D degree n the Unversty of Tokyo n 2008. Hs research nterests nclde seence mnng and data mnng. Masar KITSUREGAWA Professor and the drector of center for nformaton fson at Insttte of Indstral Scence, the Unversty of Tokyo. He receved the Ph.D degree n nformaton engneerng n 1983 from the Unversty of Tokyo. Hs research nterests nclde parallel processng and database engneerng. He s a member of steerng commttee of IEEE ICDE, PAKDD and WAIM, and has been a trstee of the VLDB Endowment. He was the char of data engneerng specal nterest grop of Insttte of Electronc, Informaton, Commncaton, Engneerng, Japan, the char of ACM SIGMOD Japan, Chapter. He s crrently a trstee of DBSJ.