Discriminative Models of Integrating Document Evidence and Document-Candidate Associations for Expert Search

Discrimiative Models of Itegratig Documet Evidece ad Documet-Cadidate Associatios for Expert Search Yi Fag Departmet of Computer Sciece Purdue Uiversity West Lafayette, IN 47907, USA fagy@cs.purdue.edu Luo Si Departmet of Computer Sciece Purdue Uiversity West Lafayette, IN 47907, USA lsi@cs.purdue.edu Aditya P. Mathur Departmet of Computer Sciece Purdue Uiversity West Lafayette, IN 47907, USA apm@cs.purdue.edu ABSTRACT Geerative models such as statistical laguage modelig have bee widely studied i the task of expert search to model the relatioship betwee experts ad their expertise idicated i supportig documets. O the other had, discrimiative models have received little attetio i expert search research, although they have bee show to outperform geerative models i may other iformatio retrieval ad machie learig applicatios. I this paper, we propose a pricipled relevace-based discrimiative learig framework for expert search ad derive specific discrimiative models from the framework. Compared with the state-ofthe-art laguage models for expert search, the proposed research ca aturally itegrate various documet evidece ad documet-cadidate associatios ito a sigle model without extra modelig assumptios or effort. A extesive set of experimets have bee coducted o two TREC Eterprise track corpora i.e., W3C ad CERC) to demostrate the effectiveess ad robustess of the proposed framework. Categories ad Subject Descriptors H.3 [Iformatio Storage ad Retrieval]: H.3.3 Iformatio Search ad Retrieval; H.3.4 Systems ad Software Geeral Terms Algorithms, Desig, Experimetatio Keywords Expert search, eterprise search, discrimiative models 1. INTRODUCTION With vast amout of iformatio available withi large orgaizatios, the key challege is to haress existig kowledge ad expertise i a timely ad effective maer. I cosequece, eterprise iformatio retrieval systems are icreasigly demaded to retur people with specific kowledge ad skills i respose to a user s query. A class of Permissio to make digital or hard copies of all or part of this work for persoal or classroom use is grated without fee provided that copies are ot made or distributed for profit or commercial advatage ad that copies bear this otice ad the full citatio o the first page. To copy otherwise, to republish, to post o servers or to redistribute to lists, requires prior specific permissio ad/or a fee. SIGIR 10, July 19 23, 2010, Geeva, Switzerlad. Copyright 2010 ACM 978-1-60558-896-4/10/07...$10.00. vertical search egies kow as expert fider have emerged for eterprise orgaizatios. As a importat IR applicatio, expert search also kow as expert fidig) has received substatial attetio i the IR research commuity. Rapid progress has bee made i modelig ad evaluatio sice the lauch of TREC Eterprise Track i 2005 [12]. A otable observatio is that probabilistic geerative models have domiated the literature of expert search. I particular, may statistical laguage modelig techiques were proposed to model the relatioship betwee a cadidate expert ad a query. These models usually characterize a geerative process of how a query is geerated from supportig documets of a expert. The key igrediet i these methods is to determie associatios betwee people ad documets because the associatios are ambiguous i the TREC scearios as well as i may realistic settigs. Previous works have ivestigated differet metrics or a combiatio of them to measure the associatios, but the way of choosig or combiig them is rather ofte heuristic ad lacks of a clear justificatio. Furthermore, documet evidece such as documet or expert authority iformatio, iteral ad exteral documet structures, global evidece ad so o is show to be able to sigificatly improve expert retrieval performace, but to icorporate these features ofte requires may modelig assumptios ad is ofte uwieldy. O the other had, discrimiative models, aother importat class of probabilistic models with solid statistical foudatio, are early abset i the research of expert search, especially o the TREC evaluatios. I fact, discrimiative models have bee preferred over geerative models i the recet past i may machie learig applicatios, partly because of their attractive theoretical properties. I the domai of IR, various discrimiative models have also bee applied to may retrieval problems e.g., [23]). However, very limited research has bee coducted to desig discrimiative models for expert search. I this work, we preset a relevace-based discrimiative learig framework for expert search ad derive specific discrimiative models from the framework. Similar to some promiet laguage models, the proposed models aggregate documet evidece ad documet-cadidate associatios through supportig documets. Ulike the laguage models, we directly model the coditioal probability of relevace give a query ad a expert. As a result, heterogeeous or eve arbitrary features ca be aturally icluded ito a sigle model. The parameters associated with the features are automatically leared from traiig data. We

report a extesive set of experimets o two TREC corpora to evaluate the effectiveess ad robustess of the proposed discrimiative framework. The ext sectio discusses related work. Sectio 3 itroduces the state-of-the-art geerative laguage models for expert search. Sectio 4 presets our proposed approaches. I sectio 5, we discuss the advatages of discrimiative models i the cotext of expert search. Sectio 6 explais our experimetal methodology ad Sectio 7 presets the experimetal results. Sectio 8 cocludes ad poits out some future work. 2. RELATED WORK The early work o expert fidig systems was iitiated i the Kowledge Maagemet commuity, usually i the form of yellow pages [9]. These systems relied o experts to judge ad iput their skills by themselves agaist a predefied set of keywords, ad thus the task was time-cosumig. More recet techiques locate experts i a automatic fashio. A overview of early automatic expert fidig systems is provided i [36]. The task of expert search has received a sigificat amout of attetio as the task had bee icluded i the TREC Eterprise track from 2005 to 2008 [12, 32, 1, 7]. The TREC Eterprise tracks provided a commo platform for researchers to empirically evaluate methods for expert search. They demostrated the feasibility of expert search o heterogeeous data collectios. I the TREC corpora, the relatioship betwee documets ad experts is ambiguous ad thus to model the documet-cadidate associatios is a key issue i expert search research. Most of the recet work o expert search geerally falls ito two categories: profile-cetric ad documet-cetric approaches. Balog et al. [3] formalizes the two methods by proposig two geerative laguage models. Their Model 1 directly models the kowledge of a expert from associated documets, which is equivalet to a profile-cetric approach, ad their Model 2 first locates documets o the topic ad the fids the associated experts, which is a documetcetric approach. It has bee show i [3] that Model 2 is geerally more effective tha Model 1 ad sice the it becomes oe of the most promiet laguage models for expert search. I [8], a two-stage laguage model combiig a documet relevace ad co-occurrece model is proposed, which is essetially equivalet to Model 2. A attempt to further improve their models is made by proposig a proximitybased documet represetatio for icorporatig sequetial iformatio i text [25]. There are may other geerative probabilistic models proposed for expert fidig. For example, Serdyukov ad Hiemstra [30] propose a expert-cetric laguage model. Fag ad Zhai [14] derive two families of geerative models by applyig probability rakig priciple. Probabilistic topic models are also proposed to simultaeously model the topical distributio of expertise evidece ad experts [34]. Some alterative approaches to expert search exist beyod laguage modelig. Oe effective approach is to treat the problem of rakig experts as a votig problem based o data fusio techiques [21]. Eleve differet votig strategies were proposed to aggregate over the documets associated to a expert. Aother approach is to model the process of expert fidig by probabilistic radom walks o so-called expertise graphs [31]. May other expert fidig methods were proposed durig TREC Eterprise tracks. Besides the models, some researchers have show that suitable features ca help sigificatly boost the performace of expert fidig. These features iclude documet authority iformatio such as the PageRak, idegree, ad URL legth [38], graph-based expert authority [10], iteral documet structures that idicate the experts associatios with the cotet of documets [6], o-local evidece [2], ad the evidece that ca be acquired outside of a eterprise [29]. Additioal evidece ca be itegrated by idetifyig home pages of cadidate experts ad clusterig relevat documets [20]. Proximity features that characterize the cooccurrece of query ad expert metios i the documet are also show idicative by the top rus i the TREC evaluatios [16]. This led to several widow-based approaches icludig [25, 4, 20]. O the other had, the early work of applyig discrimiative models i IR ca date back to the early 1980s i which the maximum etropy approach was ivestigated to get aroud term idepedece assumptios i probabilistic geerative models [11]. More recetly, Nallapati [23] compared the performace of the maximum etropy model ad support vector machies with that of laguage modelig i ad hoc retrieval ad homepage fidig, ad argued that SVMs are preferred over laguage models because of their ability to lear arbitrary features automatically. Furthermore, it has bee show that feature-based discrimiative models ca cosistetly ad sigificatly outperform curret state of the art retrieval models with the correct choice of features [22]. Discrimiative models have received icreasig attetio i IR, as aother related area, learig to rak for IR, sparked geuie iterest amog researchers i the commuity [18]. Most of the learig to rak models are discrimiative i ature ad they have bee show improvemets over their geerative couterparts i ad hoc retrieval. Bechmark data sets such as LETOR [19] are also available for research o learig to rak. Although valuable work has bee doe o discrimiative models for ad hoc retrieval ad other IR domais, very limited research has bee coducted to desig discrimiative models for expert search. The oly relevat work that we are aware of is [15], which addressed the issue of differetiatig heterogeeous sources accordig to specific queries ad experts by learig associated weights from data, but the work did ot model documet-cadidate relatioship or address how to icorporate ew documet evidece, which are two key issues i expert search. 3. GENERATIVE MODELS To predict a class C give a observatio x, the desired choice of C is give by the coditioal class probabilities P C x). Depedig o how to compute P C x), the existig classificatio techiques ca be broadly classified ito two major categories: geerative models ad discrimiative models. I a discrimiative approach, a parametric model is itroduced for P C x), ad the values of the parameters are iferred from a set of labeled traiig data. I cotrast, the geerative approach attempts to capture the maer i which a observatio x is geerated from give classes C by specifyig a prior distributio P C) over classes ad a class-coditioal distributio P x C) over the observatio. The posterior P C x) is obtaied from Bayes Theorem as P C x) P x C)P C) 1) I the cotext of expert search, the task is to fid out what

is the probability of a cadidate e beig a expert give a query topic q. I other words, we wat to kow P e q) i order to rak cadidate e accordig to this probability. Similarly, by ivokig Bayes Theorem, we have: P e q) P q e)p e) 2) where P e) is the prior probability of a cadidate, which is geerally assumed uiform. Thus, the key quatity to estimate i the geerative models is the probability of a query give the cadidate, P q e). May laguage modelig techiques are proposed to estimate this quatity. Oe of the most promiet ad effective oe was called documet models ofte referred as Model 2) [3] where documets act as a hidde variable i the process which accumulates expertise evidece. Formally, it is expressed as P q e) = P q d t)p d t e) 3) where P q d t ) is the probability of the documet d t to geerate the query q ad ca be calculated usig a stadard laguage model. P d t e) is the probability of associatio betwee the documet d t ad the cadidate e. is the umber of documets i the collectio. Model 2 mimics the process oe might use to fid experts usig a documet retrieval system. Here, relevat documets are retrieved for the expertise requested, ad they are used as evidece to idicate whether the associated cadidates are experts. After aggregatig all such evidece, the experts ca be idetified. As P q d t ) is relatively easy to determie i laguage models, the key igrediet i this model ad also i may other laguage models for expert search) is to estimate the documet-cadidate associatios: P d t e), or P e d t ) if P d t ) is assumed to be uiform. P e d t ) ca be estimated by various methods. The simplest form is the boolea model where associatios are biary decisios: P e d t ) = 1 if the cadidate appears i the documet; otherwise, P e d t ) = 0. More sophisticated methods are frequecy based which cosider the umber of times that a cadidate appears i the documet. A set of heuristic combiatios of all these metrics are also compared ad ivestigated i [6]. 4. DISCRIMINATIVE MODELS FOR EXPERT SEARCH 4.1 Discrimiative Learig Framework for Expert Search For the text-based retrieval, covetioal relevace-based probabilistic models rak documets by sortig the coditioal probability that each documet would be judged relevat to the give query [17]. The uderlyig priciple usig probabilistic models for iformatio retrieval is called probability rakig priciple [26]. The Biary Idepedece Model BIM) [27] is a realizatio of this priciple. I the domai of expert search, the similar priciple ca be used where experts are raked accordig to the descedig order of the coditioal probability of relevace give a expert ad a query. Fag ad Zhai [14] applied this priciple i studyig expert search problem. Both BIM ad [14] s models are geerative ad they use Bayes theorem to reverse the origial coditioal probability. We propose a discrimiative learig framework to directly model the coditioal probability of relevace by a parametric probability fuctio. We cast expert search ito a biary classificatio problem that treats the relevat queryexpert pairs as positive data ad irrelevat pairs as egative data. Formally, we use a relevace variable r {1, 0} to deote whether two etities are relevat or ot ad thus the coditioal probability of relevace P θ r e, q) represets the extet to which the expert e is relevat to the query q. I our framework, P θ r e, q) ca take ay fuctio form with parameter θ that eeds to estimate from traiig data. Based o differet forms of P θ, the resultig discrimiative models are differet. Give the relevace judgmet r mk for the traiig expert-query pair e k, q m ) which is assumed idepedetly geerated, the coditioal likelihood L of the traiig data is as follows L = M K P θ r = 1 e k, q m ) r mk P θ r = 0 e k, q m ) 1 r mk 4) m k where M is the umber of queries ad K is the umber of experts. The parameters ca the be estimated by maximizig the followig log likelihood fuctio θ = arg max θ M m K r mk log P θ r = 1 e k, q m ) 5) k + 1 r mk ) log 1 P θ r = 1 e k, q m ) )) The estimated parameters ca the be plugged back i P θ r = 1 e k, q m). Accordig to the probability rakig priciple, the experts are preseted to users i the descedig order of P θ r = 1 e k, q m). I the ext sectio, we propose a specific discrimiative model by defiig the form of P θ r = 1 e k, q m ). 4.2 A Discrimiative Model Accordig to the previous work, Model 2 tured out to be oe of the most effective formal models for expert search. The success of the model lies i its effective process to collect expertise evidece from documets. Our discrimiative model builds o the same process i which the supportig documet d serves as a bridge to coect expert e ad query q. Give a documet d, whether e ad q are relevat depeds o two factors: documet evidece ad documet-cadidate associatios. More specifically, we cosider: 1) whether the documet d is relevat to the query q; 2) whether the expert e is relevat to the documet d. The fial relevace decisio for e, q) is made by averagig over all the documets. Formally, this ca be expressed as P θ r = 1 e, q) = P r 1 = 1 q, d t )P r 2 = 1 e, d t )P d t ) 6) where P r 1 = 1 q, d t ) allows us to model the probability that a documet d t matches a topic q, which idicates the documet evidece. P r 2 = 1 e, d t ) allows us to model the probability that a supportig documet d t metios a cadidate e, which idicates the documet-cadidate associatios. A documet d t with higher values o both probabilities would cotribute more to the value of P r = 1 e, q). The prior probability of a documet, P d t), is geerally assumed uiform i.e., P d t ) = 1 ). We model both P r 1 = 1 q, d t ) ad P r 2 = 1 e, d t ) by logistic fuctios o a liear combiatio

of features. Formally, they are parameterized as follows: P r 1 = 1 q, d t ) = σ N f α i f i q, d t ) ) 7) i=1 P r 2 = 1 e, d t) = σ N g β jg je, d ) t) 8) where σx) = 1/1 + exp x)) is the stadard logistic fuctio. α i is the weight for the i th query-documet feature f iq, d t) ad β j is the weight for the j th documet-cadidate feature g je, d t). Specifically, f iq, d t) is the documet evidece such as documet retrieval scores that idicates how relevat the documet is to the query. g je, d t) is the feature such as the boolea associatios that describe the stregth of associatios betwee a documet ad a cadidate. N f deotes the umber of documet evidece features ad N g deotes the umber of documet-cadidate associatio features. The weight parameters ca be leared by maximizig the coditioal log-likelihood of the data i.e., Eq. 5). Because there is o aalytical solutio, we use the BFGS Quasi-Newto for the optimizatio [13]. The method requires the objective fuctio ad its gradiets. The partial derivatives of the log-likelihood L with respect to α i ad β j are give as L α i = L β j = M m M m K k K k rmk P r P r1 P r) rmk P r P r 1 P r ) j=1 ) σ α 1 σ α )σ β f i q k, d t ) ) σ β 1 σ β )σ α g j e m, d t ) where P r, σ α ad σ β deote the probabilities of Eq. 6, Eq. 7, ad Eq. 8, respectively. The mai computatio of the gradiet method is evaluatig the log likelihood fuctio ad its gradiets agaist parameters. Both of them have computatioal complexity of O MKN f + N g ) ). I practice, we oly have a small umber of relevace judgmets for traiig ad thus K is relatively small. I additio, the umber of documets associated with each expert ad the umber of features used are also usually relatively small. Therefore, the traiig procedure ca be efficiet. We ca see that both Model 2 ad this discrimiative model try to aggregate documet evidece ad documetcadidate associatios through the bridge of documets, but they are differet i how to estimate these two probabilities. I Model 2, the documet evidece i.e., P q d t)) is calculated by stadard laguage models ad the documetcadidate associatios i.e., P d t e)) are estimated by a heuristic combiatio of documet-cadidate associatio features. I our proposed discrimiative model, both quatities are modeled by logistic fuctios with arbitrary features ad the parameters are automatically determied from traiig data. From Eq. 6, we ca see that P θ r = 1 e, q) is essetially the arithmetic mea of P r = 1 q, d, e) with respect to d. Thus we refer the model as the arithmetic mea discrimiative AMD) model. 4.3 A Alterative Discrimiative Model with Geometric Mea It has bee show that i certai cases geometric mea the product rule) is better tha arithmetic mea the sum rule) i combiig evideces [35]. This observatio motivates a alterative discrimiative model which we refer as the geometric mea discrimiative GMD) model where P θ r = 1 e, q) is modeled by the geometric mea as follows: P r = 1 e, q) = 1 ) 1 P r 1 = 1 q, d t )P r 2 = 1 e, d t ) Z 9) where Z is the ormalizatio factor that scales the geometric mea to be a proper probability distributio as follows 1 Z = P r 1 q, d t)p r 2 e, d t)) 10) r 1 {0,1},r 2 {0,1} Both P r 1 = 1 q, d t) ad P r 2 = 1 e, d t) here take the same form with Eq. 7 ad Eq. 8. By pluggig them ad Eq. 10 ito Eq. 9, we ca get P r = 1 e, q) = where N f 1 E = α i i=1 N f 1 G = α i i=1 1 1 + exp E) + exp F ) + exp G) 11) f i q, d t ) ) N g, F = f i q, d t ) ) N g + j=1 j=1 β j 1 β j 1 g j e, d t ) ) g j e, d t ) ) We ca otice that i Eq. 11 there are three expoetial terms i the deomiator, which meas that either querydocumet features f iq, d t) or documet-cadidate features g je, d t) aloe caot domiate the fial relevace P r = 1 e, q). The parameters of the model ca also be estimated by maximizig the coditioal log-likelihood fuctio usig BFGS. The GMD model has the same computatioal complexity with AMD. 4.4 Advatages of Discrimiative Models for Expert Search Some theoretical results show that discrimiative models ted to have a lower asymptotic error [24]. Besides the theoretical cosideratios, we believe there are specific reasos for the domai of expert search that make discrimiative models a suitable choice. First of all, the proposed discrimiative models ca effortlessly icorporate features. As show i Sectio 2 ad prior research, expert search ca beefit from icludig various types of features. Laguage modelig approaches ofte require may modelig assumptios ad extra modelig effort to iclude ew features especially whe the heterogeeous features are preset. Secodly, discrimiative models typically make fewer model assumptios tha their geerative couterparts. For example, may state-of-the-art geerative models, icludig Model 2, the cadidate-geeratio model [14] ad the twostage laguage model approach [8], assume that the query q ad cadidate e are idepedet give the documet d, i.e., pe q, d) = pe d). It requires extra modelig effort for these models to overcome the assumptio [4]. I cotrast, our proposed discrimiative models ca easily get aroud it. For example, P r 2 = 1 e, d t) i Eq. 6 ca be replaced by P r 2 = 1 e, q, d t) where o idepedece assumptio is made o P r 2 = 1 e, q, d t). Thirdly, the discrimiative models directly ad aturally characterize the otio of relevace. I Model 2 ad may other laguage models, there

is o explicit referece to the class variable that deotes whether a expert is relevat or ot. We use P r = 1 e, q) istead of P e q) to make it explicit that the relevace of a expert is measured with respect to a query. This explicit otio of relevace ca help quatify the extet to which a user s iformatio eed is satisfied. 5. EXPERIMENTS 5.1 Data Collectios Our experimets are carried out i the settig of the Expert Search task of the TREC Eterprise tracks from 2005 to 2008. For TREC 2005 ad 2006, the documet collectio was a crawl of the World Wide Web Cosortium W3C) [12, 32]. For TREC 2007 ad 2008, a differet ad more realistic corpus was itroduced, which is a crawl of the website of Commowealth Scietific ad Idustrial Research Orgaizatio CSIRO). The corpus is kow as the CSIRO Eterprise Research Collectio CERC) [1, 7]. Table 1 gives detailed statistics of the collectios ad query sets. The W3C data is supplemeted with a list of 1092 cadidate experts represeted by their full ames ad email addresses while the CERC data do ot cotai a predefied list of cadidates. Based o the observatio that most CSIRO employees have a CSIRO email address followig the patter firstame.lastame@csiro.au, we extract a list of cadidates with email addresses matchig this patter from text. We also use heuristic rules to filter o-persoal addresses e.g. educatio.act@csiro.au). The total umber of cadidates extracted is 3,482. I 2005, 50 queries were created based o the workig groups i W3C there were 10 traiig topics also available i 2005). I 2006, 49 queries were developed by the track participats collectively usig the provided list of supportig documets for each cadidate. The 50 queries used i 2007 were created with the help of CSIRO s Sciece Commuicators, while the judgmets of 77 queries i 2008 were made by participats. To evaluate the proposed models o W3C, we use the TREC 2006 topics plus the 10 available TREC 2005 traiig topics for traiig ad test the models o the TREC 2005 topics. Similarly o CERC, we use TREC 2008 topics for traiig ad TREC 2007 topics for testig. Although differet years have differet ways of topic assessmets, we will see i the experimets that the discrimiative models ca still gai sigificat improvemets from the traiig data. Our decisio of choosig the traiig ad testig cofiguratios is maily based o the umber of relevace judgmets available. We eed a reasoable amout of traiig data for the discrimiative models ad there are relatively more relevace judgmets i 2006 for W3C ad i 2008 for CERC. Because the two test collectios have very differet characteristics, we do ot evaluate the models across the corpora. To obtai a balaced traiig set, we radomly select the same umber of egative istaces with the umber of positive istaces for each traiig query, by followig the udersamplig method i [23]. To acquire egative istaces for the queries without o-relevace judgmets i.e., 10 TREC 2005 traiig topics), we use the Base method itroduced i Sectio 6.1 to idetify a list of ujudged/irrelevat experts for each query. Evaluatio measures are mea average precisio MAP), R-precisio R-Prec), mea reciprocal rak MRR), ad precisio@5 p@5) ad precisio@10 p@10). Table 1: Statistics of the W3C ad CERC testbeds W3C CERC # Documets 331,037 370,715 # People 1,092 3,482 Avg. Doc Legth i Toke 983.4 354.8 Avg. # Rel Experts/Topic 51.5 2006) 10.4 2008) TREC Year) 30.2 2005) 3.0 2007) Traiig Queries 2006 49) 2008 77) 2005 10) Testig Queries 2005 50) 2007 50) 5.2 Research Questios A extesive set of experimets were desiged to address the followig questios of the proposed research: Ca the discrimiative traied model perform better tha its geerative couterpart whe the same set of features are available for use? Sectio 6.1) Ca itegratio of additioal features ito the discrimiative model improve the performace? Sectio 6.1) What features are likely more importat i terms of the relative values of the leared weights i the discrimiative model? Sectio 6.1) What is the effect of oly retrievig a subset of documets o the proposed model? Sectio 6.2) How robust is the proposed discrimiative model with respect to the uderlyig documet retrieval methods? Sectio 6.3) How robust is the proposed discrimiative learig framework with respect to specific discrimiative models? Sectio 6.4) I all the sectios except Sectio 6.4, we oly use the arithmetic mea discrimiative AMD) model to assess the discrimiative learig approach, sice we care less about the differece betwee discrimiative models tha about the differece betwee geerative ad discrimiative models. 5.3 Experimetal Setup I all our experimets, we have doe miimal preprocessig i which both queries ad documets are stemmed usig Krovetz stemmer. We oly use the title or query fields i the topics without usig extra iformatio e.g., arrative ). No query expasio or exteral resource is utilized. As show i Sectio 4, each query-expert pair is characterized by two feature vectors, i.e., documet evidece f i q, d t ) ad documet-cadidate associatios g j e, d t ). Table 2 summarizes the features used i the discrimiative models. These features iclude the score from the stadard documet laguage model f 1 ), documet features f 2 f 5 ), exteral documet structure features f 6 f 9 ), basic associatio features g 1 g 5 ), iteral documet structure features g 6 g 9), ad proximity features g 10 g 13). Here the exteral documet structure features are the boolea variables to represet whether a documet i W3C) comes from specific types of documets e.g., f 8 = 1 meas the documet is either from www or esw ). The evaluatios o W3C use all the features, while the features f 6 f 9 ad g 6 g 9 are ot applied to CERC, as the CERC dataset does ot

Table 2: Features used i the discrimiative models. B deotes the feature takes boolea values ad N represets umerical values Feature Descriptio Type Refereces f 1 LM N [37] f 2 PageRak N [38] f 3 URL legth N [38] f 4 Achor text N [38] f 5 Title N [38] f 6 From lists B [12] f 7 From people B [12] f 8 From www+esw B [12] f 9 From other+dev B [12] g 1 Exact ame match B [3] g 2 Name match B [3] g 3 Last ame match B [3] g 4 Email match B [3] g 5 LM score N [6] g 6 EMAIL FROM B [5] g 7 EMAIL TO B [5] g 8 EMAIL CC B [5] g 9 EMAIL CONTENT B [5] g 10 g 13 Proximity B - cotai explicit documet types or may emails with iteral structure iformatio useful for expert search [38]. The f 1 feature is the documet retrieval score by LM usig the topic as the query. The smoothig method of LM is Jeliek-Mercer with the parameter λ = 0.5 we use the same smoothig for other LMs). The g 5 feature is the retrieval score by LM usig the cadidate idetifier as the query [6]. The Proximity features g 6 g 9 ) are the boolea variables idicatig whether the cadidate idetifier co-occurs with the query term i a widow with various sizes. We use 20, 50, 100 ad 250 as the widow sizes i umber of words), approximated to the sizes of setece, passage, paragraph ad sectio, respectively. The details about these features ca be foud i the correspodig referece. To ormalize the features, we use query-based ormalizatio for each feature as suggested i [19]. May of these features have bee show useful for expert search. Because of the geerative ature of laguage models, it is difficult for them to icorporate such heterogeeous features i a uified modelig framework, but discrimiative models ca effortless iclude all the features ad may more. Sice the focus of this study is o the probabilistic models rather tha feature egieerig, we do ot ited to choose a complete set of features, but they are oe of the most comprehesive ad diverse feature sets i a sigle work amog the existig expert search research. 6. RESULTS 6.1 Discrimiative Model vs. Model 2 I this sectio, we compare the proposed discrimiative model with its geerative couterpart: Model 2. The proposed model is evaluated o four differet feature cofiguratios, which are preseted i Table 3. The Base method is the implemetatio of Model 2 by followig [3], which icludes 4 types of documet-cadidate associatios. The R1 cofiguratio uses these 4 associatio features plus f 1 as documet evidece. Thus, the idetical iformatio is Table 3: Experimetal cofiguratios Base Balog et al s Model 2 cadidate-cetric) with 4 associatio features i.e., g 1 g 4 ) [3] R1 Discrimiative model with 4 associatio features g 1 g 4 ) ad LM documet evidece feature f 1 ) R2 Discrimiative model with full documet evidece features ad 4 associatio features g 1 g 4 ) R3 Discrimiative model with full associatio features ad oe documet evidece feature f 1 ) R4 Discrimiative model with full documet evidece features ad full associatio features Table 4: Compariso of the discrimiative model AMD) with the Base mehod o W3C ad CERC. Best results o each collectio are highlighted. The symbol idicates statistical sigificace at 0.95 cofidece iterval agaist Base W3C MAP R-Prec MRR P5 P10 Base 0.1909 0.2445 0.5081 0.3760 0.3120 R1 0.2001 0.2552 0.5300 0.3820 0.3310 R2 0.2282 0.2764 0.5624 0.3960 0.3370 R3 0.2412 0.2904 0.6232 0.4020 0.3560 R4 0.2598 0.3035 0.6196 0.4130 0.3680 CERC Base 0.4039 0.3514 0.5389 0.2240 0.1540 R1 0.4123 0.3569 0.5593 0.2280 0.1540 R2 0.4453 0.3854 0.5924 0.2390 0.1650 R3 0.4569 0.3879 0.5886 0.2610 0.1660 R4 0.4604 0.3938 0.6143 0.2520 0.1770 available for R1 ad Base to use. The weights i Base are set by followig the choice of the best ru i [3]. R4 is the cofiguratio with full applicable features for the discrimiative model the R4 cofiguratio is the default settig i all the experimets except explicitly oted). Table 4 cotais the evaluatio results o the two test collectios. We ca see that the discrimiative model cosistetly performs better tha Base across all the feature cofiguratios o all measures. With the full set of features i.e., R4 vs Base), all the differeces are statistically sigificat by two-tailed Studet s t-test at 0.95 cofidece level. I R1 vs Base, although their differeces are ot sigificat, the discrimiative model outperforms the Base method o all the evaluatio metrics. Sice all the features are ormalized, the weight associated with each feature ca reflect the importace of the feature i some degree. Table 5 reports the top 3 features with the largest weights i f i ad g i respectively i the leared AMD model. These features are ordered alphabetically i the table sice their weights are ot very distict from each other. We fid that the features listed for the two testbeds are geerally differet with the exceptio of f 1 ad f 2, showig the importace of these two features across the corpora. A iterestig observatio is that the g 8 feature whe used o W3C has a large weight amog all the documet-cadidate associatio features. This is ituitive i the sese that the perso who is i the email cc field is likely a authoritative of the topics of the email, which is also cosistet with what was reported i [5]. Aother observatio is that the Proximity features have large weights for both testbeds i.e., g 13

Table 5: The top 3 features with the largest weights i AMD R4) leared from traiig data Doc evidece Doc-cadidate associatios W3C f 1, f 2, f 6 g 1, g 8, g 13 CERC f 1, f 2, f 5 g 4, g 5, g 11 MAP MAP 0.25 0.2 0.15 0.1 0.5 0.4 0.3 0.2 W3C Base AMD 10 2 10 3 10 4 Number of Documets Retrieved CERC Base AMD 10 1 10 2 10 3 10 4 Number of Documets Retrieved Figure 1: Impact of varyig the umber of documets retrieved M) o the discrimiative model. Top: impact o W3C; Bottom: impact o CERC. for W3C ad g 11 for CERC), but with differet widow sizes: i.e., larger size o W3C. This may come from the fact that these two collectios have very differet average documet legths. 6.2 The Effect of the Size of Retrieved Documets Similar to Model 2, the leared discrimiative model ca be efficietly used o top of a existig documet search egie as follows: 1) Perform a stadard documet retrieval ru usig the topic as a query ad retrieve the top m documets; 2) For each cadidate associated with the relevat documets, calculate the probability of relevace usig Eq. 6 o these m documets. I this sectio, we aim to ivestigate the effect of the size of documets retrieved o the performace of the discrimiative model. We use LM as the documet retrieval ru. Figure 1 shows the MAP results by varyig M o the two test collectios. Note that the scales o the x-axis ad y-axis differ per plot. From the figure, we ca see that as M icreases, the discrimiative model has a similar tred with the baselie: icreasig, achievig a maximum, ad the flatteig. O W3C, the MAP value tops after 300 documets retrieved, fewer tha what the baselie eeds i.e., 400). For CERC, both models eed aroud 50 documets for best performace. Therefore, usig a subset of documets could speed up the process of expert search as the best performers use much less documets tha the whole set of relevat documets. At the same time, the retrieval performace ca be improved although their differeces are ot foud statistically sigificat. 6.3 Experimets by Usig Differet Documet Retrieval Methods As show i Sectio 6.1 as well as i prior work, the doc- Table 6: Evaluatio of AMD with differet documet retrieval methods o W3C ad CERC MAP R-Prec MRR P5 P10 W3C LM 0.2598 0.3035 0.6196 0.4130 0.3680 BM25 0.2658 0.3141 0.6238 0.4060 0.3700 Idri 0.2562 0.3066 0.6149 0.4090 0.3640 CERC LM 0.4604 0.3938 0.6143 0.2520 0.1770 BM25 0.4551 0.3895 0.5877 0.2470 0.1740 Idri 0.4667 0.4086 0.6000 0.2550 0.1780 Table 7: Compariso of the geometric mea discrimiative model with Base ad AMD R4) o W3C ad CERC. The symbol idicates statistical sigificace at 0.95 cofidece iterval for GMD agaist Base MAP R-Prec MRR P5 P10 W3C Base 0.1909 0.2445 0.5081 0.3760 0.3120 AMD 0.2598 0.3035 0.6196 0.4130 0.3680 GMD 0.2512 0.3010 0.6266 0.4110 0.3640 CERC Base 0.4039 0.3514 0.5389 0.2240 0.1540 AMD 0.4604 0.3938 0.6143 0.2520 0.1770 GMD 0.4669 0.4030 0.6274 0.2500 0.1790 umet retrieval score f 1 is a importat feature to show documet evidece for expert search. I this experimet, we assess the extet to which the performace of the discrimiative model is affected by the choice of the uderlyig documet retrieval model. Besides LM, aother two differet documet retrieval methods are used i.e., BM25 [28] ad Idri [33]). Specifically, the f 1 feature is replaced by these two retrieval scores respectively i the R4 cofiguratio. Table 6 shows the MAP results of the proposed model across the three retrieval models. From the table, we ca see that the results are quite similar ad they are all sigificatly better tha the baselie. This idicates that the discrimiative model is robust to the uderlyig documet retrieval method. 6.4 The Alterative Discrimiative Model vs. Base ad AMD I this sectio, we coduct the experimet to evaluate the alterative discrimiative model GMD). The aim is to ivestigate the robustess of the proposed discrimiative framework with respect to the choice of specific discrimiative models derived from the framework. Table 7 cotais the results. From the table, we ca see that all the results achieved by GMD sigificatly outperform the baselie. Furthermore, these results are quite similar with those achieved by the AMD R4) model. I particular, the GMD model is geerally better tha AMD o CERC ad worse o W3C, but the differeces betwee GMD ad AMD are ot statistically sigificat. These results demostrate that the proposed discrimiative framework geerates accurate ad robust results with both types of discrimiative models.

7. CONCLUSIONS AND FUTURE WORK I this work, we propose a discrimiative learig framework ad derive specific models for expert search. The mai advatage of the proposed approaches is their ability to itegrate a variety of documet evidece ad documetcadidate associatio features. The evaluatios o two TREC Eterprise track testbeds have show the effectiveess ad robustess of the proposed framework. There are several possibilities to exted the research i this paper. We chose out-of-order traiig i the experimets because more traiig data are available i 2006 ad 2008. It would be iterestig to perform the i-order experimets i.e., traiig o 2005 or 2007), which would allow fair comparisos with the TREC submitted rus. The relevace judgmets i 2005 ad 2007 seem also more likely to be obtaied i a real eterprise. I fact, lack of traiig data hiders the applicability of may discrimiative models. O the other had, geerative models may be able to effectively utilize abudat ulabeled data. It is desirable to develop a hybrid of discrimiative ad geerative models to obtai the best of both for expert search. I additio, i certai scearios, pairwise comparisos betwee experts might be more easily collectible tha the poitwise judgmet for each expert. We will explore to exted the proposed discrimiative learig framework to hadle this type of traiig data. 8. ACKNOWLEDGMENTS We thak the aoymous reviewers for may valuable commets. This research was partially supported by a grat from the Idiaa Ecoomic Developmet Compay, the NSF research grat IIS-0749462, ad a grat from Purdue Uiversity. Ay opiios, fidigs, coclusios, or recommedatios expressed i this paper are the authors, ad do ot ecessarily reflect those of the sposor. 9. REFERENCES [1] P. Bailey, N. Craswell, A. De Vries, ad I. Soboroff. Overview of the trec-2007 eterprise track. I TREC-15, 2007. [2] K. Balog. No-local evidece for expert fidig. I CIKM, 2008. [3] K. Balog, L. Azzopardi, ad M. de Rijke. Formal models for expert fidig i eterprise corpora. I SIGIR, 2006. [4] K. Balog, L. Azzopardi, ad M. de Rijke. A laguage modelig framework for expert fidig. Iformatio Processig & Maagemet, 451):1 19, 2009. [5] K. Balog ad M. de Rijke. Fidig experts ad their details i e-mail corpora. I WWW, page 1036. ACM, 2006. [6] K. Balog ad M. De Rijke. Associatig people ad documets. I ECIR, 2008. [7] K. Balog, I. Soboroff, P. Thomas, N. Craswell, A. de Vries, ad P. Bailey. Overview of the trec-2008 eterprise track. I TREC-16, 2008. [8] Y. Cao, J. Liu, S. Bao, ad H. Li. Research o expert search at eterprise track of TREC 2005. I TREC-13, 2005. [9] P. Carlile. Workig kowledge: how orgaizatios maage what they kow. Huma Resource Plaig, 214):58 60, 1998. [10] H. Che, H. She, J. Xiog, S. Ta, ad X. Cheg. Social etwork structure behid the mailig lists: Ict-iiis at trec 2006 expert fidig track. I TREC-14, 2006. [11] W. Cooper. Exploitig the maximum etropy priciple to icrease retrieval effectiveess. JASIST, 341):31 39. [12] N. Craswell, A. de Vries, ad I. Soboroff. Overview of the trec-2005 eterprise track. I TREC-13, 2005. [13] J. Deis ad R. Schabel. Numerical Methods for Ucostraied Optimizatio ad Noliear Equatios. Society for Idustrial Mathematics, 1996. [14] H. Fag ad C. Zhai. Probabilistic models for expert fidig. I ECIR, 2007. [15] Y. Fag, L. Si, ad A. Mathur. Rakig experts with discrimiative probabilistic models. I SIGIR Workshop o Learig to Rak for Iformatio Retrieval, 2009. [16] Y. Fu, W. Yu, Y. Li, Y. Liu, M. Zhag, ad S. Ma. THUIR at TREC 2005: Eterprise track. I TREC-14, 2006. [17] N. Fuhr. Probabilistic models i iformatio retrieval. The Computer Joural, 353):243, 1992. [18] T. Liu. Learig to rak for iformatio retrieval. Foudatios ad Treds i Iformatio Retrieval, 33):225 331, 2009. [19] T. Liu, J. Xu, T. Qi, W. Xiog, ad H. Li. Letor: Bechmark dataset for research o learig to rak for iformatio retrieval. I SIGIR Workshop o Learig to Rak for Iformatio Retrieval, 2007. [20] C. Macdoald, D. Haah, ad I. Ouis. High quality expertise evidece for expert search. I ECIR, 2008. [21] C. Macdoald ad I. Ouis. Votig for cadidates: adaptig data fusio techiques for a expert search task. I CIKM, 2006. [22] D. Metzler ad W. Bruce Croft. Liear feature-based models for iformatio retrieval. Iformatio Retrieval, 103):257 274, 2007. [23] R. Nallapati. Discrimiative models for iformatio retrieval. I SIGIR, 2004. [24] A. Ng ad M. Jorda. O discrimiative vs. geerative classifiers: a compariso of logistic regressio ad aive bayes. NIPS, 2002. [25] D. Petkova ad W. Croft. Proximity-based documet represetatio for amed etity retrieval. I CIKM, 2007. [26] S. Robertso. The probability rakig priciple i IR. Joural of documetatio, 334):294 304, 1977. [27] S. Robertso ad K. Joes. Relevace weightig of search terms. JASIST, 273):129 146, 1976. [28] S. Robertso, S. Walker, S. Joes, M. Hacock-Beaulieu, ad M. Gatford. Okapi at TREC-4. I TREC-4, 1996. [29] P. Serdyukov ad D. Hiemstra. Beig omipreset to be almighty: The importace of the global web evidece for orgaizatioal expert fidig. I SIGIR Workshop o Future Challeges i Expertise Retrieval, 2008. [30] P. Serdyukov ad D. Hiemstra. Modelig documets as mixtures of persos for expert fidig. I ECIR, 2008. [31] P. Serdyukov, H. Rode, ad D. Hiemstra. Modelig multi-step relevace propagatio for expert fidig. I CIKM, 2008. [32] I. Soboroff, A. de Vries, ad N. Craswell. Overview of the trec-2006 eterprise track. I TREC-14, 2006. [33] T. Strohma, D. Metzler, H. Turtle, ad W. Croft. Idri: A laguage model-based search egie for complex queries. I Iteratioal Coferece o Itelligece Aalysis, 2004. [34] J. Tag, J. Zhag, L. Yao, J. Li, L. Zhag, ad Z. Su. Aretmier: Extractio ad miig of academic social etworks. I SIGKDD, 2008. [35] D. Tax, M. Va Breukele, R. Dui, ad J. Kittler. Combiig multiple classifiers by averagig or by multiplyig? Patter recogitio, 339):1475 1485, 2000. [36] D. Yimam-Seid ad A. Kobsa. Expert fidig systems for orgaizatios. Sharig Expertise: Beyod Kowledge Maagemet, 2003. [37] C. Zhai ad J. Lafferty. A study of smoothig methods for laguage models applied to iformatio retrieval. TOIS, 222):214, 2004. [38] J. Zhu, X. Huag, D. Sog, ad S. Ruger. Itegratig multiple documet features i laguage models for expert fidig. Kowledge ad Iformatio Systems, pages 1 26.