Latent Credibility Analyi Jeff Paternack Facebook, Inc. 1601 Willow Road Menlo Park, California 94025 jeffp@fb.co Dan Roth Univerity of Illinoi, Urbana-Chapaign 201 North Goodwin Chapaign, Illinoi 61801 danr@illinoi.edu ABSTRACT A frequent proble when dealing with data gathered fro ultiple ource on the web (ranging fro bookeller to Wikipedia page to tock analyt prediction) i that thee ource diagree, and we ut decide which of their (often utually excluive) clai we hould accept. Current tateof-the-art inforation credibility algorith known a factfinder are tranitive voting yte with rule pecifying how vote iteratively flow fro ource to clai and then back to ource. While thi i quite tractable and often effective, fact-finder alo uffer fro ubtantial liitation; in particular, a lack of tranparency obfucate their credibility deciion and ake the difficult to adapt and analyze: knowing the echanic of how vote are calculated doe not readily tell u what thoe vote ean, and finding, for exaple, that a ource ha a core of 6 i not inforative. We introduce a new approach to inforation credibility, Latent Credibility Analyi (LCA), contructing trongly principled, probabilitic odel where the truth of each clai i a latent variable and the credibility of a ource i captured by a et of odel paraeter. Thi give LCA odel clear eantic and odularity that ake extending the to capture additional oberved and latent credibility factor traightforward. Experient over four real-world dataet deontrate that LCA odel can outperfor the bet factfinder in both unupervied and ei-upervied etting. Categorie and Subject Decriptor H.3.3 [Inforation Syte]: Inforation Search and Retrieval Inforation filtering; I.2. [Coputing Methodologie]: Artificial Intelligence General Ter Algorith, Experientation, Meaureent, Reliability Keyword Credibility, Graphical Model, Trut, Veracity 1. INTRODUCTION Conflict aong inforation ource are coonplace: Twitter uer debate the effect of healthcare refor, Wikipedia Copyright i held by the International World Wide Web Conference Coittee (IW3C2). IW3C2 reerve the right to provide a hyperlink to the author ite if the Material i ued in electronic edia. WWW 2013, May 13 17, 2013, Rio de Janeiro, Brazil. ACM 978-1-4503-2035-1/13/05. author provide differing population for the ae city, online retailer offer dicordant decription of the ae product, financial analyt diagree on the future price of ecuritie, and edical blog precribe different coure of treatent. Conequently, we need a ean of dicerning which of the aerted clai are true, epecially on the web, where three of our four experiental dataet (fro current, real proble in inforation credibility) originate. Preently thi i addreed by iple or weighted voting or, with ore ophiticated fact-finder algorith (e.g. [4, 18, 14]), tranitive voting, but thee ethod tend to be ad hoc and difficult to analyze and extend. Latent Credibility Analyi i a new ethod of approaching the credibility proble by intead odeling the joint probability of the ource aking clai and the uneen (latent) truth of thoe clai. Finding the probability that a particular clai i true i then perfored via inference in a probabilitic graphical odel uing one of the any extant exact and approxiate inference algorith. Unlike thoe of fact-finder, the reulting credibility deciion and the paraeter capturing the credibility of the ource are ditribution and probabilitie with clear eantic: for exaple, in the SipleLCA odel we reaon that a clai i likely to be true becaue the probability that everyone who aerted it wa lying (a given by the Honety paraeter of the ource) i relatively all. Thi tranparency i iportant both when we need to explain the odel deciion to uer (who ight otherwie ditrut the yte itelf) and when we adapt an LCA odel to real-world proble; in our experient, we are able to forulate reaonable prior and anticipate (to a degree) the ot appropriate, bet perforing odel by undertanding the doain. Such clarity i a coon trait of probabilitic odel, but a ubtantial iproveent over fact-finder, where the cloet analog to prior i typically the nuber of vote each clai i initialized with; further, fact-finder in general have few, if any, other tunable paraeter that can be adjuted, and where preent (like the Invetent factfinder growth rate value [13]) they tend to be both ad hoc and opaque it i rarely poible to anticipate what value are uitable for a particular proble before evaluating the on labeled data. LCA odel are alo uch ipler to odify on a ore ubtantial level: there i a traightforward path fro a generative tory about why ource aert the clai that they do to the joint ditribution, and augenting thi core (e.g. to incorporate the idea that oberved attribute of the ource, like acadeic degree, influence their credibility) i a iple a finding a product acro everal independent coponent. Even in experient ig- 1009
norant of uch factor and uing the fact-finder tandard unupervied etting, LCA odel ubtantially outperfor fact-finder in etablihing the credibility of city population, book authorhip, tock prediction, and prediction of the Supree Court of the United State. Perhap urpriingly, thi needn t coe at an exorbitant cot: two of our odel cale linearly, a fact-finder do, and the reaining two, while not linear tie, nonethele proved tractable even over relatively large (ten of thouand of ource and clai) dataet in our experient. In the reainder of thi paper we firt provide a ore detailed decription of fact-finder. We ubequently dicu the fundaental of LCA before introducing, in order of increaing ophitication, four pecific LCA odel: SipleLCA, GueLCA, MitakeLCA, and LieLCA, and then explore the perforance of thee odel in coparion to fact-finder in our experient. 2. BACKGROUND: FACT-FINDERS A fact-finder take a it input a lit of aertion of the for ource aert clai c and a lit of dijoint utual excluion et of clai [14]. Exactly one of the clai in each utual excluion et i true, and thi i what the factfinder endeavor to identify. Thi i done via an iterative tranitive voting yte: tarting fro oe initial belief core in all the clai, the algorith calculate the trutworthine of each inforation ource (e.g. a Wikipedia editor, a financial analyt, a webite, a claifier, etc.) baed on the clai it ake, and then in turn calculate the belief of the clai baed on the trutworthine of the ource aerting it; thi proce then repeat for a fixed nuber of iteration or until convergence. Fact-finder are differentiated by their variou update rule, whereby the trutworthine of ource and belief in clai i calculated. For exaple, the Su fact-finder i derived fro Hub and Authoritie [9], where ource trutworthine can be conidered the hub core and clai belief the authority core; at each iteration i we calculate the trutworthine of each ource a the u of the belief in it clai, T i () = c: c Bi 1 (c), and then the belief core of each clai a the u of the trutworthine of the ource aerting it, B i (c) = : c T i (). Of coure, fact-finder can be coniderably ore coplex and varied; in the Invetent and PooledInvetent [13] algorith, ource invet their credibility in the clai they ake, and clai belief i then non-linearly grown and apportioned back to the ource baed on the ize of their invetent. Several fact-finder have probabilitic eleent. TruthFinder [19] calculate clai belief a 1 : c 1 T (), with the idea that T () i the probability that tell the truth, o the probability that a clai i wrong i the probability that all the (independent) ource are liar. However, thee eantic are probleatic: the peudoprobabilitie over all the clai in a utual excluion et will not u to 1 and cannot be readily noralized ince the trutworthine of a ource i calculated a the arithetic ean of thoe clai it ake. [17] explicitly eek to create a factfinder with an (approxiate) Bayeian jutification, but relie on ubtantial auption, the ot iportant being that P ( c T rue(c)) P ( c), i.e. the probability a ource aert a clai i independent of the truth of that clai (which doe not hold in practice). [21] i oething of an anoaly, a it, like Latent Credibility Analyi, odel the credibility proble a a graphical odel (a Bayeian network), but pecialize in ituation where the truth i a collection of entitie (e.g. identify all the author of a book) and the odel ha the advantage of reaoning about thee directly; other approache (including LCA) intead iply treat thee a binary clai (i John Sith an author of Book or not?). More iportantly, the odel ake an iplicit auption (a noted by the author) that each ource i predoinately honet, which often doe not hold in real data (e.g. vandali in Wikipedia). Additionally, oe fact-finder have incorporated apect beyond ource trutworthine and clai belief into their update rule. 3-Etiate [4] add paraeter to attept to capture the difficulty of a clai, an idea alo preent in our LCA odel. Fact-finder have alo been applied to intance where the clai are not extracted in a prior tep but rather nippet of textual evidence are effectively clutered uing iilarity etric, a applied by the Apollo yte to tweet [10] or to new article by [16]. AccuVote [3] attept to identify ource dependence (one ource copying another) to give greater credence to ore independent ource, an apect that i iportant in certain doain (e.g. blog poting, which are routinely derivative) and could be incorporated in future LCA odel, although we do not conider it here. Finally, fraework have been created capable of extending any fact-finder. [13] applie declarative prior knowledge (in the for of firt-order logic) to fact-finder by uing linear prograing to contrain clai belief; in our experient, we ue thi ethod in an extreely iple for to apply uperviion to fact-finder (our contraint are of the type clai c i true ), which are otherwie wholly unupervied algorith. For LCA odel, declarative contraint ay be enforced by one of everal ethod for contraining the poterior ditribution of probabilitic odel, uch a Poterior Regularization [5] or Contraint Driven Learning [1]. Further, [14] introduce generalized fact-finder, which adapt the bipartite unweighted graph of tandard fact-finder to weighted, k-partite graph, allowing uch factor a ource feature (e.g. ource ha a doctorate in a relevant field ) and uncertainty in inforation extraction to be incorporated, eentially changing how vote flow throughout the network. LCA odel naturally upport thee for of prior knowledge and data in a principled way, a we will dicu hortly, and can incorporate any other (uch a prior over the honety of ource and real-valued feature) that generalized fact-finder cannot. 3. LATENT CREDIBILITY ANALYSIS 3.1 Fundaental A Latent Credibility Analyi odel i a probabilitic odel where the true clai c in each utual excluion et of clai i a (ultinoial) latent variable, y. An oberved aertion i the probability of c a claied by, b,c, typically {0, 1} (e.g. John clai Obaa wa born in Hawaii ), but ditributional clai are alo poible (e.g. John i 95% certain Obaa wa born in Hawaii and 5% certain he wa born in Alaka ). Note that, c b,c = 1. Every ource alo ha a [0, ) confidence in hi aertion over the clai in, w,, again typically {0, 1} (0 if the ource ake no aertion about, 1 if it doe), but other value ay be ued to expre degree of confidence with traightforward ean- 1010
Notation Decription Exaple / Definition An inforation ource Aazon.co; Dan Rather c A clai Preident Barack Obaa born in 1953 A utually excluive (ME) et of clai Claied Birth Year of Barack Obaa y The true clai in Preident Barack Obaa wa born in 1961 b,c The (oberved) probability of c aerted by 0; 1; 0.7 w, [0, ) confidence of in the ditribution aerted over all c 0; 1; 4.5 H The probability ake an honet, accurate aertion 0.4; 0.9 D g// The probability know y (global, per-me et, or per-ource) 0.3; 0.7 S Set of all ource = {} C Set of all clai c = {c} M Set of all utual excluion et = {} B S C atrix of all oberved aertion b = {b,} W S M atrix of all aertion confidence w = {} Y Set of all true clai = {y : M} Y U Set of all latent true clai Y Y L Set of all oberved true clai (label) Y X Set of all obervation (including B) = B { all other feature } θ Set of all latent odel paraeter e.g. {H : S} {D : M} Table 1: LCA Notation tic: a can be een fro the joint ditribution of our LCA odel, a w, of 0.5 caue aertion ade by about clai in to affect the log-likelihood only half a uch a ource with w, = 1, and w, = 2 i equivalent to aking the ae aertion twice. Thi can be ueful if, for exaple, a ource expree abundant or reduced confidence in hi aertion, e.g. John i 50% confident that Obaa wa born in Hawaii with 95% probability..., coparable in function and purpoe to belief and plauibility in Depter- Shafer theory [20, 15] and uncertainty in ubjective logic [8, 7]. Since we are not intereted in odeling why a ource decide to ake an aertion about the clai in a utual excluion et (and with what confidence), the confidence atrix W = {w,} i taken a a given contant rather than an obervation. Our obervation are the aertion atrix B = {b,c}, together with whatever oberved feature (uch a attribute of the ource) are relevant to the particular odel; we will collectively refer to thee oberved variable a X. Siilarly, we will refer to our latent variable a Y = {y }, and the odel paraeter (in the odel we decribe later thee include the honety of each ource and the difficulty factor of identifying the true clai) a θ. Finally, when we write the joint probabilitie, we aue all utual excluion et contain at leat two clai; thi i a notational convenience, ince any unconteted clai ut be true (there i no alternative) and the probability of a ource aerting it i thu 1 and it doe not affect the joint probability. A an exaple, conider a proble with two utual excluion et, p = Obaa Birthplace and d = Obaa Birthdate, where we oberve a ource j = John ake a ingle aertion c h = Obaa wa born in Hawaii. Then b j,c h = 1, c p\ch b j,c = 0, w j, p = 1, and w j, d = 0 (rendering the value of {b j,c : c d } irrelevant). Latent variable y p and y d are Obaa true birthplace and birthdate, repectively, o y p = Hawaii and y d = Augut 4th, 1961. 3.2 Inference Inforation credibility proble can be claed a unupervied or ei-upervied; in the unupervied cae, we are only given obervation X and none of the y are known, o Y U = Y and Y L = (Y U and Y L are the et of unlabeled [latent] and labeled [oberved] true clai, repectively). Alternatively, when ei-upervied, we know the true clai in oe utual excluion et, Y L Y, already and only need to deterine the reaining Y U = Y \ Y L. In both cae, our goal i to infer: P (Y U X, Y L) = θ Y U P (YU, YL, X θ)p (θ) θ P (YU, YL, X θ)p (θ) Thi i the ditribution over the poible true clai for each utual excluion et where the true clai i not already known, given the obervation and true clai already identified. In our experient we olve thi approxiately, by uing EM [2] to find the axiu a poteriori (MAP) point etiate of the paraeter, θ = argax θ P (X θ)p (θ), and then iply calculating: P (Y U X, Y L, θ ) = P (Y U, X, Y L θ ) Y U P (Y U, X, Y L θ ) The expectation and axiization update rule ued to find the axiu a poteriori point etiate θ are: Expectation Step : : P (y = c X, θ t P (y = c, X θ t ) ) = v P (y = v, X θt ) Maxiization Step : θ t+1 = argax E Y X,θ t[log(p (X, Y θ)p (θ))] θ In LCA odel, the E-tep i alway eay, ince the y value are independent given the obervation X and the paraeter θ t at iteration t. The M-tep can be ore difficult: 1011
y w, 2 M c 2 H b,c with probability 1 H. Fro thi intuitive idea, we can 1 iediately derive a joint ditribution over y and X: P (y, X H ) = P (y ) (H ) b,y = P (y ) ( c \y (H ) b,y ( 1 H 1 ( 1 H 1 ) b,c ) (1 b,y ) ) w, w, 2 S Figure 1: A plate diagra of a baic SipleLCA odel with oberved aertion a the ole feature (X = B). in SipleLCA, θ t+1 can be calculated in cloed for provided that P (θ) i unifor; otherwie, gradient acent ut be ued. Where thi can be done paraeter-by-paraeter, the tie required for the M-tep cale linearly in the nuber of paraeter; in MitakeLCA and LieLCA, joint gradient acent require a nuber of tep increaing linearly in the nuber of dienion [12] (ince the Lipchitz contant and quared diaeter both increae linearly) while the cot to copute the gradient and the function value alo increae linearly (provided the nuber of aertion per ource and clai per utual excluion et reain contant), yielding O( θ 2 ) coplexity. However, even on our larget experient, MitakeLCA and LieLCA took no ore than 200 tie a long a SipleLCA and GueLCA, far le than uggeted by thi wort cae quadratic bound. Exact runtie varied, but for concretene LieLCA took approxiately 20 inute on the population dataet, 30 inute on the tock dataet (per tie interval), and fro 25-80 inute on the book dataet (ingle-threaded on a 3GHz Core 2 Duo E8400); by coparion, GueLCA wa 40 econd, one inute, and 3-4 inute, repectively. 3.3 SipleLCA SipleLCA, a with all our odel, i a joint ditribution that reflect a tory of how ource decide which clai to aert. For both thi and ubequent LCA odel, we aue that each b,c {0, 1} and each w, {0, 1}; thi atche our experiental doain (where ource aert a ingle clai in a utual excluion et with full certainty) and iplifie the equation for the joint by avoiding a cuberoe noralization factor. If thee auption are relaxed, the joint ditribution a written will no longer be ditribution and ut be noralized. In SipleLCA, each ource ha a probability of being honet, H. A ource then decide to aert the true clai c in utual excluion et with probability H ; otherwie, it chooe uniforly at rando fro the other clai in Here, P (y ) i our prior probability of y being the true clai in, and w, will be 1 if the ource aert (with full certainty) a clai in, or 0 if the ource ay nothing about. In the econd equation we have iplified the expreion by noting that c b,c = 1, o c \y b,c = 1 b,y. Oberving that all ource ake their aertion independently and taking θ = {H } we can write the full joint a: P (Y, X θ) = P (y ( ( ) ) (1 b,y ) w, 1 ) (H ) b,y H 1 The expected log-likelihood axiized in the M-tep i then E Y X,θ t[log(p (X, Y θ)p (θ))] = log(p (θ)) + ( P (Y X, θ t ) log P (y ) Y ( ( ) ) (1 b,y ) w, ) 1 (H ) b,y H 1 = log(p (θ)) + + y P (y X, θ t ) ( log(p (y )) w, (b,y log(h )+(1 b,y ) log ( )) ) 1 H 1 Finding the derivative with repect to each H θ, δ E δh Y X,θ t[log(p (X, Y θ)p (θ))] = δp (H ) P (H ) 1 δh + y P (y X, θ t )w,(b,y H ) H (H ) 2 Now we can axiize each H independently in our M- tep uing gradient acent to find the new, axiizing θ t+1. δp (H) However, when the prior P (H ) are unifor (o δh = 0), the gradient iplifie, allowing u to et it to 0 and olve the reulting equation explicitly for the new axiizing value of H at the tationary point: H = y P (y X, θ t )w,b,y w, A we would intuitively expect, we thu etiate the honety of a ource, that i, the probability that it provide the true clai, a eentially the expected proportion of true clai ade by the ource given our current paraeter. 1012
Thi cloed for update rule alo ean that SipleLCA with unifor honety prior i a fat a fact-finder in practice, aking it extreely calable. When alternative prior are ued, gradient acent require about twice a uch tie per EM iteration, but even on our larget dataet thi wa a atter of econd. 3.4 GueLCA SipleLCA i indeed quite iple. But it alo clear that, for ource, identifying the truth in oe utual excluion et i uch harder than in other; for exaple, a ource who erely gueed randoly would be aigned an honety of 0.5 by SipleLCA if it only ade clai in utual excluion et of ize 2, and 0.25 if ize 4. In GueLCA, a ource ha a probability of knowing and telling the truth, H. Thu, with probability H, it aert the true clai. However, with probability 1 H, it guee clai c with probability P g(c ) (where c Pg(c ) = 1). Thi give u the joint probability: P (X, Y θ) = P (y ) (H + (1 H )P g(y )) c \y ((1 H )P g(c )) b,cw, b,y w, Thi joint can be eaily undertood by conidering the arginal cae for each M; the probability that the ource aert the true clai (b,y = 1) i then jut H + (1 H )P g(y ), the probability of knowing the truth plu the chance of not knowing the truth and (fortunately) gueing it; c b,c = 1 c y b,c = 0, o the product c \y (...) b,c = 1 i oot. Converely, the probability of aerting an untrue clai (b,c y = 1) can be iilarly found a the probability of not knowing the truth and gueing c, (1 H )P g(c ). Oitting the interediate tep for brevity, we find that the gradient of the expected log-likelihood with repect to H iplifie to δ E δh Y X,θ t[log(p (X, Y θ)p (θ))] = δp (H ) P (H ) 1 δh + P (y X, θ t )w, y H + b,y Pg(y ) 1 P g(y ) + b,y 1 1 H Like SipleLCA, the gradient with repect to each H i independent of the other paraeter θ \ H, allowing u to axiize the expected log-likelihood in the M-tep uing gradient acent paraeter-by-paraeter, which i very fat in practice. The gue ditribution P g(c ) i provided to the odel a a prior; we could, for exaple, et P g(c ) to the ditribution of ource aerting the clai in under the auption that a gueing ource chooe randoly according to the ditribution of vote it oberve at the tie. Thi itigate ource becoing truted by aerting obviou or well-known clai: the aeed probability of gueing thee will then be high (becaue a large ajority of ource already aert the, and we aue that gueer tend to go with the crowd), o the odel i free to et H low a the obervation can be effectively explained away by (1 H )P g(y ); converely, a ource aerting a true clai with a low probability of being gueed will be attributed to a high H. GueLCA thu reward getting hard clai right and penalize getting eay clai wrong. GueLCA doe require that thi difficulty inforation be provided a priori rather than learned by the odel, and while in ot doain the ditribution of guee i eay to approxiate (e.g. if the ource tend to gue with the crowd, probably the ot prevalent behavior in practice, we can ue the ditribution of the nuber of aertion ade by other ource for each alternative within the utual excluion et, and if the ource are believed to gue randoly we ue a unifor prior over the poibilitie) thi cannot capture the latent difficulty iplied by, for exaple, the diagreeent of two highly honet ource (ince honety itelf i latent). More ignificantly, the odel aue that no ource will do wore than gueing even if H = 0, a ource till ha a P g(c ) probability of gueing the correct clai c. Thi auption i violated when ource are yteatically wrong. Thi ay be due to intentional deception, or, ore coonly, a recurring itake: for exaple, there are ultiple way of defining the population of a city (etro area, city liit, etc.) and oe Wikipedia editor conitently ue definition that diagree with the truth (cenu data). 3.5 MitakeLCA To overcoe thee proble, MitakeLCA odel difficulty explicitly, a the probability of an honet ource aking a itake. For a ource to aert the true clai it ut both intend to tell the truth with probability H and ut know what the truth i with probability D. D ay be global (in which cae all ource have probability D g of knowing the truth acro all utual excluion et) or tied to each utual excluion et (in which cae ource have probability D of knowing the truth in a particular utual excluion et); thi reult in two variant of the odel, which we will refer to a MitakeLCA g and MitakeLCA. A ource thu aert the true clai c with probability H D, but otherwie, with probability 1 H D, chooe another clai c \ c according to P e(c c, ). Recall that, in GueLCA, our gueing probability P g wa not conditioned on the true clai, but P e pecifie the ditribution of itake a ource will ake given that c i true, with P e(c c, ) = 0. Like P g, P e i provided a a prior, but conditioning on the true clai ean that it can alo encode very ueful inforation about iilar or eaily confued clai; for exaple, if there are three clai about a peron age, 35, 45, and 46, P e(45 46, ) and P e(46 45, ) would both be high. The joint probability i given by: P (X, Y θ) = P (y ) (H D) b,y w, c \y (P e(c y, )(1 H D)) b,cw, 1013
The gradient of the expected log-likelihood are given by: δ(...) δh δ(...) δd δp (H) = P (H ) 1 δh + ( ) P (y X, θ t b,y D H )w, H D y H 2 δp (D) = P (D ) 1 δd + ( ) P (y X, θ t b,y D H )w, D D 2 y H The gradient for D g i identical, except that we u over all utual excluion et a well a all ource. Since all H are linked by D, we ut optiize all paraeter jointly in the M-tep. 3.6 LieLCA MitakeLCA ake no ditinction between intentional lie caued by a lack of honety and honet itake that occur with probability (1 D); we can iagine that the forer cae i governed by a ditribution over poible lie, wherea the latter reult in gueing. In LieLCA, a ource aert the true clai c if it it i both honet and know the anwer (with probability H D). A dihonet ource who know the truth, however, chooe a lie c with probability (1 H)DP l (c c, ), where P l i the ditribution over poible lie given the truth (P l (c c, ) = 0). Finally, any ource who doe not know the truth guee a clai c with probability (1 D)P g(c ). The D paraeter ay be per-ource, per-utual excluion et, or global, reulting in LieLCA, LieLCA, and LieLCA g variant. The joint probability i thu: P (X, Y θ) = P (y ) b,y w, (H D + (1 D)P g(y )) c \y ((1 H )DP l (c y, ) + (1 D)P g(c )) b,cw, The gradient of the expected log-likelihood with repect to H and D can be found a: δ(...) δp (H) = P (H ) 1 δh δh + P (y X, θ t )w, b,y D (1 D)P y g(y ) + DH b,ydp l (c y, ) (1 D)P c \y g(c ) + D(1 H )P l (c y, ) δ(...) δd g δp (Dg) P (D g) 1 δd g = + P (y X, θ t )w, b,y (H P g(y )) H, y D g + (1 D g)p g(y ) + b,c((1 H )P l (c y, ) P g(c )) (1 D g)p g(c ) + D g(1 H )P l (c y, ) c \y Again, the gradient for D and D are identical, except, i replaced by and, repectively. It i intereting to note that LieLCA i a pecial cae ince each pair of (H, D ) paraeter ay be optiized independently of the other, with the ae linearly caling coplexity a SipleLCA and GueLCA; otherwie, like MitakeLCA, the paraeter ut be optiized jointly. It i iportant to note that we are abuing language oewhat here; in LiarLCA, a lie i an intentional, incorrect aertion by a ource who know the truth, but it need not iply alice or an intent to deceive. A Wikipedia editor who (perhap out of ignorance) accurately lit the population of citie by their greater etro area rather than by their city liit when the latter i held to be the true eaure would not norally be conidered a liar, even though the odel conider their aertion to be lie (and in thi particular cae thoe lie ay be quite inforative ince we know they will be drawn fro value trictly greater than the true population uch that P l (c c, ) > 0 iff c c). 3.7 Dicuion 3.7.1 Model Coplexity and Seantic Given that we have preented a erie of increaingly coplex odel it ight be tepting to think of thee hierarchically along the line of SipleLCA GueLCA MitakeLCA LieLCA. However, thi i incorrect: it i eay to ee that there are oe world that SipleLCA can odel (a ource with an honety of 0 who alway aert the wrong clai) that, for exaple, GueLCA cannot (at wort a ource will till oetie gue the truth). We can iilarly oberve that the H paraeter have ubtly different eaning in each odel: in SipleLCA, it i iply the probability that a ource aert the correct clai; in GueLCA, it i the probability that it both know and aert it; and in MitakeLCA and LieLCA, it i the probability the ource intend to tell the truth. Such ditinction are of practical iportance: becaue each odel tell a different tory with different eantic, we hould not expect, for intance, that the ore ophiticated LieLCA will necearily outperfor the SipleLCA odel given ufficient data (a we ight if SipleLCA were indeed ubued by LieLCA); rather, we expect that relative perforance will depend on which odel ore cloely reflect the actual behavior of ource within a particular doain. That aid, our experient howed that, indeed, oe odel appear to be ore plauible than other, and the ore coplex odel are vulnerable to overfitting: in particular, GueLCA perfor ubtantially better than SipleLCA overall and i copetitive with MitakeLCA and LieLCA, epecially where thee odel overfit (e.g. on the tock dataet). 3.7.2 Extenion A key benefit of LCA i it flexibility and tranparency relative to fact-finder. Bayeian prior over the paraeter, clai, and other phenoena (uch a the itake ditribution, P e) provide a traightforward way of encoding doain knowledge, but any extenion are alo poible. The odularity of LCA can be illutrated by an exaple: conider a cae where we have feature X f (uch a the quality of a ource webite, hi acadeic degree, year of experience, etc.) aociated with the credibility of our ource. By auing that thee feature are independent fro the 1014
ource aertion given their credibility, we can create a new odel by iply concatenating two joint ditribution: P (X, Y θ) = P LCA(X b, Y θ)p f (X f θ), where P LCA(X b, Y θ) i an LCA odel over oberved aertion X b and P f (X f θ) i the probability of oberving feature X f given the credibility of the ource (captured by paraeter θ). Additionally, LCA odel (and fact-finder) will norally only give credibility to clai that are known to exit and aerted by at leat one ource (an unknown alternative obviouly cannot be explicitly conidered in the et of poibilitie, and the odel infer a ditribution over the poible value of y ). However, we can eaily create a new none of the above clai u and aign it a prior probability P (u); believing one of the known, aerted clai will then depend on the evidence outweighing our prior inclination toward doubt. 4. EXPERIMENTS We evaluate our odel on two unupervied dataet, book authorhip [19] and city population [13], and two eiupervied dataet, tock prediction and U.S. Supree Court deciion prediction 1. Our evaluation copare our four baic LCA odel with everal top-perforing factfinder found in the literature: TruthFinder [19], Invetent, PooledInvetent, and Average-Log [13], Su [9], 3- Etiate [4], a well a iple voting (chooe the clai with the ot ource aerting it). For Invetent and Pooled- Invetent we ued the ae value for g a [13], 1.2 and 1.4, repectively. We run both the fact-finder and EM (for LCA) until convergence (within 50 iteration in our experient). Additionally, we uppleent our real-world experient with ynthetic data fro apled fro SipleLCA joint ditribution to ore carefully analyze the relative perforance of the LCA odel in a controlled context. 4.1 Book The book dataet [19] i a collection of 14,287 clai of the authorhip of variou book by 894 webite, with an evaluation et of 605 true clai collected by exaining the book cover. We ued unifor prior for the paraeter P (θ). For for the clai prior P (c) and gue prior P g(c ) we ued voted prior correponding to the ditribution of ource aerting each clai relative to the nuber {t:w t, =b t,c =1} v {t:w t,=b t,v =1} of ource aerting any clai within the utual excluion et:. Finally, the itake and lie prior P e(c c, ) were alo voted, coputed a P l (c c, ) = [for c c]); thi i the proportion of {t:w t, =b t,c =1} v \c {t:w t,=b t,v =1} ource aerting c relative to the total nuber of ource aerting any clai in other than c. For iplicity, the ditribution are the ae for all ource. For LieLCA, LieLCA, and MitakeLCA, the D or D paraeter in the odel are uch ore variable than a ingle global D g (which tend to be high), reulting in greater ephai on the voted P e prior and aking voted clai prior P (c) effectively redundant; to correct thi, we intead ue unifor clai prior on thee odel. The reult are hown in Table 2; we calculate confidence interval with the iplifying auption that the predic- 1 The Supree Court, city population, and book authorhip dataet are available at http://lotho.c.illinoi.edu/data/ Unfortunately, we are unable to releae the tock prediction data due to licening retriction. tion over each utual excluion et i independent fro the other. The only fact-finder to do better than any of the LCA odel i PooledInvetent, till ore than 3% below LieLCA. The LieLCA generative tory fit epecially well with what we know about online bookeller a priori: oe ource will conitently corrupt, abbreviate or oit author nae (in other word, they conitently lie with a low H ), while other gue by copying prevailing ource ince they tend not to reearch the inforation theelve (low D ). 4.2 Population The population dataet [13] contain 44,761 clai about the population of a city in a pecific year ade by 171,171 Wikipedia editor in infoboxe, with an evaluation et of 274 true clai identified fro U.S. cenu data. Our evaluation et i arginally aller than [13] becaue when an editor ade ultiple clai about the population of a city in the ae year, we kept only the ot recent edit and dicarded the ret; thi reulted in oe true clai becoing unconteted and thu eliinated fro the evaluation et. Our prior reained the ae a before, except that the clai prior followed the ditribution of the nuber of reviion a clai wa preent in, rather than the nuber of ource aerting it, a per [13]. Additionally, we noticed that oe odel could achieve better reult if we knew exactly when to top the prior to convergence (which i not poible given the unupervied etting); Invetent i the ot extree exaple of thi, a at 20 iteration it accuracy i 86.86%, but it ultiately converge to 75.55%. There i a wide variance in the the citie in thi dataet; oe, like Ventura, California are relatively contentiou (49 edit aerted a population of 105,000 in 2006, while 68 aerted 106,744), while in other thing are ore lopided (in Springfield the plit wa 202 edit v. 10). A a conequence, oe citie can be conidered uch harder than other, ince an overwheling ajority for one option over the other ean both that the anwer i well-known and that an editor need only follow the crowd to identify it. Given thi, we would expect thoe odel that are capable of capturing thi variable difficulty to perfor the bet, and thi atche our experient exactly: GueLCA (which attribute greater honety [H ] to ource that aert true but hard-to-gue clai and le to thoe that aert fale, eayto-gue clai) and LieLCA and MitakeLCA (which odel the variable difficulty of each city directly with D paraeter) are the bet perforing aong the LCA odel. TruthFinder alo doe quite well, but the opaque nature of fact-finder preclude an explanation why, or a prediction of the doain where it ight iilarly perfor well in the future. LieLCA top perforance, however, i a reult of having both D paraeter to odel latent difficulty (e.g. a deontrated by incorrect aertion by highly honet ource) and gueing prior to incorporate the ore obviou ituation of lopided and even vote where the difficulty i apparent even without having an etiate of the honety of the ource involved. 4.3 Predicting Stock Return We took the et of tock that were in the S&P 500 Index on January 1t, 2000 (the index change copoition over tie) and followed the through February 1t, 2012. Our reult average predictive accuracy acro 10 date, at July 1015
Model Book Population Stock Supree Court Unupervied Unupervied Sei-Supervied Sei-Supervied Voting 84.95 ± 2.85 79.93 ± 4.74 47.14 ± 4.13 54.72 ± 13.40 Su 82.87 ± 3.00 82.12 ± 4.54 48.93 ± 4.14 56.60 ± 13.34 3-Etiate 85.12 ± 2.84 74.45 ± 5.16 47.14 ± 4.13 52.83 ± 13.44 TruthFinder 86.16 ± 2.75 85.04 ± 4.22 47.14 ± 4.13 58.49 ± 13.27 Average-Log 85.47 ± 2.81 81.02 ± 4.64 46.61 ± 4.13 52.83 ± 13.44 Invetent 80.10 ± 3.18 75.55 ± 5.09 51.61 ± 4.14 75.47 ± 11.58 PooledInvetent 87.72 ± 2.62 79.93 ± 4.74 48.93 ± 4.14 77.36 ± 11.27 SipleLCA 86.51 ± 2.72 82.48 ± 4.50 56.96 ± 4.10 79.25 ± 10.92 GueLCA 89.10 ± 2.48 83.58 ± 4.39 56.25 ± 4.11 88.68 ± 8.53 MitakeLCA g 86.33 ± 2.74 82.12 ± 4.54 55.54 ± 4.12 N/A MitakeLCA 88.58 ± 2.53 86.13 ± 4.09 50.89 ± 4.14 N/A LieLCA g 89.62 ± 2.43 81.39 ± 4.61 57.86 ± 4.09 N/A LieLCA 87.89 ± 2.60 83.94 ± 4.35 51.61 ± 4.14 N/A LieLCA 90.83 ± 2.30 82.85 ± 4.46 53.39 ± 4.13 N/A Table 2: Experiental Reult (N/A: Not Available). Value are percent accuracy (proportion of true clai correctly identified) and 95% confidence interval. The bet LCA odel outperfor the bet fact-finder with tatitical ignificance in the Book, Stock and Supree Court dataet. 1t, 2011 and every two week thereafter. We pretend that each of thee date i the preent tie and interpret tock analyt buy or ell prediction a clai about whether each tock will yield a return higher or lower than the baeline S&P 500 return over the next 60 day. For exaple, when we pretend that the date i July 1t, 2011 and are conidering Microoft tock we know the buy or ell recoendation analyt have ade over the previou two week (in late June), and the latent truth we eek to identify i, of coure, whether or not the tock will actually outperfor the S&P 500 over the next 60 day. A a technical detail, tock are aued to be bought pieceeal over a week, tarting on the ubequent day, and then old pieceeal over a week, tarting 60 day later (thi reduce the day-to-day price variance). At each of thee date, we alo know which recoendation analyt ade ore 60 day ago were proven true, and thi oberved truth of whether each tock went up or down i our labeled data. Siilarly, the reainder of the prediction (thoe recoendation ade in the lat 60 day) are effectively unlabeled data, ince we do not know if they will be proven true yet. In total, there are approxiately 4K ditinct analyt and 80K ditinct tock prediction, and our evaluation et conit of 560 true clai about tock where analyt diagreed. One thing we can quickly oberve i that analyt are, in fact, uually wrong, a reflected by the 47.14% accuracy of voting. We therefore ued unifor clai prior, which are a better alternative to the voted prior of our previou experient; all other prior reain the ae. Given the difficulty of the proble (a the oft-cited efficient arket hypothei that conitent rik-adjuted return relative to the arket are ipoible would ugget [11]) we would expect no analyt to be epecially good (otherwie they would preuably be running a hedge fund) nor any tock to be epecially eay to predict; odeling thee feature, then, would offer little benefit but ubtantial rik of overfitting, a we oberve in LieLCA, MitakeLCA, and LieLCA, the three lowetperforing LCA odel. Converely, LieLCA g, balancing the overall difficulty of tock prediction with each ource ability (captured by H ), doe the bet (D g eentially erve a a latent, univeral cap on how accurate any analyt can be at the tak). Auingly, the (aptly-naed) Invetent i the only fact-finder to do better than 50%, although it urpae only one LCA odel (MitakeLCA ). Given the practical iportance of thi doain, a natural quetion to ak i if thee odel would work in practice a an invetent trategy, given the 58% accuracy of LieLCA g. It i iportant to oberve, however, that we conidered only binary outperfor and underperfor label and, critically, not how uch would have been gained (or lot) on each tock; overall exce return relative to the arket a a whole i likely to be inor. Furtherore, ince the arket change over tie, there i no guarantee that a trategy that work on hitorical data would continue to work in the future, nor can we eaily quantify thi rik (and unexpected, unlikely event can collectively poe a ajor hazard to any trategy, e.g. the collape of Long-Ter Capital Manageent [6]). 4.4 Predicting Supree Court Deciion Finally, we conidered the FantaySCOTUS project; here, 1138 people (largely law tudent) have ade prediction about the outcoe of 53 U.S. Supree Court cae that have already been decided, and 24 that have not been. Uing the ae prior a the Book experient (baed on voting), we evaluated with 10-fold cro-validation. Within each fold, Invetent, PooledInvetent, SipleLCA and GueLCA were tuned by neted 4-fold cro-validation. For Invetent and PooledInvetent, the growth paraeter (fro 1 to 2 in increent of.1), wa tuned, wherea for SipleLCA and GueLCA the paraeter prior P (H ) were tuned over et of 10 poible Beta ditribution. Since the vote for ot cae are nearly tied, we concluded that ot ource did little better than gueing, and elected Beta ditribution biaed toward 0 for GueLCA (uch that the prior on the probability of doing better than gueing i low), and biaed toward 1/2 for SipleLCA (uch that the prior probability of aerting the truth i near rando). The other factfinder were not tuned becaue they lacked tunable paraeter; LieLCA and MitakeLCA reult are oitted becaue 1016
the experient were not feaible; 10-fold cro-validation with 4-fold neted cro-validated tuning acro 10 poible ditribution of the prior of P (H ) and P (D) i 4000 tie a expenive a a noral run (and running a greatly reduced cro-validation regien with jut a few alternative prior for each paraeter would underetiate perforance relative to our other LCA reult). Thi i a tradeoff for the greater ophitication of the LieLCA and MitakeLCA odel: not only are there an additional et of paraeter (the D ) to elect prior for, the M-tep require a ubtantially ore expenive optiization (up to about 200 tie a expenive a that for SipleLCA or GueLCA a previouly dicued; a ingle, noral run of LieLCA on thi dataet take 20-30 inute). However, we note that thi cro-validated tuning i parallelizable, and a real-world ipleentation could handle the tak by plitting it over a cluter of achine. 4.5 Synthetic Reult and Analyi In our experiental reult, our undertanding of the doain allowed u to regularly anticipate which odel would be ot appropriate: in the book doain, the propenity of different bookeller to copy each other clai ( gueing ) or yteatically diagree with the truth ( lying, e.g. an idioyncratic way of abbreviating author nae) uggeted that LieLCA wa the bet fit. For Wikipedia population clai, LieLCA and MitakeLCA captured the widely varying difficultly of identifying the true population aong the citie. In predicting tock we could expect LieLCA and MitakeLCA to not work becaue predicting tock i ore-or-le uniforly challenging acro copanie and per-copany difficulty paraeter erely woren the chance of overfitting. Finally, in the Supree Court doain, we know that hitorically oe ource have been uch ore accurate than other, but given the even plit of vote in ot cae it clear that other ource (a ajority) are ore-or-le gueing; here we would expect LieLCA (which odel both gueing and varying difficulty aongt utual excluion et) to perfor bet, although it iilarly clear why GueLCA outperfor SipleLCA. However, thee are qualitative judgeent, and while they certainly help u narrow down the et of potential odel, it i not alway clear preciely which hould be ued, particularly when partial uperviion i not available to epirically etiate perforance; e.g. in city population it i not obviou why MitakeLCA outperfor LieLCA. Arguably, ince both of thee odel do well (and are preuably both reaonably good approxiation to the collection of highly varied procee that ource really do follow in generating clai) we could acknowledge that either would be a atifactory choice. Still, we alo wanted briefly invetigate the idea of odel fit quantitatively, epirically oberving how well thee odel perfor given varying quantitie of data and a precie knowledge of how the data were really generated (a oppoed to real word dataet, where we are left to peculate uing our knowledge of the doain). To do o, we generated data uing the SipleLCA joint ditribution with the intent of obtaining a iple underlying proce that would allow u to focu on the odel behavior. 4.5.1 SipleLCA Generation We ran two et of experient uing a SipleLCA odel to generate data; SipleLCA doe not incorporate gueing, itake or lie prior probabilitie, o in the firt et we give GueLCA, MitakeLCA and LieLCA unifor probabilitie. In the econd et, however, we generate thee prior randoly 2, with the idea that thi will give oe inight into the effect of a poor odel choice when ixed with a bad (rando and independent of reality) prior. In each experient we had 100 ource and 100 utual excluion et, each containing between 2 and 5 clai (elected uniforly at rando). The nuber of clai ade by each ource wa fixed at 3, 5, 10, or 20, and increaing thi effectively increaed the aount of data provided to the odel. To itigate tatitical noie, every experient wa repeated 100 tie with 100 different generated dataet, and the reported accuracie are an average of thoe run (and, within each experient, the ae 100 randoly-generated dataet were ued to tet each odel). The ditribution of H wa Beta(7, 3); thi prior over H wa ued in all odel in both experient et, depite H having oewhat different eantic in each odel (the intent i to oberve perforance when the odel do not fit the data in a well-undertood way). The reult of our ynthetic experient ay be found in Table 3. There are a nuber of intereting phenoena that we ay oberve in thee reult: Surpriingly, with unifor prior, two of the odel (GueLCA and LieLCA g) conitently outperfor SipleLCA on data generated by a SipleLCA proce. In SipleLCA, the odel tend to conclude that, given a diagreeent between ource, one i perfectly honet (H = 1) and the other i contantly wrong (H = 0). Other odel avoid thi with gueing, uch that even the wort ource can alway ake a lucky gue, which prevent the odel fro diregarding their clai entirely. With ufficient data thi overfitting i avoided entirely. MitakeLCA g veru MitakeLCA : the latter fare quite poorly in all experient, while the forer doe quite well, reflecting a ubtantial difference in the odel in practice depite a iilar joint ditribution. MitakeLCA g global D g paraeter control the frequency ource ake itake, again creating an alternative explanation for a ource error other than coplete dihonety (ince oe of their inaccuracy will be attributed to honet itake rather than dihonety). MitakeLCA, by contrat, ha far ore freedo to et it 100 D paraeter to extree value (overfitting). With randoized prior handicapping the other odel, SipleLCA lead the pack, a expected. 2 We generated thee ditribution by drawing a [0,1] value uniforly for each clai and then noralizing over the utual excluion et for P g(c ) and noralizing over the clai in the utual et excluding y for P l (c y, ) and P e(c y, ). Thi reult in a rather coplex ditribution: for exaple, given two clai A and B, the probability of a gueing A i taken a, where a i the value drawn for a+b clai A, and b i the value drawn for clai B. Marginalizing out b give P g(a) = a(log(a + 1) log(a))). 1017
P g, P e, P l Unifor Randoized Clai per Source 3 5 10 20 3 5 10 20 SipleLCA 79.92 87.80 95.83 99.54 79.92 87.80 95.83 99.54 GueLCA 80.10 88.14 95.96 99.54 77.67 84.73 92.51 96.27 MitakeLCA g 79.90 88.08 96.00 99.52 78.03 86.38 94.52 99.10 MitakeLCA 75.48 78.08 78.87 80.45 70.53 68.99 60.33 56.60 LieLCA g 80.10 88.06 96.01 99.54 78.83 86.96 95.20 99.28 LieLCA 79.90 87.92 95.85 99.53 76.14 82.24 89.59 94.51 LieLCA 78.35 86.89 95.58 99.52 75.23 84.54 94.94 99.29 Table 3: Perforance of LCA Model with Synthetic Data fro a SipleLCA Proce. Each experient wa run over 100 rando dataet and the reult averaged. With randoized prior, MitakeLCA uffer fro worening perforance a ore aertion are ade in each utual excluion et, increaing the D gradient relative to thoe of H and puhing D to lower value (it i eaier to explain away bad aertion by decreaing the D for the utual excluion et than decreaing the H for any ource). Thi then place greater weight on the (rando) itake prior. The other odel prove rearkably robut given their copletely incorrect prior, although it clear that thi doe cap the poible perforance of GueLCA and LieLCA a bit, wherea MitakeLCA g and LieLCA g can iply et a high D g, eliinating or reducing their influence, repectively. In our real-world data, SipleLCA wa often aong the leat accurate LCA odel; the ynthetic reult here ugget that, indeed, even in an artificial bet-cae cenario other odel are able to perfor alot a well. However, SipleLCA reain eay to ipleent, eay to undertand, and very tractable, and o hould not be dicounted entirely. It i alo apparent that MitakeLCA ay face evere difficulty in oe cae; wherea LieLCA can believe a ource will aert the correct clai by gueing even if the D paraeter for the relevant utual excluion et i 0, MitakeLCA ha no afety valve of thi ort: if D i 0, the ource ut alway get the clai wrong (thi create a ort of pervere anti-vote, whereby the clai with the fewet aertion i likely to be believed). Thi danger anifet itelf in the high variance we ee in the odel real-world perforance; while the top perforer in the population doain, it i alo the lowet perforer in the tock doain. Care ut therefore be taken to enure that MitakeLCA i a reaonably good fit to the doain, wherea the other odel are uch ore forgiving. 4.5.2 Dicuion Our ynthetic experient are liited in cope, but they do infor our approach to real-world proble. MitakeLCA can oetie yield the bet reult, but LieLCA ha a iilar generative tory and i a le variable choice that Thi reearch wa ponored in part by the Ary Reearch Laboratory (ARL) under agreeent W911NF-09-2- 0053. Any opinion, finding, concluion or recoendation are thoe of the author and do not necearily reflect the view of the ARL. can do well in the ae doain without MitakeLCA rik of overfitting. A econd leon i that thee odel can be rearkably reitant to bad prior (when the underlying proce generating the data i iple), and unifor prior are a good choice even if the generating proce i quite different fro the odel being applied. GueLCA in particular doe quite well with unifor prior in our ynthetic experient, and, oreover, perfor conitently well in the real-world experient, too. Thi conitency i partly due to it iplicity (little danger of overfitting) and partly becaue it anage to at leat approxiately odel the iportant difficulty apect of clai; not a preciely a the ore ophiticated LieLCA or MitakeLCA odel, of coure, but alo without their coputational cot. LieLCA and MitakeLCA are, on the other hand, ore appropriate where the behavior of ource i well undertood (e.g. the book doain) and where partial uperviion can be ued to avoid overfitting (e.g. the tock doain). 5. CONCLUSION Latent Credibility Analyi i a flexible and powerful approach to odeling the inforation credibility proble; although we have really only begun to explore it potential in our experient o far, we have nonethele een that the perforance of LCA odel urpae that of fact-finder on both ei-upervied and unupervied real-world dataet, often ubtantially. GueLCA in particular i proiing due to it conitently trong perforance and tractability, caling linearly with the ize of the proble a fact-finder do, although other, ore expreive (and expenive) LCA odel can achieve better reult when ued judiciouly. Future work hould extend the LCA fraework, capturing phenoena uch a ource dependency and real-valued clai that will allow it to odel an even wider range of doain; for now, however, LCA are a new approach to credibility that i already both eantically appealing and of ubtantial practical utility. 6. REFERENCES [1] M. Chang, L. Ratinov, and D. Roth. Structured Learning with Contrained Conditional Model. Machine Learning, 88(3):399 431, 2012. [2] A. Depter, N. Laird, and D. Rubin. Maxiu likelihood fro incoplete data via the EM algorith. Journal of the Royal Statitical Society. Serie B (Methodological), 39(1):1 38, 1977. 1018
[3] X. Dong, L. Berti-Equille, and D. Srivatava. Truth dicovery and copying detection in a dynaic world. VLDB, 2009. [4] A. Galland, S. Abiteboul, A. Marian, and P. Senellart. Corroborating inforation fro diagreeing view. In WSDM, 2010. [5] K. Ganchev, J. Graca, J. Gillenwater, and B. Takar. Poterior Regularization for Structured Latent Variable Model. Journal of Machine Learning Reearch, 2010. [6] P. Jorion. Rik anageent leon fro Long-Ter Capital Manageent. European financial anageent, 6(3):277 300, 2000. [7] A. Joang. Artificial reaoning with ubjective logic. 2nd Autralian Workhop on Coonene Reaoning, 1997. [8] A. Joang, S. Marh, and S. Pope. Exploring different type of trut propagation. Lecture Note in Coputer Science, 3986:179, 2006. [9] J. M. Kleinberg. Authoritative ource in a hyperlinked environent. Journal of the ACM, 46(5):604 632, 1999. [10] H. K. Le, J. Paternack, H. Ahadi, M. Gupta, Y. Sun, T. Abdelzaher, J. Han, D. Roth, B. Szyanki, and S. Adali. Apollo : Toward Factfinding in Participatory Sening. IPSN, 2011. [11] B. G. Malkiel. The efficient arket hypothei and it critic. Journal of Econoic Perpective, page 59 82, 2003. [12] Y. Neterov and I. U. E. Neterov. Introductory lecture on convex optiization: A baic coure, volue 87. Springer, 2004. [13] J. Paternack and D. Roth. Knowing What to Believe (when you already know oething). In COLING, 2010. [14] J. Paternack and D. Roth. Making Better Infored Trut Deciion with Generalized Fact-Finding. In IJCAI, 2011. [15] G. Shafer. A atheatical theory of evidence. Princeton Univerity Pre Princeton, NJ, 1976. [16] V. G. Vydiwaran, C. X. Zhai, and D. Roth. Content-driven trut propagation fraework. In Proceeding of the 17th ACM SIGKDD international conference on Knowledge dicovery and data ining, page 974 982. ACM, 2011. [17] D. Wang, T. Abdelzaher, H. Ahadi, J. Paternack, D. Roth, M. Gupta, J. Han, O. Fateieh, H. Le, and C. Aggarwal. On bayeian interpretation of fact-finding in inforation network. Inforation Fuion, 2011. [18] X. Yin, J. Han, and P. S. Yu. Truth dicovery with ultiple conflicting inforation provider on the web. In Proc. of SIGKDD, 2007. [19] X. Yin, P. S. Yu, and J. Han. Truth Dicovery with Multiple Conflicting Inforation Provider on the Web. IEEE Tranaction on Knowledge and Data Engineering, 20(6):796 808, 2008. [20] B. Yu and M. P. Singh. Detecting deception in reputation anageent. Proceeding of the econd international joint conference on Autonoou agent and ultiagent yte - AAMAS 03, page 73, 2003. [21] B. Zhao, B. I. P. Rubintein, J. Geell, and J. Han. A Bayeian approach to dicovering truth fro conflicting ource for data integration. Proceeding of the VLDB Endowent, 5(6):550 561, 2012. 1019