Robust Reading of Ambiguous Writing


 Kristopher Joseph
 2 years ago
 Views:
Transcription
1 In submission; please o not istribute. Abstract A given entity, representing a person, a location or an organization, may be mentione in text in multiple, ambiguous ways. Unerstaning natural language requires ientifying whether ifferent mentions of a name, within an across ocuments, represent the same entity. We evelop an unsupervise learning approach that is shown to resolve accurately several aspects of the name ientity an tracing problem. At the heart of our approach is a generative moel of how ocuments are generate an how names are sprinkle into them; in particular, we moel appearance similarity between names representing the same entity, contextual correlation among entities, an cooccurrence probabilities of entities within a ocument. We show how to estimate the moel an o inference with it an how this resolves several aspects of the problem from the perspective of applications such as questions answering. 1 Introuction Robust Reaing of Ambiguous Writing Xin Li Paul Morie Dan Roth Department of Computer Science University of Illinois, Urbana, IL Reaing an unerstaning text is a task that requires the ability to isambiguate at several levels, abstracting away etails an using backgroun knowlege in a variety of ways. One of the ifficulties that humans resolve instantaneously an unconsciously is that of reaing names. Most names of people, locations, organizations an others, have multiple writings that are being use freely within an across ocuments. The variability in writing a given concept, along with the fact that ifferent concepts may have very similar writings, poses a significant challenge to progress in natural language relate tasks. Consier, for example, an open omain question answering system (Voorhees, 2002) that attempts, given a question like When was Presient Kenney born? to search a large collection of articles in orer to pinpoint the concise answer: on May 29, The sentence, an even the ocument that contains the answer, may not contain the name Presient Kenney ; it may refer to this entity as Kenney, JFK or John Fitzgeral Kenney. Other ocuments may state that John F. Kenney, Jr. was born on November 25, 1960, but this fact refers to our target entity s son. Other mentions as Senator Kenney or Mrs. Kenney are even closer to the writing of the target entity, but clearly refer to ifferent entities. Even the statement John Kenney, born turns out to refer to a ifferent entity, as one can tell observing that this ocument iscusses Kenney s batting statistics. A similar problem exists for other entity types, such as locations, organization etc. A hoc solutions to this problem, as we show, fail to provie a reliable an accurate solution to this problem. This paper presents the first attempt to apply a unifie approach to all major aspects of this problem, presente here from the perspective of the question answering task: (1) Entity Ientity  o mentions A an B (typically, occurring in ifferent ocuments, or in a question an a ocument, etc.) refer to the same entity? This problem requires both ientifying when ifferent writings refer to the same entity, an when very similar or ientical writings refer to ifferent entities. (2) Name Expansion  given a writing of a name (say, in a question), fin other likely writings of the same name. (3) Prominence  given question What is Bush s foreign policy?, an given that any large collection of ocuments may contain several Bush s, there is a nee to ientify the most prominent, or relevant Bush, perhaps taking into about also some contextual information. At the heart of our approach is a global probabilistic view on how ocuments are generate an how names (of ifferent entity types) are sprinkle into them. In its most general form, our moel assumes: (1) a joint istribution over entities, so that a ocument that mentions Presient Kenney is more likely to mention Oswal or White House than Roger Clemens ; (2) an author moel, that makes sure that at least one mention of a name in a ocument is easily ientifiable, an then generates other mentions via (3) an appearance moel, governing how mentions are transforme from the representative mention. Our goal is to learn the moel from a large corpus an use it to support robust reaing  enabling on the fly ientification an tracing of entities. This work presents the first stuy of our propose moel an several relaxations of it. Given a collection of ocuments we learn the moels in an unsupervise way; that is, the system is not tol uring training whether two mentions represent the same entity. We only assume the 1
2 ability to recognize names, using a name entity recognizer run as a preprocessor. We efine several inferences that correspon to the solutions we seek, an evaluate the moels by performing these inferences against a large corpus we annotate (the corpus will be available on the web). Our experimental results suggest that the problem can be solve accurately, giving accuracies (F 1 ) close to 90%, epening on the specific task, as oppose to 80% given by state of the art ahoc approaches. Previous work in the context of question answering has not aresse the problem stuie here. Several works in NLP an Databases, though, have aresse some aspects of the problem. From the natural language perspective, there has been a lot of work on the relate problem of coreference resolution (Soon et al., 2001; Ng an Carie, 2003; Kehler, 2002)  which aims at linking occurrences of noun phrases an pronouns within a given ocument base on their appearance an local context. In the context of atabases, several works have looke at the problem of recor linkage  recognizing uplicate recors in a atabase (Cohen an Richman, 2002; Hernanez an Stolfo, 1995; Bilenko an Mooney, 2003). Specifically, (Pasula et al., 2002) consiers the problem of ientity uncertainty in the context of citation matching an suggests a probabilistic moel for that. Perhaps the work that is most relate to ours in terms of the problem efinition an in the sense that it works with text ata an across ocuments, is (Mann an Yarowsky, 2003), which consiers the problem of istinguishing occurrences of ientical names in ifferent ocuments. Like ours, this problem is global, but they consier only one aspect of the ientity problem, an only for ientical names of people (e.g., o occurrences of Jim Clark in ifferent ocuments refer to the same person or not). The rest of this paper is organize as follows: We formalize the robust reaing problem in Sec. 2. Sec. 3 escribes a generative view of ocuments creation an three practical probabilistic moels esigne base on it, an iscusses inference in these moels. Sec. 4 illustrates how to learn these moels in an unsupervise setting, an Sec. 5 escribes the experimental stuy. Sec. 6 conclues. 2 Robust Reaing We consier reaing a large number of ocuments D = { 1, 2,..., m }, each of which may contain mentions (i.e. real occurrences) of T types of entities. In the current evaluation we consier T = {P erson, Location, Organization}. An entity refers to the real concept behin the mention an can be viewe as a unique ientifier to an object in the real worl. Examples might be the person John F. Kenney who became a presient, White House the resience of the US presients, etc. E enotes 2 the collection of all possible entities in the worl an E = {e i } l 1 is the set of entities mentione in ocument. M enotes the collection of all possible mentions an M = {m i } n 1 is the set of mentions in ocument. M i (1 i l ) is the set of mentions that refer to entity e i E. For example, for entity John F. Kenney, the corresponing set of mentions in a ocument may contain Kenney, J. F. Kenney an Presient Kenney. Among all mentions of an entity e i in ocument we istinguish the one occurring first, r i M i,astherepresentative of e i. In practice, the representative is usually the longest mention of an entity in the ocument as well, an other mentions are variations of it. Representatives can be viewe as a typical representation of an entity mentione in a specific time an place. For example, Presient J.F.Kenney an Congressman John Kenney may be representatives of John F. Kenney in ifferent ocuments. R enotes the collection of all possible representatives an R = {r i } l 1 M is the set of representatives in ocument. This way, each ocument is represente as the collection of its entities, representatives an mentions = {E,R,M }. Elements in the name space W = E R M each have an ientifying writing (enote as wrt(n) for n W ) 1 an an orere list of attributes, A = {a 1,...,a p }, which epens on the entity type. Attributes use in the current evaluation inclue both internal attributes, such as, for People, {title, firstname, milename, lastname, gener} as well as contextual attributes such as {time, location, propernames}. Propernames refer to a list of proper names that occur aroun the mention in the ocument. All attributes are of string value an can be empty when the values are missing or unknown 2. The funamental problem we aress in robust reaing is to ecie what entities are mentione in a given ocument (given the observe set M ) an what the most likely assignment of entity to each mention is. 3 A Moel of Document Generation We efine a probability istribution over ocuments = {E,R,M }, by escribing how ocuments are being generate. In its most general form the moel has the following three components: (1) A joint probability istribution P (E ) that governs how entities (of ifferent types) are istribute into a ocument an reflects their cooccurrence epenencies. (2) The number of entities in a ocument, size(e ), an the number of mentions of each entity in E, size(m i ), nee to be ecie. The current evaluation 1 The observe writing of a mention is its ientifying writing, i.e., Presient Kenney. For entities, it is a stanar representation of them, i.e. the full name of a person. 2 Contextual attributes are not part of the current evaluation, an will be evaluate in the next step of this work.
3 John Fitzgeral Kenney N i e John Fitzgeral Kenney e i Presient John F. Kenney r i {Presient Kenney, Kenney, JFK} E House of Representatives E House of Representatives R House of Representatives M {House of Representatives, The House} Figure 1: Generating a ocument makes the simplifying assumption that these numbers are etermine uniformly over a small plausible range. (3) The appearance probability of a name generate (transforme) from its representative is moelle as a prouct istribution over relational transformations of attribute values. This moel captures the similarity between appearances of two names. In the current evaluation the same appearance moel is use to calculate both the probability P (r e) that generates a representative r given an entity e an the probability P (m r) that generates a mention m given a representative r. Attribute transformations are relational, in the sense that the istribution is over transformation types an inepenent of the specific names. Given these, a ocument is assume to be generate as follows (see Fig. 1): A set of size(e ) entities E E is selecte to appear in a ocument, accoring to P (E ). For each entity e i E, a representative r i R is chosen accoring to P (r i e i ), generating R. Then mentions M i of an entity are generate from each representative r i R each mention m j M i is inepenently transforme from r i accoring to the appearance probability P (m j r i ), after size(m i ) is etermine. Assuming conitional inepenency between M an E given R, the probability istribution over ocuments is therefore P () =P (E,R,M )=P(E )P (R E )P (M R ), an the probability of the ocument collection D is: P (D) = D P (). 3.1 Relaxations of the Moel In orer to simplify moel estimation an to evaluate some assumptions, several relaxations are mae to form three simpler probabilistic moels. Moel I: (the simplest moel) The key relaxation here is in losing the notion of an author rather than first choosing a representative for each ocument, mentions are generate inepenently an irectly given an entity. That is, an entity e i is selecte from E accoring to the prior probability P (e i ); then its actual mention m i is selecte accoring to P (m i e i ). Also, an entity is selecte into a ocument inepenently of other entities. In this way, the probability of the whole ocument set can be written in a simpler way: n P (D) =P ({(e i,m i)} n i=1) = P (e i)p (m i e i), i=1 an the inference problem for the most likely entity given m is: e = argmax e E P (e m, θ) (3) = argmax e E P (e)p (m e) (4) Moel II: (more expressive) The major relaxation mae here is in assuming a simple moel of choosing entities to appear in ocuments. Thus, in orer to generate a ocument, after we ecie size(e ) an {size(m 1,size(M 2 ),...} accoring to uniform istributions, each entity e i is selecte into inepenently of others accoring to P (e i ). Next, the representative r i for each entity e i is selecte accoring to P (r i e i ) an for each representative the actual mentions are selecte inepenently accoring to P (m j r j ). Here, we have iniviual ocuments along with representatives, an the istribution over ocuments is: P () = P (E,R,M )=P (E )P (R E )P (M R ) E =l = [P (size(e )) P (e i )] i=1 E =l [P (size(m 1 ),size(m 2 ),...) P (r i e i )] i=1 P (m j r j ) (r j,m j ) E =l [P (e i )P (r i e i )] P (m j r j ). i=1 (r j,m j ) Given a mention m in a ocument (M is the set of observe mentions in ), the key inference problem is to etermine the most likely entity e m that correspons to it. This is one by computing: E = argmax E EP (E,R M,θ) (1) = argmax E EP (E,R,M θ), (2) where θ is the learne moel s parameters. This gives the assignment of the most likely entity e m for m. 3 after we ignore the size components. The inference problem here is the same as in Equ. (2). Moel III: This moel performs the least relaxation. After eciing size(e ) accoring to a uniform istribution, instea of assuming inepenency among entities which oes not hol in reality (For example, Gore an George. W. Bush occur together frequently, but Gore an Steve. Bush o not), we select entities using a graph base algorithm: entities in E are viewe
4 as noes in a weighte irecte graph with eges (i, j) labelle P (e j e i ) representing the probability that entity e j is chosen into a ocument that contains entity e i.we istribute entities to E via a ranom walk on this graph starting from e 1 with a prior probability P (e i ). Representatives an mentions are generate in the same way as in Moel II. Therefore, a more general moel for the istribution over ocuments is: P () E =l P (e 1 )P (r 1 e 1 ) [P (e i e i 1 )P (r i e i )] i=2 P (m j r j ). (r j,m j ) e1= George Bush e2= George W. Bush e3= Steve Bush 1 m1,r1=presient Bush m2=bush m3=j. Quayle Entities E 2 m4,r2=steve Bush m5=bush Figure 2: An conceptual example. The arrows represent The inference problem is the same as in Equ. (2). correct assignment of entities to mentions. r 1,r 2 are representatives. 3.2 Inference The funamental problem in robust reaing can be solve of entities given those representatives in their appearing as inference with the moels: given a mention m, seek orer using the Viterbi algorithm. The total time complexity is O( M 2 + E 2 R ). the most probable entity e E for m accoring to Equ. (4) for Moel I or Equ. (2) for Moel II an III. The 3.3 Discussion inference algorithm for Moel I (with time complexity Besies ifferent assumptions of the moels, there are O( E )) is simple an irect: just compute P (e, m) for some funamental ifferences in inference with the moels as well. In Moel I, the entity of a mention is e each caniate entity e E an then choose the one with the highest value. Due to exponential number of termine completely inepenently of other mentions, possible assignments of E,R to M in Moel II an III, while in Moel II the way of figuring out the entity relies on local similarity among mentions in the same oc precise inference is infeasible. Approximate algorithms are therefore esigne: ument. In Moel III, it is not only relate to other mentions but to a global epenency over entities. The fol In Moel II, we aopt a twostep algorithm: First, we seek the representatives R for the mentions M in ocument by sequentially clustering the mentions accoring lowing conceptual example illustrates those ifferences as in Fig. 2. to the appearance moel. The first mention in each group is treate as the representative. Specifically, when consiering a mention m M, P (m r) is compute for 1, 2 an 5 mentions in them, an Example 3.1 Given E = {George Bush, George W. Bush, Steve Bush}, ocuments suppose the prior probability of entity George W. Bush is each representative r that have alreay been create an higher than those of the other two entities, the probable assignment of entities to mentions in the three moels coul be a fixe threshol is then use to ecie whether to create a new group for m or to a it to one of the existing group as follows: with the highest P (m r) value. In the secon step, each For Moel I, mentions(e 1) = φ, mentions(e 2) = {m representative r i R is assigne to its most likely entity accoring to e = argmax e E P (e) P (r e) 3 cause by the fact that a mention tens to be assigne to the 1,m 2,m 5} an mentions(e 3) = {m 4}. The result is. This entity with higher prior probability when the appearance similarity algorithm has a total time complexity of O( M 2 + E R ). is not istinctive. For Moel II, mentions(e 1) = φ, mentions(e 2) = Moel III has a similar twostep algorithm as Moel II. {m 1,m 2} an mentions(e 3) = {m 4,m 5}. Local epenency (appearance similarity) among mentions insie each ocument enforces constraints that they shoul refer to the same The only ifference is that we nee to consier the global epenency between entities. Thus in the secon step, entity, like Steve Bush an Bush in 2. instea of seeking an entity e for each representative r For Moel III, mentions(e 1)={m 1,m 2}, mentions(e 2) separately, we etermine a set of entities E for R in a = φ, mentions(e 3) = {m 4,m 5}. With the help of global Hien Markov Moel with entities in E as hien states epenency among entities, for example, George Bush an J. Quayle, an entity can be istinguishe from another entity an R as observations. The prior probabilities., the transitive probabilities an the observation probabilities for with a similar writing. this HMM are given by P (e), P (e j e i ) an P (r e) respectively. 3.4 Other Tasks In this step we seek the most likely sequence E is known after learning the moel in a close ocument collection that belongs to. The three basic problems relate to Robust Reaing can be solve base on the solutions to the key inference problem above. 4
5 Entity Ientity: Given two mentions m 1 1,m 2 2, etermine whether they correspon to the same entity (m 1 m 2 ) by: 1. In the initial (I) step, an initial E 0 an R 0 is assigne to each ocument using an initialization algorithm. After this step, we can assume that we have labelle ocuments D 0 = {(E 0,R 0,M )}. m 1 m 2 iff argmax e EP (e, m 1)=argmax e EP (e, m 2) for Moel I an m 1 m 2 iff argmax e EP (E 1,R 1,M 1 )= argmax e EP (E 2,R 2,M 2 ). for Moel II an III. Name Expansion: Given a mention m q in a query q, ecie whether mention m in the ocument collection D is a legal expansion of m q : m q m iff e m q = argmax e EP (E q,r q,m q) & m mentions(e ). We assume here that we alreay know the possible mentions of e after learning the moels in D. Prominence: Given a name n W, the most prominent entity for n is given by: e = argmax e E P (e)p (n e). P (e) is given by the prior istribution P E an P (n e) is given by the appearance moel. 4 Learning the Moels Confine by the labor of annotating ata, we learn the probabilistic moels in an unsupervise way given a collection of ocuments; that is, the system is not tol uring training whether two mentions represent the same entity. A greey search algorithm moifie after the stanar EM algorithm (We call it Truncate EM algorithm) is aopte here to avoi complex computation. Given a set of ocuments D to be stuie an the observe mentions M in each ocument, this algorithm iteratively upates the moel parameter θ (several unerlying probabilistic istributions escribe before) an the structure (that is, E an R ) of each ocument. Different from the stanar EM algorithm, in the Estep, it seeks the most likely E an R for each ocument rather than the expecte assignment. 4.1 Truncate EM Algorithm The basic framework of the Truncate EM algorithm to learn Moel II an III is as follows: 5 2. In the Mstep, we seek the moel parameter θ t+1 that maximizes P (D t θ). Given the labels supplie by the moel in the previous I or Estep, this amounts to the maximum likelihoo estimation as escribe in Sec In the Estep, we seek (E t+1,r t+1 ) for each ocument that maximizes P (D t+1 θ t+1 ) where D t+1 = {(E t+1,r t+1,m )}. It s the same inference problem in Sec Stoping Criterion: If no increase is achieve over P (D t θ t ), the algorithm exits. Otherwise the algorithm will iterate over the Mstep an Estep. The algorithm for Moel I is similar to the above algorithm but much simpler in the sense that it oes not have the notions of ocuments an representatives. So in the Estep we only nee to seek the most possible entity e for each mention m D an this simplifies the parameter estimation in the Mstep accoringly. It usually takes 3 10 iterations before the algorithm stops for all the moels in our experiments. 4.2 Initialization The purpose of the initial step is to acquire an initial guess of ocument structures an to seek the set of entities E in a close collection of ocuments D. The hope is to fin all entities without loss even if repeate entities might be create. For all the moels, we use the same algorithm: First, a local clustering is performe to group all mentions insie each ocument. A set of simple heuristics of matching attributes is applie to calculating the similarity between mentions an pairs of mentions with similarity above a threshol are clustere together. The first mention in each group is chosen as the representative (only in Moel II an III) an an entity having the same writing with the representative is create for each cluster 4. For all the moels, the set of entities create in ifferent ocuments become the global entity set E in the following M an Esteps. 4.3 Estimating the Moel Parameters In the learning process, assuming we have obtaine labelle ocuments D = {(e, r, m)} n 1 from previous I or Estep, several probability istributions unerlying the relaxe moels are estimate accoring to maximum likelihoo estimation in each Mstep. The moel parameters inclue a prior istribution over entities P E, a tran 4 Note that the performance of the initialization algorithm is 97.3% precision an 10.1% recall, measures efine in our later experimental stuy in Sec. 5.
6 sitive probability istribution over pairs of entities P E E (only in Moel III) an the appearance probability P W W of a name in the name space W being transforme from another name. The prior istribution P E is moelle as a multinomial istribution. Given a set of labelle entitymention pairs {(e i,m i )} n 1, P (e) = freq(e) n where freq(e) enotes the number of pairs containing entity e. Given all the entities appearing in D, The transitive probability between entities P (e e) is estimate by P (e 2 e 1) P (wrt(e 2) wrt(e 1)) = oc# (wrt(e 2 ),wrt(e 1 )). oc # (wrt(e 1 )) Here, the conitional probability between two real entities P (e 2 e 1 ) is backe off to the conitional probability between the ientifying writings of the two entities P (wrt(e 2 ) wrt(e 1 )) in the ocument set D to avoi sparsity problem. Given D = { 1, 2,..., m }. An oc # (w 1,w 2,...) enotes the number of ocuments having the cooccurrence of writings w 1,w 2,... Appearance Probability, the probability of one name being transforme from another, enote as P (n 2 n 1 ) (n 1,n 2 W ), is moelle as a prouct of the transformation probabilities over attribute values. The transformation probability for each attribute in A is further moelle as a multinomial istribution over a set of preetermine typical transformation types that epen on the entity types: TT = {copy,missing,typical,non typical} 5. Suppose n 1 =(a 1 = v 1,a 2 = v 2,..., a p = v p ) an n 2 =(a 1 = v 1,a 2 = v 2,..., a p = v p) are two names belonging to the same entity type, the transformation probabilities P M R, P R E an P M E, are all moelle as a prouct istribution (naive Bayes) over attributes: P (n 2 n 1)=Π p k=1 P (v k v k ). We manually collecte typical an nontypical transformations for attributes such as titles, first names, last names, organizations an locations from multiple sources such as U.S. government census an online ictionaries. For other attributes like gener, only copy transformation is allowe. Assuming multinomial istribution for each attribute, the maximum likelihoo estimation of the transformation probability P (t, k) (t TT,a k A) from labelle representativemention pairs {(r, m)} n 1 is: 5 copy enotes v k is exactly the same as v k ; missing enotes missing value for v k; typical enotes v k is a typical variation of v k, for example, Prof. for Professor, Any for Anrew ; nontypical enotes a nontypical transformation. 6 P (t, k) = freq(r, m) :vr k t v m k n vk r t vk m enotes the transformation from attribute a k of r to that of m is of type t. Simple smoothing is performe here for unseen transformations. 5 Experimental Stuy Our experimental stuy focuses on (1) evaluating our three moels on the name ientity task using three entity types (People, Locations, Organization); (2) comparing our inuce similarity measure between names with other similarity measures; (3) evaluating the contribution of the global nature of our moel, an (4) evaluating our moels on name expansion an prominence ranking. 5.1 Methoology We collecte 300 ocuments from ranomly sample New York Times articles in the TREC corpus (Voorhees, 2002). The ocuments were annotate by a name entity tagger for People, Locations an Organizations. The annotation was then correcte an each name mention was labelle with its corresponing entity by two annotators. In total, about 8, 000 mentions of name entities which correspon to about 2, 000 entities were labelle. The training process gets to see only the 300 ocuments an extracts attribute values for each mention. No supervision is supplie. These recors are use to learn the probabilistic moels. In testing, 130, 000 pairs of mentions that correspon to the same entity are generate, an are use to evaluate the moels performance. Since the probabilistic moels are learne in an unsupervise setting, testing can be viewe simply as the evaluation of the learne moel, an is thus one on the same ata. The same setting was use for all moels an all comparison performe (see below). To evaluate the performance, we pair two mentions iff the learne moel etermine that they correspon to the same entity. The list of pairs is then compare with the annotate list of pairs. We measure Precision (P ) Percentage of correctly preicte pairs, Recall (R) Percentage of correct pairs that were preicte, an F 1 = 2PR P +R. Comparisons: Our global moel inuces a similarity measure between names the appearance moel. In orer to unerstan whether the behavior of our moel is ominate by the quality of the inuce pairwise similarity or by the global aspects of the moel we (1) replace this measure by two other local similarity measures an (2) stuy the performance on entity ientity at three levels local ecision, straightforwar clustering over local similarity, an our global moel. The first similarity measure we use is a simple baseline algorithm accoring to which two names are similar iff (5)
7 All(P/L/O) Ientity SoftTFIDF Appearance Pairwise 70.7 (64.7/64.1/83.7) 82.1 (79.9/77.3/89.5) 81.5 (83.6/70.9/90.7) Clustering 70.7 (64.7/64.1/83.7) 79.8 (70.6/76.7/91.0) 79.6 (70.9/76.1/91.0) Moel II 70.7 (64.7/64.1/83.7) 82.5 (79.8/77.4/90.2) 89.0 (92.7/81.9/92.9) Table 1: Comparison of ifferent ecision levels an similarity measures. Three similarity measures are evaluate (rows) across three ecision levels (columns). Performance is evaluate by the F 1 values over the whole test set. The first number averages all entity types; numbers in parentheses represent People, Location an Organization respectively. they have ientical writings. The secon is a stateofart similarity measure for entity names (SoftTFIDF with JaroWinkler istance an θ =0.9); it was ranke the best measure in a recent stuy (Cohen et al., 2003). Local ecision (Pairwise) is one by pairing two mentions iff the similarity between them is above a fixe threshol. For Clustering, a graphbase clustering algorithm is use, where two mentions are paire iff they belong to the same connecte component. Finally, we use the baseline an the SoftTFIDF in the context of Moel II, where the appearance moel is replace by the similarity measure Results The bottom line result is given in Tab. 1. All local similarity measures are compare in the context of the three levels of processing local ecision, clustering an our probabilistic moel II. The behavior across rows inicates that our unsupervise learning base appearance moel is about the same as the stateoftheart SoftTFIDF similarity. The behavior across columns, though, shows the contribution of our global moel, an that the local appearance moel behaves better with it than a fixe similarity measure oes. A seconary observation is that our appearance moel for Location is not as goo as the one for People an Organization, probably ue to the attribute transformation types chosen. Tab. 2 presents a more etaile evaluation of the ifferent approaches on the entity ientity task. All the three probabilistic moels outperform the iscriminatory approaches in this experiment, an inication of the effectiveness of the generative moel. We note that although Moel III is more expressive an reasonable than moel II, it oes not always perform better. Inee, the global epenency among entities in Moel III achieves twofole outcomes: it achieves better precision but, may egrae the recall. The following 6 Note that both the appearance moel s(n 1,n 2) = P (n 1 n 2) an the SoftTFIDF similarity measure are not symmetric. Also, we foun that the SoftTFIDF similarity measure behaves very baly in the context of the probabilistic moel, an improve it by converting it to P (n 1 n 2)= ec s(n 1,n 2 ) 1.c e c 1 was set to 10 in the experiments. 7 Entity Mo InDoc InterDoc All Type F 1 (%) F 1 (%) R(%) P(%) F 1 (%) All B D I II III P B D I II III L B D I II III O B D I II III Table 2: Performance of ifferent approaches over all test examples. B, D, I, II an III enote the baseline moel, the SoftTFIDF similarity moel with clustering, an the three probabilistic moels. All,P,L,O enote all entities, People, Locations an Organizations respectively. We istinguish between pairs of mentions that are insie the same ocument (InDoc, 15% of the pairs) or not (InterDoc). example, taken from the corpus, illustrates the avantage of this moel. Example 5.1 Sherman Williams is mentione along with the baseball team Dallas Cowboys in eight out of 300 ocuments, while Jeff Williams is mentione along with LA Dogers in two ocuments. In all the moels except Moel III, Jeff Williams is juge to correspon to the same entity as Sherman Williams since they are quite similar an the prior probability of the latter is higher than the former. Only in Moel III, ue to the epenency between Jeff Williams an Dogers, the system ientifies it as corresponing to a ifferent entity than Sherman Williams. While this exhibits the better precision achieve by Moel III, the recall may go own. The reason is that the global epenency among entities in Moel III enforces restrictions over possible grouping of similar mentions; in aition, with a limite ocument set, estimating this global epenency cannot be one accurately, especially in the setting that entities themselves nee to be foun when learning the moel. We expect that Moel III will ominate Moel II when we have enough ata to estimate a more accurate global epenencies. Har Cases: To analyze the experimental results further, we evaluate separately two types of harer cases of the entity ientity task: (1) mentions with ifferent writings that refer to the same entity; an (2) mentions with similar writings that refer to ifferent entities. Moel II an III outperform other moels in those two cases as well. Tab. 3 presents F 1 performance of ifferent approaches in the first case. The best F 1 value is only 73.1%, inicating that appearance similarity an global epenency are not sufficient to solve this problem when the writings are very ifferent. Tab. 4 shows the performance of ifferent approaches for isambiguating similar writings that
8 Moel B D I II III Peop Loc Org All Table 3: Ientifying ifferent writings of the same entity (F 1). We filter out ientical writings an report only on cases of ifferent writings of the same entity. The test set contains 46, 376 matching pairs (but in ifferent writings) in the whole ata set. Moel B D I II III Peop Loc Org All Table 4: Ientifying similar writings of ifferent entities. (F 1) The test set contains 39, 837 pairs of mentions that associate with ifferent entities in the 300 ocuments an have at least one token in common. correspon to ifferent entities. Both these cases exhibit the ifficulty of the problem, an that our approach provies a significant improvement over the state of the art similarity measure column D vs. column II in Tab. 4. It also shows that it is necessary to use contextual attributes of the names, which are not yet inclue in this evaluation. 5.3 Other Tasks In the following experiments, we evaluate our generative moel on other tasks relate to robust reaing. We present results only for Moel II. Name Expansion: Given a mention m (for example, in a IR query q), we fin the most likely entity e E for m using our inference algorithm. All unique mentions of the entity in the ocuments are output as the expansions of m. The accuracy of Name Expansion for a given mention is efine as the number of correct expansions over the total number of names output. The average accuracy of Name Expansion of Moel II is shown in Tab. 5. Here is an example of a query: Query: Who is Gore? Expansions: Vice Presient Al Gore, Al Gore, Gore. Prominence Information: We refer to Example 3.1 an use it to exemplify quantitatively how our system supports prominence ranking. The following examples show the ranking of entities with regar to the value of P (e) P (m e) (shown in the brackets) using Moel II, given a query name m. Input: George Bush 1. George Bush(2.49E4) 2. George W. Bush(6.64E7) Input: Bush 1. George W. Bush(5.13E7) 2. George Bush(1.42E7) 3. Steve Bush(5.69E10) 6 Conclusion an Future Work This paper presents an unsupervise learning approach to several aspects of the robust reaing problem crossocument resolution of ambiguous writings of names. We evelope a moel that escribes the natural generation process of a ocument an the process of how names are sprinkle into them, taking into account epenencies between entities across types an an author moel. Several relaxations of this moel were evelope an stuie experimentally, an compare to a stateoftheart moel that oes not take a global view. The experiments exhibit goo results an show the avantage of several aspects of our moel. This work is a preliminary exploration of the robust reaing problem. There are several critical issues that our moel can support, but were not inclue in this preliminary evaluation. Some of the issues that will be inclue in future steps are: (1) integration with more contextual information (like time an place) relate to the target entities, both to support a better moel an to allow temporal tracing of entities; (2) stuying an incremental approach learning the moel; that is, when a new ocument is observe, coming, how can we upate our moel parameters an the corresponing knowlege base? (3) integration of this work with other aspect of coreference resolution (e.g., other terms like pronouns that refer to an entity) an name entity recognition (which we now take as a given); an (4) scalability issues in applying the system to very large corpora. References M. Bilenko an R. Mooney Aaptive uplicate etection using learnable string similarity measures. In KDD. W. Cohen an J. Richman Learning to match an cluster large highimensional ata sets for ata integration. In KDD. W. Cohen, P. Ravikumar, an S. Fienberg A comparison of string metrics for namematching tasks. In IIWeb Workshop M. Hernanez an S. Stolfo The merge/purge problem for large atabases. In SIGMOD. A. Kehler Coherence, Reference, an the Theory of Grammar. CSLI Publications. G. Mann an D. Yarowsky Unsupervise personal name isambiguation. In CoNLL. V. Ng an C. Carie Improving machine learning approaches to coreference resolution. In ACL. H. Pasula, B. Marthi, B. Milch, S. Russell, an I. Shpitser Ientity uncertainty an citation matching. In NIPS. Entity Type People Location Organization Accuracy(%) Table 5: Accuracy of Name Expansion. Accuracy is average over 30 ranomly chosen queries for each entity type. 8 W. Soon, H. Ng, an D. Lim A machine learning approach to coreference resolution of noun phrases. Computational Linguistics (Special Issue on Computational Anaphora Resolution), 27:
9 E. Voorhees Overview of the TREC2002 question answering track. In Proceeings of TREC, pages
Robust Reading: Identification and Tracing of Ambiguous Names
Robust Reaing: Ientification an Tracing of Ambiguous Names Xin Li Paul Morie Dan Roth Department of Computer Science University of Illinois, Urbana, IL 61801 {xli1,morie,anr}@uiuc.eu Abstract A given entity,
More informationIdentification and Tracing of Ambiguous Names: Discriminative and Generative Approaches
Ientification an Tracing of Ambiguous Names: Discriminative an Generative Approaches Xin Li Paul Morie Dan Roth Department of Computer Science University of Illinois, Urbana, IL 61801 {xli1,morie,anr}@uiuc.eu
More informationUCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 9 Paired Data. Paired data. Paired data
UCLA STAT 3 Introuction to Statistical Methos for the Life an Health Sciences Instructor: Ivo Dinov, Asst. Prof. of Statistics an Neurology Chapter 9 Paire Data Teaching Assistants: Jacquelina Dacosta
More informationFirewall Design: Consistency, Completeness, and Compactness
C IS COS YS TE MS Firewall Design: Consistency, Completeness, an Compactness Mohame G. Goua an XiangYang Alex Liu Department of Computer Sciences The University of Texas at Austin Austin, Texas 787121188,
More informationState of Louisiana Office of Information Technology. Change Management Plan
State of Louisiana Office of Information Technology Change Management Plan Table of Contents Change Management Overview Change Management Plan Key Consierations Organizational Transition Stages Change
More informationData Center Power System Reliability Beyond the 9 s: A Practical Approach
Data Center Power System Reliability Beyon the 9 s: A Practical Approach Bill Brown, P.E., Square D Critical Power Competency Center. Abstract Reliability has always been the focus of missioncritical
More informationA New Evaluation Measure for Information Retrieval Systems
A New Evaluation Measure for Information Retrieval Systems Martin Mehlitz martin.mehlitz@ailabor.e Christian Bauckhage Deutsche Telekom Laboratories christian.bauckhage@telekom.e Jérôme Kunegis jerome.kunegis@ailabor.e
More informationModelling and Resolving Software Dependencies
June 15, 2005 Abstract Many Linux istributions an other moern operating systems feature the explicit eclaration of (often complex) epenency relationships between the pieces of software
More informationOpen World Face Recognition with Credibility and Confidence Measures
Open Worl Face Recognition with Creibility an Confience Measures Fayin Li an Harry Wechsler Department of Computer Science George Mason University Fairfax, VA 22030 {fli, wechsler}@cs.gmu.eu Abstract.
More informationLecture 8: Expanders and Applications
Lecture 8: Expaners an Applications Topics in Complexity Theory an Pseuoranomness (Spring 013) Rutgers University Swastik Kopparty Scribes: Amey Bhangale, Mrinal Kumar 1 Overview In this lecture, we will
More informationCh 10. Arithmetic Average Options and Asian Opitons
Ch 10. Arithmetic Average Options an Asian Opitons I. Asian Option an the Analytic Pricing Formula II. Binomial Tree Moel to Price Average Options III. Combination of Arithmetic Average an Reset Options
More information9.3. Diffraction and Interference of Water Waves
Diffraction an Interference of Water Waves 9.3 Have you ever notice how people relaxing at the seashore spen so much of their time watching the ocean waves moving over the water, as they break repeately
More informationEnterprise Resource Planning
Enterprise Resource Planning MPC 6 th Eition Chapter 1a McGrawHill/Irwin Copyright 2011 by The McGrawHill Companies, Inc. All rights reserve. Enterprise Resource Planning A comprehensive software approach
More informationFOURIER TRANSFORM TERENCE TAO
FOURIER TRANSFORM TERENCE TAO Very broaly speaking, the Fourier transform is a systematic way to ecompose generic functions into a superposition of symmetric functions. These symmetric functions are usually
More informationDetecting Possibly Fraudulent or ErrorProne Survey Data Using Benford s Law
Detecting Possibly Frauulent or ErrorProne Survey Data Using Benfor s Law Davi Swanson, Moon Jung Cho, John Eltinge U.S. Bureau of Labor Statistics 2 Massachusetts Ave., NE, Room 3650, Washington, DC
More informationJON HOLTAN. if P&C Insurance Ltd., Oslo, Norway ABSTRACT
OPTIMAL INSURANCE COVERAGE UNDER BONUSMALUS CONTRACTS BY JON HOLTAN if P&C Insurance Lt., Oslo, Norway ABSTRACT The paper analyses the questions: Shoul or shoul not an iniviual buy insurance? An if so,
More informationINFLUENCE OF GPS TECHNOLOGY ON COST CONTROL AND MAINTENANCE OF VEHICLES
1 st Logistics International Conference Belgrae, Serbia 2830 November 2013 INFLUENCE OF GPS TECHNOLOGY ON COST CONTROL AND MAINTENANCE OF VEHICLES Goran N. Raoičić * University of Niš, Faculty of Mechanical
More informationDEVELOPMENT OF A BRAKING MODEL FOR SPEED SUPERVISION SYSTEMS
DEVELOPMENT OF A BRAKING MODEL FOR SPEED SUPERVISION SYSTEMS Paolo Presciani*, Monica Malvezzi #, Giuseppe Luigi Bonacci +, Monica Balli + * FS Trenitalia Unità Tecnologie Materiale Rotabile Direzione
More information10.2 Systems of Linear Equations: Matrices
SECTION 0.2 Systems of Linear Equations: Matrices 7 0.2 Systems of Linear Equations: Matrices OBJECTIVES Write the Augmente Matrix of a System of Linear Equations 2 Write the System from the Augmente Matrix
More informationA Data Placement Strategy in Scientific Cloud Workflows
A Data Placement Strategy in Scientific Clou Workflows Dong Yuan, Yun Yang, Xiao Liu, Jinjun Chen Faculty of Information an Communication Technologies, Swinburne University of Technology Hawthorn, Melbourne,
More information6.3 Microbial growth in a chemostat
6.3 Microbial growth in a chemostat The chemostat is a wielyuse apparatus use in the stuy of microbial physiology an ecology. In such a chemostat also known as continuousflow culture), microbes such
More informationLecture 17: Implicit differentiation
Lecture 7: Implicit ifferentiation Nathan Pflueger 8 October 203 Introuction Toay we iscuss a technique calle implicit ifferentiation, which provies a quicker an easier way to compute many erivatives we
More informationSensor Network Localization from Local Connectivity : Performance Analysis for the MDSMAP Algorithm
Sensor Network Localization from Local Connectivity : Performance Analysis for the MDSMAP Algorithm Sewoong Oh an Anrea Montanari Electrical Engineering an Statistics Department Stanfor University, Stanfor,
More informationTowards a Framework for Enterprise Architecture Frameworks Comparison and Selection
Towars a Framework for Enterprise Frameworks Comparison an Selection Saber Aballah Faculty of Computers an Information, Cairo University Saber_aballah@hotmail.com Abstract A number of Enterprise Frameworks
More informationCrossOver Analysis Using TTests
Chapter 35 CrossOver Analysis Using ests Introuction his proceure analyzes ata from a twotreatment, twoperio (x) crossover esign. he response is assume to be a continuous ranom variable that follows
More informationLearningBased Summarisation of XML Documents
LearningBase Summarisation of XML Documents Massih R. Amini Anastasios Tombros Nicolas Usunier Mounia Lalmas {name}@poleia.lip6.fr {first name}@cs.qmul.ac.uk University Pierre an Marie Curie Queen Mary,
More informationImproving Direct Marketing Profitability with Neural Networks
Volume 9 o.5, September 011 Improving Direct Marketing Profitability with eural etworks Zaiyong Tang Salem State University Salem, MA 01970 ABSTRACT Data mining in irect marketing aims at ientifying the
More informationJitter effects on Analog to Digital and Digital to Analog Converters
Jitter effects on Analog to Digital an Digital to Analog Converters Jitter effects copyright 1999, 2000 Troisi Design Limite Jitter One of the significant problems in igital auio is clock jitter an its
More informationMath 230.01, Fall 2012: HW 1 Solutions
Math 3., Fall : HW Solutions Problem (p.9 #). Suppose a wor is picke at ranom from this sentence. Fin: a) the chance the wor has at least letters; SOLUTION: All wors are equally likely to be chosen. The
More informationIf you have ever spoken with your grandparents about what their lives were like
CHAPTER 7 Economic Growth I: Capital Accumulation an Population Growth The question of growth is nothing new but a new isguise for an ageol issue, one which has always intrigue an preoccupie economics:
More informationOptimizing Multiple Stock Trading Rules using Genetic Algorithms
Optimizing Multiple Stock Traing Rules using Genetic Algorithms Ariano Simões, Rui Neves, Nuno Horta Instituto as Telecomunicações, Instituto Superior Técnico Av. Rovisco Pais, 04000 Lisboa, Portugal.
More informationApplications of Global Positioning System in Traffic Studies. Yi Jiang 1
Applications of Global Positioning System in Traffic Stuies Yi Jiang 1 Introuction A Global Positioning System (GPS) evice was use in this stuy to measure traffic characteristics at highway intersections
More informationFAST JOINING AND REPAIRING OF SANDWICH MATERIALS WITH DETACHABLE MECHANICAL CONNECTION TECHNOLOGY
FAST JOINING AND REPAIRING OF SANDWICH MATERIALS WITH DETACHABLE MECHANICAL CONNECTION TECHNOLOGY Jörg Felhusen an Sivakumara K. Krishnamoorthy RWTH Aachen University, Chair an Insitute for Engineering
More information5 Isotope effects on vibrational relaxation and hydrogenbond dynamics in water
5 Isotope effects on vibrational relaxation an hyrogenbon ynamics in water Pump probe experiments HDO issolve in liqui H O show the spectral ynamics an the vibrational relaxation of the OD stretch vibration.
More informationProduct Differentiation for SoftwareasaService Providers
University of Augsburg Prof. Dr. Hans Ulrich Buhl Research Center Finance & Information Management Department of Information Systems Engineering & Financial Management Discussion Paper WI99 Prouct Differentiation
More informationA Comparison of Performance Measures for Online Algorithms
A Comparison of Performance Measures for Online Algorithms Joan Boyar 1, Sany Irani 2, an Kim S. Larsen 1 1 Department of Mathematics an Computer Science, University of Southern Denmark, Campusvej 55,
More informationOn Adaboost and Optimal Betting Strategies
On Aaboost an Optimal Betting Strategies Pasquale Malacaria 1 an Fabrizio Smerali 1 1 School of Electronic Engineering an Computer Science, Queen Mary University of Lonon, Lonon, UK Abstract We explore
More informationThe Quick Calculus Tutorial
The Quick Calculus Tutorial This text is a quick introuction into Calculus ieas an techniques. It is esigne to help you if you take the Calculus base course Physics 211 at the same time with Calculus I,
More information_Mankiw7e_CH07.qxp 3/2/09 9:40 PM Page 189 PART III. Growth Theory: The Economy in the Very Long Run
189220_Mankiw7e_CH07.qxp 3/2/09 9:40 PM Page 189 PART III Growth Theory: The Economy in the Very Long Run 189220_Mankiw7e_CH07.qxp 3/2/09 9:40 PM Page 190 189220_Mankiw7e_CH07.qxp 3/2/09 9:40 PM Page
More informationA Case Study of Applying SOM in Market Segmentation of Automobile Insurance Customers
International Journal of Database Theory an Application, pp.2536 http://x.oi.org/10.14257/ijta.2014.7.1.03 A Case Stuy of Applying SOM in Market Segmentation of Automobile Insurance Customers Vahi Golmah
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS Contents 1. Moment generating functions 2. Sum of a ranom number of ranom variables 3. Transforms
More informationUnsteady Flow Visualization by Animating EvenlySpaced Streamlines
EUROGRAPHICS 2000 / M. Gross an F.R.A. Hopgoo Volume 19, (2000), Number 3 (Guest Eitors) Unsteay Flow Visualization by Animating EvenlySpace Bruno Jobar an Wilfri Lefer Université u Littoral Côte Opale,
More informationPurpose of the Experiments. Principles and Error Analysis. ε 0 is the dielectric constant,ε 0. ε r. = 8.854 10 12 F/m is the permittivity of
Experiments with Parallel Plate Capacitors to Evaluate the Capacitance Calculation an Gauss Law in Electricity, an to Measure the Dielectric Constants of a Few Soli an Liqui Samples Table of Contents Purpose
More informationHeatAndMass Transfer Relationship to Determine Shear Stress in Tubular Membrane Systems Ratkovich, Nicolas Rios; Nopens, Ingmar
Aalborg Universitet HeatAnMass Transfer Relationship to Determine Shear Stress in Tubular Membrane Systems Ratkovich, Nicolas Rios; Nopens, Ingmar Publishe in: International Journal of Heat an Mass Transfer
More informationAverageCase Analysis of a Nearest Neighbor Algorithm. Moett Field, CA USA. in accuracy. neighbor.
From Proceeings of the Thirteenth International Joint Conference on Articial Intelligence (1993). Chambery, France: Morgan Kaufmann AverageCase Analysis of a Nearest Neighbor Algorithm Pat Langley Learning
More informationRUNESTONE, an International Student Collaboration Project
RUNESTONE, an International Stuent Collaboration Project Mats Daniels 1, Marian Petre 2, Vicki Almstrum 3, Lars Asplun 1, Christina Björkman 1, Carl Erickson 4, Bruce Klein 4, an Mary Last 4 1 Department
More informationnparameter families of curves
1 nparameter families of curves For purposes of this iscussion, a curve will mean any equation involving x, y, an no other variables. Some examples of curves are x 2 + (y 3) 2 = 9 circle with raius 3,
More informationSprings, Shocks and your Suspension
rings, Shocks an your Suspension y Doc Hathaway, H&S Prototype an Design, C. Unerstaning how your springs an shocks move as your race car moves through its range of motions is one of the basics you must
More informationIntroduction to Integration Part 1: AntiDifferentiation
Mathematics Learning Centre Introuction to Integration Part : AntiDifferentiation Mary Barnes c 999 University of Syney Contents For Reference. Table of erivatives......2 New notation.... 2 Introuction
More informationThe oneyear nonlife insurance risk
The oneyear nonlife insurance risk Ohlsson, Esbjörn & Lauzeningks, Jan Abstract With few exceptions, the literature on nonlife insurance reserve risk has been evote to the ultimo risk, the risk in the
More informationMinimizing Makespan in Flow Shop Scheduling Using a Network Approach
Minimizing Makespan in Flow Shop Scheuling Using a Network Approach Amin Sahraeian Department of Inustrial Engineering, Payame Noor University, Asaluyeh, Iran 1 Introuction Prouction systems can be ivie
More informationConsumer Referrals. Maria Arbatskaya and Hideo Konishi. October 28, 2014
Consumer Referrals Maria Arbatskaya an Hieo Konishi October 28, 2014 Abstract In many inustries, rms rewar their customers for making referrals. We analyze the optimal policy mix of price, avertising intensity,
More informationDi usion on Social Networks. Current Version: June 6, 2006 Appeared in: Économie Publique, Numéro 16, pp 316, 2005/1.
Di usion on Social Networks Matthew O. Jackson y Caltech Leeat Yariv z Caltech Current Version: June 6, 2006 Appeare in: Économie Publique, Numéro 16, pp 316, 2005/1. Abstract. We analyze a moel of i
More informationInnovation Union means: More jobs, improved lives, better society
The project follows the Lisbon an Gothenburg Agenas, an supports the EU 2020 Strategy, in particular SMART Growth an the Innovation Union: Innovation Union means: More jobs, improve lives, better society
More informationStudy on the Price Elasticity of Demand of Beijing Subway
Journal of Traffic an Logistics Engineering, Vol, 1, No. 1 June 2013 Stuy on the Price Elasticity of Deman of Beijing Subway Yanan Miao an Liang Gao MOE Key Laboratory for Urban Transportation Complex
More informationView Synthesis by Image Mapping and Interpolation
View Synthesis by Image Mapping an Interpolation Farris J. Halim Jesse S. Jin, School of Computer Science & Engineering, University of New South Wales Syney, NSW 05, Australia Basser epartment of Computer
More informationMSc. Econ: MATHEMATICAL STATISTICS, 1995 MAXIMUMLIKELIHOOD ESTIMATION
MAXIMUMLIKELIHOOD ESTIMATION The General Theory of ML Estimation In orer to erive an ML estimator, we are boun to make an assumption about the functional form of the istribution which generates the
More informationModeling and Predicting Popularity Dynamics via Reinforced Poisson Processes
Proceeings of the TwentyEighth AAAI Conference on Artificial Intelligence Moeling an Preicting Popularity Dynamics via Reinforce Poisson Processes Huawei Shen 1, Dashun Wang 2, Chaoming Song 3, AlbertLászló
More informationLecture 13: Differentiation Derivatives of Trigonometric Functions
Lecture 13: Differentiation Derivatives of Trigonometric Functions Derivatives of the Basic Trigonometric Functions Derivative of sin Derivative of cos Using the Chain Rule Derivative of tan Using the
More information! # % & ( ) +,,),. / 0 1 2 % ( 345 6, & 7 8 4 8 & & &&3 6
! # % & ( ) +,,),. / 0 1 2 % ( 345 6, & 7 8 4 8 & & &&3 6 9 Quality signposting : the role of online information prescription in proviing patient information Liz Brewster & Barbara Sen Information School,
More informationLegal Claim Identification: Information Extraction with Hierarchically Labeled Data
Legal Claim Ientification: Information Extraction with Hierarchically Labele Data Mihai Sureanu, Ramesh Nallapati an Christopher Manning Stanfor University {mihais,nmramesh,manning}@cs.stanfor.eu Abstract
More informationMinimumEnergy Broadcast in AllWireless Networks: NPCompleteness and Distribution Issues
MinimumEnergy Broacast in AllWireless Networks: NPCompleteness an Distribution Issues Mario Čagal LCAEPFL CH05 Lausanne Switzerlan mario.cagal@epfl.ch JeanPierre Hubaux LCAEPFL CH05 Lausanne Switzerlan
More information. A UML/MARTE DETECTION OF STARVATION AND DEADLOCKS AT THE DESIGN LEVEL IN CONCURRENT SYSTEM
. A UML/MARTE DETECTION OF STARVATION AND DEADLOCKS AT THE DESIGN LEVEL IN CONCURRENT SYSTEM C.Revath Department of computer science an Engineering, Karunya University, comibatore,inia.rcswathiathi@gmail.com
More informationMathematics Review for Economists
Mathematics Review for Economists by John E. Floy University of Toronto May 9, 2013 This ocument presents a review of very basic mathematics for use by stuents who plan to stuy economics in grauate school
More informationPerformance And Analysis Of Risk Assessment Methodologies In Information Security
International Journal of Computer Trens an Technology (IJCTT) volume 4 Issue 10 October 2013 Performance An Analysis Of Risk Assessment ologies In Information Security K.V.D.Kiran #1, Saikrishna Mukkamala
More informationA New Pricing Model for Competitive Telecommunications Services Using Congestion Discounts
A New Pricing Moel for Competitive Telecommunications Services Using Congestion Discounts N. Keon an G. Ananalingam Department of Systems Engineering University of Pennsylvania Philaelphia, PA 191046315
More informationCHAPTER 5 : CALCULUS
Dr Roger Ni (Queen Mary, University of Lonon)  5. CHAPTER 5 : CALCULUS Differentiation Introuction to Differentiation Calculus is a branch of mathematics which concerns itself with change. Irrespective
More informationStochastic Modeling of MEMS Inertial Sensors
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 10, No Sofia 010 Stochastic Moeling of MEMS Inertial Sensors Petko Petkov, Tsonyo Slavov Department of Automatics, Technical
More informationISSN: 22773754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 12, June 2014
ISSN: 77754 ISO 900:008 Certifie International Journal of Engineering an Innovative echnology (IJEI) Volume, Issue, June 04 Manufacturing process with isruption uner Quaratic Deman for Deteriorating Inventory
More informationImplementing IP Traceback in the Internet An ISP Perspective
Implementing IP Traceback in the Internet An ISP Perspective Dong Wei, Stuent Member, IEEE, an Nirwan Ansari, Senior Member, IEEE AbstractDenialofService (DoS) attacks consume the resources of remote
More informationA BlameBased Approach to Generating Proposals for Handling Inconsistency in Software Requirements
International Journal of nowlege an Systems Science, 3(), 7, JanuaryMarch 0 A lamease Approach to Generating Proposals for Hanling Inconsistency in Software Requirements eian Mu, Peking University,
More informationChapter 2 Review of Classical Action Principles
Chapter Review of Classical Action Principles This section grew out of lectures given by Schwinger at UCLA aroun 1974, which were substantially transforme into Chap. 8 of Classical Electroynamics (Schwinger
More informationSecurity Vulnerabilities and Solutions for Packet Sampling
Security Vulnerabilities an Solutions for Packet Sampling Sharon Golberg an Jennifer Rexfor Princeton University, Princeton, NJ, USA 08544 {golbe, jrex}@princeton.eu Abstract Packet sampling supports a
More informationUniversal Gravity Based on the Electric Universe Model
Universal Gravity Base on the Electric Universe Moel By Frerik Nygaar frerik_nygaar@hotmail.com Nov 015, Porto Portugal. Introuction While most people are aware of both Newton's Universal Law of Gravity
More informationThe Elastic Capacitor and its Unusual Properties
1 The Elastic Capacitor an its Unusual Properties Michael B. Partensky, Department of Chemistry, Braneis University, Waltham, MA 453 partensky@attbi.com The elastic capacitor (EC) moel was first introuce
More informationCapacitive mtouch Sensing Solutions
Capacitive mtouch Sensing Solutions Design Guielines 2008 Microchip Technology Incorporate. ll Rights Reserve. Slie 1 Hello, my name is Marc McComb, Technical Training Engineer in the Security Microcontroller
More informationGuidelines for the use of UHPLC Instruments
Guielines for the use of UHPLC Instruments Requirements for UHPLC instruments, metho evelopment in UHPLC an metho transfer from regular HPLC to UHPLC Authors: Dr. Davy Guillarme, Prof. JeanLuc Veuthey
More informationAn intertemporal model of the real exchange rate, stock market, and international debt dynamics: policy simulations
This page may be remove to conceal the ientities of the authors An intertemporal moel of the real exchange rate, stock market, an international ebt ynamics: policy simulations Saziye Gazioglu an W. Davi
More informationM147 Practice Problems for Exam 2
M47 Practice Problems for Exam Exam will cover sections 4., 4.4, 4.5, 4.6, 4.7, 4.8, 5., an 5.. Calculators will not be allowe on the exam. The first ten problems on the exam will be multiple choice. Work
More informationSolving Problems Involving Line Plots
Lesson A Common Core Lesson Overview Domain Measurement an Data Cluster Represent an interpret ata. Stanar.MD. Make a line plot to isplay a ata set of measurements in fractions of a unit ( _, _, _ ).
More information2 HYPERBOLIC FUNCTIONS
HYPERBOLIC FUNCTIONS Chapter Hyperbolic Functions Objectives After stuying this chapter you shoul unerstan what is meant by a hyperbolic function; be able to fin erivatives an integrals of hyperbolic functions;
More informationThroughputScheduler: Learning to Schedule on Heterogeneous Hadoop Clusters
ThroughputScheuler: Learning to Scheule on Heterogeneous Haoop Clusters Shehar Gupta, Christian Fritz, Bob Price, Roger Hoover, an Johan e Kleer Palo Alto Research Center, Palo Alto, CA, USA {sgupta, cfritz,
More informationPROBLEMS. A.1 Implement the COINCIDENCE function in sumofproducts form, where COINCIDENCE = XOR.
724 APPENDIX A LOGIC CIRCUITS (Corrispone al cap. 2  Elementi i logica) PROBLEMS A. Implement the COINCIDENCE function in sumofproucts form, where COINCIDENCE = XOR. A.2 Prove the following ientities
More informationA Monte Carlo Simulation of Multivariate General
A Monte Carlo Simulation of Multivariate General Pareto Distribution an its Application 1 1 1 1 1 1 1 1 0 1 Luo Yao 1, Sui Danan, Wang Dongxiao,*, Zhou Zhenwei, He Weihong 1, Shi Hui 1 South China Sea
More informationESTABLISHING MARINE ACCCIDENT CLASSIFICATION: A CASE STUDY IN TAIWAN
ESTABLISHING MARINE ACCCIDENT CLASSIFICATION: A CASE STUDY IN TAIWAN ChungPing Liu Caniate for octor's egree Department of Shipping an Transportation Management National Taiwan Ocean University 2, PeiNing
More informationTraffic Delay Studies at Signalized Intersections with Global Positioning System Devices
Traffic Delay Stuies at Signalize Intersections with Global Positioning System Devices THIS FEATURE PRESENTS METHODS FOR APPLYING GPS DEVICES TO STUDY TRAFFIC DELAYS AT SIGNALIZED INTERSECTIONS. MOST OF
More informationWrites of Passage: Writing an Empirical Journal Article
LYNN WHITE University of Nebraska Lincoln Writes of Passage: Writing an Empirical Journal Article This article provies avice about preparing research reports for submission to professional journals in
More informationHardness Evaluation of Polytetrafluoroethylene Products
ECNDT 2006  Poster 111 Harness Evaluation of Polytetrafluoroethylene Proucts T.A.KODINTSEVA, A.M.KASHKAROV, V.A.KALOSHIN NPO Energomash Khimky, Russia A.P.KREN, V.A.RUDNITSKY, Institute of Applie Physics
More informationSupporting Adaptive Workflows in Advanced Application Environments
Supporting aptive Workflows in vance pplication Environments Manfre Reichert, lemens Hensinger, Peter Daam Department Databases an Information Systems University of Ulm, D89069 Ulm, Germany Email: {reichert,
More informationOption Pricing for Inventory Management and Control
Option Pricing for Inventory Management an Control Bryant Angelos, McKay Heasley, an Jeffrey Humpherys Abstract We explore the use of option contracts as a means of managing an controlling inventories
More informationNew Hypothesis TestingBased Methods for Fault Detection for Smart Grid Systems
New Hypothesis estingbase Methos for Fault Detection for Smart Gri Systems Qian He an Rick S. Blum ECE Department, Lehigh University 9 Memorial Drive West, Bethlehem, PA 85 {qih27; rblum}@lehigh.eu Abstract
More information1 HighDimensional Space
Contents HighDimensional Space. Properties of HighDimensional Space..................... 4. The HighDimensional Sphere......................... 5.. The Sphere an the Cube in Higher Dimensions...........
More informationDigital barrier option contract with exponential random time
IMA Journal of Applie Mathematics Avance Access publishe June 9, IMA Journal of Applie Mathematics ) Page of 9 oi:.93/imamat/hxs3 Digital barrier option contract with exponential ranom time Doobae Jun
More informationX On Bitcoin and Red Balloons
X On Bitcoin an Re Balloons Moshe Babaioff, Microsoft Research, Silicon Valley. moshe@microsoft.com Shahar Dobzinski, Department of Computer Science, Cornell University. shahar@cs.cornell.eu Sigal Oren,
More informationLiquidity and Corporate Debt Market Timing
Liquiity an Corporate Debt Market Timing Marina Balboa Faculty of Economics University of Alicante Phone: +34 965903621 Fax: +34 965903621 marina.balboa@ua.es Belén Nieto (Corresponing author) Faculty
More informationProcedure to Measure the Data Transmission Speed in Mobile Networks in Accordance with the LTE Standard (Methodical procedure)
Prague, 15 August 2013 Proceure to Measure the Data Transmission Spee in Mobile Networks in Accorance with the LTE Stanar (Methoical proceure Publishe in connection with the Tener for the awar of the rights
More informationUsing WordNet for Text Categorization
16 The International Arab Journal of Information Technology, Vol. 5, No. 1, January 2008 Using WorNet for Text Categorization Zakaria Elberrichi 1, Abelattif Rahmoun 2, an Mohame Amine Bentaalah 1 1 EEDIS
More information19.2. First Order Differential Equations. Introduction. Prerequisites. Learning Outcomes
First Orer Differential Equations 19.2 Introuction Separation of variables is a technique commonly use to solve first orer orinary ifferential equations. It is socalle because we rearrange the equation
More informationExploratory Optimal Latin Hypercube Designs for Computer Simulated Experiments
Thailan Statistician July 0; 9() : 793 http://statassoc.or.th Contribute paper Exploratory Optimal Latin Hypercube Designs for Computer Simulate Experiments Rachaaporn Timun [a,b] Anamai Nauom* [a,b]
More informationComparative study regarding the methods of interpolation
Recent Avances in Geoesy an Geomatics Engineering Comparative stuy regaring the methos of interpolation PAUL DANIEL DUMITRU, MARIN PLOPEANU, DRAGOS BADEA Department of Geoesy an Photogrammetry Faculty
More information