Random Walk Inference and Learning in A Large Scale Knowledge Base

Size: px
Start display at page:

Download "Random Walk Inference and Learning in A Large Scale Knowledge Base"

Transcription

1 Random Walk Inferene and Learning in A Large Sale Knowledge Base Ni Lao Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA nlao@s.mu.edu Tom Mithell Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA tom.mithell@s.mu.edu William W. Cohen Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA wohen@s.mu.edu Abstrat We onsider the problem of performing learning and inferene in a large sale knowledge base ontaining imperfet knowledge with inomplete overage. We show that a soft inferene proedure based on a ombination of onstrained, weighted, random walks through the knowledge base graph an be used to reliably infer new beliefs for the knowledge base. More speifially, we show that the system an learn to infer different target relations by tuning the weights assoiated with random walks that follow different paths through the graph, using a version of the Path Ranking Algorithm (Lao and Cohen, 2010b). We apply this approah to a knowledge base of approximately 500,000 beliefs extrated imperfetly from the web by NELL, a never-ending language learner (Carlson et al., 2010). This new system improves signifiantly over NELL s earlier Horn-lause learning and inferene method: it obtains nearly double the preision at rank 100, and the new learning method is also appliable to many more inferene tasks. 1 Introdution Although there is a great deal of reent researh on extrating knowledge from text (Agihtein and Gravano, 2000; Etzioni et al., 2005; Snow et al., 2006; Pantel and Pennahiotti, 2006; Banko et al., 2007; Yates et al., 2007), muh less progress has been made on the problem of drawing reliable inferenes from this imperfetly extrated knowledge. In partiular, traditional logial inferene methods are too brittle to be used to make omplex inferenes from automatially-extrated knowledge, and probabilisti inferene methods (Rihardson and Domingos, 2006) suffer from salability problems. This paper onsiders the problem of onstruting inferene methods that an sale to large KB s, and that are robust to imperfet knowledge. The KB we onsider is a large triple store, whih an be represented as a labeled, direted graph in whih eah entity x is a node, eah binary relation R(x, y) is an edge labeled R between x and y, and unary onepts C(x) are represented as an edge labeled isa between the node for the entity x and a node for the onept C. We present a trainable inferene method that learns to infer relations by ombining the results of different random walks through this graph, and show that the method ahieves good saling properties and robust inferene in a KB ontaining over 500,000 triples extrated from the web by the NELL system (Carlson et al., 2010). 1.1 The NELL Case Study To evaluate our approah experimentally, we study it in the ontext of the NELL (Never Ending Language Learning) researh projet, whih is an effort to develop a never-ending learning system that operates 24 hours per day, for years, to ontinuously improve its ability to read (extrat strutured fats from) the web (Carlson et al., 2010). NELL began operation in January As of Marh 2011, NELL has built a knowledge base ontaining several million andidate beliefs whih it has extrated from the web with varying onfidene. Among these,

2 NELL has fairly high onfidene in approximately half a million, whih we refer to as NELL s (onfident) beliefs. NELL has lower onfidene in a few million others, whih we refer to as its andidate beliefs. NELL is given as input an ontology that defines hundreds of ategories (e.g., person, beverage, athlete, sport) and two-plae typed relations among these ategories (e.g., atheleteplayssport( athlete, sport )), whih it must learn to extrat from the web. It is also provided a set of 10 to 20 positive seed examples of eah suh ategory and relation, along with a downloaded olletion of 500 million web pages from the ClueWeb2009 orpus (Callan and Hoy, 2009) as unlabeled data, and aess to 100,000 queries eah day to Google s searh engine. Eah day, NELL has two tasks: (1) to extrat additional beliefs from the web to populate its growing knowledge base (KB) with instanes of the ategories and relations in its ontology, and (2) to learn to perform task 1 better today than it ould yesterday. We an measure its learning ompetene by allowing it to onsider the same text douments today as it did yesterday, and reording whether it extrats more beliefs, more aurately today than it did yesterday. NELL uses a large-sale semi-supervised multitask learning algorithm that ouples the training of over 1500 different lassifiers and extration methods (see (Carlson et al., 2010)). Although many of the details of NELL s learning method are not entral to this paper, two points should be noted. First, NELL is a multistrategy learning system, with omponents that learn from different views of the data (Blum and Mithell, 1998): for instane, one view uses orthographi features of a potential entity name (like ontains apitalized words ), and another uses free-text ontexts in whih the noun phrase is found (e.g., X frequently follows the bigram mayor of ). Seond, NELL is a bootstrapping system, whih self-trains on its growing olletion of onfident beliefs. 1.2 Knowledge Base Inferene: Horn Clauses Although NELL has now grown a sizable knowledge base, its ability to perform inferene over this knowledge base is urrently very limited. At present its only inferene method beyond simple inheritane HinesWard Eli Manning AthletePlays ForTeam AthletePlays ForTeam Steelers Giants TeamPlays InLeague TeamPlays InLeague TeamPlays InLeague Figure 1: An example subgraph. NFL MLB involves applying first order Horn lause rules to infer new beliefs from urrent beliefs. For example, it may use a Horn lause suh as AthletePlaysForTeam(x, y) (1) TeamPlaysInLeague(y, z) AthletePlaysInLeague(x,z) to infer that AthletePlaysInLeague(HinesWard,NFL), if it has already extrated the beliefs in the preonditions of the rule, with variables x, y and z bound to HinesWard, PittsburghSteelers and NFL respetively as shown in Figure 1. NELL urrently has a set of approximately 600 suh rules, whih it has learned by data mining its knowledge base of beliefs. Eah learned rule arries a onditional probability that its onlusion will hold, given that its preonditions are satisfied. NELL learns these Horn lause rules using a variant of the FOIL algorithm (Quinlan and Cameron-Jones, 1993), heneforth N-FOIL. N-FOIL takes as input a set of positive and negative examples of a rule s onsequent (e.g., +AthletePlaysInLeague(HinesWard,NFL), AthletePlaysInLeague(HinesWard,NBA)), and uses a separate-and-onquer strategy to learn a set of Horn lauses that fit the data well. Eah Horn lause is learned by starting with a general rule and progressively speializing it, so that it still overs many positive examples but overs few negative examples. After a lause is learned, the examples overed by that lause are removed from the training set, and the proess repeats until no positive examples remain. Learning first-order Horn lauses is omputationally expensive not only is the searh spae large, but some Horn lauses an be ostly to evaluate (Cohen and Page, 1995). N-FOIL uses two triks to improve its salability. First, it assumes that the onsequent prediate is funtional e.g., that eah Athlete plays in at most one League. This means that expliit negative examples need not be

3 provided (Zelle et al., 1995): e.g., if AthletePlaysIn- League(HinesWard,NFL) is a positive example, then AthletePlaysInLeague(HinesWard,z ) for any other value of z is negative. In general, this onstraint guides the searh algorithm toward Horn lauses that have fewer possible instantiations, and hene are less expensive to math. Seond, N-FOIL uses relational pathfinding (Rihards and Mooney, 1992) to produe general rules i.e., the starting point for a lause of the form R(x, y) is found by looking at positive instanes R(a, b) of the onsequent, and finding a lause that orresponds to a bounded-length path of binary relations that link a to b. In the example above, a start lause might be the lause (1). As in FOIL, the lause is then (potentially) speialized by greedily adding additional onditions (like ProfessionalAthlete(x)) or by replaing variables with onstants (eg, replaing z with NFL). For eah N-FOIL rule, an estimated onditional probability ˆP (onlusion preonditions) is alulated using a Dirihlet prior aording to ˆP = (N + + m prior)/(n + + N + m) (2) where N + is the number of positive instanes mathed by this rule in the FOIL training data, N is the number of negative instanes mathed, m = 5 and prior = 0.5. As the results below show, N-FOIL generally learns a small number of high-preision inferene rules. One important role of these inferene rules is that they ontribute to the bootstrapping proedure, as inferenes made by N-FOIL inrease either the number of andidate beliefs, or (if the inferene is already a andidate) improve NELL s onfidene in andidate beliefs. 1.3 Knowledge Base Inferene: Graph Random Walks In this paper, we onsider an alternative approah, based on the Path Ranking Algorithm (PRA) of Lao and Cohen (2010b), desribed in detail below. PRA learns to rank graph nodes y relative to a query node x. PRA begins by enumerating a large set of bounded-length edge-labeled path types, similar to the initial lauses used in NELL s variant of FOIL. These path types are treated as ranking experts, eah performing a random walk through the graph, onstrained to follow that sequene of edge types, and ranking nodes y by their weights in the resulting distribution. Finally PRA ombines these experts using logisti regression. As an example, onsider a path from x to y via the sequene of edge types isa, isa 1 (the inverse of isa), and AthletePlaysInLeague, whih orresponds to the Horn lause isa(x, ) isa 1 (, x ) (3) AthletePlaysInLeague(x, y) AthletePlaysInLeague(x, y) Suppose a random walk starts at a query node x (say x=hinesward). If HinesWard is linked to the single onept node ProfessionalAthlete via isa, the walk will reah that node with probability 1 after one step. If A is the set of ProfessionalAthlete s in the KB, then after two steps, the walk will have probability 1/ A of being at any x A. If L is the set of athleti leagues and l L, let A l be the set of athletes in league l: after three steps, the walk will have probability A l / A of being at any point y L. In short, the ranking assoiated with this path gives the prior probability of a value y being an athleti league for x whih is useful as a feature in a ombined ranking method, although not by itself a high-preision inferene rule. Note that the rankings produed by this expert will hange as the knowledge base evolves for instane, if the system learns about proportionally more soer players than hokey players over time, then the league rankings for the path of lause (3) will hange. Also, the ranking is speifi to the query node x. For instane, suppose the KB ontains fats whih reflet the ambiguity of the team name Giants 1 as in Figure 1. Then the path for lause (1) above will give lower weight to y = NFL for x = EliManning than to y = NFL for x = HinesWard. The main ontribution of this paper is to introdue and evaluate PRA as an algorithm for making probabilisti inferene in large KBs. Compared to Horn lause inferene, the key harateristis of this new inferene method are as follows: The evidene in support of inferring a relation instane R(a, b) is based on many existing paths between a and b in the urrent KB, ombined using a learned logisti funtion. 1 San Franiso s Major-League Baseball and New York s National Football League teams are both alled the Giants.

4 The onfidene in an inferene is sensitive to the urrent state of the knowledge base, and the speifi entities being queried (sine the paths used in the inferene have these properties). Experimentally, the inferene method yields many more moderately-onfident inferenes than the Horn lauses learned by N-FOIL. The learning and inferene are more effiient than N-FOIL, in part beause we an exploit effiient approximation shemes for random walks (Lao and Cohen, 2010a). The resulting inferene is as fast as 10 milliseonds per query on average. The Path Ranking Algorithm (PRA) we use is similar to that desribed elsewhere (Lao and Cohen, 2010b), exept that to ahieve effiient model learning, the paths between a and b are determined by the statistis from a population of training queries rather than enumerated ompletely. PRA uses random walks to generate relational features on graph data, and ombine them with a logisti regression model. Compared to other relational models (e.g. FOIL, Markov Logi Networks), PRA is extremely effiient at link predition or retrieval tasks, in whih we are interested in identifying top links from a large number of andidates, instead of fousing on a partiular node pair or joint inferenes. 1.4 Related Work The TextRunner system (Cafarella et al., 2006) answers list queries on a large knowledge base produed by open domain information extration. Spreading ativation is used to measure the loseness of any node to the query term nodes. This approah is similar to the random walk with restart approah whih is used as a baseline in our experiment. The FatRank system (Jain and Pantel, 2010) ompares different ways of onstruting random walks, and ombining them with extration sores. However, the shortoming of both approahes is that they ignore edge type information, whih is important for ahieving high auray preditions. The HOLMES system (Shoenmakers et al., 2008) derives new assertions using a few manually written inferene rules. A Markov network orresponding to the grounding of these rules to the knowledge base is onstruted for eah query, and then belief propagation is used for inferene. In omparison, our proposed approah disovers inferene rules automatially from training data. Below we desribe our approah in greater detail, provide experimental evidene of its value for performing inferene in NELL s knowledge base, and disuss impliations of this work and diretions for future researh. 2 Approah In this setion, we first desribe how we formulate link (relation) predition on a knowledge base as a ranking task. Then we review the Path Ranking Algorithm (PRA) introdued by Lao and Cohen (2010b; 2010a). After that, we desribe two improvement to the PRA method so that it is more suitable for the task of link predition on knowledge base. The first improvement helps PRA deal with the large number of relations whih is typial of large knowledge bases. The seond improvement aims at improving the quality of inferene by applying low variane sampling. 2.1 Learning with NELL s Knowledge Base For eah relation R in the knowledge base we train a model for the link predition task: given a onept x, find all other onepts y whih potentially have the relation R(x, y). This predition is made based on an existing knowledge base extrated imperfetly from the web. Although a model an potentially benefit from prediting multiple relations jointly, suh joint inferene is beyond the sope of this work. To ensure reasonable number of training instanes, we generate labeled training queries from 48 relations whih have more than 100 instanes in the knowledge base. We reate two tasks for eah relation i.e., prediting y given x and prediting x given y yielding 96 tasks in all. Eah node x whih has relation R in the knowledge base with any other node is treated as a training query, the atual nodes y in the knowledge base known to satisfy R(x, y) are treated as labeled positive examples, and any other nodes are treated as negative examples.

5 2.2 Path Ranking Algorithm Review We now review the Path Ranking Algorithm introdued by Lao and Cohen (2010b). A relation path P is defined as a sequene of relations R 1... R l, and in order to emphasize the types assoiated with eah step, P an also be written as R T 1 R 0... l Tl, where T i = range(r i ) = domain(r i+1 ), and we also define domain(p ) T 0, range(p ) T l. In the experiments in this paper, there is only one type of nodes alled onept, whih an be onneted through different types of relations. In this notation, relations like the team ertain player plays for, and the league ertain player s team is in an be expressed by the paths below (respetively): P 1 : onept AtheletePlayesForTeam onept P 2 : onept AtheletePlayesForTeam onept TeamPlaysInLeagure onept For any relation path P = R 1... R l and a seed node s domain(p ), a path onstrained random walk defines a distribution h s,p reursively as follows. If P is the empty path, then define { 1, if e = s h s,p (e) = (4) 0, otherwise If P = R 1... R l is nonempty, then let P = R 1... R l 1, and define h s,p (e) = h s,p (e ) P (e e ; R l ), (5) e range(p ) where P (e e ; R l ) = R l(e,e) R l (e, ) is the probability of reahing node e from node e with a one step random walk with edge type R l. R(e, e) indiates whether there exists an edge with type R that onnet e to e. More generally, given a set of paths P 1,..., P n, one ould treat eah h s,pi (e) as a path feature for the node e, and rank nodes by a linear model θ 1 h s,p1 (e) + θ 2 h s,p2 (e) +... θ n h s,pn (e) where θ i are appropriate weights for the paths. This gives a ranking of nodes e related to the query node s by the following soring funtion sore(e; s) = P P l h s,p (e)θ P, (6) where P l is the set of relation paths with length l. Given a relation R and a set of node pairs {(s i, t i )}, we an onstrut a training dataset D = {(x i, r i )}, where x i is a vetor of all the path features for the pair (s i, t i ) i.e., the j-th omponent of x i is h si,p j (t i ), and r i indiates whether R(s i, t i ) is true. Parameter θ is estimated by maximizing the following regularized objetive funtion O(θ) = i o i (θ) λ 1 θ 1 λ 2 θ 2, (7) where λ 1 ontrols L 1 -regularization to help struture seletion, and λ 2 ontrols L 2 -regularization to prevent overfitting. o i (θ) is a per-instane objetive funtion, whih is defined as o i (θ) = w i [r i ln p i + (1 y i ) ln(1 p i )], (8) where p i is the predited relevane defined as p i = p(r i = 1 x i ; θ) = exp(θt x i ) 1+exp(θ T x i ), and w i is an importane weight to eah example. A biased sampling proedure selets only a small subset of negative samples to be inluded in the objetive (see (Lao and Cohen, 2010b) for detail). 2.3 Data-Driven Path Finding In prior work with PRA, P l was defined as all relation paths of length at most l. When the number of edge types is small, one an generate P l by enumeration; however, for domains with a large number of relations (e.g., a knowledge base), it is impratial to enumerate all possible relation paths even for small l. For instane, if the number of relations related to eah node is 100, even the number of length three paths easily reahes millions. For other domains like parsed natural language sentenes, useful paths an be as long as ten relations (Minkov and Cohen, 2008). In this ase, even with smaller number of possible relations, the total number of relation path is still too large for systemati enumeration. In order to apply PRA to these domains, we modify the path generation proedure in PRA to produe only paths whih are potentially useful for the task. Define a query s to be supporting a path P if h s,p (e) 0 for any entity e. We require that any node reated during path finding needs to

6 Table 1: Number of paths in PRA models of maximum path length 3 and 4. Averaged over 96 tasks. l=3 l=4 all paths up to length L 15, 376 1, 906, 624 +query support α = ever reah a target entity L 1 regularization be supported by at least a fration α of the training queries s i, as well as being of length no more than l (In the experiments, we set α = 0.01) We also require that a path inluded in the PRA model only if it retrieves at least one target entity t i in the training set. As we an see from Table 1, together these two onstraints dramatially redue the number of paths that need to be onsidered, relative to systematially enumerating all possible paths. L1 regularization redues the size of the model even more. The idea of finding paths that onnets nodes in a graph is not new. It has been embodied previously in first-order learning systems (Rihards and Mooney, 1992) as well as N-FOIL, and relational database searhing systems (Bhalotia et al., 2002). These approahes onsider a single query during path finding. In omparison, the data-driven path finding method we desribed here uses statistis from a population of queries, and therefore an potentially determine the importane of a path more reliably. 2.4 Low-Variane Sampling Lao and Cohen (2010a) previously showed that sampling tehniques like finger printing and partile filtering an signifiantly speedup random walk without sarifiing retrieval quality. However, the sampling proedures an indue a loss of diversity in the partile population. For example, onsider a node in the graph with just two out links with equal weights, and suppose we are required to generate two walkers starting from this node. A disappointing result is that with 50 perent hane both walkers will follow the same branh, and leave the other branh with no probability mass. To overome this problem, we apply a tehnique alled Low-Variane Sampling (LVS) (Thrun et al., 2005), whih is ommonly used in robotis to improve the quality of sampling. Instead of generating independent samples from a distribution, LVS uses a single random number to generate all Table 2: Comparing PRA with RWR models. MRRs and training times are averaged over 96 tasks. l=2 l=3 MRR Time MRR Time RWR(no train) RWR s s PRA s s samples, whih are evenly distributed aross the whole distribution. Note that given a distribution P (x), any number r in [0, 1] points to exatly one x value, namely x = arg min j m=1..j P (m) r. Suppose we want to generate M samples from P (x). LVS first generates a random number r in the interval [0, M 1 ]. Then LVS repeatedly adds the fixed amount M 1 to r and hooses x values orresponding to the resulting numbers. 3 Results This setion reports empirial results of applying random walk inferene to NELL s knowledge base after the 165th iteration of its learning proess. We first investigate PRA s behavior by ross validation on the training queries. Then we ompare PRA and N-FOIL s ability to reliably infer new beliefs, by leveraging the Amazon Mehanial Turk servie. 3.1 Cross Validation on the Training Queries Random Walk with Restart (RWR) (also alled personalized PageRank (Haveliwala, 2002)) is a general-purpose graph proximity measure whih has been shown to be fairly suessful for many types of tasks. We ompare PRA to two versions of RWR on the 96 tasks of link predition with NELL s knowledge base. The two baseline methods are an untrained RWR model and a trained RWR model as desribed by Lao and Cohen (2010b). (In brief, in the trained RWR model, the walker will probabilistially prefer to follow edges assoiated with different labels, where the weight for eah edge label is hosen to minimize a loss funtion, suh as Equation 7. In the untrained model, edge weights are uniform.) We explored a range of values for the regularization parameters L 1 and L 2 using ross validation on the training data, and we fix both L 1 and L 2 parameters to for all tasks. The

7 MRR k 10k Exat 10k Independent Fingerprinting Low Variane Fingerprinting Independent Filtering Low Variane Filtering Random Walk Speedup Figure 2: Compare inferene speed and quality over 96 tasks. The speedup is relative to exat inferene, whih is on average 23ms per query. 1k 100 maximum path length is fixed to 3. 2 Table 2 ompares the three methods using 5-fold ross validation and the Mean Reiproal Rank (MRR) 3 measure, whih is defined as the inverse rank of the highest ranked relevant result in a set of results. If the the first returned result is relevant, then MRR is 1.0, otherwise, it is smaller than 1.0. Supervised training an signifiantly improve retrieval quality (p-value= omparing untrained and trained RWR), and leveraging path information an produe further improvement (p-value= omparing trained RWR with PRA). The average training time for a prediate is only a few seonds. We also investigate the effet of low-variane sampling on the quality of predition. Figure 2 ompares independent and low variane sampling when applied to finger printing and partile filtering (Lao and Cohen, 2010a). The horizontal axis orresponds to the speedup of random walk ompared with exat inferene, and the vertial axis measures the quality of predition by MRR with three fold ross validation on the training query set. Low-variane sampling an improve predition for both finger 2 Results with maximum length 4 are not reported here. Generally models with length 4 paths produe slightly better results, but are 4-5 times slower to train 3 For a set of queries Q, MRR = 1 Q q Q 1k 1 rank of the first orret answer for q printing and partile filtering. The numbers on the urves indiate the number of partiles (or walkers). When using a large number of partiles, the partile filtering methods onverge to the exat inferene. Interestingly, when using a large number of walkers, the finger printing methods produe even better predition quality than exat inferene. Lao and Cohen notied a similar improvement on retrieval tasks, and onjetured that it is beause the sampling inferene imposes a regularization penalty on longer relation paths (2010a). 3.2 Evaluation by Mehanial Turk The ross-validation result above assumes that the knowledge base is omplete and orret, whih we know to be untrue. To aurately ompare PRA and N-FOIL s ability to reliably infer new beliefs from an imperfet knowledge base, we use human assessments obtained from Amazon Mehanial Turk. To limit labeling osts, and sine our goal is to improve the performane of NELL, we do not inlude RWR-based approahes in this omparison. Among all the 24 funtional prediates, N-FOIL disovers onfident rules for 8 of them (it produes no result for the other 16 prediates). Therefore, we ompare the quality of PRA to N-FOIL on these 8 prediates only. Among all the 72 non-funtional prediates whih N-FOIL annot be applied to PRA exhibits a wide range of performane in ross-validation. The are 43 tasks for whih PRA obtains MRR higher than 0.4 and builds a model with more than 10 path features. We randomly sampled 8 of these prediates to be evaluated by Amazon Mehanial Turk. Table 3 shows the top two weighted PRA features for eah task on whih N-FOIL an suessfully learn rules. These PRA rules an be ategorized into broad overage rules whih behave like priors over orret answers (e.g. 1-2, 4-6, 15), aurate rules whih leverage speifi relation sequenes (e.g. 9, 11, 14), rules whih leverage information about the synonyms of the query node (e.g. 7-8, 10, 12), and rules whih leverage information from a loal neighborhood of the query node (e.g. 3, 12-13, 16). The synonym paths are useful, beause an entity may have multiple names on the web. We find that all 17 general rules (no speialization) learned by N-FOIL an be expressed as length two relation

8 Table 3: The top two weighted PRA paths for tasks on whih N-FOIL disovers onfident rules. stands for onept. ID athleteplaysforteam leagueplayers PRA Path (Comment) athleteplaysforteam 1 (teams with many players in the athlete s league) 2 leagueteams teamagainstteam (teams that play against many teams in the athlete s league) 3 athleteplayssport players (the league that players of a ertain sport belong to) 4 isa isa 1 athleteplayssport 5 isa isa 1 6 stadiumloatedincity 7 stadiumhometeam 8 latitudelongitude teamhomestadium 9 teamplaysincity 10 teammember teamplaysincity 11 teamhomestadium (popular leagues with many players) athleteplayssport (popular sports of all the athletes) superpartoforganization teamhomestadium teamplayssport (popular sports of a ertain league) stadiumloatedincity (ity of the stadium with the same team) latitudelongitudeof stadiumloatedincity (ity of the stadium with the same loation) itystadiums (stadiums loated in the same ity with the query team) athleteplaysforteam stadiumloatedincity (ity of the team s home stadium) 12 teamhomestadium stadiumhometeam teamplaysinleague 13 teamplayssport players teamplaysagainstteam 14 teamplayssport 15 isa isa 1 16 teamplaysinleague teamplayssport teamhomestadium (home stadium of teams whih share players with the query) teamplaysincity (ity of teams with the same home stadium as the query) (the league that the query team s members belong to) teamplaysinleague (the league that the query team s ompeting team belongs to) (sports played by many teams) leagueteams teamplayssport (the sport played by other teams in the league) Table 4: Amazon Mehanial Turk evaluation for the promoted knowledge. Using paired t-test at task level, PRA is not statistially different than N-FOIL for p@10 (p-value=0.3), but is signifiantly better for p@100 (p-value=0.003) PRA N-FOIL Task P majority #Paths p@10 p@100 p@1000 #Rules #Query p@10 p@100 p@1000 athleteplaysforteam (+1) (+30) athleteplayssport (+30) stadiumloatedincity (+0) teamhomestadium (+0) teamplaysincity (+0) teamplaysinleague (+151) teamplayssport (+86) average teammember ompaniesheadquarteredin publiationjournalist produedby N-FOIL does not produe results ompeteswith for non-funtional prediates hasoffieincity teamwontrophy worksfor average

9 Table 5: Compare Mehanial Turk workers voted assessments with our gold standard labels based on 100 samples. AMT=F AMT=T Gold=F 25% 15% Gold=T 11% 49% paths suh as path 11. In omparison, PRA explores a feature spae with many length three paths. For eah relation R to be evaluated, we generate test queries s whih belong to domain(r). Queries whih appear in the training set are exluded. For eah query node s, we applied a trained model (either PRA or N-FOIL) to generate a ranked list of andidate t nodes. For PRA, the andidates are sorted by their sores as in Eq. (6). For N-FOIL, the andidates are sorted by the estimated auraies of the rules as in Eq. (2) (whih generate the andidates). Sine there are about 7 thousand (and 13 thousand) test queries s for eah funtional (and non-funtional) prediate R, and there are (potentially) thousands of andidates t returned for eah query s, we annot evaluate all andidates of all queries. Therefore, we first sort the queries s for eah prediate R by the sores of their top ranked andidate t in desending order, and then alulate preisions at top 10, 100 and 1000 positions for the 1 ),..., where list of result R(s R,1, t R,1 1 ), R(s R,2, t R,2 s R,1 is the first query for prediate R, t R,1 1 is its first andidate, s R,2 is the seond query for prediate R, t R,2 1 is its first andidate, so on and so forth. To redue the labeling load, we judge all top 10 queries for eah prediate, but randomly sample 50 out of the top 100, and randomly sample 50 out of the top Eah belief is evaluated by 5 workers at Mehanial Turk, who are given assertions like Hines Ward plays for the team Steelers, as well as Google searh links for eah entity, and the ombination of both entities. Statistis shows that the workers spend on average 25 seonds to judge eah belief. We also remove some workers judgments whih are obviously inorret 4. We sampled 100 beliefs, and ompared their voted result to gold-standard labels produed by one author of this paper. Table 5 shows that 74% of the time the workers voted result agrees with our judgement. 4 Certain workers label all the questions with the same answer Table 4 shows the evaluation result. The P majority olumn shows for eah prediate the auray ahieved by the majority predition: given a query R(x,?), predit the y that most often satisfies R over all possible x in the knowledge base. Thus, the higher P majority is, the simpler the task. Prediting the funtional prediates is generally easier prediting the non-funtional prediates. The #Query olumn shows the number of queries on whih N-FOIL is able to math any of its rules, and hene produe a andidate belief. For most prediates, N-FOIL is only able to produe results for at most a few hundred queries. In omparison, PRA is able to produe results for 6,599 queries on average for eah funtional prediate, and 12,519 queries on average for eah non-funtional prediate. Although the preision at 10 (p@10) of N-FOIL is omparable to that of PRA, preision at 10 and at 1000 (p@100 and p@1000) are muh lower 5. The #Path olumn shows the number of paths learned by PRA, and the #Rule olumn shows the number of rules learned by N-FOIL, with the numbers before brakets orrespond to unspeialized rules, and the numbers in brakets orrespond to speialized rules. Generally, speialized rules have muh smaller reall than unspeialized rules. Therefore, the PRA approah ahieves high reall partially by ombining a large number of unspeialized paths, whih orrespond to unspeialized rules. However, learning more aurate speialized paths is part of our future work. A signifiant advantage of PRA over N-FOIL is that it an be applied to non-funtional prediates. The last eight rows of Table 4 show PRA s performane on eight of these prediates. Compared to the result on funtional prediates, preisions at 10 and at 100 of non-funtional prediates are slightly lower, but preisions at 1000 are omparable. We note that for some prediates preision at 1000 is better than at 100. After some investigation we found that for many relations, the top portion of the result list is more diverse: i.e. showing produts produed by different ompanies, journalist working at different publiations. 5 If a method makes k preditions, and k < n, then p@n is the number orret out of the k preditions, divided by n

10 While the lower half of the result list is more homogeneous: i.e. showing relations onentrated on one or two ompanies/publiations. On the other hand, through the proess of labeling the Mehanial Turk workers seem to build up a prior about whih ompany/publiation is likely to have orret beliefs, and their judgments are positively biased towards these ompanies/publiations. These two fators ombined together result in positive bias towards the lower portion of the result list. In future work we hope to design a labeling strategy whih avoids this bias. 4 Conlusions and Future Work We have shown that a soft inferene proedure based on a ombination of onstrained, weighted, random walks through the knowledge base graph an be used to reliably infer new beliefs for the knowledge base. We applied this approah to a knowledge base of approximately 500,000 beliefs extrated imperfetly from the web by NELL. This new system improves signifiantly over NELL s earlier Horn-lause learning and inferene method: it obtains nearly double the preision at rank 100. The inferene and learning are both very effiient our experiment shows that the inferene time is as fast as 10 milliseonds per query on average, and the training for a prediate takes only a few seonds. There are some prominent diretions for future work. First, inferene starting from both the query nodes and target nodes (Rihards and Mooney, 1992) an be muh more effiient in disovering long paths than just inferene from the query nodes. Seond, inferene starting from the target nodes of training queries is a potential way to disover speialized paths (with grounded nodes). Third, generalizing inferene paths to inferene trees or graphs an produe more expressive random walk inferene models. Overall, we believe that random walk is a promising way to sale up relational learning to domains with very large data sets. Aknowledgments This work was supported by NIH under grant R01GM081293, by NSF under grant IIS , by DARPA under awards FA and AF C-0179, and by a gift from Google. We thank Geoffrey J. Gordon for the suggestion of applying low variane sampling to random walk inferene. We also thank Bryan Kisiel for help with the NELL system. Referenes Eugene Agihtein and Luis Gravano Snowball: extrating relations from large plain-text olletions. In Proeedings of the fifth ACM onferene on Digital libraries - DL 00, pages 85 94, New York, New York, USA. ACM Press. Mihele Banko, Mihael J. Cafarella, Stephen Soderland, Matthew Broadhead, and Oren Etzioni Open Information Extration from the Web. In IJCAI, pages Gaurav Bhalotia, Arvind Hulgeri, Charuta Nakhe, Soumen Chakrabarti, and S. Sudarshan Keyword searhing and browsing in databases using banks. ICDE, pages Avrim Blum and Tom Mithell Combining labeled and unlabeled data with o-training. In Proeedings of the eleventh annual onferene on Computational learning theory - COLT 98, pages , New York, New York, USA. ACM Press. MJ Cafarella, M Banko, and O Etzioni Relational Web Searh. In WWW. Jamie Callan and Mark Hoy Clueweb09 data set. Andrew Carlson, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam R. Hrushka Jr., and Tom M. Mithell Toward an Arhiteture for Never-Ending Language Learning. In AAAI. William W. Cohen and David Page Polynomial learnability and indutive logi programming: Methods and results. New Generation Comput., 13(3&4): Oren Etzioni, Mihael Cafarella, Doug Downey, Anamaria Popesu, Tal Shaked, Stephen Soderl, Daniel S. Weld, and Er Yates Unsupervised namedentity extration from the web: An experimental study. Artifiial Intelligene, 165: Taher H. Haveliwala Topi-sensitive pagerank. In WWW, pages Alpa Jain and Patrik Pantel Fatrank: Random walks on a web of fats. In COLING, pages Ni Lao and William W. Cohen. 2010a. Fast query exeution for retrieval models based on path-onstrained random walks. KDD. Ni Lao and William W. Cohen. 2010b. Relational retrieval using a ombination of path-onstrained random walks. Mahine Learning.

11 Einat Minkov and William W. Cohen Learning graph walk based similarity measures for parsed text. EMNLP, pages Patrik Pantel and Maro Pennahiotti Espresso: Leveraging Generi Patterns for Automatially Harvesting Semanti Relations. In ACL. J. Ross Quinlan and R. Mike Cameron-Jones FOIL: A Midterm Report. In ECML, pages Bradley L. Rihards and Raymond J. Mooney Learning relations by pathfinding. In Proeedings of the Tenth National Conferene on Artifiial Intelligene (AAAI-92), pages 50 55, San Jose, CA, July. Matthew Rihardson and Pedro Domingos Markov logi networks. Mahine Learning. Stefan Shoenmakers, Oren Etzioni, and Daniel S. Weld Saling Textual Inferene to the Web. In EMNLP, pages Rion Snow, Daniel Jurafsky, and Andrew Y. Ng Semanti Taxonomy Indution from Heterogenous Evidene. In ACL. Sebastian Thrun, Wolfram Burgard, and Dieter Fox Probabilisti Robotis (Intelligent Robotis and Autonomous Agents). The MIT Press. Alexander Yates, Mihele Banko, Matthew Broadhead, Mihael J. Cafarella, Oren Etzioni, and Stephen Soderland TextRunner: Open Information Extration on the Web. In HLT-NAACL (Demonstrations), pages John M. Zelle, Cynthia A. Thompson, Mary Elaine Califf, and Raymond J. Mooney Induing logi programs without expliit negative examples. In Proeedings of the Fifth International Workshop on Indutive Logi Programming (ILP-95), pages , Leuven, Belgium.

A Holistic Method for Selecting Web Services in Design of Composite Applications

A Holistic Method for Selecting Web Services in Design of Composite Applications A Holisti Method for Seleting Web Servies in Design of Composite Appliations Mārtiņš Bonders, Jānis Grabis Institute of Information Tehnology, Riga Tehnial University, 1 Kalu Street, Riga, LV 1658, Latvia,

More information

Channel Assignment Strategies for Cellular Phone Systems

Channel Assignment Strategies for Cellular Phone Systems Channel Assignment Strategies for Cellular Phone Systems Wei Liu Yiping Han Hang Yu Zhejiang University Hangzhou, P. R. China Contat: wliu5@ie.uhk.edu.hk 000 Mathematial Contest in Modeling (MCM) Meritorious

More information

Hierarchical Clustering and Sampling Techniques for Network Monitoring

Hierarchical Clustering and Sampling Techniques for Network Monitoring S. Sindhuja Hierarhial Clustering and Sampling Tehniques for etwork Monitoring S. Sindhuja ME ABSTRACT: etwork monitoring appliations are used to monitor network traffi flows. Clustering tehniques are

More information

Open and Extensible Business Process Simulator

Open and Extensible Business Process Simulator UNIVERSITY OF TARTU FACULTY OF MATHEMATICS AND COMPUTER SCIENCE Institute of Computer Siene Karl Blum Open and Extensible Business Proess Simulator Master Thesis (30 EAP) Supervisors: Luiano Garía-Bañuelos,

More information

A Context-Aware Preference Database System

A Context-Aware Preference Database System J. PERVASIVE COMPUT. & COMM. (), MARCH 005. TROUBADOR PUBLISHING LTD) A Context-Aware Preferene Database System Kostas Stefanidis Department of Computer Siene, University of Ioannina,, kstef@s.uoi.gr Evaggelia

More information

Discovering Trends in Large Datasets Using Neural Networks

Discovering Trends in Large Datasets Using Neural Networks Disovering Trends in Large Datasets Using Neural Networks Khosrow Kaikhah, Ph.D. and Sandesh Doddameti Department of Computer Siene Texas State University San Maros, Texas 78666 Abstrat. A novel knowledge

More information

Chapter 1 Microeconomics of Consumer Theory

Chapter 1 Microeconomics of Consumer Theory Chapter 1 Miroeonomis of Consumer Theory The two broad ategories of deision-makers in an eonomy are onsumers and firms. Eah individual in eah of these groups makes its deisions in order to ahieve some

More information

' R ATIONAL. :::~i:. :'.:::::: RETENTION ':: Compliance with the way you work PRODUCT BRIEF

' R ATIONAL. :::~i:. :'.:::::: RETENTION ':: Compliance with the way you work PRODUCT BRIEF ' R :::i:. ATIONAL :'.:::::: RETENTION ':: Compliane with the way you work, PRODUCT BRIEF In-plae Management of Unstrutured Data The explosion of unstrutured data ombined with new laws and regulations

More information

Sebastián Bravo López

Sebastián Bravo López Transfinite Turing mahines Sebastián Bravo López 1 Introdution With the rise of omputers with high omputational power the idea of developing more powerful models of omputation has appeared. Suppose that

More information

Weighting Methods in Survey Sampling

Weighting Methods in Survey Sampling Setion on Survey Researh Methods JSM 01 Weighting Methods in Survey Sampling Chiao-hih Chang Ferry Butar Butar Abstrat It is said that a well-designed survey an best prevent nonresponse. However, no matter

More information

Ranking Community Answers by Modeling Question-Answer Relationships via Analogical Reasoning

Ranking Community Answers by Modeling Question-Answer Relationships via Analogical Reasoning Ranking Community Answers by Modeling Question-Answer Relationships via Analogial Reasoning Xin-Jing Wang Mirosoft Researh Asia 4F Sigma, 49 Zhihun Road Beijing, P.R.China xjwang@mirosoft.om Xudong Tu,Dan

More information

Big Data Analysis and Reporting with Decision Tree Induction

Big Data Analysis and Reporting with Decision Tree Induction Big Data Analysis and Reporting with Deision Tree Indution PETRA PERNER Institute of Computer Vision and Applied Computer Sienes, IBaI Postbox 30 11 14, 04251 Leipzig GERMANY pperner@ibai-institut.de,

More information

Findings and Recommendations

Findings and Recommendations Contrating Methods and Administration Findings and Reommendations Finding 9-1 ESD did not utilize a formal written pre-qualifiations proess for seleting experiened design onsultants. ESD hose onsultants

More information

Deadline-based Escalation in Process-Aware Information Systems

Deadline-based Escalation in Process-Aware Information Systems Deadline-based Esalation in Proess-Aware Information Systems Wil M.P. van der Aalst 1,2, Mihael Rosemann 2, Marlon Dumas 2 1 Department of Tehnology Management Eindhoven University of Tehnology, The Netherlands

More information

User s Guide VISFIT: a computer tool for the measurement of intrinsic viscosities

User s Guide VISFIT: a computer tool for the measurement of intrinsic viscosities File:UserVisfit_2.do User s Guide VISFIT: a omputer tool for the measurement of intrinsi visosities Version 2.a, September 2003 From: Multiple Linear Least-Squares Fits with a Common Interept: Determination

More information

Improved Vehicle Classification in Long Traffic Video by Cooperating Tracker and Classifier Modules

Improved Vehicle Classification in Long Traffic Video by Cooperating Tracker and Classifier Modules Improved Vehile Classifiation in Long Traffi Video by Cooperating Traker and Classifier Modules Brendan Morris and Mohan Trivedi University of California, San Diego San Diego, CA 92093 {b1morris, trivedi}@usd.edu

More information

BUILDING A SPAM FILTER USING NAÏVE BAYES. CIS 391- Intro to AI 1

BUILDING A SPAM FILTER USING NAÏVE BAYES. CIS 391- Intro to AI 1 BUILDING A SPAM FILTER USING NAÏVE BAYES 1 Spam or not Spam: that is the question. From: "" Subjet: real estate is the only way... gem oalvgkay Anyone an buy real estate with no

More information

Recommending Questions Using the MDL-based Tree Cut Model

Recommending Questions Using the MDL-based Tree Cut Model WWW 2008 / Refereed Trak: Data Mining - Learning April 2-25, 2008 Beijing, China Reommending Questions Using the MDL-based Tree Cut Model Yunbo Cao,2, Huizhong Duan, Chin-Yew Lin 2, Yong Yu, and Hsiao-Wuen

More information

Behavior Analysis-Based Learning Framework for Host Level Intrusion Detection

Behavior Analysis-Based Learning Framework for Host Level Intrusion Detection Behavior Analysis-Based Learning Framework for Host Level Intrusion Detetion Haiyan Qiao, Jianfeng Peng, Chuan Feng, Jerzy W. Rozenblit Eletrial and Computer Engineering Department University of Arizona

More information

Granular Problem Solving and Software Engineering

Granular Problem Solving and Software Engineering Granular Problem Solving and Software Engineering Haibin Zhu, Senior Member, IEEE Department of Computer Siene and Mathematis, Nipissing University, 100 College Drive, North Bay, Ontario, P1B 8L7, Canada

More information

A Comparison of Service Quality between Private and Public Hospitals in Thailand

A Comparison of Service Quality between Private and Public Hospitals in Thailand International Journal of Business and Soial Siene Vol. 4 No. 11; September 2013 A Comparison of Servie Quality between Private and Hospitals in Thailand Khanhitpol Yousapronpaiboon, D.B.A. Assistant Professor

More information

Interpretable Fuzzy Modeling using Multi-Objective Immune- Inspired Optimization Algorithms

Interpretable Fuzzy Modeling using Multi-Objective Immune- Inspired Optimization Algorithms Interpretable Fuzzy Modeling using Multi-Objetive Immune- Inspired Optimization Algorithms Jun Chen, Mahdi Mahfouf Abstrat In this paper, an immune inspired multi-objetive fuzzy modeling (IMOFM) mehanism

More information

Static Fairness Criteria in Telecommunications

Static Fairness Criteria in Telecommunications Teknillinen Korkeakoulu ERIKOISTYÖ Teknillisen fysiikan koulutusohjelma 92002 Mat-208 Sovelletun matematiikan erikoistyöt Stati Fairness Criteria in Teleommuniations Vesa Timonen, e-mail: vesatimonen@hutfi

More information

A Keyword Filters Method for Spam via Maximum Independent Sets

A Keyword Filters Method for Spam via Maximum Independent Sets Vol. 7, No. 3, May, 213 A Keyword Filters Method for Spam via Maximum Independent Sets HaiLong Wang 1, FanJun Meng 1, HaiPeng Jia 2, JinHong Cheng 3 and Jiong Xie 3 1 Inner Mongolia Normal University 2

More information

) ( )( ) ( ) ( )( ) ( ) ( ) (1)

) ( )( ) ( ) ( )( ) ( ) ( ) (1) OPEN CHANNEL FLOW Open hannel flow is haraterized by a surfae in ontat with a gas phase, allowing the fluid to take on shapes and undergo behavior that is impossible in a pipe or other filled onduit. Examples

More information

FIRE DETECTION USING AUTONOMOUS AERIAL VEHICLES WITH INFRARED AND VISUAL CAMERAS. J. Ramiro Martínez-de Dios, Luis Merino and Aníbal Ollero

FIRE DETECTION USING AUTONOMOUS AERIAL VEHICLES WITH INFRARED AND VISUAL CAMERAS. J. Ramiro Martínez-de Dios, Luis Merino and Aníbal Ollero FE DETECTION USING AUTONOMOUS AERIAL VEHICLES WITH INFRARED AND VISUAL CAMERAS. J. Ramiro Martínez-de Dios, Luis Merino and Aníbal Ollero Robotis, Computer Vision and Intelligent Control Group. University

More information

Neural network-based Load Balancing and Reactive Power Control by Static VAR Compensator

Neural network-based Load Balancing and Reactive Power Control by Static VAR Compensator nternational Journal of Computer and Eletrial Engineering, Vol. 1, No. 1, April 2009 Neural network-based Load Balaning and Reative Power Control by Stati VAR Compensator smail K. Said and Marouf Pirouti

More information

THE PERFORMANCE OF TRANSIT TIME FLOWMETERS IN HEATED GAS MIXTURES

THE PERFORMANCE OF TRANSIT TIME FLOWMETERS IN HEATED GAS MIXTURES Proeedings of FEDSM 98 998 ASME Fluids Engineering Division Summer Meeting June 2-25, 998 Washington DC FEDSM98-529 THE PERFORMANCE OF TRANSIT TIME FLOWMETERS IN HEATED GAS MIXTURES John D. Wright Proess

More information

Intelligent Measurement Processes in 3D Optical Metrology: Producing More Accurate Point Clouds

Intelligent Measurement Processes in 3D Optical Metrology: Producing More Accurate Point Clouds Intelligent Measurement Proesses in 3D Optial Metrology: Produing More Aurate Point Clouds Charles Mony, Ph.D. 1 President Creaform in. mony@reaform3d.om Daniel Brown, Eng. 1 Produt Manager Creaform in.

More information

Recovering Articulated Motion with a Hierarchical Factorization Method

Recovering Articulated Motion with a Hierarchical Factorization Method Reovering Artiulated Motion with a Hierarhial Fatorization Method Hanning Zhou and Thomas S Huang University of Illinois at Urbana-Champaign, 405 North Mathews Avenue, Urbana, IL 680, USA {hzhou, huang}@ifpuiuedu

More information

RATING SCALES FOR NEUROLOGISTS

RATING SCALES FOR NEUROLOGISTS RATING SCALES FOR NEUROLOGISTS J Hobart iv22 WHY Correspondene to: Dr Jeremy Hobart, Department of Clinial Neurosienes, Peninsula Medial Shool, Derriford Hospital, Plymouth PL6 8DH, UK; Jeremy.Hobart@

More information

Pattern Recognition Techniques in Microarray Data Analysis

Pattern Recognition Techniques in Microarray Data Analysis Pattern Reognition Tehniques in Miroarray Data Analysis Miao Li, Biao Wang, Zohreh Momeni, and Faramarz Valafar Department of Computer Siene San Diego State University San Diego, California, USA faramarz@sienes.sdsu.edu

More information

Scalable Hierarchical Multitask Learning Algorithms for Conversion Optimization in Display Advertising

Scalable Hierarchical Multitask Learning Algorithms for Conversion Optimization in Display Advertising Salable Hierarhial Multitask Learning Algorithms for Conversion Optimization in Display Advertising Amr Ahmed Google amra@google.om Abhimanyu Das Mirosoft Researh abhidas@mirosoft.om Alexander J. Smola

More information

OpenScape 4000 CSTA V7 Connectivity Adapter - CSTA III, Part 2, Version 4.1. Developer s Guide A31003-G9310-I200-1-76D1

OpenScape 4000 CSTA V7 Connectivity Adapter - CSTA III, Part 2, Version 4.1. Developer s Guide A31003-G9310-I200-1-76D1 OpenSape 4000 CSTA V7 Connetivity Adapter - CSTA III, Part 2, Version 4.1 Developer s Guide A31003-G9310-I200-1-76 Our Quality and Environmental Management Systems are implemented aording to the requirements

More information

An Enhanced Critical Path Method for Multiple Resource Constraints

An Enhanced Critical Path Method for Multiple Resource Constraints An Enhaned Critial Path Method for Multiple Resoure Constraints Chang-Pin Lin, Hung-Lin Tai, and Shih-Yan Hu Abstrat Traditional Critial Path Method onsiders only logial dependenies between related ativities

More information

Capacity at Unsignalized Two-Stage Priority Intersections

Capacity at Unsignalized Two-Stage Priority Intersections Capaity at Unsignalized Two-Stage Priority Intersetions by Werner Brilon and Ning Wu Abstrat The subjet of this paper is the apaity of minor-street traffi movements aross major divided four-lane roadways

More information

A Design Environment for Migrating Relational to Object Oriented Database Systems

A Design Environment for Migrating Relational to Object Oriented Database Systems To appear in: 1996 International Conferene on Software Maintenane (ICSM 96); IEEE Computer Soiety, 1996 A Design Environment for Migrating Relational to Objet Oriented Database Systems Jens Jahnke, Wilhelm

More information

Improved SOM-Based High-Dimensional Data Visualization Algorithm

Improved SOM-Based High-Dimensional Data Visualization Algorithm Computer and Information Siene; Vol. 5, No. 4; 2012 ISSN 1913-8989 E-ISSN 1913-8997 Published by Canadian Center of Siene and Eduation Improved SOM-Based High-Dimensional Data Visualization Algorithm Wang

More information

An Efficient Network Traffic Classification Based on Unknown and Anomaly Flow Detection Mechanism

An Efficient Network Traffic Classification Based on Unknown and Anomaly Flow Detection Mechanism An Effiient Network Traffi Classifiation Based on Unknown and Anomaly Flow Detetion Mehanism G.Suganya.M.s.,B.Ed 1 1 Mphil.Sholar, Department of Computer Siene, KG College of Arts and Siene,Coimbatore.

More information

How To Fator

How To Fator CHAPTER hapter 4 > Make the Connetion 4 INTRODUCTION Developing seret odes is big business beause of the widespread use of omputers and the Internet. Corporations all over the world sell enryption systems

More information

WORKFLOW CONTROL-FLOW PATTERNS A Revised View

WORKFLOW CONTROL-FLOW PATTERNS A Revised View WORKFLOW CONTROL-FLOW PATTERNS A Revised View Nik Russell 1, Arthur H.M. ter Hofstede 1, 1 BPM Group, Queensland University of Tehnology GPO Box 2434, Brisbane QLD 4001, Australia {n.russell,a.terhofstede}@qut.edu.au

More information

i_~f e 1 then e 2 else e 3

i_~f e 1 then e 2 else e 3 A PROCEDURE MECHANISM FOR BACKTRACK PROGRAMMING* David R. HANSON + Department o Computer Siene, The University of Arizona Tuson, Arizona 85721 One of the diffiulties in using nondeterministi algorithms

More information

Procurement auctions are sometimes plagued with a chosen supplier s failing to accomplish a project successfully.

Procurement auctions are sometimes plagued with a chosen supplier s failing to accomplish a project successfully. Deision Analysis Vol. 7, No. 1, Marh 2010, pp. 23 39 issn 1545-8490 eissn 1545-8504 10 0701 0023 informs doi 10.1287/dea.1090.0155 2010 INFORMS Managing Projet Failure Risk Through Contingent Contrats

More information

Robust Classification and Tracking of Vehicles in Traffic Video Streams

Robust Classification and Tracking of Vehicles in Traffic Video Streams Proeedings of the IEEE ITSC 2006 2006 IEEE Intelligent Transportation Systems Conferene Toronto, Canada, September 17-20, 2006 TC1.4 Robust Classifiation and Traking of Vehiles in Traffi Video Streams

More information

Chapter 5 Single Phase Systems

Chapter 5 Single Phase Systems Chapter 5 Single Phase Systems Chemial engineering alulations rely heavily on the availability of physial properties of materials. There are three ommon methods used to find these properties. These inlude

More information

From a strategic view to an engineering view in a digital enterprise

From a strategic view to an engineering view in a digital enterprise Digital Enterprise Design & Management 2013 February 11-12, 2013 Paris From a strategi view to an engineering view in a digital enterprise The ase of a multi-ountry Telo Hervé Paault Orange Abstrat In

More information

GABOR AND WEBER LOCAL DESCRIPTORS PERFORMANCE IN MULTISPECTRAL EARTH OBSERVATION IMAGE DATA ANALYSIS

GABOR AND WEBER LOCAL DESCRIPTORS PERFORMANCE IN MULTISPECTRAL EARTH OBSERVATION IMAGE DATA ANALYSIS HENRI COANDA AIR FORCE ACADEMY ROMANIA INTERNATIONAL CONFERENCE of SCIENTIFIC PAPER AFASES 015 Brasov, 8-30 May 015 GENERAL M.R. STEFANIK ARMED FORCES ACADEMY SLOVAK REPUBLIC GABOR AND WEBER LOCAL DESCRIPTORS

More information

Classical Electromagnetic Doppler Effect Redefined. Copyright 2014 Joseph A. Rybczyk

Classical Electromagnetic Doppler Effect Redefined. Copyright 2014 Joseph A. Rybczyk Classial Eletromagneti Doppler Effet Redefined Copyright 04 Joseph A. Rybzyk Abstrat The lassial Doppler Effet formula for eletromagneti waves is redefined to agree with the fundamental sientifi priniples

More information

An integrated optimization model of a Closed- Loop Supply Chain under uncertainty

An integrated optimization model of a Closed- Loop Supply Chain under uncertainty ISSN 1816-6075 (Print), 1818-0523 (Online) Journal of System and Management Sienes Vol. 2 (2012) No. 3, pp. 9-17 An integrated optimization model of a Closed- Loop Supply Chain under unertainty Xiaoxia

More information

protection p1ann1ng report

protection p1ann1ng report f1re~~ protetion p1ann1ng report BUILDING CONSTRUCTION INFORMATION FROM THE CONCRETE AND MASONRY INDUSTRIES Signifiane of Fire Ratings for Building Constrution NO. 3 OF A SERIES The use of fire-resistive

More information

Agile ALM White Paper: Redefining ALM with Five Key Practices

Agile ALM White Paper: Redefining ALM with Five Key Practices Agile ALM White Paper: Redefining ALM with Five Key Praties by Ethan Teng, Cyndi Mithell and Chad Wathington 2011 ThoughtWorks ln. All rights reserved www.studios.thoughtworks.om Introdution The pervasiveness

More information

Chapter 6 A N ovel Solution Of Linear Congruenes Proeedings NCUR IX. (1995), Vol. II, pp. 708{712 Jerey F. Gold Department of Mathematis, Department of Physis University of Utah Salt Lake City, Utah 84112

More information

FOOD FOR THOUGHT Topical Insights from our Subject Matter Experts

FOOD FOR THOUGHT Topical Insights from our Subject Matter Experts FOOD FOR THOUGHT Topial Insights from our Sujet Matter Experts DEGREE OF DIFFERENCE TESTING: AN ALTERNATIVE TO TRADITIONAL APPROACHES The NFL White Paper Series Volume 14, June 2014 Overview Differene

More information

Availability, Reliability, Maintainability, and Capability

Availability, Reliability, Maintainability, and Capability Availability, Reliability, Maintainability, and Capability H. Paul Barringer, P.E. Barringer & Assoiates, In. Humble, TX Triplex Chapter Of The Vibrations Institute Hilton Hotel Beaumont, Texas February

More information

Customer Efficiency, Channel Usage and Firm Performance in Retail Banking

Customer Efficiency, Channel Usage and Firm Performance in Retail Banking Customer Effiieny, Channel Usage and Firm Performane in Retail Banking Mei Xue Operations and Strategi Management Department The Wallae E. Carroll Shool of Management Boston College 350 Fulton Hall, 140

More information

In order to be able to design beams, we need both moments and shears. 1. Moment a) From direct design method or equivalent frame method

In order to be able to design beams, we need both moments and shears. 1. Moment a) From direct design method or equivalent frame method BEAM DESIGN In order to be able to design beams, we need both moments and shears. 1. Moment a) From diret design method or equivalent frame method b) From loads applied diretly to beams inluding beam weight

More information

arxiv:astro-ph/0304006v2 10 Jun 2003 Theory Group, MS 50A-5101 Lawrence Berkeley National Laboratory One Cyclotron Road Berkeley, CA 94720 USA

arxiv:astro-ph/0304006v2 10 Jun 2003 Theory Group, MS 50A-5101 Lawrence Berkeley National Laboratory One Cyclotron Road Berkeley, CA 94720 USA LBNL-52402 Marh 2003 On the Speed of Gravity and the v/ Corretions to the Shapiro Time Delay Stuart Samuel 1 arxiv:astro-ph/0304006v2 10 Jun 2003 Theory Group, MS 50A-5101 Lawrene Berkeley National Laboratory

More information

Outline. Planning. Search vs. Planning. Search vs. Planning Cont d. Search vs. planning. STRIPS operators Partial-order planning.

Outline. Planning. Search vs. Planning. Search vs. Planning Cont d. Search vs. planning. STRIPS operators Partial-order planning. Outline Searh vs. planning Planning STRIPS operators Partial-order planning Chapter 11 Artifiial Intelligene, lp4 2005/06, Reiner Hähnle, partly based on AIMA Slides Stuart Russell and Peter Norvig, 1998

More information

Software Ecosystems: From Software Product Management to Software Platform Management

Software Ecosystems: From Software Product Management to Software Platform Management Software Eosystems: From Software Produt Management to Software Platform Management Slinger Jansen, Stef Peeters, and Sjaak Brinkkemper Department of Information and Computing Sienes Utreht University,

More information

3 Game Theory: Basic Concepts

3 Game Theory: Basic Concepts 3 Game Theory: Basi Conepts Eah disipline of the soial sienes rules omfortably ithin its on hosen domain: : : so long as it stays largely oblivious of the others. Edard O. Wilson (1998):191 3.1 and and

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introdution 1.1 Pratial olumn base details in steel strutures 1.1.1 Pratial olumn base details Every struture must transfer vertial and lateral loads to the supports. In some ases, beams or

More information

A Three-Hybrid Treatment Method of the Compressor's Characteristic Line in Performance Prediction of Power Systems

A Three-Hybrid Treatment Method of the Compressor's Characteristic Line in Performance Prediction of Power Systems A Three-Hybrid Treatment Method of the Compressor's Charateristi Line in Performane Predition of Power Systems A Three-Hybrid Treatment Method of the Compressor's Charateristi Line in Performane Predition

More information

The Optimal Deterrence of Tax Evasion: The Trade-off Between Information Reporting and Audits

The Optimal Deterrence of Tax Evasion: The Trade-off Between Information Reporting and Audits The Optimal Deterrene of Tax Evasion: The Trade-off Between Information Reporting and Audits Yulia Paramonova Department of Eonomis, University of Mihigan Otober 30, 2014 Abstrat Despite the widespread

More information

Impact Simulation of Extreme Wind Generated Missiles on Radioactive Waste Storage Facilities

Impact Simulation of Extreme Wind Generated Missiles on Radioactive Waste Storage Facilities Impat Simulation of Extreme Wind Generated issiles on Radioative Waste Storage Failities G. Barbella Sogin S.p.A. Via Torino 6 00184 Rome (Italy), barbella@sogin.it Abstrat: The strutural design of temporary

More information

Electrician'sMathand BasicElectricalFormulas

Electrician'sMathand BasicElectricalFormulas Eletriian'sMathand BasiEletrialFormulas MikeHoltEnterprises,In. 1.888.NEC.CODE www.mikeholt.om Introdution Introdution This PDF is a free resoure from Mike Holt Enterprises, In. It s Unit 1 from the Eletrial

More information

Learning Curves and Stochastic Models for Pricing and Provisioning Cloud Computing Services

Learning Curves and Stochastic Models for Pricing and Provisioning Cloud Computing Services T Learning Curves and Stohasti Models for Priing and Provisioning Cloud Computing Servies Amit Gera, Cathy H. Xia Dept. of Integrated Systems Engineering Ohio State University, Columbus, OH 4310 {gera.,

More information

A Survey of Usability Evaluation in Virtual Environments: Classi cation and Comparison of Methods

A Survey of Usability Evaluation in Virtual Environments: Classi cation and Comparison of Methods Doug A. Bowman bowman@vt.edu Department of Computer Siene Virginia Teh Joseph L. Gabbard Deborah Hix [ jgabbard, hix]@vt.edu Systems Researh Center Virginia Teh A Survey of Usability Evaluation in Virtual

More information

university of illinois library AT URBANA-CHAMPAIGN BOOKSTACKS

university of illinois library AT URBANA-CHAMPAIGN BOOKSTACKS university of illinois library AT URBANA-CHAMPAIGN BOOKSTACKS CENTRAL CIRCULATION BOOKSTACKS The person harging this material is responsible for its renewal or its return to the library from whih it was

More information

Bayes Bluff: Opponent Modelling in Poker

Bayes Bluff: Opponent Modelling in Poker Bayes Bluff: Opponent Modelling in Poker Finnegan Southey, Mihael Bowling, Brye Larson, Carmelo Piione, Neil Burh, Darse Billings, Chris Rayner Department of Computing Siene University of Alberta Edmonton,

More information

Computational Analysis of Two Arrangements of a Central Ground-Source Heat Pump System for Residential Buildings

Computational Analysis of Two Arrangements of a Central Ground-Source Heat Pump System for Residential Buildings Computational Analysis of Two Arrangements of a Central Ground-Soure Heat Pump System for Residential Buildings Abstrat Ehab Foda, Ala Hasan, Kai Sirén Helsinki University of Tehnology, HVAC Tehnology,

More information

Supply chain coordination; A Game Theory approach

Supply chain coordination; A Game Theory approach aepted for publiation in the journal "Engineering Appliations of Artifiial Intelligene" 2008 upply hain oordination; A Game Theory approah Jean-Claude Hennet x and Yasemin Arda xx x LI CNR-UMR 668 Université

More information

AUDITING COST OVERRUN CLAIMS *

AUDITING COST OVERRUN CLAIMS * AUDITING COST OVERRUN CLAIMS * David Pérez-Castrillo # University of Copenhagen & Universitat Autònoma de Barelona Niolas Riedinger ENSAE, Paris Abstrat: We onsider a ost-reimbursement or a ost-sharing

More information

TECHNOLOGY-ENHANCED LEARNING FOR MUSIC WITH I-MAESTRO FRAMEWORK AND TOOLS

TECHNOLOGY-ENHANCED LEARNING FOR MUSIC WITH I-MAESTRO FRAMEWORK AND TOOLS TECHNOLOGY-ENHANCED LEARNING FOR MUSIC WITH I-MAESTRO FRAMEWORK AND TOOLS ICSRiM - University of Leeds Shool of Computing & Shool of Musi Leeds LS2 9JT, UK +44-113-343-2583 kia@i-maestro.org www.i-maestro.org,

More information

State of Maryland Participation Agreement for Pre-Tax and Roth Retirement Savings Accounts

State of Maryland Participation Agreement for Pre-Tax and Roth Retirement Savings Accounts State of Maryland Partiipation Agreement for Pre-Tax and Roth Retirement Savings Aounts DC-4531 (08/2015) For help, please all 1-800-966-6355 www.marylandd.om 1 Things to Remember Complete all of the setions

More information

SLA-based Resource Allocation for Software as a Service Provider (SaaS) in Cloud Computing Environments

SLA-based Resource Allocation for Software as a Service Provider (SaaS) in Cloud Computing Environments 2 th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing SLA-based Resoure Alloation for Software as a Servie Provider (SaaS) in Cloud Computing Environments Linlin Wu, Saurabh Kumar

More information

Srinivas Bollapragada GE Global Research Center. Abstract

Srinivas Bollapragada GE Global Research Center. Abstract Sheduling Commerial Videotapes in Broadast Television Srinivas Bollapragada GE Global Researh Center Mihael Bussiek GAMS Development Corporation Suman Mallik University of Illinois at Urbana Champaign

More information

Impedance Method for Leak Detection in Zigzag Pipelines

Impedance Method for Leak Detection in Zigzag Pipelines 10.478/v10048-010-0036-0 MEASUREMENT SCIENCE REVIEW, Volume 10, No. 6, 010 Impedane Method for Leak Detetion in igzag Pipelines A. Lay-Ekuakille 1, P. Vergallo 1, A. Trotta 1 Dipartimento d Ingegneria

More information

MATE: MPLS Adaptive Traffic Engineering

MATE: MPLS Adaptive Traffic Engineering MATE: MPLS Adaptive Traffi Engineering Anwar Elwalid Cheng Jin Steven Low Indra Widjaja Bell Labs EECS Dept EE Dept Fujitsu Network Communiations Luent Tehnologies Univ. of Mihigan Calteh Pearl River,

More information

10.1 The Lorentz force law

10.1 The Lorentz force law Sott Hughes 10 Marh 2005 Massahusetts Institute of Tehnology Department of Physis 8.022 Spring 2004 Leture 10: Magneti fore; Magneti fields; Ampere s law 10.1 The Lorentz fore law Until now, we have been

More information

RISK-BASED IN SITU BIOREMEDIATION DESIGN JENNINGS BRYAN SMALLEY. A.B., Washington University, 1992 THESIS. Urbana, Illinois

RISK-BASED IN SITU BIOREMEDIATION DESIGN JENNINGS BRYAN SMALLEY. A.B., Washington University, 1992 THESIS. Urbana, Illinois RISK-BASED IN SITU BIOREMEDIATION DESIGN BY JENNINGS BRYAN SMALLEY A.B., Washington University, 1992 THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Siene in Environmental

More information

Hierarchical Beta Processes and the Indian Buffet Process

Hierarchical Beta Processes and the Indian Buffet Process Hierarhial Beta Proesses and the Indian Buffet Proess Romain Thibaux Dept. of EECS University of California, Berkeley Berkeley, CA 9472 Mihael I. Jordan Dept. of EECS and Dept. of Statistis University

More information

INCOME TAX WITHHOLDING GUIDE FOR EMPLOYERS

INCOME TAX WITHHOLDING GUIDE FOR EMPLOYERS Virginia Department of Taxation INCOME TAX WITHHOLDING GUIDE FOR EMPLOYERS www.tax.virginia.gov 2614086 Rev. 07/14 * Table of Contents Introdution... 1 Important... 1 Where to Get Assistane... 1 Online

More information

Asymmetric Error Correction and Flash-Memory Rewriting using Polar Codes

Asymmetric Error Correction and Flash-Memory Rewriting using Polar Codes 1 Asymmetri Error Corretion and Flash-Memory Rewriting using Polar Codes Eyal En Gad, Yue Li, Joerg Kliewer, Mihael Langberg, Anxiao (Andrew) Jiang and Jehoshua Bruk Abstrat We propose effiient oding shemes

More information

Marker Tracking and HMD Calibration for a Video-based Augmented Reality Conferencing System

Marker Tracking and HMD Calibration for a Video-based Augmented Reality Conferencing System Marker Traking and HMD Calibration for a Video-based Augmented Reality Conferening System Hirokazu Kato 1 and Mark Billinghurst 2 1 Faulty of Information Sienes, Hiroshima City University 2 Human Interfae

More information

The Application of Mamdani Fuzzy Model for Auto Zoom Function of a Digital Camera

The Application of Mamdani Fuzzy Model for Auto Zoom Function of a Digital Camera (IJCSIS) International Journal of Computer Siene and Information Seurity, Vol. 6, No. 3, 2009 The Appliation of Mamdani Fuzzy Model for Auto Funtion of a Digital Camera * I. Elamvazuthi, P. Vasant Universiti

More information

A novel active mass damper for vibration control of bridges

A novel active mass damper for vibration control of bridges IABMAS 08, International Conferene on Bridge Maintenane, Safety and Management, 3-7 July 008, Seoul, Korea A novel ative mass damper for vibration ontrol of bridges U. Starossek & J. Sheller Strutural

More information

protection p1ann1ng report

protection p1ann1ng report ( f1re protetion p1ann1ng report I BUILDING CONSTRUCTION INFORMATION FROM THE CONCRETE AND MASONRY INDUSTRIES NO. 15 OF A SERIES A Comparison of Insurane and Constrution Costs for Low-Rise Multifamily

More information

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 3, MAY/JUNE 2012 401

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 3, MAY/JUNE 2012 401 IEEE TRASACTIOS O DEPEDABLE AD SECURE COMPUTIG, VOL. 9, O. 3, MAY/JUE 2012 401 Mitigating Distributed Denial of Servie Attaks in Multiparty Appliations in the Presene of Clok Drifts Zhang Fu, Marina Papatriantafilou,

More information

REDUCTION FACTOR OF FEEDING LINES THAT HAVE A CABLE AND AN OVERHEAD SECTION

REDUCTION FACTOR OF FEEDING LINES THAT HAVE A CABLE AND AN OVERHEAD SECTION C I E 17 th International Conferene on Eletriity istriution Barelona, 1-15 May 003 EUCTION FACTO OF FEEING LINES THAT HAVE A CABLE AN AN OVEHEA SECTION Ljuivoje opovi J.. Elektrodistriuija - Belgrade -

More information

CIS570 Lecture 4 Introduction to Data-flow Analysis 3

CIS570 Lecture 4 Introduction to Data-flow Analysis 3 Introdution to Data-flow Analysis Last Time Control flow analysis BT disussion Today Introdue iterative data-flow analysis Liveness analysis Introdue other useful onepts CIS570 Leture 4 Introdution to

More information

Optimal Sales Force Compensation

Optimal Sales Force Compensation Optimal Sales Fore Compensation Matthias Kräkel Anja Shöttner Abstrat We analyze a dynami moral-hazard model to derive optimal sales fore ompensation plans without imposing any ad ho restritions on the

More information

5.2 The Master Theorem

5.2 The Master Theorem 170 CHAPTER 5. RECURSION AND RECURRENCES 5.2 The Master Theorem Master Theorem In the last setion, we saw three different kinds of behavior for reurrenes of the form at (n/2) + n These behaviors depended

More information

Suggested Answers, Problem Set 5 Health Economics

Suggested Answers, Problem Set 5 Health Economics Suggested Answers, Problem Set 5 Health Eonomis Bill Evans Spring 2013 1. The graph is at the end of the handout. Fluoridated water strengthens teeth and redues inidene of avities. As a result, at all

More information

INCOME TAX WITHHOLDING GUIDE FOR EMPLOYERS

INCOME TAX WITHHOLDING GUIDE FOR EMPLOYERS Virginia Department of Taxation INCOME TAX WITHHOLDING GUIDE FOR EMPLOYERS www.tax.virginia.gov 2614086 Rev. 01/16 Table of Contents Introdution... 1 Important... 1 Where to Get Assistane... 1 Online File

More information

Henley Business School at Univ of Reading. Pre-Experience Postgraduate Programmes Chartered Institute of Personnel and Development (CIPD)

Henley Business School at Univ of Reading. Pre-Experience Postgraduate Programmes Chartered Institute of Personnel and Development (CIPD) MS in International Human Resoure Management For students entering in 2012/3 Awarding Institution: Teahing Institution: Relevant QAA subjet Benhmarking group(s): Faulty: Programme length: Date of speifiation:

More information

Agent-Based Grid Load Balancing Using Performance-Driven Task Scheduling

Agent-Based Grid Load Balancing Using Performance-Driven Task Scheduling Agent-Based Grid Load Balaning Using Performane-Driven Task Sheduling Junwei Cao *1, Daniel P. Spooner, Stephen A. Jarvis, Subhash Saini and Graham R. Nudd * C&C Researh Laboratories, NEC Europe Ltd.,

More information

Price-based versus quantity-based approaches for stimulating the development of renewable electricity: new insights in an old debate

Price-based versus quantity-based approaches for stimulating the development of renewable electricity: new insights in an old debate Prie-based versus -based approahes for stimulating the development of renewable eletriity: new insights in an old debate uthors: Dominique FINON, Philippe MENNTEU, Marie-Laure LMY, Institut d Eonomie et

More information

Performance Analysis of IEEE 802.11 in Multi-hop Wireless Networks

Performance Analysis of IEEE 802.11 in Multi-hop Wireless Networks Performane Analysis of IEEE 80.11 in Multi-hop Wireless Networks Lan Tien Nguyen 1, Razvan Beuran,1, Yoihi Shinoda 1, 1 Japan Advaned Institute of Siene and Tehnology, 1-1 Asahidai, Nomi, Ishikawa, 93-19

More information

VOLUME 13, ARTICLE 5, PAGES 117-142 PUBLISHED 05 OCTOBER 2005 DOI: 10.4054/DemRes.2005.13.

VOLUME 13, ARTICLE 5, PAGES 117-142 PUBLISHED 05 OCTOBER 2005  DOI: 10.4054/DemRes.2005.13. Demographi Researh a free, expedited, online journal of peer-reviewed researh and ommentary in the population sienes published by the Max Plank Institute for Demographi Researh Konrad-Zuse Str. 1, D-157

More information

Design Implications for Enterprise Storage Systems via Multi-Dimensional Trace Analysis

Design Implications for Enterprise Storage Systems via Multi-Dimensional Trace Analysis Design Impliations for Enterprise Storage Systems via Multi-Dimensional Trae Analysis Yanpei Chen, Kiran Srinivasan, Garth Goodson, Randy Katz University of California, Berkeley, NetApp In. {yhen2, randy}@ees.berkeley.edu,

More information