# Study Manual. Probabilistic Reasoning. Silja Renooij. August 2015

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Study Manual Probabilistic Reasoning Silja Renooij August 2015

2 General information This study manual was designed to help guide your self studies. As such, it does not include material that is not already present in the syllabus or discussed in class. Hopefully, it will help you take a deeper dive into the subjects and enhance your understanding. This manual enumerates some details on the knowledge and abilities students enrolling in this course are expected to learn. In addition, this manual lists for each chapter from the syllabus a number of questions that can help to guide your self-study. In the syllabus itself, a number of exercises and answers to some of them (indicated by a *) are given. The exercises enable you to practice what you have learned and give an idea of what you should be able to do if you fully understand the material. Most of the exercises are former exam questions. Examples of previous exams are available through the course web-site. The questions and exercises should indicate how you are doing thus far. Examination The Probabilistic Reasoning course is a 7.5 ECTS course, which means that an average student is expected to require 210 hours of work to complete and pass the course. The course is graded based on a number of practical assignments and a written exam. For the exam, you are expected to have knowledge of, insight in and, most importantly, understanding of the subjects presented in the syllabus and (sometimes more detailed) in the course slides. This also means that you have practised with applying different procedures and algorithms and can correctly do so in little time. In general, I expect at least the following: you know all preliminaries by heart, as well as any formula that can be easily reconstructed from understanding the idea behind it; you understand the origin of the components of all parameters/messages from Pearl s algorithm (Chapter 4), but you need not know the exact way they are computed (the lemmas); you understand the entropy formula (Chapter 5) and its effects, but you are not required to reproduce it; you know what different symbols and notations mean, how different concepts are defined, and you are able to apply the different methods and algorithms discussed; (NB make sure you practice with Pearl s algorithm; this will save you at lot of time during the exam!!) you have enough understanding of all subjects treated to be able to identify advantages and drawbacks of different concepts and algorithms, and to use these insights in generating or understanding variations on what was treated during the course;... 2

3 More specifically: Chapter 2 + course slides: you know all definitions, propositions and theorems by heart; Chapter 3 + course slides: you know all definitions, theorem, corollaries and lemmas by heart; Chapter 4 + course slides: you know by heart: all definitions, corollaries, proposition 4.1.3, lemmas 4.2.1, 4.2.8, all algorithms (test your inference answers with one of the Bayesian network software packages (see links on the course website)) you will be given during the exam: lemmas 4.2.4, 4.2.5, 4.2.6, 4.2.7, (recall that the formulas for trees are just a special case of those for singly connected graphs) Chapter 5 + course slides you know by heart: all definitions, except 5.2.2, 5.2.3, all lemmas, corollaries and algorithms, all formulas for the (leaky) noisy-or gate if necessary, you will be given during the exam: definition 5.2.2, Chapter 6 + course slides: you know all concepts and formulas in 6.1 and 6.2 by heart; you understand the global ideas in 6.3 An important note on applying procedures and algorithms: there is a difference between solving a problem and applying an algorithm that solves the problem. If an exercise or exam asks you to solve a problem, you can use whatever method you think is most suitable to do the job. If, however, you are required to solve a problem using a certain algorithm, then you are basically asked to illustrate how the algorithm can be used to solve the problem and you should execute the steps prescribed by the algorithm yourself. For example, if you are asked to compute a certain probability from a set of given probability, you may use whatever rules from probability theory you can apply. If you are asked to compute a probability using Pearl s algorithm (see Chapter 4), then you should perform exactly those computations prescribed by the algorithm, even if the example is such that there are more efficient ways of solving the problem by hand! In illustrating Pearl s algorithm, it is sufficient, though, to do only those computations that are required to answer the question at hand, instead of computing each and every parameter for all variables. Basically, if you are asked to apply an algorithm, I am testing whether or not you understand how the algorithm works, not if you are able to solve the problem for which the algorithm is designed. 3

4 Chapter 1 The first chapter gives some historical background about the use of probability theory and other uncertainty formalisms for reasons of decision support. It briefly motivates the emergence of Bayesian networks and the reason why Bayesian network applications historically have often concerned the medical domain. Questions What is mutually exclusive? What is collectively exhaustive? Which assumption is made in a naive Bayes approach for reasons of time complexity, what is the reduction achieved and why? Which assumption is made in a naive Bayes approach for reasons of space complexity, what is the reduction achieved and why? Why is a naive (or idiot) Bayes approach naive? Is it still used? What is the relation between a naive Bayes approach and GLADYS? Chapter 2 The second chapter refreshes the necessary concepts from graph- and probability theory that play a central role in the Bayesian network framework. These concepts have already been discussed in other courses and can be found in any textbook on graph theory and probability theory. The chapter also introduces the notation that is used throughout the syllabus and which may differ from what you encountered previously. Two important things to note here are the following: In this course we often address paths in graphs; unless stated otherwise, these paths are assumed to be simple paths. A lot of formulas in the course material contain a summation of the following form: c V f(c V ) for some expression f(c V ) depending on c V. Note that this summation is a summation over the set of configurations c V of a set V, not over the elements of set V itself. If this set V is empty, this therefore does not mean the summation is undefined! Rather, it means the summation is over the single element c V = c =T (true). In that case the summation reduces to the single term f(t). 4

5 Questions Is there a limit to the number of times a certain vertex may be included in a path? What is the difference between a path and a chain in a directed graph? What is the difference between a loop and a cycle in a directed graph? What is the difference between a tree, a singly connected digraph and a multiply connected digraph? When can we use the operators and? When can we use the operators and? What is the difference between the proposition True and a value true for a variable? What is the difference between a joint probability distribution, a conditional probability distribution, and a marginal probability distribution? What is the difference between a value assignment, a configuration and a configuration template? Why is it the case that for two sets of variables X and Y C X Y = C X C Y is the only correct way of expressing the configuration template over both sets X and Y in terms of the separate configuration templates? What is the relation between the marginalisation and the conditioning property? What is the difference between independence of propositions and independence of variables? Suppose a (not necessarily binary) statistical variable V can take on the values v 1,..., v n. Why does the definition of probability distribution imply that n i=1 Pr(v i) = 1? What is the difference between (in)dependence and conditional (in)dependence? Refreshing exercises The following exercises were designed for a bachelor course on statistics and can be used to refresh your memory and intuition. 1 Joost has been suffering from fits of sneezing. He decides to see his general practitioner (GP) and have himself tested for hay fever. The GP tells Joost that about 1 in 10 people in the Netherlands get hay fever. a. The prior probability Pr(h) that Joost has hay fever is the probability of hay fever in case we know nothing about Joost. What is this probability? The GP performs a skin-test; this test turns up positive, i.e. the test indicates that Joost indeed has a hay fever. b. Can you now give an estimation of the probability that Joost does not suffer from hay fever? Joost asks his GP how reliable the skin-test is. The GP indicates that the test gives a correct prediction for 70% of all people with hayfever, and for 90% of all people without hayfever. 5

6 c. Which conditional probabilities are described here? d. Without calculating, what do you think the probability is of Joost not having a hay fever, despite the positive test result?: between 0 and 0.2, between 0.2 and 0.4, between 0.4 and 0.6, between 0.6 and 0.8, or between 0.8 and 1 e. Use the conditioning rule to compute the prior probability of a positive test result. NB This rule is also referred to as the rule of total probability. The reliability of a test is given in terms of the probability of obtaining a certain test result given that the patient tested has, or does not have, respectively, the disease tested for. In designing a treatment for a patient, however, it is important to know the opposite: what is the probability of the disease given the test result. f. Which rule can be used for reversing conditional probabilities? g. Compute the probability of Joost not having a hay fever, despite the positive test result. Does the answer coincide with your estimate for part d.? Intermezzo Two equivalent definitions of independence of two propositions x and y exist: (a) x and y are independent if Pr(x y) = Pr(x) Pr(y); (b) x and y are independent if Pr(x y) = Pr(x). In addition, two equivalent definitions exist of conditional independence of two propositions xand y given a third proposition z: (a) x and y are independent given z if Pr(x y z) = Pr(x z) Pr(y z); (b) x and y are independent given z if Pr(x y z) = Pr(x z). 2 Suppose that the GP from the previous exercise also performs a blood test on Joost. a. Are the results from the skin-test and the blood test dependent or independent? b. Suppose you are clairvoyant (all-seeing) and know for sure that Joost has a hay fever. Are the results from the skin-test and the blood test dependent or independent, given this knowledge? Suppose that you have available for both tests the probabilities of a positive result given the presence or absence, respectively, of hay fever. You want to use Bayes rule to compute the probability of Joost having a hay fever given results of both tests. c. Write down Bayes rule for this probability. d. Which assumptions with respect to (conditional) (in)dependence do you have to make in order to compute this probability from the information about the reliability of the tests? 3 Consider a population of a 1000 males and females. Consider the following propositions: m = {is male}, f = {is female}, a = {wears make-up} en b = {doesn t wear make-up}. a. How many statistical variables do we consider here, and what are their values? b. Use your intuition to determine whether the propositions m and b are dependent or independent. Suppose that the following frequency table is given for the population under consideration: m f a b

7 c. Use the definition of independence to determine whether the propositions m and b are independent. d. Suppose that we only consider the subset of grown-ups from the above population. Use your intuition to determine whether, given the proposition {is grown-up}, the propositions m and b are dependent or independent. e. Suppose that we only consider the subset of children from the above population. Use your intuition to determine whether, given the proposition {is child}, the propositions m and b are dependent or independent. Suppose we divide the population into children c and grown-ups g: m f g c g k a b f. Use the definition of independence to determine whether, given the proposition k, the propositions m and b are independent. 4 Consider a population of 1000 individuals. Consider the following propositions: h = {dog owner}, k = {no dog owner}, s = {wears glasses} en g = {does not wear glasses}. a. How many statistical variables do we consider here, and what are their values? b. Use your intuition to determine whether the propositions h and s are dependent or independent. Suppose that the following frequency table is given for the population under consideration: s g h k c. Use the definition of independence to determine whether the propositions h and s are independent. d. Suppose we consider only the women in the above population. Use your intuition to determine whether, given the proposition {is female}, the propositions h and s are dependent or independent. e. Suppose we consider only the severely visual disabled from the population. Use your intuition to determine whether, given the proposition {is blind }, the propositions h and s are dependent or independent. Suppose we divide the population into blind b and not blind z: s g z b z b h k e. Use the definition of independence to determine whether, given the proposition b, the propositions s and h are independent. 5 Consider the propositions a and b 1,..., b n, and the marginalisation rule: Pr(a) = n Pr(a b i ) i=1 Invent a proposition a and at least three proposition b i to motivate the existence of the marginalisation rule. 7

8 6 Suppose a course is graded based on n assignments; each assignment T i concerns the material discussed in week i, i = 1,..., n. Let t i be the proposition {student passes assignment T i }. a. Suppose the course requirements are such that you can only pass the assignment in week i, if you have passed assignments T 1 through T i 1. Give a formula for the probability P (t 1 t 2 t 3 t 4 t 5 t 6 ) that someone passes the first 6 assignments. The formula that you constructed in part a. (if correct) is known as the chain rule. b. Now suppose that the course consists of various subjects, each covering two weeks. The material for the assignments in uneven weeks is new, the material for the assignments in the even weeks is related to the material of the previous week. Rewrite the formula in part a. to reflect these properties. Chapter 3 This chapter formalises two types of independence relation. The first, I Pr, is the type of relation that can be captured by a probability distribution. These independence relations form a proper subset of a more general type of independence relation I that abstracts away from probability distributions. The chapter also discusses different representations of independence relations, most notably (in)directed graphs. An important notion introduced in this chapter is the concept of d-separation. Independence relations and their representation are still an area of ongoing research, see for example the work by Milan Studený from Prague. This chapter, however, describes the full body of research as far as required for our discussion of Bayesian networks in the subsequent chapters. Questions Intuitively, what is the difference between independence relations I Pr and I? What is the difference between the properties satisfied by I Pr compared to those satisfied by I? What are possible encodings of an independence relation? Why does an I-map represent independencies and a D-map dependencies? Why is it unfortunate that not every independence relation has an (un)directed P -map? Why is an independence relation for which a P -map exists termed graph-isomorphic? Does a graph-isomorphic independence relation have a unique P -map? Does the (d-)separation criterion require a graph to be acyclic? Why/why not? Why does the following statement hold for both directed and undirected graphs: If a set Z blocks a path, then Z Y blocks the path for any Y? Why does the following statement for directed graphs only hold if Z and at least one Z i Z is on the chain under consideration: If a set Z blocks a chain, then Z Y blocks the chain for any Y? 8

9 Why does the following statement hold for undirected graphs: If a set Z separates two other sets, then Z Y separates the sets for any Y? Why does the following statement not hold for directed graphs: If a set Z d-separates two other sets, then Z Y d-separates the sets for any Y? Chapter 4 This chapter defines the concept of Bayesian network and reconstructs the algorithm for exact probabilistic inference as designed by Judea Pearl; for ease of exposition only binary variables are considered. Other, and often more efficient, algorithms for exact inference exist such as jointree propagation (S.L. Lauritzen, D.J. Spiegelhalter, F.V. Jensen, P.P Shenoy, G. Shafer) and variable elimination (R. Dechter, G.F. Cooper) methods. The reason for discussing Pearl s algorithm is that it explicitly exploits the graphical structure of the network without transforming it, and therefore seems the one easiest to explain and comprehend. Current research focuses on finding still more efficient algorithms, both for exact and approximate inference, and also for dealing with networks that include continuous variables. Also, now and again, researchers look into better algorithms for loop cutset conditioning (A. Becker, D. Geiger), the extension to Pearl s algorithm that is required to do inference in multiply connected graphs. To gain more insight in inference and the effects of loop cutset conditioning try one of the free software packages available through the course website. Questions The network as graphical model: Why is a Bayesian network often called a belief network, or causal network? What is the independence relation I modelled by a Bayesian network? Why have people chosen the graph of a Bayesian network to represent a (minimal) I-map of the independence relation as opposed to a D-map? Why have people chosen a directed graph to capture the independence relation for a Bayesian network? The probability distribution: Why is the digraph of a Bayesian network assumed to be acyclic? What is the difference between an assessment function for a vertex and a probability distribution for that vertex? Which piece(s) of information is/are necessary to determine the number of assessments functions required for specifying a Bayesian network? 9

10 Which pieces of information are necessary to determine the number of (conditional) probabilities required for specifying all assessments functions? How does the joint probability distribution Pr defined by a Bayesian network exploit the fact that the digraph of the network is an I-map of the independence relation of Pr? How do constraints for space requirements for Bayesian networks compare to the naive Bayes approach? Probabilistic Inference general: How do constraints for time requirements for Bayesian networks compare to the naive Bayes approach? What is the difference between a prior and a posterior probability? Is there actually a difference between Pr(v i v j ) and Pr v j (v i )? What is the difference between Pr(v i ) and Pr v i (v i )? How can a set of instantiated or observed vertices be related to the blocking set introduced in Chapter 3? What is the relation between configuration, instantiation, and partial configuration? What is the computational time complexity of probabilistic inference? Pearl s algorithm definitions: Why does the definition of V i differ across different types of graph? How do the messages of Pearl s algorithm exploit the fact that the digraph is an I-map of the independence relation of Pr? Why do the formulas for computing messages differ across different types of graph? Which lemma represents the first step of Pearl s algorithm? What is the difference between a causal and a diagnostic parameter? What are the differences and similarities between causal/diagnostic parameters and probabilities? Why do the compound causal parameters for a vertex sum to 1? parameters sent by a vertex sum to 1? Why do the causal Why do the compound diagnostic parameters for a vertex in general not sum to 1? Why do the diagnostic parameters sent by a vertex in general not sum to 1? Why is it the case that if the compound diagnostic parameter of a vertex equals 1, that all diagnostic parameters sent by that vertex are equivalent (in a directed tree: equal to 1)? When is a vertex ready to send a causal parameter? When is a vertex ready to send a diagnostic parameter? 10

11 When is a vertex ready to calculate its (posterior) probability? To calculate prior probabilities, do you require causal parameters, diagnostic parameters, or both? To calculate posterior probabilities, do you require causal parameters, diagnostic parameters, or both? Why do the formulas in Lemmas and and similar formulas for singly connected digraphs assume V i to be an uninstantiated vertex? Why do observations have to be entered by means of dummy nodes? Why does the dummy node approach enable us to use lemmas 4.2.5, etc. for instantiated vertices as well? Why can Pearl s algorithm lead to incorrect results when applied to multiply connected graphs? Pearl s algorithm normalisation: What is the use of a normalisation constant α? Why does the normalisation constant α differ across messages? Is normalisation always used for messages that sum to 1, or could they sum to a different constant? (Compare α in the formula for separate diagnostic parameters in singly connected graphs to the other α s). Why can normalisation be postponed to the final step of Pearl s algorithm? When does it come in handy to not postpone normalisation? Why is normalisation not actually required when computing prior probabilities? (Can be used as correctness check) Loop Cutsets: Why does Pearl s algorithm as presented in and not work in multiply connected graphs? What is the idea behind the method of loop cutset conditioning? Which of the following statements is correct (the difference is rather subtle): Loop cutset conditioning is a method for probabilistic inference in multiply connected networks that uses Pearl s algorithm for computing some of the probabilities is requires. or, Loop cutset conditioning is a method that is used within Pearl s algorithm in order to enable its correct application to multiply connected networks.? Consider a Bayesian network with loops in its digraph G; can the loop cutset for G be the empty set? Given the definition of a loop cutset, can a vertex with two incoming arcs in G be included in a loop cutset for G? 11

12 For a graph G, will the Suermondt & Cooper algorithm return a loop cutset that includes a vertex with two or more incoming arcs in G? Why does every loop in a Bayesian network s digraph have at least one vertex with at most one incoming arc? Is a loop cutset a property of a graph or a property of a Bayesian network? Does a loop cutset change after entering evidence in a Bayesian network? Does the probability distribution over configurations of the loop cutset change after entering evidence into a Bayesian network? Why can the prior probability distribution over configurations of the loop cutset not be computed using Pearl s algorithm, but has to be computed by marginalisation from the joint distribution instead? In loop cutset conditioning, in general, which probabilities can and which probabilities cannot be computed using Pearl s algorithm (as a black box )? What is the use of a global supervisor? What is the first step in performing loop cutset conditioning (not: in finding a loop cutset!)? Why is loop cutset conditioning computationally expensive? Is finding a loop cutset computationally expensive? Is a minimal loop cutset always optimal? Is an optimal loop cutset always minimal? Why should vertices with a degree of one or less not be included in a loop cutset? Why are vertices with an indegree of at most one selected as candidates for the loop cutset by the heuristic Suermondt & Cooper algorithm? What are possible effects of this choice? Why does the heuristic Suermondt & Cooper algorithm for finding a loop cutset choose to add a candidate with highest degree to the loop cutset? What if there are several candidates with the highest degree? What are properties of the graph and the loop cutset before and after performing the algorithm for finding a loop cutset? Chapter 5 This chapter and the next one differ from previous chapters in the sense that the latter describe more or less finished research, whereas the following chapters discuss topics that are subject of ongoing research 1. As a result, the chapters tell a number of short stories with lots of open endings. We provide the basics of various algorithms and methodologies that still underlie, and are necessary for understanding, state of the art results. The assumption that we consider binary variables only is now lifted, although employed in some specific situations. 1 As a matter of fact, the Decision Support Systems group of our Department does research into quite a number of subjects discussed in this chapter (see the group website). 12

13 This chapter discusses the construction of a Bayesian network: how do we get the graphical structure (by hand, or automatically?) and where do the probabilities come from (experts, data,...?). Try and construct a (small) Bayesian network yourself using one of the numerous software packages or online tools developed for constructing and reasoning with Bayesian networks (see the course website) Remark: in this chapter you will find formulas including a term of the form a log b, where a and/or b can be zero. A standard convention is that 0 log 0 = 0. Questions Construction in general What are typical trade-offs made when constructing a Bayesian network for a real application domain? What is the difference between domain-variables and Bayesian network variables? What is the difference between single-valued variable, multi-valued variables and statistical variables? Given the definition of a Bayesian network in Chapter 4, why is it, in general, not possible to allow continuous statistical variables? What are advantages and disadvantages of the different ways of modelling a multi-valued domain variable in a Bayesian network? Construction by hand What is the problem with using the notion of causality for constructing the digraph? Why is it important to distinguish between direct and indirect relations? Why do digraphs which are constructed by hand often contain cycles at some stage of the construction? Why does the assumption of a disjunctive interaction simplify probability assessment? Is the assumption realistic? Why is a noisy-or gate called noisy? Why is its leaky version called leaky? This chapter formulates the noisy-or gate for binary variables. Is it easy to generalise to non-binary variables? Why does the inhibitor probability in the leaky noisy-or gate differ semantically from the inhibitor probability in the noisy-or gate? Why is it difficult to assess probabilities from literature? Why is it difficult to use domain experts for assessing probabilities? What is a typical trade-off when eliciting probabilities from experts? 13

14 What is the use of sensitivity analysis? Automated construction Are the assumptions required of a database in order to use it for learning a network realistic? What are possible solutions to and the effects of handling missing values in a database? N(c X ) gives the number of cases in a database for which variable X has the given configuration. Why can we correctly assume that N(c ) = N? Is frequency counting an efficient way of probability estimation? What is the difference between a conditional independence learning algorithm and a metric learning algorithm? Does the order of variables used for conditional independence learning affect the resulting graph? Why are a quality measure and a search heuristic necessary ingredients for a metric learning algorithm? What is the function of the different ingredients of the MDL quality measure? Why does the MDL measure as stated assume variables to be binary? Can this restriction be lifted? What are the ranges of the different terms in the MDL measure? Why is the log P term a constant if you assume a uniform prior distribution over possible digraphs? You could say Entropy Chaos Uncertainty. Why does lower entropy go hand in hand with denser digraphs? Why do denser digraphs go hand in hand with less reliable probability estimates? What solution to this problem is present in the MDL measure? What are the benefits and drawbacks of using a search heuristic? When employing the search heuristic as described in Chapter 5 it is sufficient to consider only gain in quality. Why? For the search heuristic described: why does a difference in vertex quality for a single vertex correspond to a difference in graph quality? When the search heuristic adds an arc (V i, V j ), why does the quality of V j change and not the quality of V i? If adding a certain arc decreases the quality of the graph, will that arc ever be added? If adding a certain arc increases the quality of the graph, will that arc always be added? Does the choice of an arc to add affect the set of arcs that may be subsequently added? Does the procedure of adding arcs that give the largest increase in quality result in a graph with maximum quality? 14

15 Is the MDL measure a suitable measure? When should you learn networks of restricted topology? What are possible effects of learning a network from a database that does not obey the required assumptions? Chapter 6 This chapter tries to bridge the gap between construction and actual use of a Bayesian network application. It first discusses possible methods for evaluating the behaviour of your network; then some example applications where a Bayesian network is used as a component in a decision support process are discussed 2. Questions Sensitivity analysis: What is the use of sensitivity analysis? Compare the following terms: sensitivity, robustness, and correctness. Is it computationally feasible to perform a sensitivity analysis? Why can no parameter probability have a non-linear influence on a prior probability of interest? Is the sensitivity set a property of a Bayesian network s digraph? Why can variables in the sensitivity set be excluded from a sensitivity analysis? What do sensitivity functions look like in general? Why are sensitivity functions for a prior joint probability linear? What is the amount of data you get from a sensitivity analysis? How can you select parameters of interest? What are advantages and disadvantages of performing a two-way sensitivity analysis as opposed to a one-way analysis? Evaluation: In evaluating a Bayesian network against a standard of validity, why would you ideally use a gold standard? Why is this not always possible? What information does and doesn t a percentage correct convey? 2 The subjects of evaluation and of test-selection, that are briefly touched upon here, are also directions of research pursued in the Decision Support Systems group of our Department. 15

16 What is an acceptable percentage correct? What information does and doesn t the Brier score convey? What is an acceptable Brier score? Application: What is the use of a two-layer architecture? In the threshold model for patient management, are the utilities patient-specific? What about the threshold probabilities? Can the threshold model for patient management be applied to a series of tests? Why are utilities in the threshold model said to subjective and in diagnostic problem solving objective? In diagnostic problem solving utilities are defined for values of binary variables only. Can this be generalised to non-binary variables? In diagnostic problem solving utilities represent the shift in probability of an hypothesis as a result of evidence. Why are large shifts in either direction awarded large utilities? In diagnostic problem solving, is the utility of a variable s value a fixed number? What about the expected utility of a node? In selective evidence gathering, what is the use of a stopping criterion? In selective evidence gathering, what are the effects of the single-disorder and myopic assumptions? 16

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

### Incorporating Evidence in Bayesian networks with the Select Operator

Incorporating Evidence in Bayesian networks with the Select Operator C.J. Butz and F. Fang Department of Computer Science, University of Regina Regina, Saskatchewan, Canada SAS 0A2 {butz, fang11fa}@cs.uregina.ca

### GeNIeRate: An Interactive Generator of Diagnostic Bayesian Network Models

GeNIeRate: An Interactive Generator of Diagnostic Bayesian Network Models Pieter C. Kraaijeveld Man Machine Interaction Group Delft University of Technology Mekelweg 4, 2628 CD Delft, the Netherlands p.c.kraaijeveld@ewi.tudelft.nl

### Real Time Traffic Monitoring With Bayesian Belief Networks

Real Time Traffic Monitoring With Bayesian Belief Networks Sicco Pier van Gosliga TNO Defence, Security and Safety, P.O.Box 96864, 2509 JG The Hague, The Netherlands +31 70 374 02 30, sicco_pier.vangosliga@tno.nl

### The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### DETERMINING THE CONDITIONAL PROBABILITIES IN BAYESIAN NETWORKS

Hacettepe Journal of Mathematics and Statistics Volume 33 (2004), 69 76 DETERMINING THE CONDITIONAL PROBABILITIES IN BAYESIAN NETWORKS Hülya Olmuş and S. Oral Erbaş Received 22 : 07 : 2003 : Accepted 04

### CHAPTER 3. Methods of Proofs. 1. Logical Arguments and Formal Proofs

CHAPTER 3 Methods of Proofs 1. Logical Arguments and Formal Proofs 1.1. Basic Terminology. An axiom is a statement that is given to be true. A rule of inference is a logical rule that is used to deduce

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### Approximating the Partition Function by Deleting and then Correcting for Model Edges

Approximating the Partition Function by Deleting and then Correcting for Model Edges Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles Los Angeles, CA 995

### Chapter 14 Managing Operational Risks with Bayesian Networks

Chapter 14 Managing Operational Risks with Bayesian Networks Carol Alexander This chapter introduces Bayesian belief and decision networks as quantitative management tools for operational risks. Bayesian

### Operations and Supply Chain Management Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Operations and Supply Chain Management Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Lecture - 36 Location Problems In this lecture, we continue the discussion

### Determination of the normalization level of database schemas through equivalence classes of attributes

Computer Science Journal of Moldova, vol.17, no.2(50), 2009 Determination of the normalization level of database schemas through equivalence classes of attributes Cotelea Vitalie Abstract In this paper,

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### Examples and Proofs of Inference in Junction Trees

Examples and Proofs of Inference in Junction Trees Peter Lucas LIAC, Leiden University February 4, 2016 1 Representation and notation Let P(V) be a joint probability distribution, where V stands for a

### 5 Directed acyclic graphs

5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

### Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

### Large-Sample Learning of Bayesian Networks is NP-Hard

Journal of Machine Learning Research 5 (2004) 1287 1330 Submitted 3/04; Published 10/04 Large-Sample Learning of Bayesian Networks is NP-Hard David Maxwell Chickering David Heckerman Christopher Meek Microsoft

### Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule.

Probability 1 Probability Random Variables Outline Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule Inference Independence 2 Inference in Ghostbusters A ghost

### Euler Paths and Euler Circuits

Euler Paths and Euler Circuits An Euler path is a path that uses every edge of a graph exactly once. An Euler circuit is a circuit that uses every edge of a graph exactly once. An Euler path starts and

### Compression algorithm for Bayesian network modeling of binary systems

Compression algorithm for Bayesian network modeling of binary systems I. Tien & A. Der Kiureghian University of California, Berkeley ABSTRACT: A Bayesian network (BN) is a useful tool for analyzing the

### each college c i C has a capacity q i - the maximum number of students it will admit

n colleges in a set C, m applicants in a set A, where m is much larger than n. each college c i C has a capacity q i - the maximum number of students it will admit each college c i has a strict order i

### CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical

Computing and Informatics, Vol. 20, 2001,????, V 2006-Nov-6 UPDATES OF LOGIC PROGRAMS Ján Šefránek Department of Applied Informatics, Faculty of Mathematics, Physics and Informatics, Comenius University,

### Circuits 1 M H Miller

Introduction to Graph Theory Introduction These notes are primarily a digression to provide general background remarks. The subject is an efficient procedure for the determination of voltages and currents

### Chapter 28. Bayesian Networks

Chapter 28. Bayesian Networks The Quest for Artificial Intelligence, Nilsson, N. J., 2009. Lecture Notes on Artificial Intelligence, Spring 2012 Summarized by Kim, Byoung-Hee and Lim, Byoung-Kwon Biointelligence

### Inference of Probability Distributions for Trust and Security applications

Inference of Probability Distributions for Trust and Security applications Vladimiro Sassone Based on joint work with Mogens Nielsen & Catuscia Palamidessi Outline 2 Outline Motivations 2 Outline Motivations

### Why? A central concept in Computer Science. Algorithms are ubiquitous.

Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

### Estimating Software Reliability In the Absence of Data

Estimating Software Reliability In the Absence of Data Joanne Bechta Dugan (jbd@virginia.edu) Ganesh J. Pai (gpai@virginia.edu) Department of ECE University of Virginia, Charlottesville, VA NASA OSMA SAS

### I I I I I I I I I I I I I I I I I I I

A ABSTRACT Method for Using Belief Networks as nfluence Diagrams Gregory F. Cooper Medical Computer Science Group Medical School Office Building Stanford University Stanford, CA 94305 This paper demonstrates

### Pooling and Meta-analysis. Tony O Hagan

Pooling and Meta-analysis Tony O Hagan Pooling Synthesising prior information from several experts 2 Multiple experts The case of multiple experts is important When elicitation is used to provide expert

### Social Media Mining. Graph Essentials

Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures

### Scheduling Home Health Care with Separating Benders Cuts in Decision Diagrams

Scheduling Home Health Care with Separating Benders Cuts in Decision Diagrams André Ciré University of Toronto John Hooker Carnegie Mellon University INFORMS 2014 Home Health Care Home health care delivery

### Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

### THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER

THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER JASPER DEANTONIO Abstract. This paper is a study of the Turing Degrees, which are levels of incomputability naturally arising from sets of natural numbers.

### 6.3 Conditional Probability and Independence

222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

### Analysis of Algorithms, I

Analysis of Algorithms, I CSOR W4231.002 Eleni Drinea Computer Science Department Columbia University Thursday, February 26, 2015 Outline 1 Recap 2 Representing graphs 3 Breadth-first search (BFS) 4 Applications

### OPRE 6201 : 2. Simplex Method

OPRE 6201 : 2. Simplex Method 1 The Graphical Method: An Example Consider the following linear program: Max 4x 1 +3x 2 Subject to: 2x 1 +3x 2 6 (1) 3x 1 +2x 2 3 (2) 2x 2 5 (3) 2x 1 +x 2 4 (4) x 1, x 2

### An Empirical Evaluation of Bayesian Networks Derived from Fault Trees

An Empirical Evaluation of Bayesian Networks Derived from Fault Trees Shane Strasser and John Sheppard Department of Computer Science Montana State University Bozeman, MT 59717 {shane.strasser, john.sheppard}@cs.montana.edu

### [Refer Slide Time: 05:10]

Principles of Programming Languages Prof: S. Arun Kumar Department of Computer Science and Engineering Indian Institute of Technology Delhi Lecture no 7 Lecture Title: Syntactic Classes Welcome to lecture

### Statistical Machine Translation: IBM Models 1 and 2

Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation

### Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning

Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2015/2016 Lesson 12 25 march 2015 Uncertainty and Plausible Reasoning MYCIN (continued)...2 Backward

### In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999

In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999 A Bayesian Network Model for Diagnosis of Liver Disorders Agnieszka

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* Rakesh Nagi Department of Industrial Engineering University at Buffalo (SUNY) *Lecture notes from Network Flows by Ahuja, Magnanti

### Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

### A Statistical Framework for Operational Infrasound Monitoring

A Statistical Framework for Operational Infrasound Monitoring Stephen J. Arrowsmith Rod W. Whitaker LA-UR 11-03040 The views expressed here do not necessarily reflect the views of the United States Government,

### Lecture 7: NP-Complete Problems

IAS/PCMI Summer Session 2000 Clay Mathematics Undergraduate Program Basic Course on Computational Complexity Lecture 7: NP-Complete Problems David Mix Barrington and Alexis Maciel July 25, 2000 1. Circuit

### 136 CHAPTER 4. INDUCTION, GRAPHS AND TREES

136 TER 4. INDUCTION, GRHS ND TREES 4.3 Graphs In this chapter we introduce a fundamental structural idea of discrete mathematics, that of a graph. Many situations in the applications of discrete mathematics

### Model-based Synthesis. Tony O Hagan

Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

### Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

### Decision Trees and Networks

Lecture 21: Uncertainty 6 Today s Lecture Victor R. Lesser CMPSCI 683 Fall 2010 Decision Trees and Networks Decision Trees A decision tree is an explicit representation of all the possible scenarios from

### (Refer Slide Time: 01:52)

Software Engineering Prof. N. L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture - 2 Introduction to Software Engineering Challenges, Process Models etc (Part 2) This

### Data Structures and Algorithms Written Examination

Data Structures and Algorithms Written Examination 22 February 2013 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students: Write First Name, Last Name, Student Number and Signature where

### Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

### Agenda. Interface Agents. Interface Agents

Agenda Marcelo G. Armentano Problem Overview Interface Agents Probabilistic approach Monitoring user actions Model of the application Model of user intentions Example Summary ISISTAN Research Institute

### A first step towards modeling semistructured data in hybrid multimodal logic

A first step towards modeling semistructured data in hybrid multimodal logic Nicole Bidoit * Serenella Cerrito ** Virginie Thion * * LRI UMR CNRS 8623, Université Paris 11, Centre d Orsay. ** LaMI UMR

### Chapter 20: Data Analysis

Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

### Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs Kousha Etessami U. of Edinburgh, UK Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 6) 1 / 13 Overview Graphs and Graph

### Lecture 16 : Relations and Functions DRAFT

CS/Math 240: Introduction to Discrete Mathematics 3/29/2011 Lecture 16 : Relations and Functions Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT In Lecture 3, we described a correspondence

### Introduction to Logic in Computer Science: Autumn 2006

Introduction to Logic in Computer Science: Autumn 2006 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam Ulle Endriss 1 Plan for Today Now that we have a basic understanding

### STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table

### Less naive Bayes spam detection

Less naive Bayes spam detection Hongming Yang Eindhoven University of Technology Dept. EE, Rm PT 3.27, P.O.Box 53, 5600MB Eindhoven The Netherlands. E-mail:h.m.yang@tue.nl also CoSiNe Connectivity Systems

### CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

Mathematics for Computer Science/Software Engineering Notes for the course MSM1F3 Dr. R. A. Wilson October 1996 Chapter 1 Logic Lecture no. 1. We introduce the concept of a proposition, which is a statement

### SEQUENTIAL VALUATION NETWORKS: A NEW GRAPHICAL TECHNIQUE FOR ASYMMETRIC DECISION PROBLEMS

SEQUENTIAL VALUATION NETWORKS: A NEW GRAPHICAL TECHNIQUE FOR ASYMMETRIC DECISION PROBLEMS Riza Demirer and Prakash P. Shenoy University of Kansas, School of Business 1300 Sunnyside Ave, Summerfield Hall

### Data Coding and Entry Lessons Learned

Chapter 7 Data Coding and Entry Lessons Learned Pércsich Richárd Introduction In this chapter we give an overview of the process of coding and entry of the 1999 pilot test data for the English examination

### Social Media Mining. Network Measures

Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the like-minded users

### Boolean Representations and Combinatorial Equivalence

Chapter 2 Boolean Representations and Combinatorial Equivalence This chapter introduces different representations of Boolean functions. It then discuss the applications of these representations for proving

### Belief Formation in the Returns to Schooling

Working paper Belief Formation in the Returns to ing Jim Berry Lucas Coffman March 212 Belief Formation in the Returns to ing March 31, 212 Jim Berry Cornell University Lucas Coffman Ohio State U. & Yale

### Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques. My T. Thai

Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques My T. Thai 1 Overview An overview of LP relaxation and rounding method is as follows: 1. Formulate an optimization

### Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi

Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi Lecture - 9 Basic Scheduling with A-O-A Networks Today we are going to be talking

### Class One: Degree Sequences

Class One: Degree Sequences For our purposes a graph is a just a bunch of points, called vertices, together with lines or curves, called edges, joining certain pairs of vertices. Three small examples of

### CONTINUED FRACTIONS AND FACTORING. Niels Lauritzen

CONTINUED FRACTIONS AND FACTORING Niels Lauritzen ii NIELS LAURITZEN DEPARTMENT OF MATHEMATICAL SCIENCES UNIVERSITY OF AARHUS, DENMARK EMAIL: niels@imf.au.dk URL: http://home.imf.au.dk/niels/ Contents

### What mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL

What mathematical optimization can, and cannot, do for biologists Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL Introduction There is no shortage of literature about the

### Information, Entropy, and Coding

Chapter 8 Information, Entropy, and Coding 8. The Need for Data Compression To motivate the material in this chapter, we first consider various data sources and some estimates for the amount of data associated

### Math 143: College Algebra Spring 2016 - - 8- Week Session

Math 143: College Algebra Spring 2016 - - 8- Week Session Course Description and Foundational Studies Program Objectives Boise State's Foundational Studies Program provides undergraduates with a broad-

### Cloud Computing is NP-Complete

Working Paper, February 2, 20 Joe Weinman Permalink: http://www.joeweinman.com/resources/joe_weinman_cloud_computing_is_np-complete.pdf Abstract Cloud computing is a rapidly emerging paradigm for computing,

### A REMARK ON ALMOST MOORE DIGRAPHS OF DEGREE THREE. 1. Introduction and Preliminaries

Acta Math. Univ. Comenianae Vol. LXVI, 2(1997), pp. 285 291 285 A REMARK ON ALMOST MOORE DIGRAPHS OF DEGREE THREE E. T. BASKORO, M. MILLER and J. ŠIRÁŇ Abstract. It is well known that Moore digraphs do

### Bargaining Solutions in a Social Network

Bargaining Solutions in a Social Network Tanmoy Chakraborty and Michael Kearns Department of Computer and Information Science University of Pennsylvania Abstract. We study the concept of bargaining solutions,

### Lecture 4: AC 0 lower bounds and pseudorandomness

Lecture 4: AC 0 lower bounds and pseudorandomness Topics in Complexity Theory and Pseudorandomness (Spring 2013) Rutgers University Swastik Kopparty Scribes: Jason Perry and Brian Garnett In this lecture,

### MONITORING AND DIAGNOSIS OF A MULTI-STAGE MANUFACTURING PROCESS USING BAYESIAN NETWORKS

MONITORING AND DIAGNOSIS OF A MULTI-STAGE MANUFACTURING PROCESS USING BAYESIAN NETWORKS Eric Wolbrecht Bruce D Ambrosio Bob Paasch Oregon State University, Corvallis, OR Doug Kirby Hewlett Packard, Corvallis,

### Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

### CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma

CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma Please Note: The references at the end are given for extra reading if you are interested in exploring these ideas further. You are

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

Outline NP-completeness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2-pairs sum vs. general Subset Sum Reducing one problem to another Clique

### Protein Protein Interaction Networks

Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

### Lecture 22: November 10

CS271 Randomness & Computation Fall 2011 Lecture 22: November 10 Lecturer: Alistair Sinclair Based on scribe notes by Rafael Frongillo Disclaimer: These notes have not been subjected to the usual scrutiny

### 2. Methods of Proof Types of Proofs. Suppose we wish to prove an implication p q. Here are some strategies we have available to try.

2. METHODS OF PROOF 69 2. Methods of Proof 2.1. Types of Proofs. Suppose we wish to prove an implication p q. Here are some strategies we have available to try. Trivial Proof: If we know q is true then

### Context-Specific Independence in Bayesian Networks. P(z [ x) for all values x, y and z of variables X, Y and

Context-Specific Independence in Bayesian Networks 115 Context-Specific Independence in Bayesian Networks Craig Boutilier Dept. of Computer Science University of British Columbia Vancouver, BC V6T 1Z4

### Probability OPRE 6301

Probability OPRE 6301 Random Experiment... Recall that our eventual goal in this course is to go from the random sample to the population. The theory that allows for this transition is the theory of probability.

### School Timetabling in Theory and Practice

School Timetabling in Theory and Practice Irving van Heuven van Staereling VU University, Amsterdam Faculty of Sciences December 24, 2012 Preface At almost every secondary school and university, some