Introduction to Graphical Models

Size: px
Start display at page:

Download "Introduction to Graphical Models"

Transcription

1 obert Collins CSE586 Credits: Several slides are from: Introduction to Graphical odels eadings in Prince textbook: Chapters 0 and but mainly only on directed graphs at this time eview: Probability Theory Sum rule (marginal distributions) Product rule From these we have Bayes theorem with normalization factor eview: Conditional Probabilty Conditional Probability (rewriting product rule) A B) P (A, B) / B) Chain ule A,B,C,D) A) A,B) A,B,C) A,B,C,D) A) A,B) A,B,C) Conditional Independence A) B A) C A,B) D A,B,C) A, B C) A C ) B C) statistical independence A, B) A) B) Christopher Bishop, S Overview of Graphical odels Graphical odels model conditional dependence/ independence Graph structure specifies how joint probability factors Directed graphs Example:H The Joint Distribution ecipe for making a joint distribution of variables: Example: Boolean variables A, B, C Undirected graphs Example:F Inference by : belief propagation Sum-product algorithm ax-product (in-sum if using logs) We will focus mainly on directed graphs right now. Andrew oore, CU

2 The Joint Distribution Example: Boolean variables A, B, C The Joint Distribution Example: Boolean variables A, B, C ecipe for making a joint distribution of variables: A B C ecipe for making a joint distribution of variables: A B C Prob ake a truth table listing all combinations of values of your variables (if there are Boolean variables then the table will have 2 rows) ake a truth table listing all combinations of values of your variables (if there are Boolean variables then the table will have 2 rows). 2. For each combination of values, say how probable it is Andrew oore, CU Andrew oore, CU The Joint Distribution Example: Boolean variables A, B, C Joint distributions ecipe for making a joint distribution of variables:. ake a truth table listing all combinations of values of your variables (if there are Boolean variables then the table will have 2 rows). 2. For each combination of values, say how probable it is. 3. If you subscribe to the axioms of probability, those numbers must sum to. A B C Prob A truth table C Good news Once you have a joint distribution, you can answer all sorts of probabilistic questions involving combinations of attributes 0.30 B 0.0 Andrew oore, CU Using the Joint Using the Joint Poor ale) E) row) rows matching E Poor) E) row) rows matching E Andrew oore, CU Andrew oore, CU 2

3 Inference with the Joint Inference with the Joint computing conditional probabilities E E2) E E2) E ) 2 rows matching E and E2 rows matching E2 row) row) E E2) E E2) E ) 2 rows matching E and E2 rows matching E2 row) row) ale Poor) / Andrew oore, CU Andrew oore, CU Joint distributions Good news Once you have a joint distribution, you can answer all sorts of probabilistic questions involving combinations of attributes Bad news Impossible to create JD for more than about ten attributes because there are so many numbers needed when you build the thing. For 0 binary variables you need to specify numbers 023. (question for class: why the -?) How to use Fewer Numbers Factor the joint distribution into a product of distributions over subsets of variables Identify (or just assume) independence between some subsets of variables Use that independence to simplify some of the distributions Graphical models provide a principled way of doing this. Factoring Directed versus Undirected Graphs Consider an arbitrary joint distribution We can always factor it, by application of the chain rule what this factored form looks like as a graphical model Directed Graph Examples: Bayes nets Hs Undirected Graph Examples FS Note: The word graphical denotes the graph structure underlying the model, not the fact that you can draw a pretty picture of it (although that helps). Christopher Bishop, S Christopher Bishop, S 3

4 Graphical odel Concepts Graphical odel Concepts s)0.3 S )0.6 s)0.3 S )0.6 ^S)0.05 ^~S)0. ~^S)0. ~^~S)0.2 T T )0.3 T ~)0.8 )0.3 ~)0.6 ^S)0.05 ^~S)0. ~^S)0. ~^~S)0.2 T T )0.3 T ~)0.8 )0.3 ~)0.6 Nodes represent random variables. Edges (or lack of edges) represent conditional dependence (or independence). Each node is annotated with a table of conditional probabilities wrt parents. Note: The word graphical denotes the graph structure underlying the model, not the fact that you can draw a pretty picture of it using graphics. Directed Acyclic Graphs Directed acyclic means we can t follow arrows around in a cycle. Examples: chains; trees Also, things that look like this: Factoring Examples Joint distribution where denotes the parents of i We can read the factored form of the joint distribution immediately from a directed graph where denotes the parents of i x parents of x) y parents of y) Factoring Examples Joint distribution Factoring Examples We can read the form of the joint distribution directly from the directed graph where denotes the parents of i where denotes the parents of i p(x,y)p(x)p(y x) p(x,y)p(y)p(x y) p(x,y)p(x)p(y) parents of ) parents of ) parents of ) 4

5 Factoring Examples We can read the form of the joint distribution directly from the directed graph Factoring Examples We can read the form of the joint distribution directly from a directed graph where denotes the parents of i where denotes the parents of i,,) ) ) ) parents of ) parents of ) parents of ) Factoring Examples Graphical odel Concepts We can read the form of the joint distribution directly from a directed graph s)0.3 S )0.6 where denotes the parents of i ^S)0.05 ^~S)0. ~^S)0. ~^~S)0.2 T T )0.3 T ~)0.8 )0.3 ~)0.6 Note:,,),))) How about this one?,,,s,t) Graphical odel Concepts Factoring Examples s)0.3 S )0.6 Joint distribution ^S)0.05 ^~S)0. ~^S)0. ~^~S)0.2 T T )0.3 T ~)0.8 )0.3 ~)0.6 where denotes the parents of i What about this one? How about this one?,,,s,t) S)) S,) )T ) No directed cycles Note to mention T )+T ~) Christopher Bishop, S 5

6 Factoring Examples How many probabilities do we have to specify/learn (assuming each x i is a binary variable)? if fully connected, we would need 2^7-27 but, for this connectivity, we need Important Case: Time Series Consider modeling a time series of sequential data x, x2,..., xn These could represent locations of a tracked object over time observations of the weather each day spectral coefficients of a speech signal joint angles during human motion Note: If all nodes were independent, we would only need 7! odeling Time Series odeling Time Series Simplest model of a time series is that all observations are independent. In the most general case, we could use chain rule to state that any node is dependent on all previous nodes... This would be appropriate for modeling successive tosses {heads,tails} of an unbiased coin. However, it doesn t really treat the series as a sequence. That is, we could permute the ordering of the observations and not change a thing. x,x2,x3,x4,...) x)x2 x)x3 x,x2)x4 x,x2,x3)... ook for an intermediate model between these two extremes. odeling Time Series odeling Time Series arkov assumption: xn x,x2,...,xn-) xn xn-) that is, assume all conditional distributions depend only on the most recent previous observation. Generalization: State-Space odels You have a arkov chain of latent (unobserved) states Each state generates an observation x! x 2! x n-! x n! x n+! The result is a first-order arkov Chain y! y 2! y n-! y n! y n+! x,x2,x3,x4,...) x)x2 x)x3 x2)x4 x3)... Goal: Given a sequence of observations, predict the sequence of unobserved states that imizes the joint probability. 6

7 odeling Time Series Examples of State Space models Hidden arkov model Kalman filter x! x 2! x n-! x n! x n+! odeling Time Series x,x2,x3,x4,...,y,y2,y3,y4,...) x)y x)x2 x)y2 x2)x3 x2)y3 x3)x4 x3)y4 x4)... x! x 2! x n-! x n! x n+! y! y 2! y n-! y n! y n+! y! y 2! y n-! y n! y n+! Example of a Tree-structured odel essage Passing Confusion alert: Our textbook uses w to denote a world state variable and x to denote a measurement. (we have been using x to denote world state and y as the measurement). essage Passing : Belief Propagation Example: D chain Find marginal for a particular node Key Idea of essage Passing multiplication distributes over addition a * b + a * c a * (b + c) as a consequence: for -state nodes, cost is exponential in length of chain but, we can exploit the graphical structure (conditional independences) is number of discrete values a variable can take is number of variables Applicable to both directed and undirected graphs. 7

8 Example essage Passing In the next several slides, we will consider an example of a simple, four-variable arkov chain. 48 multiplications + 23 additions 5 multiplications + 6 additions For, this principle is applied to functions of random variables, rather than the variables as done here. essage Passing Now consider computing the marginal distribution of variable x3 essage Passing ultiplication distributes over addition... essage Passing, aka Forward-Backward Algorithm Can view as sending/combining messages... Forward-Backward Algorithm Express marginals as product of messages evaluated forward from ancesters of xi and backwards from decendents of xi α β Forw * Back ecursive evaluation of messages - + Find Z by normalizing Works in both directed and undirected graphs Christopher Bishop, S 8

9 Confusion Alert! This standard notation for defining heavily overloads the notion of multiplication, e.g. the messages are not scalars it is more appropriate to think of them as vectors, matrices, or even tensors depending on how many variables are involved, with multiplication defined accordingly. [] Note: these are conditional probability tables, so values in each row must sum to one Not scalar multiplication! [] Note: these are conditional probability tables, so values in each row must sum to one [] Note: these are conditional probability tables, so values in each row must sum to one Interpretation x) x2) x2) x22) x22 X) 0.6 Sample computations: x, x2, x3, x4) (.7)(.4)(.8)().2 x2, x2, x32, x4) (.3)()(.2)(.7).02 Joint Probability, represented in a truth table x x2 x3 x4 x,x2,x3,x4) Joint Probability x x2 x3 x4 x,x2,x3,x4) Compute marginal of x3: x3) x32) 042 9

10 now compute via now compute via [] [] x)x2 x) x)x22 x).3.3 x2)x2 x2) x2)x22 x2) (.7)(.4) + (.3)() (.7)(.6) + (.3)() x)x2 x) x)x22 x).3.3 x2)x2 x2) x2)x22 x2) (.7)(.4) + (.3)() (.7)(.6) + (.3)() simpler way to compute this... [] [.43 7] i.e. matrix multiply can do the combining and marginalization all at once!!!! now compute via now compute via [] [] [] [.43 7] [ ] [] compute sum along rows of x4 x3) Can also do that with matrix multiply: [.43 7] note: this is not [ ] a coincidence essage Passing Can view as sending/combining messages... essage Passing Can view as sending/combining messages... Forw [ ] Belief that x3, from front part of chain Belief that x32, from front part of chain * How to combine them? Back Belief that x32, from back part of chain Belief that x3, from back part of chain Forw [ ] * Back [ (.458)() (42)() ] [ ] [ ] (after normalizing, but note that it was already normalized. Again, not a coincidence) These are the same values for the marginal x3) that we computed from the raw joint probability table. Whew!!! 0

11 [] [] If we want to compute all marginals, we can do it in one shot by cascading, for a big computational savings. We need one cascaded forward pass, one separate cascaded backward pass, then a combination and normalization at each node. forward pass [] backward pass Then combine each by elementwise multiply and normalize forward pass [] backward pass Then combine each by elementwise multiply and normalize Forward: [] [.43 7] [ ] [ ] Backward: [ ] [ ] [ ] [ ] combined+ normalized [] [.43 7] [ ] [ ] Note: In this example, a directed arkov chain using true conditional probabilities (rows sum to one), only the forward pass is needed. This is true because the backward pass sums along rows, and always produces [ ]. ax arginals What if we want to know the most probably state (mode of the distribution)? Since the marginal distributions can tell us which value of each variable yields the highest marginal probability (that is, which value is most likely), we might try to just take the arg of each marginal distribution. We didn t really need forward AND backward in this example. Forward: [] [.43 7] [ ] [ ] Backward: [ ] [ ] [ ] [ ] combined+ normalized [] [.43 7] [ ] [ ] these are already the marginals for this example Didn t need to do these steps marginals computed by belief propagation marginals [] [.43 7] [ ] [ ] arg x x 2 2 x 3 2 x 4 Although that s correct in this example, it isn t always the case

12 ax arginals can Fail to Find the AP However, the marginals find most likely values of each variable treated individually, which may not be the combination of values that jointly imize the distribution. marginals: w4, w22 actual AP solution: w2, w24 ax-product Algorithm Goal: find define the marginal then essage passing algorithm with sum replaced by Generalizes to any two operations forming a semiring Christopher Bishop, S Computing AP Value product Can solve using algorithm with sum replaced by. [] In our chain, we start at the end and work our way back to the root (x) using the -product algorithm, keeping track of the value as we go. marginal stage.7.4 x)x2 x).3 x2)x2 x2).7.6 x)x22 x).3 x2)x22 x2) stage3 stage2 [(.7)(.4), (.3)()] [(.7)(.6), (.3)()] product product [] [] [.28.42] [ ] (.28)(.8) (.28)(.2) (.42)(.2) (.42)(.8) Note that this is no longer matrix multiplication, since we are not summing down the columns but taking instead... [.28.42] [ ] (.28)(.8) (.28)(.2) (.42)(.2) (.42)(.8) compute along rows of x4 x3).7 2

13 [] product Joint Probability, represented in a truth table x x2 x3 x4 x,x2,x3,x4) argest value of joint prob mode AP interpretation: the mode of the joint distribution is.2352, and the value of variable x3, in the configuration that yields the mode value, is 2. [ ] [(.224)(), (.336)(.7)] arg 2 x3 interpretation: the mode of the joint distribution is.2352, and the value of variable x3, in the configuration that yields the mode value, is [(.224)(), (.336)(.7)].2352 arg 2 x3 Computing Arg-ax of AP Value : AP Estimate Chris Bishop, P: At this point, we might be tempted simply to continue with the algorithm [sending forward-backward messages and combining to compute arg for each variable node]. However, because we are now imizing rather than summing, it is possible that there may be multiple configurations of x all of which give rise to the imum value for p(x). In such cases, this strategy can fail because it is possible for the individual variable values obtained by imizing the product of messages at each node to belong to different imizing configurations, giving an overall configuration that no longer corresponds to a imum. The problem can be resolved by adopting a rather different kind of... Essentially, the solution is to write a dynamic programming algorithm based on -product. DP State Space Trellis : AP Estimate Joint Probability, represented in a truth table x x2 x3 x4 x,x2,x3,x4) DP State Space Trellis argest value of joint prob mode AP achieved for x, x22, x32, x4 3

14 Belief Propagation Summary Definition can be extended to general tree-structured graphs Works for both directed AND undirected graphs Efficiently computes marginals and AP configurations At each node: form product of incoming messages and local evidence marginalize to give outgoing message one message in each direction across every link oopy Belief Propagation BP applied to graph that contains loops needs a propagation schedule needs multiple iterations might not converge Typically works well, even though it isn t supposed to State-of-the-art performance in error-correcting codes Gives exact answer in any acyclic graph (no loops). Christopher Bishop, S Christopher Bishop, S 4

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

More information

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

More information

These axioms must hold for all vectors ū, v, and w in V and all scalars c and d.

These axioms must hold for all vectors ū, v, and w in V and all scalars c and d. DEFINITION: A vector space is a nonempty set V of objects, called vectors, on which are defined two operations, called addition and multiplication by scalars (real numbers), subject to the following axioms

More information

9.4. The Scalar Product. Introduction. Prerequisites. Learning Style. Learning Outcomes

9.4. The Scalar Product. Introduction. Prerequisites. Learning Style. Learning Outcomes The Scalar Product 9.4 Introduction There are two kinds of multiplication involving vectors. The first is known as the scalar product or dot product. This is so-called because when the scalar product of

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

3. The Junction Tree Algorithms

3. The Junction Tree Algorithms A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin mark@paskin.org 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )

More information

Factor Graphs and the Sum-Product Algorithm

Factor Graphs and the Sum-Product Algorithm 498 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 2, FEBRUARY 2001 Factor Graphs and the Sum-Product Algorithm Frank R. Kschischang, Senior Member, IEEE, Brendan J. Frey, Member, IEEE, and Hans-Andrea

More information

Math 4310 Handout - Quotient Vector Spaces

Math 4310 Handout - Quotient Vector Spaces Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable

More information

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

More information

LS.6 Solution Matrices

LS.6 Solution Matrices LS.6 Solution Matrices In the literature, solutions to linear systems often are expressed using square matrices rather than vectors. You need to get used to the terminology. As before, we state the definitions

More information

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015 ECON 459 Game Theory Lecture Notes Auctions Luca Anderlini Spring 2015 These notes have been used before. If you can still spot any errors or have any suggestions for improvement, please let me know. 1

More information

Lecture Note 1 Set and Probability Theory. MIT 14.30 Spring 2006 Herman Bennett

Lecture Note 1 Set and Probability Theory. MIT 14.30 Spring 2006 Herman Bennett Lecture Note 1 Set and Probability Theory MIT 14.30 Spring 2006 Herman Bennett 1 Set Theory 1.1 Definitions and Theorems 1. Experiment: any action or process whose outcome is subject to uncertainty. 2.

More information

Dynamic Programming. Lecture 11. 11.1 Overview. 11.2 Introduction

Dynamic Programming. Lecture 11. 11.1 Overview. 11.2 Introduction Lecture 11 Dynamic Programming 11.1 Overview Dynamic Programming is a powerful technique that allows one to solve many different types of problems in time O(n 2 ) or O(n 3 ) for which a naive approach

More information

Click on the links below to jump directly to the relevant section

Click on the links below to jump directly to the relevant section Click on the links below to jump directly to the relevant section What is algebra? Operations with algebraic terms Mathematical properties of real numbers Order of operations What is Algebra? Algebra is

More information

Decision Making under Uncertainty

Decision Making under Uncertainty 6.825 Techniques in Artificial Intelligence Decision Making under Uncertainty How to make one decision in the face of uncertainty Lecture 19 1 In the next two lectures, we ll look at the question of how

More information

Binary Adders: Half Adders and Full Adders

Binary Adders: Half Adders and Full Adders Binary Adders: Half Adders and Full Adders In this set of slides, we present the two basic types of adders: 1. Half adders, and 2. Full adders. Each type of adder functions to add two binary bits. In order

More information

Lecture Notes 2: Matrices as Systems of Linear Equations

Lecture Notes 2: Matrices as Systems of Linear Equations 2: Matrices as Systems of Linear Equations 33A Linear Algebra, Puck Rombach Last updated: April 13, 2016 Systems of Linear Equations Systems of linear equations can represent many things You have probably

More information

A Few Basics of Probability

A Few Basics of Probability A Few Basics of Probability Philosophy 57 Spring, 2004 1 Introduction This handout distinguishes between inductive and deductive logic, and then introduces probability, a concept essential to the study

More information

3.2 Matrix Multiplication

3.2 Matrix Multiplication 3.2 Matrix Multiplication Question : How do you multiply two matrices? Question 2: How do you interpret the entries in a product of two matrices? When you add or subtract two matrices, you add or subtract

More information

The PageRank Citation Ranking: Bring Order to the Web

The PageRank Citation Ranking: Bring Order to the Web The PageRank Citation Ranking: Bring Order to the Web presented by: Xiaoxi Pang 25.Nov 2010 1 / 20 Outline Introduction A ranking for every page on the Web Implementation Convergence Properties Personalized

More information

Question 2: How do you solve a matrix equation using the matrix inverse?

Question 2: How do you solve a matrix equation using the matrix inverse? Question : How do you solve a matrix equation using the matrix inverse? In the previous question, we wrote systems of equations as a matrix equation AX B. In this format, the matrix A contains the coefficients

More information

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1. Be able to apply Bayes theorem to compute probabilities. 2. Be able to identify

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

5.5. Solving linear systems by the elimination method

5.5. Solving linear systems by the elimination method 55 Solving linear systems by the elimination method Equivalent systems The major technique of solving systems of equations is changing the original problem into another one which is of an easier to solve

More information

GREEN CHICKEN EXAM - NOVEMBER 2012

GREEN CHICKEN EXAM - NOVEMBER 2012 GREEN CHICKEN EXAM - NOVEMBER 2012 GREEN CHICKEN AND STEVEN J. MILLER Question 1: The Green Chicken is planning a surprise party for his grandfather and grandmother. The sum of the ages of the grandmother

More information

DERIVATIVES AS MATRICES; CHAIN RULE

DERIVATIVES AS MATRICES; CHAIN RULE DERIVATIVES AS MATRICES; CHAIN RULE 1. Derivatives of Real-valued Functions Let s first consider functions f : R 2 R. Recall that if the partial derivatives of f exist at the point (x 0, y 0 ), then we

More information

Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

More information

Conditional Random Fields: An Introduction

Conditional Random Fields: An Introduction Conditional Random Fields: An Introduction Hanna M. Wallach February 24, 2004 1 Labeling Sequential Data The task of assigning label sequences to a set of observation sequences arises in many fields, including

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

Simplifying Logic Circuits with Karnaugh Maps

Simplifying Logic Circuits with Karnaugh Maps Simplifying Logic Circuits with Karnaugh Maps The circuit at the top right is the logic equivalent of the Boolean expression: f = abc + abc + abc Now, as we have seen, this expression can be simplified

More information

Section 10.4 Vectors

Section 10.4 Vectors Section 10.4 Vectors A vector is represented by using a ray, or arrow, that starts at an initial point and ends at a terminal point. Your textbook will always use a bold letter to indicate a vector (such

More information

Likelihood: Frequentist vs Bayesian Reasoning

Likelihood: Frequentist vs Bayesian Reasoning "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and

More information

LECTURE 4. Last time: Lecture outline

LECTURE 4. Last time: Lecture outline LECTURE 4 Last time: Types of convergence Weak Law of Large Numbers Strong Law of Large Numbers Asymptotic Equipartition Property Lecture outline Stochastic processes Markov chains Entropy rate Random

More information

TU e. Advanced Algorithms: experimentation project. The problem: load balancing with bounded look-ahead. Input: integer m 2: number of machines

TU e. Advanced Algorithms: experimentation project. The problem: load balancing with bounded look-ahead. Input: integer m 2: number of machines The problem: load balancing with bounded look-ahead Input: integer m 2: number of machines integer k 0: the look-ahead numbers t 1,..., t n : the job sizes Problem: assign jobs to machines machine to which

More information

Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4 Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

More information

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system 1. Systems of linear equations We are interested in the solutions to systems of linear equations. A linear equation is of the form 3x 5y + 2z + w = 3. The key thing is that we don t multiply the variables

More information

Compression algorithm for Bayesian network modeling of binary systems

Compression algorithm for Bayesian network modeling of binary systems Compression algorithm for Bayesian network modeling of binary systems I. Tien & A. Der Kiureghian University of California, Berkeley ABSTRACT: A Bayesian network (BN) is a useful tool for analyzing the

More information

Linear Programming. March 14, 2014

Linear Programming. March 14, 2014 Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1

More information

3. Mathematical Induction

3. Mathematical Induction 3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

More information

2.4 Real Zeros of Polynomial Functions

2.4 Real Zeros of Polynomial Functions SECTION 2.4 Real Zeros of Polynomial Functions 197 What you ll learn about Long Division and the Division Algorithm Remainder and Factor Theorems Synthetic Division Rational Zeros Theorem Upper and Lower

More information

6.3 Conditional Probability and Independence

6.3 Conditional Probability and Independence 222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

More information

CONTENTS 1. Peter Kahn. Spring 2007

CONTENTS 1. Peter Kahn. Spring 2007 CONTENTS 1 MATH 304: CONSTRUCTING THE REAL NUMBERS Peter Kahn Spring 2007 Contents 2 The Integers 1 2.1 The basic construction.......................... 1 2.2 Adding integers..............................

More information

Solutions to Math 51 First Exam January 29, 2015

Solutions to Math 51 First Exam January 29, 2015 Solutions to Math 5 First Exam January 29, 25. ( points) (a) Complete the following sentence: A set of vectors {v,..., v k } is defined to be linearly dependent if (2 points) there exist c,... c k R, not

More information

Chapter 7 - Roots, Radicals, and Complex Numbers

Chapter 7 - Roots, Radicals, and Complex Numbers Math 233 - Spring 2009 Chapter 7 - Roots, Radicals, and Complex Numbers 7.1 Roots and Radicals 7.1.1 Notation and Terminology In the expression x the is called the radical sign. The expression under the

More information

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Samarjit Chakraborty Computer Engineering and Networks Laboratory Swiss Federal Institute of Technology (ETH) Zürich March

More information

10.2 ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS. The Jacobi Method

10.2 ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS. The Jacobi Method 578 CHAPTER 1 NUMERICAL METHODS 1. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS As a numerical technique, Gaussian elimination is rather unusual because it is direct. That is, a solution is obtained after

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2 CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2 Proofs Intuitively, the concept of proof should already be familiar We all like to assert things, and few of us

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

Decision Analysis. Here is the statement of the problem:

Decision Analysis. Here is the statement of the problem: Decision Analysis Formal decision analysis is often used when a decision must be made under conditions of significant uncertainty. SmartDrill can assist management with any of a variety of decision analysis

More information

Lecture 2 Matrix Operations

Lecture 2 Matrix Operations Lecture 2 Matrix Operations transpose, sum & difference, scalar multiplication matrix multiplication, matrix-vector product matrix inverse 2 1 Matrix transpose transpose of m n matrix A, denoted A T or

More information

Solving Quadratic Equations

Solving Quadratic Equations 9.3 Solving Quadratic Equations by Using the Quadratic Formula 9.3 OBJECTIVES 1. Solve a quadratic equation by using the quadratic formula 2. Determine the nature of the solutions of a quadratic equation

More information

5 Directed acyclic graphs

5 Directed acyclic graphs 5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

More information

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92. Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure

More information

A Non-Linear Schema Theorem for Genetic Algorithms

A Non-Linear Schema Theorem for Genetic Algorithms A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland

More information

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Why? A central concept in Computer Science. Algorithms are ubiquitous. Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

The last three chapters introduced three major proof techniques: direct,

The last three chapters introduced three major proof techniques: direct, CHAPTER 7 Proving Non-Conditional Statements The last three chapters introduced three major proof techniques: direct, contrapositive and contradiction. These three techniques are used to prove statements

More information

University of Lille I PC first year list of exercises n 7. Review

University of Lille I PC first year list of exercises n 7. Review University of Lille I PC first year list of exercises n 7 Review Exercise Solve the following systems in 4 different ways (by substitution, by the Gauss method, by inverting the matrix of coefficients

More information

1 Review of Least Squares Solutions to Overdetermined Systems

1 Review of Least Squares Solutions to Overdetermined Systems cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares

More information

7 Gaussian Elimination and LU Factorization

7 Gaussian Elimination and LU Factorization 7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method

More information

Rational Exponents. Squaring both sides of the equation yields. and to be consistent, we must have

Rational Exponents. Squaring both sides of the equation yields. and to be consistent, we must have 8.6 Rational Exponents 8.6 OBJECTIVES 1. Define rational exponents 2. Simplify expressions containing rational exponents 3. Use a calculator to estimate the value of an expression containing rational exponents

More information

Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson Mathematics for Computer Science/Software Engineering Notes for the course MSM1F3 Dr. R. A. Wilson October 1996 Chapter 1 Logic Lecture no. 1. We introduce the concept of a proposition, which is a statement

More information

Study Manual. Probabilistic Reasoning. Silja Renooij. August 2015

Study Manual. Probabilistic Reasoning. Silja Renooij. August 2015 Study Manual Probabilistic Reasoning 2015 2016 Silja Renooij August 2015 General information This study manual was designed to help guide your self studies. As such, it does not include material that is

More information

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Know the definitions of conditional probability and independence

More information

Row Echelon Form and Reduced Row Echelon Form

Row Echelon Form and Reduced Row Echelon Form These notes closely follow the presentation of the material given in David C Lay s textbook Linear Algebra and its Applications (3rd edition) These notes are intended primarily for in-class presentation

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

Structured Learning and Prediction in Computer Vision. Contents

Structured Learning and Prediction in Computer Vision. Contents Foundations and Trends R in Computer Graphics and Vision Vol. 6, Nos. 3 4 (2010) 185 365 c 2011 S. Nowozin and C. H. Lampert DOI: 10.1561/0600000033 Structured Learning and Prediction in Computer Vision

More information

Matrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws.

Matrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws. Matrix Algebra A. Doerr Before reading the text or the following notes glance at the following list of basic matrix algebra laws. Some Basic Matrix Laws Assume the orders of the matrices are such that

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

(Refer Slide Time: 2:03)

(Refer Slide Time: 2:03) Control Engineering Prof. Madan Gopal Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 11 Models of Industrial Control Devices and Systems (Contd.) Last time we were

More information

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006 Hidden Markov Models in Bioinformatics By Máthé Zoltán Kőrösi Zoltán 2006 Outline Markov Chain HMM (Hidden Markov Model) Hidden Markov Models in Bioinformatics Gene Finding Gene Finding Model Viterbi algorithm

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Satisfiability Checking

Satisfiability Checking Satisfiability Checking SAT-Solving Prof. Dr. Erika Ábrahám Theory of Hybrid Systems Informatik 2 WS 10/11 Prof. Dr. Erika Ábrahám - Satisfiability Checking 1 / 40 A basic SAT algorithm Assume the CNF

More information

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit? ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the

More information

Solving Mass Balances using Matrix Algebra

Solving Mass Balances using Matrix Algebra Page: 1 Alex Doll, P.Eng, Alex G Doll Consulting Ltd. http://www.agdconsulting.ca Abstract Matrix Algebra, also known as linear algebra, is well suited to solving material balance problems encountered

More information

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team Lecture Summary In this lecture, we learned about the ADT Priority Queue. A

More information

Statistical Machine Translation: IBM Models 1 and 2

Statistical Machine Translation: IBM Models 1 and 2 Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation

More information

Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks. Mausam (Slides by UW-AI faculty) Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

More information

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm. Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of

More information

Solving Systems of Linear Equations

Solving Systems of Linear Equations LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how

More information

Boolean Algebra. Boolean Algebra. Boolean Algebra. Boolean Algebra

Boolean Algebra. Boolean Algebra. Boolean Algebra. Boolean Algebra 2 Ver..4 George Boole was an English mathematician of XIX century can operate on logic (or Boolean) variables that can assume just 2 values: /, true/false, on/off, closed/open Usually value is associated

More information

Chapter 3. if 2 a i then location: = i. Page 40

Chapter 3. if 2 a i then location: = i. Page 40 Chapter 3 1. Describe an algorithm that takes a list of n integers a 1,a 2,,a n and finds the number of integers each greater than five in the list. Ans: procedure greaterthanfive(a 1,,a n : integers)

More information

- Easy to insert & delete in O(1) time - Don t need to estimate total memory needed. - Hard to search in less than O(n) time

- Easy to insert & delete in O(1) time - Don t need to estimate total memory needed. - Hard to search in less than O(n) time Skip Lists CMSC 420 Linked Lists Benefits & Drawbacks Benefits: - Easy to insert & delete in O(1) time - Don t need to estimate total memory needed Drawbacks: - Hard to search in less than O(n) time (binary

More information

Beginner s Matlab Tutorial

Beginner s Matlab Tutorial Christopher Lum lum@u.washington.edu Introduction Beginner s Matlab Tutorial This document is designed to act as a tutorial for an individual who has had no prior experience with Matlab. For any questions

More information

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively. Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry

More information

Notes on Determinant

Notes on Determinant ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

More information

Recursive Algorithms. Recursion. Motivating Example Factorial Recall the factorial function. { 1 if n = 1 n! = n (n 1)! if n > 1

Recursive Algorithms. Recursion. Motivating Example Factorial Recall the factorial function. { 1 if n = 1 n! = n (n 1)! if n > 1 Recursion Slides by Christopher M Bourke Instructor: Berthe Y Choueiry Fall 007 Computer Science & Engineering 35 Introduction to Discrete Mathematics Sections 71-7 of Rosen cse35@cseunledu Recursive Algorithms

More information

Regular Languages and Finite Automata

Regular Languages and Finite Automata Regular Languages and Finite Automata 1 Introduction Hing Leung Department of Computer Science New Mexico State University Sep 16, 2010 In 1943, McCulloch and Pitts [4] published a pioneering work on a

More information

Chapter 6 Work and Energy

Chapter 6 Work and Energy Chapter 6 WORK AND ENERGY PREVIEW Work is the scalar product of the force acting on an object and the displacement through which it acts. When work is done on or by a system, the energy of that system

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

Solution to Homework 2

Solution to Homework 2 Solution to Homework 2 Olena Bormashenko September 23, 2011 Section 1.4: 1(a)(b)(i)(k), 4, 5, 14; Section 1.5: 1(a)(b)(c)(d)(e)(n), 2(a)(c), 13, 16, 17, 18, 27 Section 1.4 1. Compute the following, if

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

Math 223 Abstract Algebra Lecture Notes

Math 223 Abstract Algebra Lecture Notes Math 223 Abstract Algebra Lecture Notes Steven Tschantz Spring 2001 (Apr. 23 version) Preamble These notes are intended to supplement the lectures and make up for the lack of a textbook for the course

More information

Linear Algebra I. Ronald van Luijk, 2012

Linear Algebra I. Ronald van Luijk, 2012 Linear Algebra I Ronald van Luijk, 2012 With many parts from Linear Algebra I by Michael Stoll, 2007 Contents 1. Vector spaces 3 1.1. Examples 3 1.2. Fields 4 1.3. The field of complex numbers. 6 1.4.

More information

1 Review of Newton Polynomials

1 Review of Newton Polynomials cs: introduction to numerical analysis 0/0/0 Lecture 8: Polynomial Interpolation: Using Newton Polynomials and Error Analysis Instructor: Professor Amos Ron Scribes: Giordano Fusco, Mark Cowlishaw, Nathanael

More information

Solving Systems of Linear Equations Using Matrices

Solving Systems of Linear Equations Using Matrices Solving Systems of Linear Equations Using Matrices What is a Matrix? A matrix is a compact grid or array of numbers. It can be created from a system of equations and used to solve the system of equations.

More information

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products Chapter 3 Cartesian Products and Relations The material in this chapter is the first real encounter with abstraction. Relations are very general thing they are a special type of subset. After introducing

More information