On the Robustness of Most Probable Explanations



Similar documents
Polynomial Functions. Polynomial functions in one variable can be written in expanded form as ( )

Factoring Polynomials

Math 135 Circles and Completing the Square Examples

LINEAR TRANSFORMATIONS AND THEIR REPRESENTING MATRICES

Basic Analysis of Autarky and Free Trade Models

Example 27.1 Draw a Venn diagram to show the relationship between counting numbers, whole numbers, integers, and rational numbers.

All pay auctions with certain and uncertain prizes a comment

Econ 4721 Money and Banking Problem Set 2 Answer Key

Integration. 148 Chapter 7 Integration

Babylonian Method of Computing the Square Root: Justifications Based on Fuzzy Techniques and on Computational Complexity

EQUATIONS OF LINES AND PLANES

PROF. BOYAN KOSTADINOV NEW YORK CITY COLLEGE OF TECHNOLOGY, CUNY

Operations with Polynomials

Mathematics. Vectors. hsn.uk.net. Higher. Contents. Vectors 128 HSN23100

Module 2. Analysis of Statically Indeterminate Structures by the Matrix Force Method. Version 2 CE IIT, Kharagpur

5.2. LINE INTEGRALS 265. Let us quickly review the kind of integrals we have studied so far before we introduce a new one.

SPECIAL PRODUCTS AND FACTORIZATION

Algebra Review. How well do you remember your algebra?

Treatment Spring Late Summer Fall Mean = 1.33 Mean = 4.88 Mean = 3.

Graphs on Logarithmic and Semilogarithmic Paper

Appendix D: Completing the Square and the Quadratic Formula. In Appendix A, two special cases of expanding brackets were considered:

Integration by Substitution

MATH 150 HOMEWORK 4 SOLUTIONS

Reasoning to Solve Equations and Inequalities

The Velocity Factor of an Insulated Two-Wire Transmission Line

Binary Representation of Numbers Autar Kaw

Example A rectangular box without lid is to be made from a square cardboard of sides 18 cm by cutting equal squares from each corner and then folding

Protocol Analysis / Analysis of Software Artifacts Kevin Bierhoff

Vectors Recap of vectors

Network Configuration Independence Mechanism

Section 7-4 Translation of Axes

Regular Sets and Expressions

Redistributing the Gains from Trade through Non-linear. Lump-sum Transfers

Use Geometry Expressions to create a more complex locus of points. Find evidence for equivalence using Geometry Expressions.

4.11 Inner Product Spaces

CHAPTER 11 Numerical Differentiation and Integration

DlNBVRGH + Sickness Absence Monitoring Report. Executive of the Council. Purpose of report

Decision Rule Extraction from Trained Neural Networks Using Rough Sets

P.3 Polynomials and Factoring. P.3 an 1. Polynomial STUDY TIP. Example 1 Writing Polynomials in Standard Form. What you should learn

Health insurance marketplace What to expect in 2014

Economics Letters 65 (1999) macroeconomists. a b, Ruth A. Judson, Ann L. Owen. Received 11 December 1998; accepted 12 May 1999

piecewise Liner SLAs and Performance Timetagment

Physics 43 Homework Set 9 Chapter 40 Key

Small Business Networking

9.3. The Scalar Product. Introduction. Prerequisites. Learning Outcomes

and thus, they are similar. If k = 3 then the Jordan form of both matrices is

Lecture 5. Inner Product

MODULE 3. 0, y = 0 for all y

ClearPeaks Customer Care Guide. Business as Usual (BaU) Services Peace of mind for your BI Investment

Health insurance exchanges What to expect in 2014

UNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics STRATEGIC SECOND SOURCING IN A VERTICAL STRUCTURE

Warm-up for Differential Calculus

9 CONTINUOUS DISTRIBUTIONS

Experiment 6: Friction

Euler Euler Everywhere Using the Euler-Lagrange Equation to Solve Calculus of Variation Problems

Helicopter Theme and Variations

2. Transaction Cost Economics

6.2 Volumes of Revolution: The Disk Method

g(y(a), y(b)) = o, B a y(a)+b b y(b)=c, Boundary Value Problems Lecture Notes to Accompany

Small Business Networking

Week 11 - Inductance

Small Business Networking

How To Network A Smll Business

belief Propgtion Lgorithm in Nd Pent Penta

Lower Bound for Envy-Free and Truthful Makespan Approximation on Related Machines

COMPARISON OF SOME METHODS TO FIT A MULTIPLICATIVE TARIFF STRUCTURE TO OBSERVED RISK DATA BY B. AJNE. Skandza, Stockholm ABSTRACT

Review guide for the final exam in Math 233

Novel Methods of Generating Self-Invertible Matrix for Hill Cipher Algorithm

Virtual Machine. Part II: Program Control. Building a Modern Computer From First Principles.

Solving BAMO Problems

Enterprise Risk Management Software Buyer s Guide

Version 001 Summer Review #03 tubman (IBII ) 1

A.7.1 Trigonometric interpretation of dot product A.7.2 Geometric interpretation of dot product

Data replication in mobile computing

4 Approximations. 4.1 Background. D. Levy

Bayesian Updating with Continuous Priors Class 13, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Second-Degree Equations as Object of Learning

Week 7 - Perfect Competition and Monopoly

Modeling POMDPs for Generating and Simulating Stock Investment Policies

Lecture 3 Gaussian Probability Distribution

How To Understand The Theory Of Inequlities

2 DIODE CLIPPING and CLAMPING CIRCUITS

Section 5-4 Trigonometric Functions

The Definite Integral

COMPONENTS: COMBINED LOADING

How To Set Up A Network For Your Business

3 The Utility Maximization Problem

PHY 140A: Solid State Physics. Solution to Homework #2

Discovering General Logical Network Topologies

Recognition Scheme Forensic Science Content Within Educational Programmes

Techniques for Requirements Gathering and Definition. Kristian Persson Principal Product Specialist

Small Business Networking

Online Multicommodity Routing with Time Windows

Answer, Key Homework 10 David McIntyre 1

Space Vector Pulse Width Modulation Based Induction Motor with V/F Control

Distributions. (corresponding to the cumulative distribution function for the discrete case).

Anthem Blue Cross Life and Health Insurance Company University of Southern California Custom Premier PPO 800/20%/20%

This paper considers two independent firms that invest in resources such as capacity or inventory based on

19. The Fermat-Euler Prime Number Theorem

Transcription:

On the Robustness of Most Probble Explntions Hei Chn School of Electricl Engineering nd Computer Science Oregon Stte University Corvllis, OR 97330 chnhe@eecs.oregonstte.edu Adnn Drwiche Computer Science Deprtment University of Cliforni, Los Angeles Los Angeles, CA 90095 drwiche@cs.ucl.edu Abstrct In Byesin networks, Most Probble Explntion (MPE) is complete vrible instntition with the highest probbility given the current evidence. In this pper, we discuss the problem of finding robustness conditions of the MPE under single prmeter chnges. Specificlly, we sk the question: How much chnge in single network prmeter cn we fford to pply while keeping the MPE unchnged? We will describe procedure, which is the first of its kind, tht computes this nswer for ll prmeters in the Byesin network in time O(n exp(w)), where n is the number of network vribles nd w is its treewidth. 1 Introduction A Most Probble Explntion (MPE) in Byesin network is complete vrible instntition which hs the highest probbility given current evidence [1]. Given n MPE solution for some piece of evidence, we concern ourselves in this pper with the following question: Wht is the mount of chnge one cn pply to some network prmeter without chnging this current MPE solution? Our gol is then to deduce robustness conditions for MPE under single prmeter chnges. This problem flls into the relm of sensitivity nlysis. Here, we tret the Byesin network s system which ccepts network prmeters s inputs, nd produces the MPE s n output. Our gol is then to chrcterize conditions under which the output is gurnteed to be the sme (or different) given chnge in some input vlue. This question is very useful in number of ppliction res, including wht-if nlysis, in ddition to the This work ws completed while Hei Chn ws t UCLA. Figure 1: An exmple Byesin network where we re interested in the MPE nd its robustness. design nd debugging of Byesin networks. For n exmple, consider Figure 1 which depicts Byesin network for dignosing potentil problems in cr. Suppose now tht we hve the following evidence: the dshbord test nd the lights test cme out positive, while the engine test cme out negtive. When we compute the MPE in this cse, we get scenrio in which ll cr components re working normlly. This seems to be counterintuitive s we expect the most likely scenrio to indicte t lest tht the engine is not working. The methods developed in this pper cn be used to debug this scenrio. In prticulr, we will be ble to identify the mount of chnge in ech network prmeter which is necessry to produce different MPE solution. We will revisit this exmple lter in the pper nd discuss the specific recommendtions computed by our proposed lgorithm. Previous results on sensitivity nlysis hve focused mostly on the robustness of probbility vlues, such s the probbility of evidence, under single or multiple prmeter chnges [2, 3, 4, 5, 6, 7, 8, 9]. Becuse probbility vlues re continuous, while MPE solutions re discrete instntitions, brupt chnges in MPE so-

lutions my occur when we chnge prmeter vlue. This mkes the sensitivity nlysis of MPE quite different from previous work on the subject. This pper is structured s follows. We first provide the forml definition of Byesin networks nd MPE in Section 2. Then in Section 3, we explore the reltionship between the MPE nd single network prmeter, nd lso look into the cse where we chnge co-vrying prmeters in Section 4. We deduce tht the reltionship cn be cptured by two constnts tht re independent of the given prmeter. Next in Section 5, we show how we cn compute these constnts for ll network prmeters, llowing us to utomticlly identify robustness conditions for MPE, nd provide complexity nlysis of our proposed pproch. Finlly, we show some concrete exmples in Section 6, nd then extend our nlysis to evidence chnge in Section 7. 2 Most Probble Explntions We will formlly define most probble explntions in this section, but we specify some of our nottionl conventions first. We will denote vribles by uppercse letters (X) nd their vlues by lowercse letters (x). Sets of vribles will be denoted by bold-fce uppercse letters (X) nd their instntitions by bold-fce lowercse letters (x). For vrible X nd vlue x, we will often write x insted of X = x, nd hence, Pr(x) insted of Pr(X = x). For binry vrible X with vlues true nd flse, we will use x to denote X = true nd x to denote X = flse. Therefore, Pr(X = true) nd Pr(x) represent the sme probbility in this cse. Similrly, Pr(X = flse) nd Pr( x) represent the sme probbility. Finlly, for instntition x of vribles X, we will write x to men the set of ll instntitions x x of vribles X. For exmple, we will write Pr(x) + Pr( x) = 1. A Byesin network is specified by its structure, directed cyclic grph (DAG), nd set of conditionl probbility tbles (CPTs), with one CPT for ech network vrible [1]. In the CPT for vrible X with prents U, we define network prmeter x u for every fmily instntition xu such tht x u = Pr(x u). Given the network prmeters, we cn compute the probbility of complete vrible instntition x s follows: Pr(x) = x u, (1) xu x where is the comptibility reltion between instntitions, i.e., xu x mens tht xu is comptible with x). Now ssume tht we re given evidence e. A most probble explntion (MPE) given e is complete vrible instntition tht is consistent with e nd hs the highest probbility [1]: MPE(e) def = rg mx x e = rg mx x e Pr(x) (2) x u. xu x We note tht the MPE my not be unique instntition s there cn be multiple instntitions with the sme highest probbility. Therefore, we will define MPE(e) s set of instntitions insted of just one instntition. Moreover, we will sometimes use MPE(e, x) to denote the MPE instntitions tht re consistent with e but inconsistent with x. In the following discussion, we will find it necessry to distinguish between the MPE identity nd the MPE probbility. By the MPE identity, we men the set of instntitions hving the highest probbility. By the MPE probbility, we men the probbility ssumed by most likely instntition, which is denoted by: MPE p (e) def = mx x e Pr(x). (3) This distinction is importnt when discussing robustness conditions for MPE since chnge in some network prmeter my chnge the MPE probbility, but not the MPE identity. 3 Reltion Between MPE nd Network Prmeters Assume tht we re given evidence e nd re ble to find its MPE, MPE(e). We now ddress the following question: How much chnge cn we pply to network prmeter x u without chnging the MPE identity of evidence e? To simplify the discussion, we will first ssume tht we cn chnge this prmeter without chnging ny co-vrying prmeters, such s x u, but we will relx this ssumption lter. Our solution to this problem is bsed on some bsic observtions which we discuss next. In prticulr, we observe tht complete vrible instntitions x which re consistent with e cn be divided into two ctegories: Those tht re consistent with xu. From Eqution 1, the probbility of ech such instntition x is liner function of the prmeter x u. Those tht re inconsistent with xu. From Eqution 1, the probbility of ech such instntition x is constnt which is independent of the prmeter x u. Let us denote the first set of instntitions by Σ e,xu nd the second set by Σ e, (xu). We cn then conclude tht:

The set of most likely instntitions in Σ e,xu remins unchnged regrdless of the vlue of prmeter x u, even though the probbility of such instntitions my chnge ccording to the vlue of x u. This is becuse the probbility of ech instntition x Σ e,xu is liner function of the vlue of x u : Pr(x) = r x u, where r is coefficient independent of the vlue of x u. Therefore, the reltive probbilities mong instntitions in Σ e,xu remin unchnged s we chnge the vlue of x u. Note lso tht the most likely instntitions in this set Σ e,xu re just MPE(e, xu) nd their probbility is MPE p (e, xu). Therefore, if we define: r(e, xu) we will then hve: def = MPE p(e, xu) x u, (4) Pr(x) = r(e, xu) x u, for ny x MPE(e, xu). Both the identity nd probbility of the most likely instntitions in Σ e, (xu) re independent of the vlue of prmeter x u. This is becuse the probbility of ech instntition x Σ e, (xu) is independent of the vlue of x u. Note tht the most likely instntition in this set Σ e, (xu) is just MPE(e, (xu)). We will define the probbility of such n instntition s: k(e, xu) def = MPE p (e, (xu)). (5) Given the bove observtions, MPE(e) will either be MPE(e, xu), MPE(e, (xu)), or their union, depending on the vlue of prmeter x u : MPE(e) MPE(e, xu), if r(e, xu) x u > k(e, xu); = MPE(e, (xu)), if r(e, xu) x u < k(e, xu); MPE(e, xu) MPE(e, (xu)), otherwise. Moreover, the MPE probbility cn lwys be expressed s: MPE p (e) = mx(r(e, xu) x u, k(e, xu)). Figure 2 plots the reltion between the MPE probbility MPE p (e) nd the vlue of prmeter x u. According to the figure, if x u > k(e, xu)/r(e, xu), i.e., region A of the plot, then we hve MPE(e) = MPE(e, xu), nd thus the MPE solutions re consistent with xu. Moreover, the MPE identity will remin unchnged s long s the vlue of x u remins greter thn k(e, xu)/r(e, xu). MPE Pr (e) k(e,xu) 0 Region B Region A k(e,xu) / r(e,xu) x u Figure 2: A plot of the reltion between the MPE probbility MPE p (e) nd the vlue of prmeter x u. On the other hnd, if x u < k(e, xu)/r(e, xu), i.e., region B of the plot, then we hve MPE(e) = MPE(e, (xu)), nd thus the MPE solutions re inconsistent with xu. Moreover, the MPE identity nd probbility will remin unchnged s long s the vlue of x u remins less thn k(e, xu)/r(e, xu). Therefore, x u = k(e, xu)/r(e, xu) is the point where there is chnge in the MPE identity if we were to chnge the vlue of prmeter x u. At this point, MPE(e) = MPE(e, xu) MPE(e, (xu)) nd we hve both MPE solutions consistent with xu nd MPE solutions inconsistent with xu. There re no other points where there is chnge in the MPE identity. If we re ble to find the constnts r(e, xu) nd k(e, xu) for the network prmeter x u, we cn then compute robustness conditions for MPE with respect to chnges in this prmeter. 4 Deling with Co-Vrying Prmeters The bove nlysis ssumed tht we cn chnge prmeter x u without needing to chnge ny other prmeters in the network. This is not relistic though in the context of Byesin networks, where co-vrying prmeters need to dd up to 1 for the network to induce vlid probbility distribution. For exmple, if vrible X hs two vlues, x nd x, we must lwys hve: x u + x u = 1. We will therefore extend the nlysis conducted in the previous section to ccount for the simultneously chnges in the co-vrying prmeters. We will restrict our ttention to binry vribles to simplify the discussion, but our results cn be esily extended to multivlued vribles s we will show lter. In prticulr, ssuming tht we re chnging prme-

ters x u nd x u simultneously for binry vrible X, we cn now ctegorize ll network instntitions which re consistent with evidence e into three groups, depending on whether they re consistent with xu, xu, or u. Moreover, the most likely instntitions in ech group re just MPE(e, xu), MPE(e, xu), nd MPE(e, u) respectively. Therefore, if x MPE(e), then: r(e, xu) x u, if x MPE(e, xu); Pr(x) = r(e, xu) x u, if x MPE(e, xu); k(e, u), if x MPE(e, u); where: r(e, xu) = MPE p(e, xu) x u ; r(e, xu) = MPE p(e, xu) x u ; k(e, u) = MPE p (e, u); nd the MPE probbility is: MPE p (e) = mx(r(e, xu) x u, r(e, xu) x u, k(e, u)). Therefore, chnging the co-vrying prmeters x u nd x u will not ffect the identity of either MPE(e, xu) or MPE(e, xu), nor will it ffect the identity or probbility of MPE(e, u). The robustness condition of n MPE solution cn now be summrized s follows: If n MPE solution is consistent with xu, it remins solution s long s the following inequlities re true: r(e, xu) x u r(e, xu) x u ; r(e, xu) x u k(e, u). If n MPE solution is consistent with xu, it remins solution s long s the following inequlities re true: r(e, xu) x u r(e, xu) x u ; r(e, xu) x u k(e, u). If n MPE solution is consistent with u, it remins solution s long s the following inequlities re true: k(e, u) r(e, xu) x u ; k(e, u) r(e, xu) x u. We note here tht one cn esily deduce whether n MPE solution is consistent with xu, xu, or u since it is complete vrible instntition. Therefore, ll we need re the constnts r(e, xu) nd k(e, u) for ech network prmeter x u in order to define robustness conditions for MPE. The constnts k(e, u) cn be esily computed from the constnts r(e, xu) by observing the following: k(e, u) = MPE p (e, u) = mx p(e, u ) u :u u = mx p(e, xu ) xu :u u = mx xu :u u xu ) x u. (6) As the lgorithm we will describe lter computes the r(e, xu) constnts for ll fmily instntitions xu, the lgorithm will then llow us to compute ll the k(e, u) constnts s well. As simple exmple, for the Byesin network whose CPTs re shown in Figure 3, the current MPE solution without ny evidence is A =, B = b, nd hs probbility.4. For the prmeters in the CPT of B, we cn compute the corresponding r(e, xu) constnts. In prticulr, we hve r(e, b) = r(e, b) = r(e, bā) = r(e, bā) =.5 in this cse. The k(e, u) constnts cn lso be computed s k(e, ) =.3 nd k(e, ā) =.4. Given these constnts, we cn esily compute the mount of chnge we cn pply to covrying prmeters, sy b nd b, such tht the MPE solution remins the sme. The conditions we must stisfy re: r(e, b) b r(e, b) b ; r(e, b) b k(e, ). This leds to b b nd b.6. Therefore, the current MPE solution will remin so s long s b.6, which hs current vlue of.8. We close this section by pointing out tht our robustness equtions cn be extended to multi-vlued vribles s follows. If vrible X hs vlues x 1,..., x j, with j > 2, then ech of the conditions we showed erlier will consist of j inequlities insted of just two. For exmple, if n MPE solution is consistent with x 1 u, it remins solution s long s the following inequlities re true: r(e, x 1 u) x1 u r(e, x u) x u for ll x x 1 ; r(e, x 1 u) x1 u k(e, u). 5 Computing Robustness Conditions In this section, we will develop n lgorithm for computing the constnts r(e, xu) for ll network prmeters x u. In prticulr, we will show tht they cn be computed in time nd spce which is O(n exp(w)), where n is the number of network vribles nd w is its treewidth.

A Θ A.5 ā.5 A B Θ B A b.2 b.8 ā b.6 ā b.4 Figure 3: The CPTs for Byesin network A B. with x. Moreover, the term for x evlutes to the probbility vlue Pr(e, x) when the evidence indictors re set ccording to e. Note tht this function is multiliner. Therefore, corresponding rithmetic circuit will hve the property tht two sub-circuits tht feed into the sme multipliction node will never contin common vrible. This property is importnt for some of the following developments. 0.5 0.2 b b + + + 0.8 0.6 0.4 0.5 b b b b Figure 4: An rithmetic circuit for the bove Byesin network. The bold lines depict complete sub-circuit, corresponding to the term b b. 5.1 Arithmetic Circuits Our lgorithm for computing the r(e, xu) constnts is bsed on n rithmetic circuit representtion of the Byesin network [10]. Figure 4 depicts n rithmetic circuit for smll network consisting of two binry nodes, A nd B, shown in Figure 3. An rithmetic circuit is rooted DAG, where ech internl node corresponds to multipliction ( ) or ddition (+), nd ech lef node corresponds either to network prmeter x u or n evidence indictor x ; see Figure 4. Opertionlly, the circuit cn be used to compute the probbility of ny evidence e by evluting the circuit while setting the evidence indictor x to 0 if x contrdicts e nd setting it to 1 otherwise. Semnticlly though, the rithmetic circuit is simply fctored representtion of n exponentil-size function tht cptures the network distribution. For exmple, the circuit in Figure 4 is simply fctored representtion of the following function: b b + b b + ā b ā b ā + ā bā b ā. This function, clled the network polynomil, includes term for ech instntition x of network vribles, where the term is simply product of the network prmeters nd evidence indictors which re consistent 5.2 Complete Sub-Circuits nd Their Coefficients Ech term in the network polynomil corresponds to complete sub-circuit in the rithmetic circuit. A complete sub-circuit cn be constructed recursively from the root, by including ll children of ech multipliction node, nd exctly one child of ech ddition node. The bold lines in Figure 4 depict complete sub-circuit, corresponding to the term b b. In fct, it is esy to check tht the circuit in Figure 4 hs four complete sub-circuits, corresponding to the four terms in the network polynomil. A key observtion bout complete sub-circuits is tht if network prmeter is included in complete subcircuit, there is unique pth from the root to this prmeter in this sub-circuit, even though there my be multiple pths from the root to this prmeter in the originl rithmetic circuit. This pth is importnt s one cn relte the vlue of the term corresponding to the sub-circuit nd the prmeter vlue by simply trversing the pth s we show next. Consider now complete sub-circuit which includes network prmeter x u nd let α be the unique pth in this sub-circuit connecting the root to prmeter x u. We will now define the sub-circuit coefficient w.r.t. x u, denoted s r, in terms of the pth α such tht r x u is just the vlue of the term corresponding to the sub-circuit. Let Σ be the set of ll multipliction nodes on this pth α. The sub-circuit coefficient w.r.t. x u is defined s the product of ll children of nodes in Σ which re themselves not on the pth α. Consider for exmple the complete sub-circuit highlighted in Figure 4 nd the pth from the root to the network prmeter. The coefficient w.r.t. is r = b b. Moreover, r = b b, which is the term corresponding to the sub-circuit. 5.3 Mximizer Circuits An rithmetic circuit cn be esily modified into mximizer circuit to compute the MPE solutions, by simply replcing ech ddition node with mximiztion node; see Figure 5. This corresponds to circuit

0.4 mx 0.4 0 0.8 0.6 mx mx 0.2 0.8 0.6 0.4 Algorithm 1 D-MAXC(M: mximizer circuit, e: evidence) 1: evlute the circuit M under evidence e; fterwrds the vlue of ech node v is p[v] 2: r[v] 1 for root v of circuit M 3: r[v] 0 for ll non-root nodes v in circuit M 4: for non-lef nodes v (prents before children) do 5: if node v is mximiztion node then 6: r[c] mx(r[c], r[v]) for ech child c of node v 7: if node v is multipliction node then 8: r[c] mx (r[c], r[v] c p[c ]) for ech child c of node v, where c re the other children of node v 0.5 1 b 0.2 1 b 0.8 0.6 1 0.4 0 0.5 b b b b Figure 5: A mximizer circuit for Byesin network, evluted under evidence A =. Given this evidence, the evidence indictors re set to = 1, ā = 0, b = 1, b = 1. The bold lines depict the MPE subcircuit. tht computes the vlue of the mximum term in network polynomil, insted of dding up the vlues of these terms. The vlue of the root will thus be the MPE probbility MPE p (e). The mximizer circuit in Figure 5 is evluted under evidence A =, leding to n MPE probbility of.4. To recover n MPE solution from mximizer circuit, ll we need to do is construct the MPE sub-circuit recursively from the root, by including ll children of ech multipliction node, nd one child c for ech mximiztion node v, such tht v nd c hve the sme vlue; see Figure 5. The MPE sub-circuit will then correspond to n MPE solution. Moreover, if prmeter x u is in the MPE sub-circuit, nd the sub-circuit coefficient w.r.t x u is r, then we hve r x u s the probbility of MPE, MPE p (e). Consider Figure 5 nd the highlighted MPE subcircuit, evluted under evidence A =. The term corresponding to this sub-circuit is A =, B = b, which is therefore n MPE solution. Moreover, we hve two prmeters in this sub-circuit, nd b, with coefficients.8 = (1)(.8) nd.5 = (.5)(1)(1) respectively. Therefore, the MPE probbility cn be obtined by multiplying ny of these coefficients with the corresponding prmeter vlue, s (.8) = (.8)(.5) =.4 nd (.5) b = (.5)(.8) =.4. 5.4 Computing r(e, xu) Suppose now tht our gol is to compute MPE(e, xu) for some network prmeter x u. Suppose further tht α 1,..., α m re ll the complete sub-circuits tht include x u. Moreover, let x 1,..., x m be the instntitions corresponding to these sub-circuits nd let r 1,..., r m be their corresponding coefficients w.r.t. x u. It then follows tht the probbilities of these instntitions re r 1 x u,..., r m x u respectively. Moreover, it follows tht: MPE p (e, xu) = mx r 1 x u,..., r m x u, i nd hence, from Eqution 4: MPE p (e, xu) x u = r(e, xu) = mx r 1,..., r m. i Therefore, if we cn compute the mximum of these coefficients, then we hve computed the constnt r(e, xu). Algorithm 1 provides procedure which evlutes the mximizer circuit nd then trverses it top-down, prents before children, computing simultneously the constnts r(e, xu) for ll network prmeters. The procedure mintins n dditionl register vlue r[.] for ech node in the circuit, nd updtes these registers s it visits nodes. When the procedure termintes, it is gurnteed tht the register vlue r[ x u ] will be the constnt r(e, xu). We will lso see lter tht the register vlue r[ x ] is lso constnt which provides vluble informtion for the MPE solutions. Figure 6 depicts n exmple of this procedure. Algorithm 1 cn be modelled s the ll-pirs shortest pth procedure, with edge v c hving weight 0 = ln 1 if v is mximiztion node, nd weight ln π if v is multipliction node, where π is the product of the vlues of the other children c c of node v. The length of the shortest pth from the root to the network prmeter x u is then ln r(e, xu). It should be cler tht the time nd spce complexity of the bove lgorithm is liner in the number of circuit nodes. 1 It is well known tht we cn compile 1 More precisely, this lgorithm is liner in the number of circuit nodes only if the number of children per multipliction node is bounded. If not, one cn use technique which gives liner complexity by simply storing two dditionl bits with ech multipliction node [10].

0.4 mx 1 1 1 0.4 0.5 0 0.8 0.6 mx mx.5.5 0.2 0.8 0 0 0.6 0.4.8.4.5.1.5 0.4 0.3 0 0.5 1 0.2 1 0.8 0.6 1 0.4 0 0.5 b b b b b b Figure 6: A mximizer circuit for Byesin network, evluted under evidence A =. Next to ech node is the vlue r[.] computed by Algorithm 1. circuit for ny Byesin network in O(n exp(w)) time nd spce, where n is the number of network vribles nd w is its treewidth [10]. Therefore, ll constnts r(e, xu) cn be computed with the sme complexity. We close this section by pointing out tht one cn in principle use the jointree lgorithm to compute MPE p (e, xu) = r(e, xu) x u for ll fmily instntitions xu with the bove complexity. In prticulr, by replcing summtion with mximiztion in the jointree lgorithm, one obtins MPE p (e, c) for ech cluster instntition c. Projecting on the fmilies XU in cluster C, one cn then obtin MPE p (e, xu) for ll fmily instntitions xu, which is ll we need to compute robustness conditions for MPE. 2 Our method bove, however, is more generl for two resons: The rithmetic circuit for Byesin network cn be much smller thn the corresponding jointree by exploiting the locl structures of the Byesin network [12, 13]. The constnts computed by the lgorithm for the evidence indictors cn be used to nswer dditionl MPE queries, which results fter vritions on the current evidence. This will be discussed in Section 7. 6 Exmple We now go bck to the exmple network in Figure 1, nd compute robustness conditions for the current 2 However, in cse some of the prmeters re equl to 0, one needs to use specil jointree [11]. Figure 7: A list of prmeter chnges tht would produce different MPE solution. MPE solution using the inequlities we obtin in Section 4, nd n implementtion of Algorithm 1. After going through the CPT of ech vrible, our procedure found nine possible prmeter chnges tht would produce different MPE solution, s shown in Figure 7. From these nine suggested chnges, only three chnges mke sense from qulittive point of view: Decresing the probbility tht the ignition is working from.9925 to t most.9133. (6th row) Decresing the probbility tht the engine is working given both the bttery nd the ignition re working from.97 to t most.9108. (1st row) Decresing the flse-negtive rte of the engine test from.09 to t most.0285. (9th row) If we pply the first prmeter chnge, we get new MPE solution in which both the ignition nd the engine re not working. If we pply either the second or third prmeter chnge, we get new MPE solution in which the engine is not working. 7 MPE under Evidence Chnge We hve discussed in Section 5.2 the notion of complete sub-circuit nd its coefficient with respect to network prmeter x u which is included in the subcircuit. In prticulr, we hve shown how ech subcircuit corresponds to term in the network polynomil, nd tht if complete sub-circuit hs coefficient r with respect to prmeter x u, then r x u will be the vlue of the term corresponding to this sub-circuit. The notion of sub-circuit coefficient cn be extended to evidence indictors s well. In prticulr, if complete sub-circuit hs coefficient r with respect to n evidence indictor x which is included in the sub-circuit,

then r x will be the vlue of the term corresponding to this sub-circuit. Suppose now tht α 1,..., α m re ll the complete subcircuits tht include x. Moreover, let x 1,..., x m be the terms corresponding to these sub-circuits nd let r 1,..., r m be their corresponding coefficients with respect to x. It then follows tht the vlues of these terms re r 1 x,..., r m x respectively. Moreover, it follows tht: MPE p (e X, x) = mx r 1,..., r m, i where e X denotes the new evidence fter hving retrcted the vlue of vrible X from e (if X E, otherwise e X = e). Therefore, if we cn compute the mximum of these coefficients, then we hve computed MPE p (e X, x). Note, however, tht Algorithm 1 lredy computes the mximum of these coefficients for ech x s the evidence indictors re nodes in the mximizer circuit s well, nd therefore the register vlue r[ x ] gives us MPE p (e X, x) for every vrible X nd vlue x. Consider for exmple the circuit in Figure 6, nd the coefficients computed by Algorithm 1 for the four evidence indictors. According to the bove nlysis, these coefficients hve the following menings: x e X, x r[ x ] = MPE p (e X, x).4 ā ā.3 b, b.1 b, b.4 For exmple, the second row bove tells us tht the MPE probbility would be.3 if the evidence ws A = ā insted of A =. In generl, if we hve n vribles, we then hve O(n) vritions on the current evidence of the form e X, x. The MPE probbility of ll of these vritions re immeditely vilble from the coefficients with respect to the evidence indictors. The computtion of these coefficients llows us to deduce the MPE identity fter evidence retrction. In prticulr, suppose tht vrible X is set s x in evidence e, nd MPE p (e) MPE p (e X, x ) for ll other vlues x x. We cn then conclude tht MPE p (e) = MPE p (e X). Moreover, MPE(e) = MPE(e X) if MPE p (e) > MPE p (e X, x ) for ll other vlues x x, or MPE(e) MPE(e X) if there exists some x x such tht MPE p (e) = MPE p (e X, x ). Therefore, the current MPE solutions will remin so even fter we retrct X = x from the evidence. This mens tht X = x is not integrl in the determintion of the current MPE solutions given the other evidence, i.e., e X. The result bove lso hs implictions on the identifiction of multiple MPE solutions given evidence e. In prticulr, suppose tht vrible X is not set in evidence e, then: If the coefficients for the evidence indictors x nd x re equl, we must hve both MPE solutions with X = x nd MPE solutions with X = x. In fct, the coefficients must both equl the MPE probbility MPE p (e) in this cse. If the coefficient for the evidence indictor x is lrger thn the coefficient for the evidence indictor x, then every MPE solution must hve X = x. In the exmple bove, we hve r[ b] > r[ b ], suggesting tht every MPE solution must hve b in this cse. 8 Conclusion We considered in this pper the problem of finding robustness conditions for MPE solutions of Byesin network under single prmeter chnges. We were ble to solve this problem by identifying some interesting reltionships between n MPE solution nd the network prmeters. In prticulr, we found tht the robustness condition of n MPE solution under single prmeter chnge depends on two constnts tht re independent of the prmeter vlue. We lso proposed method for computing such constnts nd, therefore, the robustness conditions of MPE in O(n exp(w)) time nd spce, where n is the number of network vribles nd w is the network treewidth. Our lgorithm is the first of its kind for ensuring the robustness of MPE solutions under prmeter chnges in Byesin network. Acknowledgments This work hs been prtilly supported by Air Force grnt #FA9550-05-1-0075-P00002 nd JPL/NASA grnt #1272258. We would lso like to thnk Jmes Prk for reviewing this pper nd mking the observtion on how to compute k(e, u) in Eqution 6. References [1] Jude Perl. Probbilistic Resoning in Intelligent Systems: Networks of Plusible Inference. Morgn Kufmnn Publishers, Sn Frncisco, Cliforni, 1988. [2] Hei Chn nd Adnn Drwiche. When do numbers relly mtter? Journl of Artificil Intelligence Reserch, 17:265 287, 2002.

[3] Hei Chn nd Adnn Drwiche. Sensitivity nlysis in Byesin networks: From single to multiple prmeters. In Proceedings of the Twentieth Conference on Uncertinty in Artificil Intelligence (UAI), pges 67 75, Arlington, Virgini, 2004. AUAI Press. [13] Mrk Chvir, Adnn Drwiche, nd Mnfred Jeger. Compiling reltionl Byesin networks for exct inference. Interntionl Journl of Approximte Resoning, 42:4 20, 2006. [4] Enrique Cstillo, José Mnuel Gutiérrez, nd Ali S. Hdi. Sensitivity nlysis in discrete Byesin networks. IEEE Trnsctions on Systems, Mn, nd Cybernetics, Prt A (Systems nd Humns), 27:412 423, 1997. [5] Veerle M. H. Coupé, Niels Peek, Jp Ottenkmp, nd J. Dik F. Hbbem. Using sensitivity nlysis for efficient quntifiction of belief network. Artificil Intelligence in Medicine, 17:223 247, 1999. [6] Uffe Kjærulff nd Lind C. vn der Gg. Mking sensitivity nlysis computtionlly efficient. In Proceedings of the Sixteenth Conference on Uncertinty in Artificil Intelligence (UAI), pges 317 325, Sn Frncisco, Cliforni, 2000. Morgn Kufmnn Publishers. [7] Kthryn B. Lskey. Sensitivity nlysis for probbility ssessments in Byesin networks. IEEE Trnsctions on Systems, Mn, nd Cybernetics, 25:901 909, 1995. [8] Mlcolm Prdhn, Mx Henrion, Gregory Provn, Brendn Del Fvero, nd Kurt Hung. The sensitivity of belief networks to imprecise probbilities: An experimentl investigtion. Artificil Intelligence, 85:363 397, 1996. [9] Lind C. vn der Gg nd Silj Renooij. Anlysing sensitivity dt from probbilistic networks. In Proceedings of the Seventeenth Conference on Uncertinty in Artificil Intelligence (UAI), pges 530 537, Sn Frncisco, Cliforni, 2001. Morgn Kufmnn Publishers. [10] Adnn Drwiche. A differentil pproch to inference in Byesin networks. Journl of the ACM, 50:280 305, 2003. [11] Jmes D. Prk nd Adnn Drwiche. A differentil semntics for jointree lgorithms. Artificil Intelligence, 156:197 216, 2004. [12] Mrk Chvir nd Adnn Drwiche. Compiling Byesin networks with locl structure. In Proceedings of the Nineteenth Interntionl Joint Conference on Artificil Intelligence (IJCAI), pges 1306 1312, Denver, Colordo, 2005. Professionl Book Center.