Lecture 21: Uncertainty 6 Today s Lecture Victor R. Lesser CMPSCI 683 Fall 2010 Decision Trees and Networks Decision Trees A decision tree is an explicit representation of all the possible scenarios from a given state. Each path corresponds to decisions made by the agent, actions taken, possible observations, state changes, and a final outcome node. Similar to a game played against nature - Decision node - Chance node Display! Software Example 1: Software Development make! buy reuse major changes P=0.6 simple P=0.3 difficult P=0.7 minor changes P=0.4 simple P=0.2 complex P=0.8 minor changes P=0.7 major changes P=0.3 $380K $450K $275K $310K $490K $210K $400K EU(make) = 0.3 $380K + 0.7 $450K = $429K EU(reuse) = 0.4 $275K + 0.6 [0.2 $310K + 0.8 $490K] = $382.4K EU(buy) = 0.7 $210K + 0.3 $400K = $267K ; best choice
Example 2: Buying a car There are two candidate cars C 1 and C 2, each can be of good quality (+) or bad quality ( ). There are two possible tests, T 1 on C 1 (costs $50) and T 2 on C 2 (costs $20). C 1 costs $1500 ($500 below market value) but if it is of bad quality repair cost is $700. 500 gain or 200 lost C 2 costs $1150 ($250 below market value) but if it is of bad quality repair cost is $150. 250 gain or 100 gain Buyer must buy one of the cars and can perform at most one test. -- What other information? Example 2: Buying a car cont. The chances that the cars are of good quality are 0.70 for C 1 and 0.80 for C 2. Test T 1 on C 1 will confirm good quality with probability 0.80 if C 1 =good and will confirm bad quality with probability 0.65 if C 1 = bad. Imperfect information Test T 2 on C 2 will confirm good quality with probability 0.75 and will confirm bad quality with probability 0.70. Example 2: Buying a car cont. What are the decisions and how can you judge their outcomes? Evaluating decision trees Decision T 2 on C 2 T 0 - no test Chance T 1 on C 1 fail pass fail pass Decision + + Chance 17 18 19 20 + + + + + + + + 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1. Traverse the tree in a depth-first manner: (a) Assign a value to each leaf node based on the outcome, then back-up outcome values (b) Calculate the average utility at each chance node based on the likelihood of each outcome (c) Calculate the maximum utility at each decision node, while marking the maximum branch 2. Trace back the marked branches, from the root node down to find the desired optimal (conditional) plan. Finding the value of (perfect or imperfect ) information in a decision tree. T2-fail + + 1 2 3 4
Additional Information Details of Example Buyer knows car c 1 is good quality 70% P(c 1 =good) =.7 Buyer knows car c 2 is good quality 80% P(c 2 =good) =.8 Test t 1 check quality of car c 1 P(t 1 =pass c 1 =good) =.8 P(t 1 =pass c 1 =bad) =.35 Test of t 2 check quality of car c 2 P(t 2 =pass c 2 =good) =.75 P(t 2 =pass c 2 =bad) =.3 Case 1 P(c1=good t2=fail)=p(c1=good)=.7; test t2 does not say anything about c1 Utility = 2000(value of car)-1500(cost of car)-20(cost of test) =480 Case 2 P(c1=bad t2=fail) = p(c1=bad) = 1- p(c1=good) =.3 Utility = 2000-1500-700(cost of repair)-20 = -220 Expected Utility of Chance Node of 1&2.7 x480 +.3x-220 = 270 T2-fail 270 + + 1 2 3 4 480-220 Case 3 P(c2=good t2=fail) = Details of Example cont P(t2=fail c2=good) P(c2=good)/P(t2=fail) = (.25x.8=.2)/ P(t2=fail) = Normalize.2/.34 (=.2+.14),.14/.34 (over c2=bad case 4).59 Utility = 1400-1150-20= 230 Case 4 P(c2=bad/t2=fail) = P(t2=fail/c2=bad) P(c2=bad)/P(t2=fail) = (.7x.2=.14) / P(t2=fail) =.41 Utility = 1400-1150-20-150= 80 Expected Utility of Chance Node of 3&4.59 x230 +.41x80 =168.5 T2-fail + + 168.5 1 2 3 4 230 80 _ Details of Example cont What is the decision if Decide to do test t2 It comes out false Do you buy c1 or c2? E(c1 test t2=fail) = Expected Utility of Chance Node of 1&2 = 270 E(c2 test t2=fail) = Expected Utility of Chance Node of 3&4 = 168.5 T2-fail 270 270 168.5 + + 1 2 3 4
Example 2: Buying a car cont. Decision Networks/Influence Diagrams fail pass T 2 T 1 fail T 0 pass Decision networks or influence diagrams are an extension of belief networks that allow for reasoning about actions and utility. C 1 C 2 + + + + + + + + + + The network represents information about the agent s current state, its possible actions, the possible outcome of those actions, and their utility. 480-220 230 80 230 80 480-220 200 50 450-250 200 50 450-250 Nodes in a Decision Network Example 3: Taking an Umbrella Chance nodes (ovals) have CPTs (conditional probability tables) that depend on the states of the parent nodes (chance or decision). Decision nodes (squares) represent options available to the decision maker. Utility nodes (Diamonds) or value nodes represent the overall utility based on the states of the parent nodes. Rain (hidden variable) Does WeatherReport Indicate Rain? Bring Umbrella? Utility Parameters: P(Rain), P(WeatherReport Rain), P(WeatherReport Rain), Utility(Rain,Umbrella)
Taking an Umbrella as Decision Tree Knowledge in an Influence Diagram Rain Causal knowledge about how events influence each other in the domain Weather Report Yes (take umbrella) Rain No (don t take umbrella) Not Rain Yes No Knowledge about what action sequences are feasible in any given set of circumstances Lays out possible temporal ordering of decisions 1 Umbrella Not Umbrella Umbrella Not Umbrella Normative (Utility) knowledge about how desirable the consequences are Example 2 as an Influence Diagram Decision Trees vs Influence Diagrams C1 T1 Cost of test T is decision whether to do a Test or not and which one D is the decision of which car to buy T D r(c1,d) T2 r(t) C2 r(c2,d) fail T 2 pass + + + + + + + + 480-22023080 230 80480-220200 C 1 T 1 T 0 fail pass C 2 + + 50450-250200 50450-250 Decision trees are not convenient for representing domain knowledge Requires tremendous amount of storage Multiple decisions nodes -- expands tree Duplication of knowledge along different paths Joint Probability Distribution vs Bayes Net Generate decision tree on the fly from more economical forms of knowledge Depth-first expansion of tree for computing optimal decision
Topology of decision networks 1. The directed graph has no cycles. 2. The utility nodes have no children. 3. There is a directed path that contains all of the decision nodes. 4. A CPT is attached to each chance node specifying P(A parents(a)). 5. A real valued function over parents(u) is attached to each utility node. Semantics Links into decision nodes are called information links, and they indicate that the state of the parent is known prior to the decision. The directed path that goes through all the decision nodes defines a temporal sequence of decisions. It also partitions the chance variables into sets: I 0 is the vars observed before any decision is made, I 1 is the vars observed after the first and before the second decision, etc. I n is the set of unobserved vars. The no-forgetting assumption is that the decision maker remembers all past observations and decisions. -- Non Markov Assumption Example 4: Airport Siting Problem Evaluating Decision Networks Airport Site 1. Set the evidence variables for the current state. Air Traffic! Litigation Deaths Noise! Utility (deaths,noise,cost) U 2. For each possible value of the decision node(s): (a) Set the decision node to that value. (b) Calculate the posterior probabilities for the parent nodes of the utility node. (c) Calculate the expected utility for the action. 3. Return the action/decision with the highest utility. Construction! Cost P(cost=high airportsite=darien, airtraffic=low, litigation=high, construction=high)
Imperfect Information Example 5: Mildew Two months before the harvest of a wheat field, the farmer observes the state Q of the crop, and he observes whether it has been attacked by mildew, M. If there is an attack, he will decide on a treatment with fungicides. Mildew decision model Maximize ( value of crop - cost of action ) Q H U Value of crop There are five variables: - Q: fair (f), not too bad (n), average (a), good (g) - M: no (no), little (l), moderate (m), severe (s) - H: state of Q plus M: rotten (r),bad (b), poor (p) OQ M M* State of Mildew after Treatment - OQ: observation of Q; imperfect information on Q - OM: observation of M; imperfect information on M OM A V Cost of action One action in general Multiple decisions -- Policy Generation A single decision node D may have links to some chance nodes. A set of utility functions U 1,,U n over domains X 1,,X n. Goal: find the decision d that maximizes EU(D=d e): C1 T1 T D T2 C2 How to solve such problems using a standard Bayesian network package? V Need a more complex evaluation technique since generating a policy (sequence of decisions)
The Domain of Decision Nodes: Options At Decision Node D T: t 0, t 1, t 2 D: If T = t 0 then { Buy 1, Buy 2 } If T = t 1 then { Buy 1 if t 1 =pass else Buy 2, Buy 2 if t 1 =pass else Buy 1, always Buy 1, always Buy 2 } If T = t 2 then { Buy 1 if t 2 =pass else Buy 2, Buy 2 if t 2 =pass else Buy 1, always Buy 1, always Buy 2 } The Next Set of Slides were not covered in detail in class and thus will not be tested on the final exam Evaluation by Graph Reduction Basic idea: (Ross Shachter) Perform a sequence of transformations to the diagram that preserve the optimal policy and its value, until only the UTILITY node remains. Similar to ideas of transformation into polytree Four basic value/utility-preserving reductions: Barren node removal Chance node removal (marginalization) Decision node removal (maximization) Arc reversal (Bayes rule) Barren node reduction Let X j represent a subset of nodes of interest in an influence diagram. Let X k represent a subset of evidence nodes. We are interested in P(f(X j ) X k ) A node is barren if it has no successors and it is not a member of X j or X k. The elimination of barren nodes does not affect the value of P(f(X j ) X k )
Barren Node Removal becomes becomes Notation for Shachter s algorithm For chance nodes: S(i) = direct successors = children C(i) = conditional predecessors = parents For decision nodes I(i) = information predecessors = parents Chance Node Removal i C(i) \ C(v) C(i) C(v) C(v)\C(i)\{i} nodes connected to i but not to v becomes v Node i directly linked to utility node v nodes connected to v but not to i and not i v marginalization C(i) \ C(v) C(i) C(v) C(v)\C(i)\{i}
Decision node removal Arc reversal I(i) \ C(v) i I(i) C(v) v becomes maximization v Given an influence diagram containing an arc from i to j, but no other directed path from i to j, it is possible to to transform the diagram to one with an arc from j to i. (If j is deterministic, then it becomes probabilistic.) I(i) C(v) Arc Reversal i j Bayes rule C(i) \ C(j) C(i) C(j) C(j)\C(i)\{i} becomes i j C(i) \ C(j) C(i) C(j) C(j)\C(i)\{i} Pa=Parents Pa(A)\Pa(B) parents of A who are not parents of B
Next Lecture Introduction to Different Forms of Learning