Informatics 2D Reasoning and Agents Semester 2, 2015-16

Similar documents

Decision Making under Uncertainty

Simple Decision Making

Decision making in the presence of uncertainty II

Decision Trees and Networks

1 Uncertainty and Preferences

Lecture 11 Uncertainty

Introduction to Game Theory IIIii. Payoffs: Probability and Expected Utility

Choice Under Uncertainty

Lecture notes for Choice Under Uncertainty

Financial Markets. Itay Goldstein. Wharton School, University of Pennsylvania

Compression algorithm for Bayesian network modeling of binary systems

1 Introduction to Option Pricing

DECISION MAKING UNDER UNCERTAINTY:

Lecture Note 1 Set and Probability Theory. MIT Spring 2006 Herman Bennett

Choice Under Uncertainty Insurance Diversification & Risk Sharing AIG. Uncertainty

Lecture 8 The Subjective Theory of Betting on Theories

A Few Basics of Probability

Why is Insurance Good? An Example Jon Bakija, Williams College (Revised October 2013)

Choice under Uncertainty

Insurance. Michael Peters. December 27, 2013

Economics 1011a: Intermediate Microeconomics

Applied Economics For Managers Recitation 5 Tuesday July 6th 2004

The Basics of Graphical Models

What Is Probability?

Statistics in Geophysics: Introduction and Probability Theory

CHAPTER 2 Estimating Probabilities

Moral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania

Lecture Note 14: Uncertainty, Expected Utility Theory and the Market for Risk

Elementary Statistics and Inference. Elementary Statistics and Inference. 17 Expected Value and Standard Error. 22S:025 or 7P:025.

Prospect Theory Ayelet Gneezy & Nicholas Epley

MA 1125 Lecture 14 - Expected Values. Friday, February 28, Objectives: Introduce expected values.

Week 5: Expected value and Betting systems

Chapter 21: The Discounted Utility Model

Lecture Notes on Actuarial Mathematics

Decision Theory Rational prospecting

Bayesian networks - Time-series models - Apache Spark & Scala

Economics 2020a / HBS 4010 / HKS API-111 Fall 2011 Practice Problems for Lectures 1 to 11

Basic Probability. Probability: The part of Mathematics devoted to quantify uncertainty

2. Information Economics

Prediction Markets, Fair Games and Martingales

Decision & Risk Analysis Lecture 6. Risk and Utility

Unit 19: Probability Models

Economics 1011a: Intermediate Microeconomics

Chapter 28. Bayesian Networks

Economics of Insurance

AMS 5 CHANCE VARIABILITY

Bayesian Analysis for the Social Sciences

Risk and Uncertainty. Vani K Borooah University of Ulster

Decision Making Under Uncertainty. Professor Peter Cramton Economics 300

In the situations that we will encounter, we may generally calculate the probability of an event

3.2 Roulette and Markov Chains

STA 371G: Statistics and Modeling

1 Representation of Games. Kerschbamer: Commitment and Information in Games

Lecture 10 - Risk and Insurance

Online Appendix Feedback Effects, Asymmetric Trading, and the Limits to Arbitrage

Chance and Uncertainty: Probability Theory

Introductory Probability. MATH 107: Finite Mathematics University of Louisville. March 5, 2014

Principal Agent Model. Setup of the basic agency model. Agency theory and Accounting. 2 nd session: Principal Agent Problem

John Kerrich s coin-tossing Experiment. Law of Averages - pg. 294 Moore s Text

Optimization under uncertainty: modeling and solution methods

Machine Learning.

Lecture Notes on the Mathematics of Finance

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

36 Odds, Expected Value, and Conditional Probability

Lecture 15. Ranking Payoff Distributions: Stochastic Dominance. First-Order Stochastic Dominance: higher distribution

Bayesian Networks. Mausam (Slides by UW-AI faculty)

arxiv: v1 [math.pr] 5 Dec 2011

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

Intermediate Micro. Expected Utility

Decision Analysis. Here is the statement of the problem:

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Econ 132 C. Health Insurance: U.S., Risk Pooling, Risk Aversion, Moral Hazard, Rand Study 7

Decision trees DECISION NODES. Kluwer Academic Publishers / X_220. Stuart Eriksen 1, Candice Huynh 1, and L.

If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

Math/Stats 425 Introduction to Probability. 1. Uncertainty and the axioms of probability

We never talked directly about the next two questions, but THINK about them they are related to everything we ve talked about during the past week:

3 Introduction to Assessing Risk

1 Interest rates, and risk-free investments

ECE302 Spring 2006 HW4 Solutions February 6,

Game Theory and Algorithms Lecture 10: Extensive Games: Critiques and Extensions

Before we discuss a Call Option in detail we give some Option Terminology:

Probabilities. Probability of a event. From Random Variables to Events. From Random Variables to Events. Probability Theory I

MONEY MANAGEMENT. Guy Bower delves into a topic every trader should endeavour to master - money management.

Compact Representations and Approximations for Compuation in Games

MONITORING AND DIAGNOSIS OF A MULTI-STAGE MANUFACTURING PROCESS USING BAYESIAN NETWORKS

Question 1 Formatted: Formatted: Formatted: Formatted:

פרויקט מסכם לתואר בוגר במדעים )B.Sc( במתמטיקה שימושית

Math 141. Lecture 2: More Probability! Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141

Pascal is here expressing a kind of skepticism about the ability of human reason to deliver an answer to this question.

Probability and Expected Value

Lecture 13: Risk Aversion and Expected Utility

A Portfolio Model of Insurance Demand. April Kalamazoo, MI East Lansing, MI 48824

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Transcription:

Informatics 2D Reasoning and Agents Semester 2, 2015-16 Alex Lascarides alex@inf.ed.ac.uk Lecture 29 Decision Making Under Uncertainty 24th March 2016 Informatics UoE Informatics 2D 1

Where are we? Last time... Looked at Dynamic Bayesian Networks General, powerful method for describing temporal probabilistic problems Unfortunately exact inference computationally too hard Methods for approximate inference (particle filtering) Today... Decision Making under Uncertainty Informatics UoE Informatics 2D 197

Combining beliefs and desires Rational agents do things that are an optimal tradeoff between: the likelihood of reaching a particular resultant state (given one s actions) and The desirability of that state So far we have done the likelihood bit: we know how to evaluate the probability of being in a particular state at a particular time. But we ve not looked at an agent s preferences or desires Now we will discuss utility theory in more detail to obtain a full picture of decision-theoretic agent design Informatics UoE Informatics 2D 198

Agent s preferences between world states are described using a utility function UF assigns some numerical value U(S) to each state S to express its desirability for the agent Nondeterministic action a has results Result(a) and probabilities P(Result(a) = s a,e) summarise agent s knowledge about its effects given evidence observations e. Can be combined with probabilities for outcomes to obtain expected utility of action: EU(A E) = s P(Result(a) = s a,e)u(s ) Informatics UoE Informatics 2D 199

Principle of maximum expected utility (MEU) says agent should use action that maximises expected utility In a sense, this summarises the whole endeavour of AI: If agent maximises utility function that correctly reflects the performance measure applied to it, then optimal performance will be achieved by averaging over all environments in which agent could be placed Of course, this doesn t tell us how to define utility function or how to determine probabilities for any sequence of actions in a complex environment For now we will only look at one-shot decisions, not sequential decisions (next lecture) Informatics UoE Informatics 2D 200

MEU sounds reasonable, but why should this be the best quantity to maximise? Why are numerical utilities sensible? Why single number? Questions can be answered by looking at constraints on preferences Notation: A B A B A B A is preferred to B the agent is indifferent between A and B the agent prefers A to B or is indifferent between them But what are A and B? Introduce lotteries with outcomes C 1...C n and accompanying probabilities L = [p 1, C 1 ; p 2, C 2 ;...;p n, C n ] Informatics UoE Informatics 2D 201

Outcome of a lottery can be state or another lottery Can be used to understand how preferences between complex lotteries are defined in terms of preferences among their (outcome) states The following are considered reasonable axioms of utility theory Orderability: (A B) (B A) (A B) Transitivity: If agent prefers A over B and B over C then he must prefer A over C: (A B) (B C) (A C) Example: Assume A B C A and A, B, C are goods Agent might trade A and some money for C if he has A We then offer B for C and some cash and then trade A for B Agent would lose all his money over time Informatics UoE Informatics 2D 202

Continuity: If B is between A and C in preference, then with some probability agent will be indifferent between getting B for sure and a lottery over A and C A B C p [p,a;1 p,c] B Substitutability: Indifference between lotteries leads to indifference between complex lotteries built from them A B [p,a;1 p,c] [p,b;1 p,c] Monotonicity: Preferring A to B implies preference for any lottery that assigns higher probability to A A B (p q [p,a;1 p,b] [q,a;1 q,b] Informatics UoE Informatics 2D 203

Decomposability example Decomposability: Compound lotteries can be reduced to simpler one [p,a;1 p,[q,b;1 q,c]] [p,a;(1 p)q,b;(1 p)(1 q),c] p A q B (1 p) is equivalent to (1 q) A C p B (1 p)(1 q) C Informatics UoE Informatics 2D 204

From preferences to utility The following axioms of utility ensure that utility functions follow the above axioms on preference: Utility principle: there exists a function such that U(A) > U(B) A B U(A) = U(B) A B MEU principle: utility of lottery is sum of probability of outcomes times their utilities U([p 1,S 1 ;...;p n,s n ]) = i p i U(S i ) But an agent might not know even his own utilities! But you can work out his (or even your own!) utilities by observing his (your) behaviour and assuming that he (you) chooses to MEU. Informatics UoE Informatics 2D 205

According to the above axioms, arbitrary preferences can be expressed by utility functions I prefer to have a prime number of in my bank account; when I have 10 I will give away 3. But usually preferences are more systematic, a typical example being money (roughly, we like to maximise our money) Agents exhibit monotonic preference toward money, but how about lotteries involving money? Who wants to be a millionaire -type problem, is pocketing a smaller amount irrational? Expected monetary value (EMV) is actual expectation of outcome Informatics UoE Informatics 2D 206

Utility of money Assume you can keep 1 million or risk it with the prospect of getting three millions at the toss of a (fair) coin EMV of accepting gamble is 0.5 0 + 0.5 3, 000, 000 which is greater than 1, 000, 000 Use S n to denote state of possessing wealth n dollars, current wealth S k Expected utilities become: EU(Accept) = 1 2 U(S k) + 1 2 U(S k+3,000,000) EU(Decline) = U(Sk+1,000,000 ) But it all depends on utility values you assign to levels of monetary wealth (is first million more valuable than second?) Informatics UoE Informatics 2D 207

Utility of money (empirical study) It turns out that for most people this is usually concave (curve (a)), showing that going into debt is considered disastrous relative to small gains in money risk averse. U U o o o o o o o o o o o o 150,000 o 800,000 o $ $ o (a) But if you re already $10M in debt, your utility curve is more like (b) risk seeking when desperate! (b) Informatics UoE Informatics 2D 208

Utility scales Axioms don t say anything about scales For example transformation of U(S) into U (S) = k 1 + k 2 U(S) (k 2 positive) doesn t affect behaviour In deterministic contexts behaviour is unchanged by any monotonic transformation (utility function is value function/ordinal function) One procedure for assessing utilities is to use normalised utility between best possible prize (u = 1) and worst possible catastrophe (u = 0) Ask agent to indicate preference between S and the standard lottery [p, u : (1 p), u ], adjust p until agent is indifferent between S and standard lottery, set U(S) = p Informatics UoE Informatics 2D 209

Representing problems with DNs Evaluating decision networks What we now need is a way of integrating utilities into our view of probabilistic reasoning (influence diagrams) combine BNs with additional node types for actions and utilities Illustrate with airport siting problem: Airport Site Air Traffic Deaths Litigation Noise U Construction Cost Informatics UoE Informatics 2D 210

Representing problems with DNs Evaluating decision networks Representing decision problems with DNs Chance nodes (ovals) represent random variables with CPTs, parents can be decision nodes Decision nodes represent decision-making points at which actions are available Utility nodes represent utility function connected to all nodes that affect utility directly Often nodes describing outcome states are omitted and expected utility associated with actions is expressed (rather than states) action-utility tables Informatics UoE Informatics 2D 211

Representing problems with DNs Evaluating decision networks Representing decision problems with DNs Simplified version with action-utility tables Less flexible but simpler (like pre-compiled version of general case) Airport Site Air Traffic Litigation U Construction Informatics UoE Informatics 2D 212

Representing problems with DNs Evaluating decision networks Evaluating decision networks Evaluation of a DN works by setting decision node to every possible value Algorithm : 1. Set evidence variables for current state 2. For each value of decision node: 2.1 Set decision node to that value 2.2 Calculate posterior probabilities for parents of utility node 2.3 Calculate resulting (expected) utility for action 3. Return action with highest (expected) utility Using any algorithm for BN inference, this yields a simple framework for building agents that make single-shot decisions Informatics UoE Informatics 2D 213

Foundations for rational decision making under uncertainty Utility theory and its axioms, utility functions Possible points of criticism? nicely blend with our BN framework Only looked at one-shot decisions so far Next time: Markov Decision Processes Informatics UoE Informatics 2D 214