Measuring the Performance of an Agent



Similar documents
AI: A Modern Approach, Chpts. 3-4 Russell and Norvig

Agents: Rationality (2)

Chapter 2 Intelligent Agents

cs171 HW 1 - Solutions

Chapter 2: Intelligent Agents

Problem solving and search

Introduction to Artificial Intelligence. Intelligent Agents

Artificial Intelligence

Course Outline Department of Computing Science Faculty of Science. COMP Applied Artificial Intelligence (3,1,0) Fall 2015

CPSC 211 Data Structures & Implementations (c) Texas A&M University [ 313]

Chapter 13: Binary and Mixed-Integer Programming

Unit 4 DECISION ANALYSIS. Lesson 37. Decision Theory and Decision Trees. Learning objectives:

SOLVING PROBLEMS BY SEARCHING

Game playing. Chapter 6. Chapter 6 1

6.852: Distributed Algorithms Fall, Class 2

INTEGER PROGRAMMING. Integer Programming. Prototype example. BIP model. BIP models

Gerry Hobbs, Department of Statistics, West Virginia University

Information, Entropy, and Coding

Decision Trees from large Databases: SLIQ

On the Relationship between Classes P and NP

Lecture 10: Regression Trees

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

Intelligent Agents. Chapter 2. Chapter 2 1

Chapter 11 Monte Carlo Simulation

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

Search methods motivation 1

The Basics of Graphical Models

Teaching Introductory Artificial Intelligence with Pac-Man

Informed search algorithms. Chapter 4, Sections 1 2 1

Decision Trees and Networks

Dynamic programming. Doctoral course Optimization on graphs - Lecture 4.1. Giovanni Righini. January 17 th, 2013

CS 103X: Discrete Structures Homework Assignment 3 Solutions

(Refer Slide Time: 01:52)

Less naive Bayes spam detection

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Introduction to Algorithms. Part 3: P, NP Hard Problems

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

The Classes P and NP

How Asymmetry Helps Load Balancing

Load Balancing. Load Balancing 1 / 24

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

A Fast Algorithm For Finding Hamilton Cycles

Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows

Chapter 4 DECISION ANALYSIS

Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data

Analysis of Algorithms, I

Discuss the size of the instance for the minimum spanning tree problem.

Equilibrium computation: Part 1

Why? A central concept in Computer Science. Algorithms are ubiquitous.

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

Mining Social Network Graphs

KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS

Introduction to Logic in Computer Science: Autumn 2006

Algebraic Computation Models. Algebraic Computation Models

Binary Trees and Huffman Encoding Binary Search Trees

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Compression algorithm for Bayesian network modeling of binary systems

Data Structure with C

R u t c o r Research R e p o r t. A Method to Schedule Both Transportation and Production at the Same Time in a Special FMS.

Dynamic programming formulation

Optimizing Description Logic Subsumption

Chess Algorithms Theory and Practice. Rune Djurhuus Chess Grandmaster / September 23, 2014

The effect of exchange rates on (Statistical) decisions. Teddy Seidenfeld Mark J. Schervish Joseph B. (Jay) Kadane Carnegie Mellon University

Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees

A simple criterion on degree sequences of graphs

Smart Graphics: Methoden 3 Suche, Constraints

Data Mining Practical Machine Learning Tools and Techniques

Algorithms are the threads that tie together most of the subfields of computer science.

Mathematics for Algorithm and System Analysis

Chess Algorithms Theory and Practice. Rune Djurhuus Chess Grandmaster / October 3, 2012

Dynamic Programming 11.1 AN ELEMENTARY EXAMPLE

Optimizations. Optimization Safety. Optimization Safety. Control Flow Graphs. Code transformations to improve program

Intelligent Agents. Based on An Introduction to MultiAgent Systems and slides by Michael Wooldridge

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

The Relative Worst Order Ratio for On-Line Algorithms

CSE 326: Data Structures B-Trees and B+ Trees

Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.

SIMS 255 Foundations of Software Design. Complexity and NP-completeness

Factoring & Primality

Full and Complete Binary Trees

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

Using simulation to calculate the NPV of a project

Scheduling Home Health Care with Separating Benders Cuts in Decision Diagrams

Iterated Dynamic Belief Revision. Sonja Smets, University of Groningen. website:

POMPDs Make Better Hackers: Accounting for Uncertainty in Penetration Testing. By: Chris Abbott

Handling Fault Detection Latencies in Automata-based Scheduling for Embedded Control Software

Module1. x y 800.

Reinforcement Learning

SINGLE-STAGE MULTI-PRODUCT PRODUCTION AND INVENTORY SYSTEMS: AN ITERATIVE ALGORITHM BASED ON DYNAMIC SCHEDULING AND FIXED PITCH PRODUCTION

THE NUMBER OF REPRESENTATIONS OF n OF THE FORM n = x 2 2 y, x > 0, y 0

(IALC, Chapters 8 and 9) Introduction to Turing s life, Turing machines, universal machines, unsolvable problems.

OHJ-2306 Introduction to Theoretical Computer Science, Fall

8. KNOWLEDGE BASED SYSTEMS IN MANUFACTURING SIMULATION

Transcription:

25 Measuring the Performance of an Agent The rational agent that we are aiming at should be successful in the task it is performing To assess the success we need to have a performance measure What is rational at any given time depends on The performance measure that defines the criterion of success. The agent s prior knowledge of the environment. The actions that the agent can perform. The agent s percept sequence to date. For each possible percept sequence a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has. 26 Junk mail filtering has to classify e-mail messages as junk or relevant messages A physician has to decide whether to operate on a patient or not The number of correctly classified instances is not the best possible measure of performance, because right and wrong decisions have different weights Moving an important message to the junk folder is worse than letting through some junk mails occasionally True Positive False negative False positive True negative 1

27 Properties of Environments Task environments can be classified at least by the following properties Fully vs. partially observable In a fully observable environment the agent s sensors give it all relevant aspects affecting its choice of performance Hence, the agent does not need to maintain any internal state (understanding of the state of the world) An environment might be partially observable because of noisy and inaccurate sensors or simply due to its basic nature Deterministic vs. stochastic If the next state of the environment is completely determined by the current state and the action executed, then we say that it is deterministic. Otherwise it is stochastic. A deterministic environment may appear stochastic if it is partially observable 28 Task Environments /2 Episodic vs. sequential In an episodic task environment, the next episode does not depend on the actions taken in previous episodes In sequential environments the current decision could affect all future decisions Static vs. dynamic In a static environment the agent may stop to deliberate its actions without fearing that the world changes In a dynamic environment the agent has to keep looking for the state of the environment In a semidynamic environment the environment itself does not change with the passage of time, but the agent s performance score does (the agent is penalized for the time required to plan its actions) 2

29 Task Environments /3 Discrete vs. continuous The distinction can be made with respect to the state of the environment, time, and the percepts and actions of the agent Single agent vs. multiagent Depends on which entities one wants to view as agents or just simply as stochastically behaving objects In the multiagent environment one can compete or cooperate Known vs. unknown Whether the laws of physics of the environment are known In a known environment the outcomes (or outcome probabilities in a stochastic environment) for all actions are given A known environment can be partially observable 30 Real World from the Perspective of a Robot Partially observable Sensors are not perfect and perceive only the close environment (holds also for humans) Stochastic Wheels slip, batteries run out, parts break one can never be certain that the intended action is fulfilled Sequential The effects of actions change over time a robot has to manage sequential decision problems and be able to learn Dynamic When to deliberate, when to act Continuous/Infinite Algorithms must work is this environment, not, e.g., in a finite discrete search space Single/multiagent environment Depending on whether one wants to look at other objects as agents or stochastic parts of the environment 3

31 2.4 The Structure of Agents Because the agent s history of percept-action pairs stored in a table describes the external behavior of the agent, in principle the control program of the agent could be based on table lookup Obviously, this only works for very small environments In more realistic situations tabulating percept-action pairs is not a viable solution Instead, the agent program has to be able to decide the desired action on any percept history without tabulating all possible alternatives The following agent types are the most common solutions to this problem 32 Simple / Model-based Reflex Agents The simplest possible control program makes the agent operate solely on the current percept discarding the percept history Reflexes are used in emergency situations by humans as well as by robots Reflexive behavior yields correct decisions only when the task environment is fully observable One can choose a set of current rules based on the agents internal state Hence one can maintain a model of the world covering some information that is not directly observable 4

33 2.4.4 Goal-based Agents In addition to its percepts the agent possesses knowledge of its goal The goal is some assertion concerning the environment which should be satisfied By combining the goal and knowledge of the effects of available actions the agent can try to satisfy the goal If the goal cannot be satisfied directly through one action, one has to find out a sequence of actions to satisfy it The agent may resort to search algorithms (next section) or planning (later on) 34 2.4.5 Utility-based Agents The agent may achieve its goal in many different ways different solutions may have differences in quality Setting a goal alone does not suffice to express more complex settings If the possible states of the environment are assigned an order through an utility function, then the agent can try to improve its value In partially observable and stochastic environments we choose the action that maximizes the expected utility of the outcomes The utility function maps a state (or a sequence of states) onto a real number In order to take advantage of this approach, the agent does not necessarily have to possess a utility function explicitly 5

35 2.4.6 Learning Agents Programming agents for all possible tasks by hand appears to be a hopeless task Already Turing (1950) proposed machine learning as a method of creating intelligent systems Learning allows the agent to operate in initially unknown environments and become more competent than its initial knowledge alone might allow The learning element of an agent has to be kept distinct from the actual performance element We come back to the techniques of machine learning towards the end of the course 36 3 SOLVING PROBLEMS BY SEARCHING A goal-based agent aims at solving problems by performing actions that lead to desirable states Let us first consider the uninformed situation in which the agent is not given any information about the problem other than its definition In blind search only the structure of the problem to be solved is formulated The agent aims at reaching a goal state The world is static, discrete, and deterministic The possible states of the world define the state space of the search problem 6

37 In the beginning the world is in the initial state s 1 The agent aims at reaching one of the goal states G Quite often one uses a successor function S: P( ) to define the agent s possible actions Each state s has a set of legal successor states S(s) that can be reached in one step Paths have a non-negative cost, most often the sum of costs of steps in the path 38 The definitions above naturally determine a directed and weighted graph The simplest problem in searching is to find out whether any goal state can be reached from the initial state s 1 The actual solution to the problem, however, is a path from the initial state to a goal state Usually one takes also the costs of paths into account and tries to find a cost-optimal path to a goal state Many tasks of a goal-based agent are easy to formulate directly in this representation 7

39 7 2 4 1 2 5 6 3 4 5 8 3 1 6 7 8 For example, the states of the world of 8-puzzle (a sliding-block puzzle) are all 9!/2 = 181 440 reachable configurations of the tiles Initial state is one of the possible states The goal state is the one given on the right above Possible values of the successor function are moving the blank to left, right, up, or down Each move costs one unit and the path cost is the total number of moves 40 Donald Knuth s (1964) illustration of how infinite state spaces can arise Conjecture: Starting with the number 4, a sequence of factorial, square root, and floor operations will reach any desired positive integer 4!! 5 States : Positive numbers Initial state s 1 : 4 Actions: Apply factorial, square root, or floor operation Transition model: As given by the mathematical definitions of the operations Goal test: State is the desired positive integer 8

41 Search Tree When the search for a goal state begins from the initial state and proceeds by steps determined by the successor function, we can view the progress of search in a tree structure When the root of the search tree, which corresponds to the initial state, is expanded, we apply the successor function to it and generate new search nodes to the tree as children of the root corresponding to the successor states The search continues by expanding other nodes in the tree respectively The search strategy (search algorithm) determines in which order the nodes of the tree are expanded 42 7 2 4 5 6 8 3 1 7 2 4 7 4 7 2 4 7 2 4 5 6 5 2 6 5 6 5 3 6 8 3 1 8 3 1 8 3 1 8 1 9

43 The node is a data structure with five components: The state to which the node corresponds, Link to the parent node, The action that was applied to the parent to generate this node, The cost of the path from the initial state to the node, and The depth of the node Global parameters of a search tree include: b (average or maximum) branching factor, d the depth of the shallowest goal, and m the maximum length of any path in the state space 44 3.4 Uninformed Search Strategies The search algorithms are implemented as special cases of normal tree traversal The time complexity of search is usually measured by the number of nodes generated to the tree Space complexity, on the other hand, measures the number of nodes that are maintained in the memory at the same time A search algorithm is complete if it is guaranteed to find a solution (reach a goal state starting from the initial state) when there is one The solution returned by a complete algorithm is not necessarily optimal: several goal states with different costs may be reachable from the initial state 10

45 3.4.1 Breadth-first search When the nodes of the search tree are traversed in level-order, the tree gets searched in breadth-first order All nodes at a given depth are expanded before any nodes at the next level are expanded Suppose that the solution is at depth d In the worst case we expand all but the last node at level d Every node that is generated must remain in memory, because it may belong to the solution path Let b be the branching factor of the search This the worst-case time and space complexities are b + b 2 + + b d + ( b d+1 b ) = O(b d+1 ) 46 A B C D E F G 11

47 In general, the weakness of breadth-first search is its exponential (in depth) time and space usage In particular the need to maintain all explored nodes causes problems For example, when the tree has branching factor 10, nodes are generated 1 million per second and each node requires 1 000 bytes of storage, then: Depth Nodes Time Space. 2 110.11 sec 107 kb (10 3 ) 4 11 110 11 sec 10.6 MB (10 6 ) 6 10 6 1.1 sec 1 GB (10 9 ) 8 10 8 2 min 103 GB 10 10 10 3 h 10 terab (10 12 ) 12 10 12 13 days 1 petab (10 15 ) 14 10 14 3.5 years 99 petab 16 10 16 350 years 10 exab (10 18 ) 48 Breadth-first search is optimal when all step costs are equal, because it always expands the shallowest unexpanded node On the other hand, to this special case of uniform-cost search, we could also apply the greedy algorithm, which always expands the node with the lowest path cost If the cost of every step is strictly positive the solution returned by the greedy algorithm is guaranteed to be optimal The space complexity of the greedy algorithm is still high When all step costs are equal, the greedy algorithm is identical to breadth-first search 12

49 3.4.3 Depth-first search When the nodes of a search tree are expanded in preorder, the tree gets searched in depth-first order The deepest node in the current fringe of the search tree becomes expanded When one branch from the root to a leaf has been explored, the search backs up to the next shallowest node that still has unexplored successors Depth-first search has very modest memory requirements It needs to store only a single path from the root to a leaf node, along with the remaining unexpanded sibling nodes for each node on the path Depth-first search requires storage of only bm+1 nodes 50 A B C D E F G 13

51 Using the same assumptions as in the previous example, we find that depth-first search would require 156 kb (instead of 10 exab) at depth 16 (7 trillion times less) If the search tree is infinite, depth-first search is not complete The only goal node may always be in the branch of the tree that is examined the last In the worst case also depth-first search takes an exponential time: O(b m ) At its worst m»d, the time taken by depth-first search may be much more than that of breadth-first search Moreover, we cannot guarantee the optimality of the solution that it comes up with 52 3.4.4 Depth-limited search We can avoid examining unbounded branches by limiting the search to depth The nodes at level are treated as if they have no successors Depth-first search can be viewed as a special case with = m When d the search algorithm is complete, but in general one cannot guarantee finding a solution Obviously, the algorithm does not guarantee finding an optimal solution The time complexity is now O(b ) and the space complexity is O(b ) 14

53 3.4.5 Iterative deepening search We can combine the good properties of limited-depth search and general depth-first search by letting the value of the parameter grow gradually E.g. = 0, 1, 2, until a goal node is found In fact, thus we gain a combination of the benefits of breadth-first and depth-first search The space complexity is controlled by the fact that the search algorithm is depth-first search On the other hand, gradual growth of the parameter guarantees that the method is complete It is also optimal when the cost of a path is nondecreasing function of the depth of the node 54 3.4.6 Bidirectional search If node predecessors are easy to compute e.g., Pred(n) = S(n) then search can proceed simultaneously forward from the initial state and backward from the goal The process stops when the two searches meet The motivation for this idea is that 2b d/2 «b d If the searches proceeding to both directions are implemented as breadth-first searches, the method is complete and leads to an optimal result If there is a single goal state, the backward search is straightforward, but having several goal states may require creating new temporary goal states 15

55 An important complication of the search process is the possibility to expand states that have already been expanded before Thus, a finite state space may yield an infinite search tree A solvable problem may become practically unsolvable Detecting already expanded states usually requires storing all states that have been encountered One needs to do that even in depth-first search On the other hand, pruning may lead to missing an optimal path 7 2 4 5 6 8 3 1 7 2 4 5 6 8 3 1 7 2 4 5 6 8 3 1 56 Searching with Partial Information Above we (unrealistically) assumed that the environment is fully observable and deterministic Moreover, we assumed that the agent knows what the effects of each action are Therefore, the agent can calculate exactly which state results from any sequence of actions and always knows which state it is in Its percepts provide no new information after each action In a more realistic situation the agent s knowledge of states and actions is incomplete If the agent has no sensors at all, then as far as it knows it could be in one of several possible initial states, and each action might therefore lead to one of several possible successor states 16

57 An agent without sensors, thus, must reason about sets of states that it might get to, rather than single states At each instant the agent has a belief of in which state it might be Hence, the action happens in the power set P( ) of the state space, which contains 2 belief states A solution is now a path to that leads to a belief state, all of whose members are goal states If the environment is partially observable or if the effects of actions are uncertain, then each new action yields new information Every possible contingency that might arise during execution need considering 58 The cause of uncertainty may be another agent, an adversary When the states of the environment and actions are uncertain, the agent has to explore its environment to gather information In a partially observable world one cannot determine a fixed action sequence in advance, but needs to condition actions on future percepts As the agent can gather new knowledge through its actions, it is often not useful to plan for each possible situation Rather, it is better to interleave search and execution 17