Principles of AI Planning

Similar documents
Minimax Strategies. Minimax Strategies. Zero Sum Games. Why Zero Sum Games? An Example. An Example

Course Outline Department of Computing Science Faculty of Science. COMP Applied Artificial Intelligence (3,1,0) Fall 2015

Game playing. Chapter 6. Chapter 6 1

Teaching Introductory Artificial Intelligence with Pac-Man

A simple analysis of the TV game WHO WANTS TO BE A MILLIONAIRE? R

THE LOGIC OF ADAPTIVE BEHAVIOR

An Introduction to Markov Decision Processes. MDP Tutorial - 1

Measuring the Performance of an Agent

Reinforcement Learning

Regular Expressions and Automata using Haskell

Creating a NL Texas Hold em Bot

Machine Learning.

Best-response play in partially observable card games

Lecture 1: Introduction to Reinforcement Learning

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Notes on Complexity Theory Last updated: August, Lecture 1

Agents: Rationality (2)

CS MidTerm Exam 4/1/2004 Name: KEY. Page Max Score Total 139

We employed reinforcement learning, with a goal of maximizing the expected value. Our bot learns to play better by repeated training against itself.

Motivation. Motivation. Can a software agent learn to play Backgammon by itself? Machine Learning. Reinforcement Learning

What is Artificial Intelligence?

Today s Agenda. Automata and Logic. Quiz 4 Temporal Logic. Introduction Buchi Automata Linear Time Logic Summary

On some Potential Research Contributions to the Multi-Core Enterprise

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

Decision Making under Uncertainty

Complexity Classes P and NP

Reinforcement Learning of Task Plans for Real Robot Systems

Safe robot motion planning in dynamic, uncertain environments

KS3 Computing Group 1 Programme of Study hours per week

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Reinforcement Learning

POMPDs Make Better Hackers: Accounting for Uncertainty in Penetration Testing. By: Chris Abbott

Lab 11. Simulations. The Concept

chapter >> First Principles Section 1: Individual Choice: The Core of Economics

CSC384 Intro to Artificial Intelligence

Predictive Act-R (PACT-R)

COMP-424: Artificial intelligence. Lecture 1: Introduction to AI!

Course 395: Machine Learning

Stochastic Models for Inventory Management at Service Facilities

COMP 590: Artificial Intelligence

Just Visiting. Visiting. Just. Sell. Sell $20 $20. Buy. Buy $20 $20. Sell. Sell. Just Visiting. Visiting. Just

Master Candle E-book

MATHEMATICS PLACEMENT

CS 301 Course Information

Code Aware Resource Scheduling

Complexity Theory. Jörg Kreiker. Summer term Chair for Theoretical Computer Science Prof. Esparza TU München

Scheduling Home Health Care with Separating Benders Cuts in Decision Diagrams

Universality in the theory of algorithms and computer science

Division of Mathematical Sciences

Testing LTL Formula Translation into Büchi Automata

Optimizing Description Logic Subsumption

Some Research Directions in Automated Pentesting

Game theory and AI: a unified approach to poker games

Turing Machines: An Introduction

Big Data and Scripting

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, Class 4 Nancy Lynch

6.254 : Game Theory with Engineering Applications Lecture 1: Introduction

Math Quizzes Winter 2009

Peer-to-Peer Networks

MATHEMATICS: CONCEPTS, AND FOUNDATIONS Vol. III - Logic and Computer Science - Phokion G. Kolaitis

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

Genetic programming with regular expressions

6.254 : Game Theory with Engineering Applications Lecture 2: Strategic Form Games

Intelligent Agents. Based on An Introduction to MultiAgent Systems and slides by Michael Wooldridge

6.080/6.089 GITCS Feb 12, Lecture 3

Computer Science. Requirements for the Major (updated 11/13/03)

Machine Learning Introduction

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

Mental Computation Activities

Roulette Wheel Selection Game Player

cs171 HW 1 - Solutions

3515ICT Theory of Computation Turing Machines

Formalizing Properties of Agents

Content. CEO Statement 3. Solving for the Impossible 4. The Challenge 5. Optimizing Results is Complicated 6. Daisy s Theory of Retail TM 7

NP-Completeness and Cook s Theorem

CSE 517A MACHINE LEARNING INTRODUCTION

Marketing Concept. The Marketing Concept

CSE841 Artificial Intelligence

Laboratory work in AI: First steps in Poker Playing Agents and Opponent Modeling

The positive minimum degree game on sparse graphs

Operations and Supply Chain Management Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology Madras

Managerial Economics Prof. Trupti Mishra S.J.M. School of Management Indian Institute of Technology, Bombay. Lecture - 13 Consumer Behaviour (Contd )

WRITING PROOFS. Christopher Heil Georgia Institute of Technology

1 Representation of Games. Kerschbamer: Commitment and Information in Games

Practical Guide to the Simplex Method of Linear Programming

Probabilistic Strategies: Solutions

Introduction to Markov Chain Monte Carlo

SEARCHING AND KNOWLEDGE REPRESENTATION. Angel Garrido

Theorem (informal statement): There are no extendible methods in David Chalmers s sense unless P = NP.

Automata-based Verification - I

ExCEL Management System (EMS)

Artificial Intelligence Beating Human Opponents in Poker

Economics 159: Introduction to Game Theory. Summer 2015: Session A. Online Course

A Review And Evaluations Of Shortest Path Algorithms

Introduction to Artificial Intelligence

Primes in Sequences. Lee 1. By: Jae Young Lee. Project for MA 341 (Number Theory) Boston University Summer Term I 2009 Instructor: Kalin Kostadinov

Describe the process of parallelization as it relates to problem solving.

P vs NP problem in the field anthropology

Numerical Analysis. Professor Donna Calhoun. Fall 2013 Math 465/565. Office : MG241A Office Hours : Wednesday 10:00-12:00 and 1:00-3:00

Chapter 2: Intelligent Agents

Transcription:

Principles of Dr. Jussi Rintanen Albert-Ludwigs-Universität Freiburg Summer term 2005

Course: Principles of Lecturer Dr Jussi Rintanen (rintanen@informatik.uni-freiburg.de) Lecture Monday 2-4pm, Wednesday 2-3pm in SR 101-01-009/13 No lecture on May 16 & 18 (Pentecost) www.informatik.uni-freiburg.de/~ki/teaching/ss05/aip/ Lectures Exercises Text Complete lecture notes are available on the web page as the course proceeds.

Exercises and Examination Exercises assistant: Marco Ragni (ragni@informatik.uni-freiburg.de) Wednesday 3pm after lecture (not on May 18: Pentecost) Assignments are given out on Wednesday, returned on Monday. Lectures Exercises Examination Takes place either in July or in September (exact date to be determined). grade: 0.85 exam +0.15 exercises

What is planning? Intelligent decision making: What actions to take? general-purpose problem representation algorithms for solving any problem expressible in the representation application areas: high-level planning for intelligent robots autonomous systems: NASA Deep Space One,... problem-solving (single-agent games like Rubik s cube)

Why is planning difficult? Solutions to simplest planning problems are paths from an initial state to a goal state in the transition graph. Efficiently solvable e.g. by Dijkstra s algorithm in O(n log n) time. Why don t we solve all planning problems this way? State spaces may be huge: 10 9, 10 12, 10 15,... states. Constructing the transition graph and using e.g. Dijkstra s algorithm is not feasible!! Planning algorithms try to avoid constructing the whole graph. Planning algorithms often are but are not guaranteed to be more efficient than the obvious solution method of constructing the transition graph + running e.g. Dijkstra s algorithm.

Why is planning difficult? Solutions to simplest planning problems are paths from an initial state to a goal state in the transition graph. Efficiently solvable e.g. by Dijkstra s algorithm in O(n log n) time. Why don t we solve all planning problems this way? State spaces may be huge: 10 9, 10 12, 10 15,... states. Constructing the transition graph and using e.g. Dijkstra s algorithm is not feasible!! Planning algorithms try to avoid constructing the whole graph. Planning algorithms often are but are not guaranteed to be more efficient than the obvious solution method of constructing the transition graph + running e.g. Dijkstra s algorithm.

Why is planning difficult? Solutions to simplest planning problems are paths from an initial state to a goal state in the transition graph. Efficiently solvable e.g. by Dijkstra s algorithm in O(n log n) time. Why don t we solve all planning problems this way? State spaces may be huge: 10 9, 10 12, 10 15,... states. Constructing the transition graph and using e.g. Dijkstra s algorithm is not feasible!! Planning algorithms try to avoid constructing the whole graph. Planning algorithms often are but are not guaranteed to be more efficient than the obvious solution method of constructing the transition graph + running e.g. Dijkstra s algorithm.

Why is planning difficult? Solutions to simplest planning problems are paths from an initial state to a goal state in the transition graph. Efficiently solvable e.g. by Dijkstra s algorithm in O(n log n) time. Why don t we solve all planning problems this way? State spaces may be huge: 10 9, 10 12, 10 15,... states. Constructing the transition graph and using e.g. Dijkstra s algorithm is not feasible!! Planning algorithms try to avoid constructing the whole graph. Planning algorithms often are but are not guaranteed to be more efficient than the obvious solution method of constructing the transition graph + running e.g. Dijkstra s algorithm.

Different classes of problems actions deterministic nondeterministic probabilities no yes observability full partial horizon finite infinite. 1 classical planning 2 conditional planning with full/partial observability 3 Markov decision processes (MDP) 4 partially observable MDPs (POMDP)

Different classes of problems actions deterministic nondeterministic probabilities no yes observability full partial horizon finite infinite. 1 classical planning 2 conditional planning with full/partial observability 3 Markov decision processes (MDP) 4 partially observable MDPs (POMDP)

Different classes of problems actions deterministic nondeterministic probabilities no yes observability full partial horizon finite infinite. 1 classical planning 2 conditional planning with full/partial observability 3 Markov decision processes (MDP) 4 partially observable MDPs (POMDP)

Different classes of problems actions deterministic nondeterministic probabilities no yes observability full partial horizon finite infinite. 1 classical planning 2 conditional planning with full/partial observability 3 Markov decision processes (MDP) 4 partially observable MDPs (POMDP)

Different classes of problems actions deterministic nondeterministic probabilities no yes observability full partial horizon finite infinite. 1 classical planning 2 conditional planning with full/partial observability 3 Markov decision processes (MDP) 4 partially observable MDPs (POMDP)

Different classes of problems actions deterministic nondeterministic probabilities no yes observability full partial horizon finite infinite. 1 classical planning 2 conditional planning with full/partial observability 3 Markov decision processes (MDP) 4 partially observable MDPs (POMDP)

Properties of the world: nondeterminism Deterministic world/actions Action and current state uniquely determine the successor state. Nondeterministic world/actions For an action and a current state there may be several successor states. Analogy: deterministic versus nondeterministic automata

Example Moving objects with an unreliable robotic hand: move the green block onto the blue block. p = 0.1 p = 0.9

Properties of the world: observability Full observability Observations/sensing allow to determine the current state of the world uniquely. Partial observability Observations/sensing allow to determine the current state of the world only partially: we only know that the current state is one of several of possible ones. Consequence: It is necessary to represent the knowledge an agent has.

What difference does observability make? Camera A Camera B Goal

Different objectives 1 Reach a goal state. Example: Earn 500 euro. 2 Stay in goal states indefinitely (infinite horizon). Example: Never allow the bank account balance to be negative. 3 Maximize the probability of reaching a goal state. Example: To be able to finance buying a house by 2015 study hard and save money. 4 Collect the maximal expected rewards / minimal expected costs (infinite horizon). Example: Maximize your future income. 5...

Different objectives 1 Reach a goal state. Example: Earn 500 euro. 2 Stay in goal states indefinitely (infinite horizon). Example: Never allow the bank account balance to be negative. 3 Maximize the probability of reaching a goal state. Example: To be able to finance buying a house by 2015 study hard and save money. 4 Collect the maximal expected rewards / minimal expected costs (infinite horizon). Example: Maximize your future income. 5...

Different objectives 1 Reach a goal state. Example: Earn 500 euro. 2 Stay in goal states indefinitely (infinite horizon). Example: Never allow the bank account balance to be negative. 3 Maximize the probability of reaching a goal state. Example: To be able to finance buying a house by 2015 study hard and save money. 4 Collect the maximal expected rewards / minimal expected costs (infinite horizon). Example: Maximize your future income. 5...

Different objectives 1 Reach a goal state. Example: Earn 500 euro. 2 Stay in goal states indefinitely (infinite horizon). Example: Never allow the bank account balance to be negative. 3 Maximize the probability of reaching a goal state. Example: To be able to finance buying a house by 2015 study hard and save money. 4 Collect the maximal expected rewards / minimal expected costs (infinite horizon). Example: Maximize your future income. 5...

Different objectives 1 Reach a goal state. Example: Earn 500 euro. 2 Stay in goal states indefinitely (infinite horizon). Example: Never allow the bank account balance to be negative. 3 Maximize the probability of reaching a goal state. Example: To be able to finance buying a house by 2015 study hard and save money. 4 Collect the maximal expected rewards / minimal expected costs (infinite horizon). Example: Maximize your future income. 5...

Relation to games and game theory Game theory addresses decision making in multi-agent setting: Assuming that the other agents are intelligent, what do I have to do to achieve my goals? Game theory is related to multi-agent planning. In this course we concentrate on single-agent planning. In certain special cases our techniques are applicable to multi-agent planning: Finding a winning strategy of a game (example: chess). In this case it is not necessary to distinguish between an intelligent opponent and a randomly behaving opponent. Game theory in general is about optimal strategies which do not necessarily guarantee winning. For example card games like poker do not have a winning strategy.

Relation to games and game theory Game theory addresses decision making in multi-agent setting: Assuming that the other agents are intelligent, what do I have to do to achieve my goals? Game theory is related to multi-agent planning. In this course we concentrate on single-agent planning. In certain special cases our techniques are applicable to multi-agent planning: Finding a winning strategy of a game (example: chess). In this case it is not necessary to distinguish between an intelligent opponent and a randomly behaving opponent. Game theory in general is about optimal strategies which do not necessarily guarantee winning. For example card games like poker do not have a winning strategy.

Relation to games and game theory Game theory addresses decision making in multi-agent setting: Assuming that the other agents are intelligent, what do I have to do to achieve my goals? Game theory is related to multi-agent planning. In this course we concentrate on single-agent planning. In certain special cases our techniques are applicable to multi-agent planning: Finding a winning strategy of a game (example: chess). In this case it is not necessary to distinguish between an intelligent opponent and a randomly behaving opponent. Game theory in general is about optimal strategies which do not necessarily guarantee winning. For example card games like poker do not have a winning strategy.

Prerequisites of the course 1 basics of AI (you have attended an introductory course on AI) 2 basics of propositional logic

What do you learn in this course? 1 Classification of different problems to different classes 1 Classification according to observability, nondeterminism, goal objectives,... 2 complexity 2 Techniques for solving different problem classes 1 algorithms based on heuristic search 2 algorithms based on satisfiability testing (SAT) 3 algorithms based on exhaustive search with logic-based data structures Many of these techniques are applicable to problems outside AI as well.