10. Machine Learning in Games

Size: px

Start display at page:

Download "10. Machine Learning in Games"

Jeremy Dean
10 years ago
Views:

1 Machine Learning and Data Mining 10. Machine Learning in Games Luc De Raedt Thanks to Johannes Fuernkranz for his slides

2 Contents Game playing What can machine learning do? What is (still) hard? Various types of games Board games Card games Real-time games Some historical developments

3 Why Games? Games - ideal environment to test AI / ML systems Progress / performance can easily be measured Environment can easily be controlled

4 Machine Learning for Game Playing A long history, almost as old as AI itself Arthur Samuel Playing checkers - Damen (late 50 s, early 60 ) Several interesting ideas and techniques Now, chinook (without learning) - world champion

Damen (late 50 s, early 60 ) Several interesting ideas

5 State of the art Solves Tic-tac-toe, 4 gewinnt, Go-Mo-Ku Endgames: chess (5 pieces), checkers (8) Worldchampion level Chess, checkers, backgammon, scrabble, Othello Human still much better Go, Shogi, Bridge, Poker

6 ML in games Learning the evaluation function For e.g. minimax Essentially reinforcement learning Discovering patterns From databases discover characteristic / winning patterns Modelling the opponent Given optimal strategy Find strategy that better fits the opponent.

7 MENACE (Michie, 1963)

8 MENACE (Michie, 1963) Learns Tic-Tac-Toe 287 boxes (1 for every board) 9 colors (for every position) Algorithm: Choose box according to position Choose pearl from box Take corresponding move Learning: O X X O Lost game -> keep pearls (negative reinforcement) Won game -> add extra pearl to boxes from which pearl was taken (positive reinforcement)

corresponding move Learning: O X X O Lost game -> keep pearls (negative reinforcement)

9 O X O O X O X to Move X X X Choose Box Take corresponding Move Select pearl

10 Arthur s Samuel Checkers Player Rote learning Learning by heart - memorizing Minimax - AlphaBeta

11 Minimax Search / KnightCap

12 Temporal difference learning

13 Backgammon Elements of chance TD-gammon (Tesauro) Very high level Changes in strategies of humans Why does it work? Deep search does not seem to be very useful (due to random aspects) Situations can be compactly represented using neural net and reasonable set of features

Deep search does not seem to be very useful (due to random aspects)

14 KnightCap (Baxter et al. 2000) Learns chess From 1650 Elo (beginner) to 2150 Elo (master player) in ca. 300 Internetgames Improvements wrt TD-Gammon: Integration of TD-learning with search Training against real opponents instead of against itself

15 Discovering patterns Database endgames Enormous endgame databases exist For certain combinations of pieces Optimal moves known (brute force) Known whether positions are won, lost, draw, how many moves Can they be compressed? Rules + exceptions more compact than database? Can they be turned into simple rules? Can we turn complex optimal strategies into simple but effective ones? Which properties of boards to take into account? Relational representations / engineering E.g., Quinlan, Alan Shapiro, Fuernkranz,

Rules + exceptions more compact than database? Can they be turned into simple rules?

16 KRK: simplest endgame positions Won in 0-16 moves 2796 different positions 18 classes Learning classification rules Knowledge, relations 1457 rules, 1003 exceptions Not much gained

17 Relational / Logical representatoins krk(-1,d,4,h,5,g,5) Use information such as samediagonal samerow samecollumn attacks( ) Etc.

18 Discovering strategies Endgames are solved but hard to understand Even hard for grand masters (KQKR) Many books written on endgames Goal Find easy to understand strategies Perhaps not optimal, but easy to recall and follow

books written on endgames Goal Find easy to understand

19 Difficult games for computers Go? Too many possible moves Too deep search would be necessary Intractable (big award to be gained) What about end-games? Go end-games (simplified) have been considered (E.g. Jan Ramon)

20 Modelling the opponent Key problem in games such as poker, bridge, For simple games, optimal strategy known (Nash- Equilibrium) Optimal: Random But not optimal against a player that always plays stone Modelling the opponent Trying to predict move of the opponent Or which move the opponent you will play Key to success for some games Cf. Poker (Jonathan Schaeffer)

player that always plays stone Modelling the opponent Trying to predict move of the

21 Other types of games Adventure games, interactive games, current compute games Let s look at some examples QuickTime and a TIFF (LZW) decompressor are needed to see this picture.

22 (learning to survive) Digger QuickTime and a TIFF (LZW) decompressor are needed to see this picture. A key problem : representing the states, use of relations necessary

23 Real time games Robocup Components can be learned Using RL - e.g. the goalie How to tackle those? Problems Degrees of freedom Varying number of objects Continuous positions

24 Learning to fly Work by Claude Sammut et al. Behavioural cloning Trying to imitate the player Reinforcement learning Layered learning / bootstrapping

25 Financial Games Predicting exchange rates Daimler-Chrysler Predicting the stock market Many models Time series!

26 Games and ML A natural and challenging environment Several successes, a lot still to do Ideal topic for thesis / studien arbeit Merry Christmas and Happy New Year!!!

CSE 517A MACHINE LEARNING INTRODUCTION

CSE 517A MACHINE LEARNING INTRODUCTION Spring 2016 Marion Neumann Contents in these slides may be subject to copyright. Some materials are adopted from Killian Weinberger. Thanks, Killian! Machine Learning