Machine Learning and Data Mining 10. Machine Learning in Games Luc De Raedt Thanks to Johannes Fuernkranz for his slides
Contents Game playing What can machine learning do? What is (still) hard? Various types of games Board games Card games Real-time games Some historical developments
Why Games? Games - ideal environment to test AI / ML systems Progress / performance can easily be measured Environment can easily be controlled
Machine Learning for Game Playing A long history, almost as old as AI itself Arthur Samuel Playing checkers - Damen (late 50 s, early 60 ) Several interesting ideas and techniques Now, chinook (without learning) - world champion
State of the art Solves Tic-tac-toe, 4 gewinnt, Go-Mo-Ku Endgames: chess (5 pieces), checkers (8) Worldchampion level Chess, checkers, backgammon, scrabble, Othello Human still much better Go, Shogi, Bridge, Poker
ML in games Learning the evaluation function For e.g. minimax Essentially reinforcement learning Discovering patterns From databases discover characteristic / winning patterns Modelling the opponent Given optimal strategy Find strategy that better fits the opponent.
MENACE (Michie, 1963)
MENACE (Michie, 1963) Learns Tic-Tac-Toe 287 boxes (1 for every board) 9 colors (for every position) Algorithm: Choose box according to position Choose pearl from box Take corresponding move Learning: O X X O Lost game -> keep pearls (negative reinforcement) Won game -> add extra pearl to boxes from which pearl was taken (positive reinforcement)
O X O O X O X to Move X X X Choose Box Take corresponding Move Select pearl
Arthur s Samuel Checkers Player Rote learning Learning by heart - memorizing Minimax - AlphaBeta
Minimax Search / KnightCap
Temporal difference learning
Backgammon Elements of chance TD-gammon (Tesauro) Very high level Changes in strategies of humans Why does it work? Deep search does not seem to be very useful (due to random aspects) Situations can be compactly represented using neural net and reasonable set of features
KnightCap (Baxter et al. 2000) Learns chess From 1650 Elo (beginner) to 2150 Elo (master player) in ca. 300 Internetgames Improvements wrt TD-Gammon: Integration of TD-learning with search Training against real opponents instead of against itself
Discovering patterns Database endgames Enormous endgame databases exist For certain combinations of pieces Optimal moves known (brute force) Known whether positions are won, lost, draw, how many moves Can they be compressed? Rules + exceptions more compact than database? Can they be turned into simple rules? Can we turn complex optimal strategies into simple but effective ones? Which properties of boards to take into account? Relational representations / engineering E.g., Quinlan, Alan Shapiro, Fuernkranz,
KRK: simplest endgame 25620 positions Won in 0-16 moves 2796 different positions 18 classes Learning classification rules Knowledge, relations 1457 rules, 1003 exceptions Not much gained
Relational / Logical representatoins krk(-1,d,4,h,5,g,5) Use information such as samediagonal samerow samecollumn attacks( ) Etc.
Discovering strategies Endgames are solved but hard to understand Even hard for grand masters (KQKR) Many books written on endgames Goal Find easy to understand strategies Perhaps not optimal, but easy to recall and follow
Difficult games for computers Go? Too many possible moves Too deep search would be necessary Intractable (big award to be gained) What about end-games? Go end-games (simplified) have been considered (E.g. Jan Ramon)
Modelling the opponent Key problem in games such as poker, bridge, For simple games, optimal strategy known (Nash- Equilibrium) Optimal: Random But not optimal against a player that always plays stone Modelling the opponent Trying to predict move of the opponent Or which move the opponent you will play Key to success for some games Cf. Poker (Jonathan Schaeffer)
Other types of games Adventure games, interactive games, current compute games Let s look at some examples QuickTime and a TIFF (LZW) decompressor are needed to see this picture.
(learning to survive) Digger QuickTime and a TIFF (LZW) decompressor are needed to see this picture. A key problem : representing the states, use of relations necessary
Real time games Robocup Components can be learned Using RL - e.g. the goalie How to tackle those? Problems Degrees of freedom Varying number of objects Continuous positions
Learning to fly Work by Claude Sammut et al. Behavioural cloning Trying to imitate the player Reinforcement learning Layered learning / bootstrapping
Financial Games Predicting exchange rates Daimler-Chrysler Predicting the stock market Many models Time series!
Games and ML A natural and challenging environment Several successes, a lot still to do Ideal topic for thesis / studien arbeit Merry Christmas and Happy New Year!!!