Knowledge and Strategy-based Computer Player for Texas Hold'em Poker

Transcription

1 Knowledge and Strategy-based Computer Player for Texas Hold'em Poker Ankur Chopra Master of Science School of Informatics University of Edinburgh 2006

2 Abstract The field of Imperfect Information Games has interested researchers for many years, yet the field has failed to provide good competitive players to play some of the complex card games at the master level. The game of Poker is observed in this project, along with providing two Computer Poker Player solutions to the gaming problem, Anki V1 and Anki V2. These players, along with a few generic ones, were created in this project using methods ranging from Expert Systems to that of Simulation and Enumeration. Anki V1 and Anki V2 were tested against a range of hard-coded computer players, and a variety of human players to reach the conclusion that Anki V2 displays behaviour at the intermediate level of human players. Finally, many interesting conclusions regarding poker strategies and human heuristics are observed and presented in this thesis. ii

3 Acknowledgments I would like to thank Dr. Jessica Chen-Burger for her overwhelming support and help throughout the life-cycle of this project, and for the late nights she spent playing my Poker Players. I would also like to thank Mr. Richard Carter for his insight into the workings of some of the Poker players, and all the authors of the research quoted in my bibliography, especially the creators of Gala, Loki, Poki and PsOpti. I would also like to thank my parents, who have always been there to me, and inspire me every step of the way. And finally, I would like to acknowledge the calming contribution of my lab-fellows, without whom, completing this dissertation couldn't have been nearly this much fun. iii

4 Declaration I declare that this thesis was composed by myself, that the work contained herein is my own except where explicitly stated otherwise in the text, and that this work has not been submitted for any other degree or professional qualification except as specified. (Ankur Chopra) iv

5 Table of Contents Chapter 1 Introduction...1 Chapter 2 Literature Review Imperfect Information games Poker history Gala Loki Poki PsOpti...11 Chapter 3 Playing Poker Basic rules and aim of tournament Sequence of each game (specific to Texas Hold'em) Betting Rounds Winning combinations Basic player/strategy types Advanced strategies and Poker complexity Abstraction Techniques of 2-Person Bet-Limit Poker...19 Chapter 4 Design and Methodology Choice of Prolog General Incorporation of Rules Basic Two-Human-Player Poker General Strategic Behavior Anki V Strategy based player Overview of functioning and method Probability realisation of all possible starting hands Grouping form of strategy Similarities and differences from human beings...31 v

6 4.4 - Anki V A Randomised Rational Strategy Player Statistical Method vs. Random Generator Formulas and Evaluation Pseudo games and Winning Potential Calculation of Probability Triples Betting Strategies, Randomised Numbers and Enumeration...38 Chapter 5 Testing and Evaluation System Test White and Black Box Testing Random -1 player's Evaluation Evaluation of Anki V Anki V1 vs. Computer players Anki V1's Evaluation against Human Players Evaluation of Anki - V Anki - V2 vs. Computer players Anki V2's Evaluation against vs. Human Anki V1 vs. Anki V Direct Anki Comparison Anki Comparison in Human and Random Players Anki and the Previous Research...65 Chapter 6 Conclusion and Future Work General conclusions Conclusions of Anki V1 and Anki V Conclusions of Poker Game and Betting Strategies Future Work The Anki Poker Teaching Tool Testing and Extensions to project Resource based extensions to project...73 Bibliography...75 Appendix A Program Code...77 vi

7 Chapter 1 Introduction Poker has recently become one of the most popular games in the gaming community with many online poker rooms and programs. Despite the high level of human interest in this game, the computer programs and existing AI of this game are still in its infancy. A very interesting and influential paper was written in this field in 1995 by Daphne Koller on Imperfect Information Games[1]; and after more than a decade of research and advancements, the best of Poker programs are still known to lose regularly to master level players. The concept of Imperfect Information Games makes this field and project very significant. The AI community has come leaps and bounds in creating world champion level players for Perfect Information Games, e.g. chess, backgammon, etc. The approach of these games is very different from Imperfect Information Games, as all players can view the entire state of the game or gaming world at any point of time. This allows all the required information to be coded into the AI player's game design. On the other hand, extensive work done on the Imperfect Information field has only yielded either theoretical solutions or restricted Poker players. The statistically best Poker player, PsOpti, is considered better than most human players, but is still below the master level. PsOpti, also, only works with a restricted form of Texas Hold'em Poker, a variant of poker, optimised for 2 players. The reason why Imperfect Information Games are considered hard is because it requires players, human and AI, to cope with uncertainty, taking risks and deploying strategies. Also in such games, there is no concept of strict strategy, whereby a method could be found which would always lead to the optimal result. To have a strict strategy in most Imperfect Information Games, including poker, is considered suboptimal, as it provides additional information to the opponent. Thus, in addition to playing well with good strategies, an aspect of randomness needs to be incorporated into the player's strategy. It is sometimes beneficial to play a bad game, to encourage the opponent to bet more in future games, and thus win more in the long run. This project aimed to create Texas Hold'em Poker players in Prolog based on the concepts of Knowledge Base and Strategy. This project offers a useful insight into the AI gaming community through its exploration of Imperfect Information Games. Logic is one of the best languages to 1

8 represent a Knowledge Base, and thus Prolog has been chosen as the language for implementing this project. There are a variety of computer Poker players that were created, including Random 1, Random 2, Anki V1 and Anki V2. The players created in this project are tested against many pre-coded Strict Strategy computer players, and also against various categories of Human Players, ranging from Beginner to Advanced. A comparison to previous research has also been done, all of which is documented and presented in the form of results and conclusions. The next chapter, Chapter 2, presents the Literature Review on the subjects of Imperfect Information Games and previous Computer Poker Player Solutions to the Poker gaming problem. It also provides in-depth information about three of the best and well-documented players in the field, i.e. Loki, Poki and PsOpti. Chapter 3, Playing Poker, deals with the Poker game and Texas Hold'em in particular. The game, its rules, sequence of play and winning or losing conditions are presented along with the basic and complex strategies used by Human Players during the game. The abstraction techniques being utilised in this program, i.e. Bet-Limit Two-Player Texas Hold'em Poker Player, is explained in Section 3.7. Chapter 4, Methodology, presents the design and structure of the system and computer players created. The system architecture is represented along with the strategies and specifications of Random 1, Random 2, Anki V1 and Anki V2. The testing documentation, results and evaluation of Anki V1 and Anki V2 is presented in Chapter 5, Testing and Evaluation. Both these players are compared against strict or random strategy players explained in Chapter 4, against Human Players and also against each other. They are finally compared to the previous research and Computer Poker Players created in the field. The project is summed up in the final chapter, Chapter 6, Conclusions and Future Work. 2

9 Chapter 2 Literature Review Game theory has been a very prominent area of research since the 1940's. It has been utilised and successfully applied to various other fields such as...inspection of nuclear facilities, models of biological evolution, the FCC auctions for radio and television bandwidth, and much more. [1] It has also helped accomplish big advances in the AI gaming community. AI player capabilities have reached an extent whereby games such as Chess and Backgammon are no longer under human dominance. Computer players compete against grandmasters in such games and win on a very frequent basis. Other games with lesser strategical requirements can no longer even be won by humans, when playing against a competent computer player. It is, however, found that AI dominance exists only in one section of the gaming field. The games mentioned above and others like it are called Perfect Information Games. These are games in which the entire state of the game is visible to all players, or, all players of the game have equal Knowledge. This complete knowledge of the game state allows brute-search algorithms to compute scenarios and possible future moves. Success has been achieved in this field through improvement in speed of searches, for example, Deep Blue searched over 250 million chess positions a second to allow it to make the most optimal move. The other section of the gaming field lies in Imperfect Information Games. Examples of these are Bridge, Poker and RoShamBo (also known as Stone-Paper-Scissors). These are games that only allow partial world knowledge to be available to players, and requires them to base decisions on the same uncertainty. In the case of poker, or other similar card games, imperfect information is created by the hidden cards that the opponent holds. Similarly, as your cards are hidden, your opponent too has imperfect information. It is well-known in game theory that the notion of a strategy is necessarily different for games with imperfect information. [1] 3

10 Perfect Information Games have a well-defined optimal move, i.e. there is always a move which is at least as good as some other move. In addition to this, if the opponent gains knowledge of this move, it would make no difference to the current play, as it is by definition, the optimal move. This property is exploited by search algorithms to find that optimal move in the game tree Imperfect Information games Although game-tree search works well in perfect-information games, there are problems in trying to use it for imperfect information games such as bridge (and poker). The lack of knowledge about the opponents' possible moves gives the game tree a very large branching factor, making the game tree search infeasible. [4] Imperfect Information Games are played with a constant knowledge-gap between players with both having partial knowledge of the game state. Unlike Perfect Information Games, a deterministic strategy cannot be utilised in these games as that would allow the opponent to have an advantage of guessing the complete knowledge of the game. For this reason, optimal strategies in Imperfect Information Games can be described as a random combination of strategies with optimal evaluation of the partially known game state. Kuhn ([19]) has shown for a simplified poker game that the optimal strategy does, indeed, use randomization. [1] Poker history It is evident from the previous citation that people have been interested in computer poker players for a very long time. Apart from Kuhn, illustrious mathematicians and economists have also worked on theoretical and practical solutions for poker. This research field dates back to John von Neumann and John Nash, who used simplified poker to illustrate fundamental principles [16][17][18]. With such a long history of focus on Imperfect Information Games, especially Poker, it would be expected that the current computer players at least be of commendable standard. However, this is not the case. The most prolific players have all been created under the same umbrella of Darse Billings' work. Also, the best full-game Poker players are intermediate quality at best. The 4

11 creators of these programs confess to this fact in [10][13]. Their latest player, PsOpti is shown to perform well at the master level but only after a variety of changes to game-play, for example, restricting the game to a two person play. Most games of poker can see between 8 to 10 players playing a single game (full-game Poker). There is now a need for Imperfect Information Game computer players to catch up with their Perfect Information Game counterparts and offer a similar challenge to the human gaming community. This is the purpose on the basis of which this project has been proposed and evaluated. Over the previous two decades, a variety of solutions have been offered to the computer Poker player creation problem. These solutions can be sub-divided into the following four categories; Theoretical, Expert Systems, Simulation and Enumeration and finally, Game-Theoretic. These terms are further explained in detail below along with a case study of their most prominent example Gala Gala was created by Daphne Koller and Avi Pfeffer, first documented in [1]. It offers a theoretical solution to a restricted form of Poker. Once again, a two player system was evaluated, to which Gala, a logic programming language, was applied. Even though it was only a theoretical solution, Gala offered to write down the rules and constraints of Poker play in pseudo code form for the first time. This allowed future programmers such as Darse Billings and this project to gain insight into the sequential methods and workings of computer Poker players. Darse Billings, one of the major contributors of modern computer Poker players frequently refers to both [1] and [5], the works of Koller and Pfeffer. Figure 1 shows a small portion of the code designed for the Poker player. Gala is;... a knowledge representation language that allows a clear and concise specification of imperfect information games. [1] 5

12 In addition to the language, Gala also provided algorithms that prevented exponential growth of the game-trees with Imperfect Information. Figure 1. An abbreviated Gala description of Poker, from [1] However, as the game-trees were still found to be proportional to the unknown states in the game, the algorithm was found to be impractical for its application to full-scale poker. The authors concluded,... we are nowhere close to being able to solve huge games such as full-scale poker, and it is unlikely that we will ever be able to do so [1] 6

13 This statement of Koller and Pfeffer and been repeated many times in the various papers by Darse Billings, e.g. [10] Loki Loki was one of the first ever full-game computer poker players. It was developed by Darse Billings et. al., and is documented in [11]. It was created in two stages. Both versions have an expert knowledge and rule-based engine hard-coded into them, i.e. the experience of Darse Billings as a master poker player facilitated the formation of a form of game tree. This game tree consisted of scenarios and strategies suggested by the expert, and thus the game engine could brute-search through these strategies to obtain a near-optimal or random solution. The first version of the program relied solely on this expert knowledge to play against other players. It can be argued if this was truly an AI player or just the code representation of a particular master's player. Strategies were given priority rankings, and were randomly selected using weights, this allowed them to be relatively random. The first version, created on 1998, was shown to perform well solely against beginners, as even with the expert system, certain situations resulted in near deterministic working. As discussed earlier, any form of fixed strategy by an opponent can be used by the player to their advantage, and this occurred with Loki when it played against more experienced players. Figure 2 shows the basic Loki Architecture, and its functions. 7

14 Figure 2. Loki Architecture The second version, updated in 1999, added more computing concepts into the player. It, however, still kept the core expert engine, Loki uses expert knowledge in four places: 1. Pre-computed tables of expected income rates are used to evaluate its hand before the pre-flop, and to assign initial weight probabilities for the various possible opponent hands. 2. The Opponent Modeler applies re-weighting rules to modify the opponent hand weights based on the previous weights, new cards on the board, opponent betting actions, and other contextual information. 3. The Hand Evaluator uses enumeration techniques to compute hand strength and hand potential for any hand based on the game state and the opponent model. 4. The Bettor uses a set of expert-defined rules and a hand assessment provided by the Hand Evaluator for each betting decision: fold, call/check or raise/bet. [13] In the above statement, there is a mention of 'hand strength' and 'hand potential'. Hand Strength is the probability that the cards held by a player are the best possible cards in the game. On the other 8

15 hand, Hand Potential refers to the probability of the current cards becoming the best cards in the game in the future. These two terms together form the basis of a Hand Evaluator. There were major changes installed in the hand evaluator. Figure 3 shows the transformation of Loki from one form to another, and its reduction of dependence on expert knowledge. Another major update was the introduction of simulation to determine the best future action. This is explained in detail under the next sub-heading of Poki, as the transformation from Loki 2 to Poki was almost immediate. Figure 3. Transformation from Loki - 1 to Loki 2 Loki offered some useful insight into the design of the player created in this project. The most prominent of these was Bucketing, which was incorporated in Loki 2 and is present in Anki - V1. Bucketing is a method of abstraction, whereby, scenarios or game states are bunched together in groups. These groups are based on the expected reaction of the player, for example, in poker, hands of King-King and Ace-Ace can be grouped together as they are nearly as good as each other. More poker glossary and information is available from Chapter 3. Another important concept introduced in Loki - 2 was the usage of Triples to represent probability values for fold, bet and raise. The definition of these terms can be found in Chapter 3, but they are basically the options available to a player during the course of a Poker game betting round. This concept has also been incorporated into Anki V2. 9

16 2.5 - Poki Poki, the step up from Loki, introduced the strategy of Simulation and Enumeration to the computer poker players. It was created in , once again by Darse Billings et. al., [10]. Poki is also currently the best full-game Texas Hold'em Poker player, the variant of poker under consideration in this project. Once again, more information is available in Chapter 3 regarding Texas Hold'em. Poki supports a simulation-based betting strategy. It consists of playing out many likely scenarios, keeping track of how much each decision will win or lose. Every time it faces a decision, Poki invokes the Simulator to get an estimate of the expected value (EV) of each betting action.... A single trial consists of playing out the hand from the current state of the game through to the end. Many trials produce a full information simulation. [10] Enumeration, on the other hand, refers to the updating of the current belief or partial information state, with data received through interaction from the outside world. In the case of any card game, this can be implemented in the form of increasing the apparent chance of good cards with the opponent, if the opponent continues to bet excessive money during the game. This requires some form of an opponent modeler, which tries to guess an opponent's hand or strategy through an opponent's actions. Figure 4 shows the architecture of Poki. It can be seen how it is very similar to that of Loki 2 in Figure 3. The most major upgrade that resulted in a different name was concerning the opponent modeling. 10

17 Figure 4. Poki's Architecture In spite of the above upgrades to the AI player, Poki was still found to be inadequate in certain situations, The program s weaknesses are easily exploited by strong players, especially in the 2-player game. [2] So, the Computer Poker Community managed to increase their level of play, but they were still below par in comparison to good human players. Anki V2, the system developed in this thesis, also uses the concepts of Simulation and Enumeration in addition with Triple Generator to decide on its actions during a betting round. This is further explained in Chapter 4, Design and Methodology PsOpti PsOpti is the most recent player in the market, created once again by Darse Billings et. al. in 2003 [2]. It is a player based on game-theoretic strategies, using concepts of game theory. Ironically, the approach is quite similar to Gala, but only at the most fundamental level. PsOpti is the first game-theoretic AI player that was successfully tested against a master poker player, and was usually found to compete at a better scale with the master, than any of the previous computer players. 11

18 A game-theoretic solution produces a sophisticated strategy that is well-balanced in all respects. It is also safe and "robust", because it is guaranteed to obtain the theoretical (optimal) value of the game against *any* other strategy. [1] This value allows the player to play a pseudo-optimal game throughout its tournament. As it finds the value against any game, this also holds true for randomised, aggressive and bluffing gameplay, all of which exist to a very high extent in master level poker play. Figure 5 shows PsOpti's player against a master level player 'the count'. The unit of measure used here is 'small bets won', which can basically be replaced in this case by 10s of dollars. Figure 5. the count's performance against PsOpti However, this player is severely restricted in its game play.... abstraction techniques are combined to represent the game of 2- player Hold em, having size O(10 18 ), using closely related models each having size O(10 7 ). [2] Some of these abstraction techniques remove the possibility of PsOpti being utilised in the fullgame field in its current state. For example, one of the abstractions removes one of the betting rounds from the game of play, and another restricts the play to only a two-player game. A twoplayer game can be viewed as a simplified version of a full-game, wherein upto ten people can be playing at the same time. 12

19 This can still be seen as a great beginning, as all the abstractions are mostly computational reductions, and such players can easily be scaled up once they have proven their might. The drawback is that this type of strategy is fixed -- it can't adapt to the style of a particular opponent. Although it will break even against any opponent, it might only win at a slow rate against a weak or predictable player. [1] Another manner of interpreting this statement is that the game-theoretic player evaluates its current state and the state of the game, to try and obtain a near-optimal strategy. It has has no opponent modeling and thus cannot take advantage of human inexperience or mistakes. Opponent Modeling cannot be expressed in a similar manner as a game-theoretic approach; and thus the creators of PsOpti are currently looking into a player which plays game-theoretic till it develops enough knowledge about an opponent, so that it can switch to more sub-optimal strategies. These are strategies, which do not necessarily give the best result against an optimal player, but are the expertly determined best response to the manner in which the opponent is playing. Many of the abstractions used in PsOpti's creation are also being used to create the program of this project. This is because these abstractions offer a smaller set of constraints to satisfy, which is a commodity in a time constrained project such as this. Also, PsOpti's creation shows how these abstractions are mere computational releases of regulations, and offer a faster cycle of design, prototyping and results which are still relevant to the full-game analysis. The next chapter introduces the game of poker, Texas Hold'em, and its strategies and complexity. 13

20 Chapter 3 Playing Poker 3.1 Basic rules and aim of tournament Poker is a game played with a normal deck of 52 cards. The cards are divided into four set of thirteen cards, with each set have a distinctive symbol. These sets are called suits, and are named; spades, clubs, diamonds and hearts. The first two suits are usually represented in black, and the latter two in red. Each suit has cards from two till ten, followed by Jack, Queen, King and Ace, in order of value. The aim of a poker game is to obtain the best possible pattern with a 'hand' of 5 cards. 'Hand' has two definitions in poker, it refers to the final set of cards available to a player to form his/her pattern, and also the initial cards that are provided to each player and are hidden from the rest. This project will use the word in both contexts, but the difference can be found through the use of the words 'final hand' and 'initial hand'. A game in Poker is defined as the entire sequence of play from receiving the initial hand, to the point where some-one 'folds'; forfeits the current money in the pot, or a showdown determines which of the players have the strongest final hands. A game starts with a couple of blinds, which is small amount of money put into the game by two players without looking at individual initial hands. This allows each game to be worth at least some amount of money, and encourages more aggressive strategies by players. The game ends with the last remaining player, or the strongest player taking all the money in the 'pot' after the various betting rounds. A poker tournament consists of various players who start with equal money. As the games proceed, any player who reaches the end of their monies is said to have been eliminated and leaves the table. The last person left on the table with all the money is the winner of that tournament. Basic aim of poker can be described as Win as much as you can, but when you about to lose, lose as little as you can. Quite clearly the statement portrays the need for evaluation and deception. 14

21 3.2 - Sequence of each game (specific to Texas Hold'em)... the game of Texas Hold em, the poker variation used to determine the world champion in the annual World Series of Poker. Hold em is generally considered to be the most strategically complex poker variant that is widely played in casinos and card clubs. It is also convenient because it has particularly simple rules and logistics. [10] The popularity of Poker has already been discussed in Chapter 1, so the choice of Texas Hold'em should be obvious, as it is seen as the most popular form of the game. Texas Hold'em starts with all the players receiving a two-card initial hand. These cards are hidden from all the players bar the one to whom they are dealt. With deception or future potential of cards in mind, a betting round is held. The exact semantics of a betting round are explained in the next sub-section. At the end of the betting round, three community cards are dealt face-up. In computer play, this is represented by making the knowledge of the three cards available to both players. These three cards are called the 'Flop'. The Flop is followed by another round of betting, at the end of which another community card is shown. This card is known as the 'Turn'. This is followed by another round of betting and then another community card called the 'River' is revealed. The final betting round takes place afterwards, and if more than one person remains till the end, the seven cards are checked for the best hand of five. The winner takes all the money collected in the betting rounds of that game Betting Rounds The aim of the betting round is to collect equal money from all players before proceeding to the next stage, be it the revealing of community cards or a showdown. The options available to a player at any point in the round is fold and bet. Folding results in immediate forfeit of the money in the pot. Bad as it may sound, it is usually better to fold a hand which you are quite certain would not win, rather than lose money on it. Betting puts in a 'bet amount' into the pot. In this project this bet-amount is fixed at 10 units of money. There is another option available to the player, i.e. to check. Checking can be done when 15

22 there has been no money put down at the start of a betting round. So, at the start of any betting round, the person with the first turn has the option to check, and after that the second person has the option to check, bet or fold. The final option available to a player is that of a raise. Raising puts 20 units of money into the pot and is almost the opposite of checking, as it is allowed only when there has already been a bet on the table. Also, raising is restricted to a maximum of three times per player. Betting rounds can follow many patterns from the choices available above. For a two player game, an example is both players checking, in which case neither player puts any money into the pot. Another is that of both players betting, whereby both put in 10 units each. Certain complex betting rounds can also occur, such as check, bet, raise, raise, bet. Here, the first player checked, only to have the second player bet. The first player now has the option of betting, folding or raising, and he chooses the latter most. The second player re-raises and finally the first player bets to bring the contribution of both players at the end of the betting round to 30 units each. A betting round terminates as soon as a person folds, in which case the game ends, or when a person matches the opponent's contribution to the pot by checking or betting. Raising is usually used to 'raise-the-stakes' Winning combinations The strength of a winning five-card hand is determined by the table shown in Figure 6. The topmost 'Royal Flush' is considered the best hand in the game, whereas a 'High Card' is the worst. When determining a winner at the showdown, the player with the pattern which is highest on the table in Figure 6 wins. 16

23 Figure 6. Winning combinations in Texas Hold'em Poker Certain circumstances require further evaluation to find the winner of a hand. In such a case, the entire best-five hand is seen. For example, if by some strange luck, the community cards are the four Jacks and an 8, then the best five-card hand for each player is going to be the four jacks, as royal and straight flush and not possible, followed by the next highest card. It is this card that determines the winner of the game. If Player 1 has King-Four, and Player 2 has Queen-Ten, then the winner is Player 1, as his 'Four Jacks with King kicker' beat Player 2's 'Four Jacks with Queen kicker'. As is the case with Four of a Kind, sometimes, the second or third kickers are seen to determine the winner, but never beyond the best-five hand of a player. Draws occur in poker usually when both player 'play-the-board', i.e. the best-five hand for both players is actually the community cards. In this case, as the ranking of all the five cards is the same for both players, the pot is split halfway between the two players Basic player/strategy types There are several different ways to categorise the playing style of a particular player. When considering the ratio of raises to calls a player may be classified as aggressive, moderate or passive (conservative). Aggressive means that the player 17

24 frequently bets or raises rather than checking or calling (more than the norm), while passive means the opposite. Another simple set of categories is loose, moderate and tight. A tight player will play fewer hands than the norm, and tend to fold in marginal situations, while a loose player is the opposite. Players may be classified as differently for pre-flop and post-flop play [3] Pre-flop refers to the betting round held before the flop is revealed, whereas post-flop refers to the rest of the game after the display of the flop. A differentiation is usually made between these stages as the amount of information change is very great. Initially, each player has knowledge of 2 of the 4 cards that have been dealt, and after the flop, each player has knowledge of 5 of the 7 cards in play. The different strategies mentioned above also result in a specific form of reaction from the opponent. Games against aggressive or loose players would be worth a lot more, and at the same time, raising by an aggressive player may not always mean that the player has a good hand. On the contrary, raising by a tight or conservative player should be considered with greater concern. Apart from these basic player types, there are also known to exist many complex strategies which make the game of Poker both interesting and exceedingly hard Advanced strategies and Poker complexity Poker is a complex game, with many different aspects, from mathematics and hidden information to human psychology and motivation. To master the game, a player must handle all of them at least adequately, and excel in most. [10] This is the task being handed to a computer player to successfully excel in the game of Poker. Quite clearly, this is a huge task, and it probably would not be accomplished in the next few years. Mostly this is due to the inability of computing players to have a gut instinct or the ability to rapidly change one's strategy to combat another's. This topic is discussed further in Future Work in Chapter 6. Computing players have to deal with a lot of advanced strategies like check-raising, whereby a false impression is imparted on the opponent by checking on a good hand to raise when the turn comes back to the player. This would usually cause the opponent to at least respond with a bet, 18

25 and thereby allow the current Player to extract more money from the opponent. In addition to this, a check-raise may be made only in order to scare an opponent, as it is a well-known aggressive strategy. The best player to date, i.e. PsOpti incorporates no opponent modeling. It is important to understand that a game-theoretic optimal player is, in principle, not designed to win. Its purpose is to not lose. An implicit assumption is that the opponent is also playing optimally, and nothing can be gained by observing the opponent for patterns or weaknesses. [2] It is in the face of these challenges that this project hopes to show a brave front and come up with some important conclusions that may assist in furthering the field of AI poker players Abstraction Techniques of 2-Person Bet-Limit Poker The form of poker being considered for design and evaluation is that of two-player, bet-limit Texas Hold'em Poker. These abstractions have been made as they allow a player to be built under the required time constraints, and yet have conclusions which are applicable to full-scale poker players. Bet-limit poker is a form of poker whereby each bet is of a set limit, i.e. 10 money units in this case. Texas Hold'em tournaments are usually held with no-limit poker, whereby a player can dedicate his/her entire money on the first bet of the first game itself. Bet-limit poker provides both a regulated and a beginner level of understanding, which can be matched by the scope of this project. 19

26 Chapter 4 Design and Methodology Choice of Prolog The first decision taken for this thesis was regarding the choice of programming base. The duration of time available or the project and the heuristics constraints required a language that expressed a knowledge base and rule-based reasoning in a clear and concise manner, and yet had to ability to produce a quick design-to-result cycle. For this reason, Logic Programming was chosen to be the fundamental base upon which the program would be created. This decision does result in a disadvantage to the computing power efficiency of the final result, however, Prolog's ability to create programs and obtain prototypes and results much faster, makes it an ideal choice for this thesis. Rewriting the program to allow interoperability through the usage of platform-independent languages such as Java has been discussed in Chapter 6, Future Work and Conclusion. 4.2 General Incorporation of Rules The primary design of the thesis program required a structure, this was provided through insight into the Gala system. An in-depth knowledge of Poker rules was also discussed with the people mentioned in the Acknowledgment section to finalise certain discrepancies in the system. Like all popular games, Texas Hold'em has a variety of different rules, which are used by different organisations in their tournaments. For example, the first step in any game is the small and big blinds, whereby the first and second players put the money equivalent to half the betting amount and the betting amount respectively without looking at their cards. This allows each game to be worth at least some amount, and thereby more interesting. This is a rule seen for major multi-player tournaments where the chance to be the first or the last in a betting round circulates and thus allows every player to be at an advantage or disadvantage at some point in the game. This does not hold true for two-player poker, as the significant pro-con 20

27 scenario of a tournament is lost in two-player poker. For example, every player that is required to put in a big blind that player also gets to be the final player to bet in that betting round. Thus the pros and cons seem to balance each other out, a system which would be redundant in a smaller table of two-player poker. Also, the program designed for this project has very limited opponent modeling, which makes the changing of sequence of play unnecessary as a player's strategy doesn't effect the betting decision of the AI Player. For this reason, the blind system has been replaced with an ante. An ante is a fixed amount of money contributed by each player at the start of each game, due to the same reason as blinds. It is worth a bet amount, i.e. 10 units of money in this case. Figure 7 shows a clear view of the playing sequence of each game, in the form of cards and Betting Rounds. The Start represents the submission of ante, this is followed by the individual hands being allocated. The next step on the board is the revealing of the Flop, however, there is a betting round, known as Pre-Betting Round that is played before the Flop is revealed. Similarly, the Post-Flop and Post-Turn Betting Rounds sit between the Flop and Turn, and the Turn and the River respectively. The tournament continues onto the Post-River Betting Round, which is also called Final Betting Round sometimes. The final step, i.e. the Showdown represents the calculation of the winner/s of the game and the distribution of finances to that/those players. 21

28 Figure 7. Sequence of Texas Hold'em Play Poker tournaments also have a variety of rules to decide the winning pattern at a showdown, or the final comparison of players' cards. Certain organisations do not recognise the role of kickers in the winning hand, and are restricted only to the basic pattern. For example, in the example used 22

29 in Section 3.4 regarding the comparison of Four Jacks with a King Kicker and Four Jacks with a Queen Kicker, the above mentioned organisations would regard the result to be a draw, as the basic pattern is Four-of-a-Kind and both players have it in the form of Jacks. The kicker plays no role in the decision. The program created for this thesis follows the more generally accepted rule of recognising the role of a kicker. This is due to both the rules' popularity and the extensive strategy management required on the addition of this rule. The cards available to both players, i.e. community cards, can sometimes lead to a very good hand by themselves. But even in such a case, this rule creates the need for additional strategy, as kickers can be used to determine the outcome Basic Two-Human-Player Poker The next step after the creation of a framework like the one described in the above section, is the formation of players. The most basic form of play was implemented first, i.e. Human vs. Human. Clearly, this required no knowledge base or strategy on behalf of the computer in terms of gameplay. The primary purpose of a Human vs. Human player was to brute-force check the stability and rules of the Poker framework. A user-friendly Prolog interface was created for the users to allow multiple games to be played. Section 3.3 discusses the various betting choices available to each player, most of them only under certain circumstances. Choices of checking and raising were encoded along with their necessary rules and restrictions. A recursive error checker was also added to return the game state to its previous form in the case of an invalid entry from a human player. Finally, at the end of each game, if no player has folded, the program displays the cards of both players and specifies the winning pattern and player (showdown scenario). In the case of a player folding, there is no showdown and the cards are scratched, i.e. none of the players get to see each other's cards. This follows the working of real poker, whereby you can only see your opponent's cards if you fight them all the way to the end, and they do the same. 23

30 General Strategic Behavior Basic player strategies have been discussed in Section 3.5, and the keywords such as aggressive, conservative, tight and loose will be used very frequently in this thesis. These keywords are used to describe the playing strategies of a poker player, and the basic aim of such a player is to randomise these strategies as much as possible. Sticking to just one of the strategies can allow the opponent to recognise it and respond accordingly. The first step towards finding a near-optimal poker strategy is to create poker players which utilise these basic strategies strictly or randomly, and then to compare their performances. The comparison is given in Chapter 5, Testing and Evaluation, and the detail of the players created is provided below. Most of the previous research in the field also points towards this form of project cycle, whereby known bad or random players are created, and newer versions of the actual AI players are played against these players to obtain performance data. [2] The single most important decision in Poker can be stated as knowing when to play your hand. The keywords associated with this feature are tight and loose, where tightness signifies fewer hand plays and loose signifies a more liberal approach. Based on these keywords, two players were created. The first type of player randomly chose between acting loose and tight when it was it's turn to act. This player was coined 'Random - 1'. In addition to having a randomised betting strategy, it also had a randomised strategy to determine its aggressiveness. So, in theory, this Random - 1 was a completely random player, which could fold, bet, check or raise at any time; the final two options being restricted under the relevant circumstances. The second type of player chose a randomised aggression but has a strictly loose policy. It played every hand that it was dealt, randomly choosing how much money it wanted to bet on it. This player is called 'Random - 2'. As the policy of the player forces it to be completely loose, it never folds a hand, and bets and raises quite frequently. Thus, it is also seen as a more aggressive player by humans. 24

31 A third type of player with a completely tight strategy was not required, as such a player would always fold, or check, and would thus lose against any other player which chose to bet even once during the whole game Anki V1 Anki V1 is best introduced through its architectural diagram. Figure 8 shows the overview of Anki V1 along with all its functions and working. These are further explained in the later subsections of this chapter. Figure 8. Anki - V1's Architectural Overview Strategy based player After the creation of the generic Random - 1 and Random - 2, the need for a more intelligent player arose. The secret to a human-level intelligence player may lie in the strategies used by human beings to decide their betting strategies. Like it has been mentioned in the previous section, one of the most important decisions is knowing when to play one's hand; this player tries to offer a solution to that problem. It evaluates it's hand before every betting round using domain knowledge and uses that evaluation to come up with a betting strategy. 25

32 The evaluation before each betting round is done by looking at the 'type of hand' that the player has. 'Type of Hand' here refers to both the current hand strength and the future potential to have a strong hand. This is done by grouping together similar 'types of hands'. There are check, bet and raise 'buckets' or groups, and by matching the current hand to the grouping, the final betting strategy is decided Overview of functioning and method The most important method of abstraction for the computation of our pseudooptimal strategies is called bucketing. This is an extension of the natural and intuitive concept that has been applied many times in previous research [22][23][14]. The set of all possible hands is partitioned into equivalence classes (called buckets or bins). A many-to-one mapping function determines which hands will be grouped together. Ideally, the hands should be grouped according to strategic similarity, meaning that they can all be played in a similar manner without much loss in EV (Expected Value). [2] Before the working of the code and player is understood, there is a need to express the method by which the above mentioned buckets have been created. The most difficult time to correctly judge a betting strategy is that of the pre-flop betting round. This is the time when the player has the least percentage of information available to him/her, only 50%, as only two of four cards are visible, compared to the end, where 7 of the 9 cards are known to the player. However, playing strategies for initial hand can be determined as well. This can be done through simulation and expert knowledge. [3] provides an extensive table for comparison of performance of opening hands, however, it was designed specifically for Loki, and thus uses certain unspecified changes and heuristics. A similar table specific to this project needed to be created, with estimated performance values of opening hands. These performance values were used to determine the bucket or group that the hand would be assigned in. More details regarding this simulation is given in the next subsection, After having dealt with the pre-flop, the post-flop also needs some extensive strategies. Groups in this case were decided on the basis of an optimistic hand potential. Optimistic here can be defined has a fairly loose strategy that looks for patterns and hopes to complete them in the future. All the 26

33 possible winning patters such as sequences, flush or pairs are seen, along with their completeness to finally determine the group of a particular state of a hand. It is important to note that unlike the pre-flop hand, the post-flop works on patterns rather than on actual cards. For example, simulation determines the playing value of each possible initial hand, i.e. all 2,652 of them, yet, the post-flop hands are taken on the basis of pattern, instead of individual values. A sequence of 3,4,5,6,7 would be treated in this system exactly the same as a sequence of 4,5,6,7,8 as they both follow the same pattern. The sequences pattern, is divided into low and high number sequences, and is not judged on the exact value of the cards. This is because of the information explosion of the game state. For example, leading up to the final betting round, a player could have 6.74 * combinations of cards available. This number is clearly too large to allow individual analysis of each hand. Aggressive strategies are commonplace among expert human players. This is one of the well known strategies of master play, i.e. to intelligently utilise aggressive play to increase the doubt in an opponent's mind about a player's luck or bluff. Always staying aggressive is obviously not considered wise, as it discloses the player's strategy, but general aggressive behaviour is accepted at the master level. This reasoning has lead to a tight aggressive showdown player. The above categorisation requires more explanation. The tightness refers to its decision to only play to the end if it feels it has at least some form of a pattern available to it, i.e. it does not play to the end with a 'High Card'. The aggressive nature is apparent from Section where the table shows that anything above a pair is automatically put into the raising group, and is thus given the maximum aggression. Another important point to be mentioned is that the evaluation of the betting strategy for Anki V1 is done before each betting round, and the decision is maintained throughout that betting round, irrespective of the strategy of the opponent. It should also be clear from the above explanation that the Anki V1 does not bluff, it works with pure evaluation strategies, evaluated after each betting round. Overall it can be termed as a 'Tight Aggressive Player'. 27

34 Probability realisation of all possible starting hands As it has been discussed before, the pre-flop strategy evaluator looks at all the possible hands to determine their performance. Performance can be measured in many ways; Darse Billings uses sb/hds, i.e. small bets won per hand. In a tournament game, the size of the smallest bet allowed keeps increasing, and thus the winning is described by small bets won per hand, which is can be constant over the tournament. If money per hand was considered, latter games would be given undue additional weights due to their larger small bets. For the purposes of this project, a more generic definition has been implemented. Performance is described as the probability of winning a game with a particular hand. An exhaustive method to determine the probability of winning is not possible, simply due to the large game state that results O(10 11 ). As a result, a combination of simulation and previously defined optimistic hand potential is referenced to determine the various betting strategy groups. Figure 9 shows a tabular form of the result of playing any particular starting hand. This table was created by the author of this thesis, and the exact manner of its creation is explained below the figure. There is a reduction in the number of observed possible hands when the need for specific suits is abstracted. A hand can either be suited, i.e. both cards are of the same suit, or unsuited, i.e. different suits. Either way, the exact suit is unimportant, a suited 'spade' is considered at par with a suited 'diamond', and the same is true for all other combinations. A K Q J T A K Key : Unsuited Q Suited J Pair T A Ace K King Q- Queen J Jack T Figure 9. Probability of victory of initial hands 28

35 Each of the numbers given above is obtained in same manner: 1. The specified hand of the player is used along with a random hand for the opponent. 2. A random flop, turn and river are generated. 3. The final hands of both the player and opponent are compared to decide the victor. 4. Steps 1 3 are repeated for 5000 games. 5. Finally, the percentage of times that the player won is calculated, by dividing the victories by total number of games, i.e in this case. The approach followed above is similar to the one used to create the table in [3], however, the use of heuristics and strategies has been removed. This is because the calculation is used to find the Winning Potential, and not the sb/hds value that Loki uses. This eases both the creation and use of the data from the table. The buckets or groups of the pre-flop strategies are decided on the basis of the numbers created in Figure 9 and optimistic potential. The latter term forces any suited or sequenced player to be played irrespective of the number that was actually received from the table above. For example, even though a 2-3 unsuited only has a 31.62% chance of winning at the end, it is still played as a bet, as it may result in a high winner in the form of a sequence. The exact grouping of the pre-flop strategies is provided in Figure 10. The hands have been explained in plain English, followed by Figure 11, which is a modified form of Figure 9, and shows the grouping explicitly over the all the initial hands in the game. If possible Raise, otherwise Bet Bet If possible Check, otherwise Fold High Pair (i.e. 9-9 or higher) Any pair, suited or sequenced cards Everything Else High Suited Seq (i.e suited or higher) High cards (i.e. both cards above 8) Figure 10. Grouping of Pre-Flop Strategies 29

36 A K Q J A K Key : Raise Q Bet J Check T A Ace K King Q- Queen J Jack T Figure 11. A modified version of Figure 8 to show grouping of initial hands It may seem from the Figure 11 that the majority of the hands are bet, with a comparatively smaller set being checked or folded. However, this is not the case. Each example of a suited cell in the table above can exist in four different forms, i.e. through the four suits of the game. On the other hand, each non-suited cell, exists 12 times in the actual game, through all non-suited combinations of the four suits Grouping form of strategy Figure 12 provides the rest of the betting strategies used to group hands in later stages of the game. Post-Flop refers to the betting hand right after the Flop, and similarly with Post-Turn and Post-River. Post-River is also the final betting round of the game. If possible Raise, else Bet Bet If possible Check, else Fold Post Flop Flush, or one card away from it Seq or 1 away from a High Seq (more than 9) Three or Four of a kind, or Full House Two pair, with at least one of the no. in hand High Pair (i.e or higher) 2 cards away from Flush or High Seq One away from a Seq All other pairs Pairs in board with high cards in Both cards higher than a 10 Everything Else Post-Turn Three or Four of a kind, or Full House Flush, Seq, or High Pair Two pair with both numbers in hand One away from Flush or Seq Pairs in Board, or just one in hand High cards Figure 12. Buckets of betting strategy at various stages Everything Else Post River Any winning pattern better than a low pair Low pair and High Cards Everything Else 30

37 As can be seen from the figure, the player continues with its optimistic policy, whereby any chance of a flush or sequence is not abandoned. The headings of the columns of Figures 10 and 12 also show the exact strategies being utilised by the player. For example, it is not always possible to raise in a situation, thus in that case, the player decides to bet, and similarly when the player is unable to check, it folds Similarities and differences from human beings There are many similarities between the workings of Anki V1 and that of a human being. One of the major ones is that of Bucketing. Human players tend to form pre-defined groups in their minds that allow them to act in face of familiar situations. For example, a player with K-Q suited would probably behave similar to when given the cards J-Q suited. There are only a maximum of three betting choices available to a player at any given time, and a human player is required to play to the merit of the cards. Pre-Flop strategies are usually quite strict with intermediate or beginner level players. Lower-level human players decide beforehand about which kind of cards they would play with, and which ones they would usually fold. This method is similar to one being utilised in Anki V1. Post-Flop bucketing resembles human play even more in Anki V1, whereby the player works by matching the best patterns that they can find. Human players also tend to look for flushes, sequences, etc., before looking at the exact cards or suits available. Apart from the expressed similarities, there are also a number of human features with which Anki V1 differs. Human players do not have the capability to create a probability table similar to the one shown in Figure 9, instead, they rely on instinct and experience. Anki V1 can benefit from its higher computation power to allow such a table to tone it's betting strategies. A definite advantage that humans hold over Anki V1 is that of opponent modeling and the randomisation of their strategies to some extent. This plays a great part in the final result of its Anki V1's play against humans, more of which is expressed in Chapter 5, Testing and Evaluation. Anki V1 plays a completely evaluated strategy game, and thus lacks the randomisation discussed in Chapter 2, Literature Review. Thus, following from a hand evaluation, which is 31

38 handled very well in Anki V1, the introduction of randomised betting strategy is required. This leads to the creation of Anki V Anki V2 Figure 13. Anki - V2's Architectural Overview Figure 13 shows the emphasis on the creation of a randomised betting strategy, along with the ability to tone the randomisation through methods such as adjusting the Tightness Threshold, etc. More on the player is discussed below A Randomised Rational Strategy Player Following the expert-rule based approach of Anki V1, a player was required that followed a more Simulation and Enumeration form of strategy build-up. It is obvious that an AI player has higher computation power than that of a normal human player, thus the computer needs to given the opportunity to utilise this power to better itself at Poker. 32

39 Simulation and Enumeration allows real-time strategy build up, and limited reaction to opponent's responses in a game. This new player is called Anki V2. It primarily uses the concept of Probability Triples. A probability triple is an ordered triple of values, PT = [f,c,r ], such that f + c + r = 1.0, representing the probability distribution that the next betting action in a given context is a fold (check), call (bet), or raise, respectively. [13] The probability triples allow the program to create a controlled randomised strategy over a betting round, as compared to Anki V1, which worked with a strict strategy over the whole round. Each of the numbers in the Triple represent the individual probability of a certain action. A significant difference between the quotation given above and the implementation in this project lies in the range of the numbers used. Poki and Loki used a range between to express the Triples, whereas this thesis utilises all the real numbers between 0.0 and It makes no major change to the expressive power of the program, but allows a percentage output for each of the Triple values Statistical Method vs. Random Generator The exact working of the Anki V2 is explained in sequence below, this provides an overview of the entire Figure 13 seen previously : 1. Like Anki V1 for the initial hand, Anki V2 also plays simulated games with randomised values for the flop, turn, river and opponent hands. In the end, it receives a winning percentage. The games played are called 'pseudo games' and the winning percentage is called WP or Winning potential. 2. Using this WP, and some pre-defined formulas, the player creates a Probability Triple for the next betting round. 3. During the course of the betting round, at every decision point, a random real number between 0 and 100 is generated which is compared against the probabilities of the Probability Triples, to decide on the betting action. 4. The randomised betting action is adjusted using the Tightness setting provided to prevent 'silly' decisions by the player. 33

40 5. At the end of the betting round, when the next set of community cards are shown, the pseudo games are played again, by including information that is now available. For example, if the flop and turn have been revealed, Steps 1 through 4 are repeated, but Step 1 only generates random rivers and opponent hands, and uses the flop and turn information to get a more exact WP. The exact formulas and values used to calculate all the data mentioned above is given in Section However, it is important to understand the need for this simulation method against a more mathematical statistical model. One of the major drawbacks of simulation is that it is essentially a form of approximation. Experts have documented and proven that luck or a strange coincidence of events can effect a hand's performance to make it seem better or worse than its actual value [10]. This phenomenon can show its effects for a couple of thousand hands at a time. For this reason, even though extensive simulation can be considered to be a very good approximation, it is always exactly that, and can unknowingly contain high levels of noise. The other option that can be considered is that of statistics. A statistical method to find the Winning Potential of a given hand would consist of finding all the scenarios under which the current hand is stronger than an opponents'. This is clearly a more exact method of hand evaluation. Also, there is definite possibilities of this method in the final betting round, when the final state or pattern of a player is known. The amount of unknown information is quite scarce, i.e. only the opponent's two-card hand. Thus an exact statistical model of hands better than the players' can be generated. This method however has a near exponential blow-out when the first betting round is considered. The number of possibilities of future cards have been discussed earlier, i.e * possible hands. To group these hands in terms of those which are better, equivalent or worse is clearly a mammoth task, the kind for which there is neither computational power nor time. This is further proven by the fact that no computer Poker player created till date has ever tried to obtain the exact statistical evaluation of a game state. 34

41 Another major reason for the choice of simulation is that fact that it mimics human play. It offers both optimistic or pessimistic viewpoints at times, quite like another human player. This allows a more random strategy than what a strict statistical model would provide Formulas and Evaluation There are a variety of formulas and numbers that have been utilised in the formation of Anki V2. These features need further explanation, both for their function in the program and their justification. The exact formation of the Probability Triple is also explained, and so is the working of the betting action evaluator Pseudo games and Winning Potential Each pre-betting round evaluation begins with the simulation of 1000 pseudo games, at the end of which the number of games that the player won, drew or lost are reported back. These numbers are then used to determine a Winning Potential. The WP is then adjusted using enumeration, which is explained in Section And finally, the WP is converted into a Probability Triple. Figure 14 shows a part of the program code which relates to this exact sequence of work, along with a representation of the formula used to calculate WP from the data of pseudo games. play_pseudo_game(...), WP is ((W + (D / 2)) / (W + D + L)) * 100, WP1 is WP + N, assign_str(wp1, X),... Win Draw 2 Winning Potential= Win Draw Lose 100 Figure 14. Sequence of evaluation of Winning Potential The 'X' written in the final line of the Prolog code is the Probability Triple that is generated. The method 'assign_str' is explained in Section The first justification is regarding the 1000 pseudo games being played. This is due to the time constraints specified by Darse Billings that a player should not ponder over a decision for more than two seconds. And with the provided computational power, 1000 pseudo games were found to require 1 2 seconds of computation time. Higher computation power would allow more pseudo games, and thus provide a better approximation, but under the current restrictions, this the best that the game can offer. 35

42 Another formula created by the author was the regarding the finding of the Winning Potential. Previous papers explain Positive Potential as Winning probability over Total number of games. Drawn games are not discussed, and definitely need to be addressed, especially in the case of a smaller number of pseudo games such as this. The decision was taken to give half importance to drawn games, as they offer half the return of a normal winning game Calculation of Probability Triples Winning Potential is converted to Probability triples using a method 'assign_str'. This method uses the value of the Winning Potential to return percentage values for the next betting round. There are three sub-sets that have been introduced in the method: 1. When the WP is less than or equal to 50 : The chances of raising are quite low. In addition to this, at 0 WP, the chances of folding are maximum, with a low betting potential. As the WP increases, so do the chances of betting. 2. When the WP is less than or equal to 75 : Chances of raising are higher, with betting getting a major bonus, and checking reducing significantly. 3. When the WP is greater than 75 : Checking has a low probability, with the chance of raising slowly increasing and betting reaching a moderate level. Figure 15 shows the formulas used to portray the above three points. More explanation regarding the numbers follows the figure. Check Bet Raise <= 50 WinPot 80 WinPot <= WinPot WinPot > WinPot WinPot 20 Figure 15. Formulas used to calculate Probability Triples, given WinPot The formulas arise by firstly looking at the most basic premises, i.e. the Probability Triples required at the crucial points of 0, 50 and 100% probability. It is obvious that at no point should the value of any of these be 0, as that would lead to a strict strategy, and would thus be recognisable by humans. It was decided that the value of betting or raising should not fall below 36

43 10%, as this allows substantial bluffing power to the player even in the case of bad hands. As a result, the probability triple for 0% WP becomes [80,10,10] in the form [Check, Bet, Raise]. Human players can be found to change their strategy rather dramatically with knowledge of the current hand having more than 50% or 75% Winning Potential. This strategy change has been dulled to a large extent in the program, to allow a more gradual change relating to the exact Winning Potential. The hand with an exact 50% WP can be seen to have the Triple [45, 45, 10], with a jump to [40, 40, 20] for a hand with a number just above 50 but tending to it. From this point on, the checking power falls at a greater speed, while the betting power builds up. The final change occurs at 75%WP, at which point the need for a greater raise probability is introduced. Exact 75% WP has a Triple [15, 65, 20] and a WP just over 70, but tending to it, has a Triple [15, 30, 55]. This may seem like a dramatic jump, but in practice it is quite mellow due to the inability to raise in most situations. In these situations, the player simply bets, thereby reducing the apparent jump in probability. This is discussed further in the next subsection Betting Strategies, Randomised Numbers and Enumeration This section deals with the happening inside a betting round. The Anki V2 Player has a Probability Triple available to it, guiding its future moves, however, these moves still need to be monitored and executed. The choice (of betting action) can be made by generating a random number, allowing the program to vary its play, even in identical situations. This is analogous to a mixed strategy in game theory, but the probability triple implicitly contains contextual information resulting in better informed decisions which, on average, can outperform a game theoretic approach. [13] Figure 16 provides a Prolog code excerpt from the actual program that details the working of Anki V2's betting turn. The random number generated is a real number between 0 and 100. That allows decimal type Triples to be treated correctly, and not rounded off to the nearest integer. 37

44 random_float(...), % Random Number Generation choose_rel_str(...), % Strategy selection change_win(...), % Enumeration Step exact_str(...), % Strategy Refinement eval(...). % Strategy Implemented Figure 16. Anki V2 Betting Rounds The Strategy Selection shown in Figure 16 converts the random number into one of the betting strategies. It is done in the following manner; if the random number is lesser than the probability of a check, the result is Check. Otherwise, if the number is between the probabilities of check and the sum of check and raise, then the result is a Raise. In all other cases, the result is a Bet. For example, if we are given the Triple [40, 35, 25], random numbers generated between 1-40 would lead to Checks, would lead to Raises and would lead to Bets. The Enumeration step involves looking at the previous action of the opponent and slightly changing the odds to represent the game state more accurately.... we have specific information that can be used to bias the selection of (opponent's) cards. For example, a player who has been raising the stakes is more likely to have a strong hand than a player who has just called every bet. [1] Thus, both the actions of checking and raising by an opponent offer information regarding the opponent's hand. It can generally be assumed that a raise results from a good hand, and a check from a much worse one. Using this data, the Winning Potential can receive small tweaks to better represent this information. It has been decided that upon each of the opponent's checks, the WP would be incremented by 2, thus implying that the player now has a 2% greater chance of winning, as the opponent seems to have a bad hand. Similarly, an opponent's raise decrements the value of the WP by 2. This enumerated value is usually transferred between the betting rounds, with the exception of the pre-flop to post-flop change. In this case, the WP of hands are found to change so drastically through the different flops possible that any notion of enumeration over this stage would be redundant. Also, the value of 2 was chosen to allow a maximum of 10% change in the value created by the Simulator. This is a value chosen by the author, and thus, further improvements can 38

45 be made to this value through expert external input. Only the experts of the field can truly shed light on the importance of a person's raising and checking. For the sake of a controlled experiment, it is not possible to keep this number too high, as that may result in radical changes to strategy. Finally, before the final action is played, the probabilistic random action is refined. This is done in order to prevent the player from making 'silly' decisions, and also to implement some of the betting rules. It is at this stage that a raise is converted to a bet, in case that there is no money on the table, i.e. there have been no bets in the round, and hence the raise is invalid. More importantly, it is also at this step that a restriction is forced upon the player to never fold a hand with the Winning Potential higher than a particular percentage. If the action chosen is a check, and the play requires either a bet or a fold, this refinement is used to decide on the correct betting action. The number chosen for this non-folding refinement in most of the experimentation is 60, i.e. Anki V2 never folds any hand in which it believes it has a higher than 60% chance of winning. This number was chosen as initial experimentation found that it implemented a similar form of tightness to the strategy as compared to Anki V1, thereby leveling their playing field during human testing. Experimentation with this number and Triple Creation Formulas are discussed in the next chapter. 39

46 Chapter 5 Testing and Evaluation This chapter tests and evaluates both the system and all the computer players described in Chapter 4, Methodology. Firstly, there is a testing of the system design and architecture in the white box method. This is followed by black box result checking of the system. Finally, all the players, i.e. Random 1, Random 2, Anki V1, and Anki V2 are evaluated, with most emphasis on the latter two. The results and evaluation is also compared to the previous research of Loki, Poki, etc. discussed in Chapter 2, Literature Review System Test White and Black Box Testing The first batch of experimentation and testing that needed to be performed on the program is regarding its completeness and soundness. Its stability needed to be proven, to justify any results obtained from it later on. The architecture of the program was put to brute-force worst-case scenario tests to try and prove its soundness. These test were conducted on the Human vs. Human specification, so that each step of the program could be monitored and observed. The following tests were conducted and found to complete successfully : 1. The program started up without any errors and provided completely random opening hands to both the players of the game. Also, absolutely no repetition or pattern in the cards was found over a number of hand requests. 2. The program was found to provide the Human Player with all the necessary game state information, including the cards he/she held, community cards and the financial state of the game. All the information was found to be accurate. An example screen shot of the program is provided in Figure The betting options were found to adhere to their respective constraints, along with allowing the player to re-play the last move in case an erroneous choice was entered, e.g. raising when it is not permitted. 4. The betting rounds were found to progress in the manner required, and ended upon equal commitment of monies from both players, i.e. in the cases of two checks, two bets or a raise followed by a bet. 40

47 5. Each of the player's actions such as betting or checking was displayed clearly, with no information of the opponent available to a player. 6. Folding sets the game into a quick end mode, whereby all the community card displays and betting rounds are bypassed to reach the end. The cards of each player are not displayed on the board either. 7. Finally, the situation under which a player finishes his/her money is addressed. The program was found to display the required community cards and hurry to the end of showdown without any more betting requests. Figure 17. Screen shot of normal gameplay in the Prolog window Following the successful completion of the above mentioned tests, the winning pattern evaluator also needed to be tested. Strict rules are available for this section of the program and their stability and correct implementation needed to be proven. The winning pattern finder was put through a series of extreme case scenarios along with regular tests to make sure it worked in the least probable draw situations. For example, the evaluator was asked to find the winner in the case of the Fourth Kicker for a High Card, i.e. the 5 th card of the hand. At the same time, a draw result 41

48 was tested by providing the tester with identical hands that differed only on the 6 th most important card. The following tests were completed on the winning pattern evaluator : 1. The correct winner was identified using the priority table explained in Section 3.4. For example, Full House won over a Flush, etc. 2. In the case that the patterns on both players were found to match, the owner of the highest ranking card of the pattern was chosen as the winner. For example, 'Three Jacks' won over 'Three 8s'. 3. The comparison of hands with similar patterns was restricted to the correct number of kickers. For example, there are 2 kickers in 'Three-of-a-Kind', but none in a Sequence. 4. In the case of the best 5-card hand of each player being the same, or of similar importance, the result was announced as a Draw. 5. The correct amount of money was alloted to the players at the end of the pattern evaluation, i.e. the winner got all the money in the pot, or the money was divided between the players in the case of a draw. 6. All the information concerning the cards being played, the winning pattern type, the winning player and the new financial state of the game is displayed. A screen shot of a final result is provided in Figure

49 Figure 18. Screen shot of a showdown (end-game scenario) with Winning Evaluator As mentioned previously, the above methods were conducted on all of the patterns individually, and were created with a specific purpose of checking the most computational and in-depth rule scenarios of each of the winning patterns. All the cards were created by the author and then tested individually, as each winning pattern needed to deal with a different formation of cards. Certain tests revealed errors in the coding, in which case, the error was corrected, and the entire testing cycle was repeated. The above test carried out was more than 2000 hands in number, however, they were created specially to check specific features or components of the program. There was a need to test the program in its entirety. For this reason, the following four 'Strict' players were created; Always- Checks1 (else Folds), Always-Checks2 (else Bets), Always-Bets and Always-Raises (else Bets). These players were made to play 1000 games against each other, and the entire decision and game state of each of the games was recorded. This transcript of the 6000 games was checked manually by the author to confirm the stability and soundness of the program created. All the decisions were found to be correct according to the Poker rules discussed in previous chapters. These players continue to exist in the code given in Appendix A, but future testing was preferred on players such as Random 1 and Random 2 as they provided more varied results. 43

50 5.2 - Random 1 Player's Evaluation As expressed earlier, the main phase of computer player testing began with the introduction of Random 1 and Random 2, also called Random Player and Non-Folding Random Player respectively. This sub-section explains the bad performance of Random 1, and the reason why it was not tested to as much depth as Random 2. It also has certain implications on the workings and required strategies of future computer players. Random 1 was found to play appallingly badly against both Random 2 and Anki V1, given the same test conditions. All the games experiments were conducted to play 10,000 tournaments. Under these conditions both Random 2 and Anki V1 were found to beat Random 1 in all of the 10,000 tournaments. This clearly shows the flawed strategy of the player. This finding leads to our first major conclusion of the thesis, i.e. randomisation is required over strategies and meta-strategies, i.e. the decision to be aggressive, loose, etc. at any one time, that influence individual betting actions. This is especially proven in the Random 1 vs. Random 2 experiments, where the high folding rate of Random 1 forces it to bow out of the competition too often and too early unnecessarily, and thereby loose tournaments quickly. Random 1 performs slightly better against Anki V1 because Anki V1 assumes that Random 1 is a rational player, and thus if Random 1 bets, and Anki V1 has really bad cards, it chooses to fold. But once again, the sheer volume of folding by Random - 1 leads to its eventual downfall. Figure 19 provides more insight into the exact statistics of the two experiments. 44

51 Percentage of Victory % Tournaments Won % Games Won 0 Random 2 Player Anki V1 Figure 19. Player Performance when playing against Random - 1 Random V1 was found to play a total of 506,026 and 860,978 games against Random 2 and Anki V1 respectively and was beaten by both opponent players in all the tournaments. The game - winning percentages of the latter players is also shown in Figure 19. Through the results obtained above, the Random 1 player was abandoned. Future testing is conducted through self-play or through play against Random 2, with the only occasional comparison to Random Evaluation of Anki V1 The evaluation of Anki V1 is done in the form of two broad categories; playing against preprogrammed players for evolutionary and basic results, and playing against humans for more advanced results and final evaluation. Each of the category of tests are presented in the subsections below in more detail Anki V1 vs. Computer players The first test that Anki V1 faced was with the Random 2 player. The Random - 2 player's strategy, however static or randomised, was found to be very aggressive and quite close to the 45

52 strategy that an expert would recommend for master play. The static aspect of this player is not a disadvantage to it either, as the opponent, Anki V1, does not have opponent modeling. In addition to testing the performance of Anki V1 against the pre-defined computer player, it is also imperative that Anki V1 prove its increase in its performance as it develops. Anki V1 is created from four different betting strategy/evaluation components; pre-flop evaluation, post-flop evaluation, post-turn evaluation and final evaluation (post-river). It needs to be shown that the introduction of each one of these components adds value to the player as a whole. For the purpose of these experiments, the Anki V1 with only pre-flop evaluation was coined as Start-Eval Anki. The next upgrade with both pre-flop and post-flop evaluations is called Flop- Eval Anki. The addition of post-turn evaluation leads to Turn-Eval Anki and finally, all four evaluations come together to be called Final-Eval Anki. Each experiment between the players consisted of 10,000 tournaments. This was done so, in lieu of the fact that previous research has shown that up to a couple of thousands of games can be affected by good or bad luck of a player [10]. Thus, to make the statistical result more accurate, and assuming at least 100 hands per tournament, 10,000 tournaments provide us with a million games. This gives us an unbiased result that is free from the luck factor. All the results were checked to confirm that more than a million games had at least been played, and this was found to be true. Figure 21 shows the performance of improving Anki V1 against Random 2. As can be seen, each improvement is found to benefit the performance. 46

53 No. of Tournaments Won Start-Eval Anki Flop-Eval Anki Turn-Eval Anki Final-Eval Anki Anki - V1 Player with accumulated heuristics Figure 21. Anki V1's performance against Random 2 Player improves as more heuristics are added in average of 2.8 million games for each evaluation The figure above is seen to have an extraordinary quality, in that, from Start-Eval Anki to Flop- Eval Anki, there is a major change in improvement. Also, there is noticeable but not major improvement between the last three players. Both these observations can be explained through the concept of game state information. In the first case, the evaluator has 50% (2 of the 4 cards) information available to it. The decision based on this information is thus seriously flawed, which leads to the folding of good potential hands and protection of bad final hands in the Start-Eval Anki Player. In comparison to this, the information jump is very substantial in the next round, from 50% to 71.4%, as 5 of the 7 cards are now visible to the player. This allows the player to progress more intelligently. Partial, but small Information Gain is also responsible for the slow growth in the latter three forms of Anki - V1. The percentages of information available to Flop-Eval, Turn-Eval and Final-Eval Anki are 71.4%, 75% and 77.8% respectively, all of which are not much of an increase. As three players discussed above do not improve their knowledge of the world by a large percentage, their relative intelligence improves only a little (just one more card each time). 47

54 Apart from the tournament victory of Anki V1, the various players also need to be measured for their profitability and their efficiency. Figure 21 shows the increase in earnings of the players, whereas Figure 22 shows the relative increase in game winnings Money won per game played Start-Eval Anki Flop-Eval Anki Turn-Eval Anki Anki - V1 Player Final-Eval Anki Figure 21. Profitability of Anki V1 Players increases, as it adds heuristics to its play. Figure 21 shows the increase in the profits of the various players. For example, whereas Start- Eval Anki is found to lose 2.52 units of money with every game, Final-Eval wins 9.21 units on average for every game that is played. This shows the increasing intelligence and playing ability of each player. Figure 22 sums up the Anki V1's performance against the random player's by showing the comparison between the winnings of tournaments and games. Unlike the victory of tournaments, which describes a players performance, the lesser the number of games won, while improving tournament play, the better the player. This is because the player simultaneously improves both tournament play and overall profitability. It knows better of when and which hands it should play. 48

55 Percentage of Victory Start-Eval Anki Flop-Eval Anki Anki - V1 Player Turn-Eval Anki Final-Eval Anki % Tournaments Won % Games Won Figure 22. Tournaments and Games won by Anki V1 playing against Random 2 show that the efficiency of the player is improving. Figure 22 shows how the increase in the number of tournament wins doesn't seem to affect the percentage of games being played and won by the player Anki - V1, i.e. the percentage of games won remains constant. This is a positive sign for the latter versions of Anki V1, as it shows that the players are becoming more efficient in winning tournaments. Their improvement is proven by the increase in tournament wins, and their intelligence by the stability of percentage of game victories Anki V1's Evaluation against Human Players Anki V1 was played against three forms of players; beginners, intermediate and advanced. Beginners are newcomers to the game, these are people who have never played poker before. One of the subscribed aims of this project is also to investigate the formation of a Poker player that teaches beginners, and for the same reason, it also needs to be able to play well against them. Intermediate Players are either infrequent players of Poker, with detailed knowledge of the game or beginner players with knowledge of the functioning of the program. Due to resource and time 49

56 constraints of the project, the test base for the project was restricted to a close community, and thus certain members of the community had additional information available to them, which helped them develop a strategy against Anki V1. For this reason, they have been considered in the category above the absolute class. Finally, advanced players are either people with frequent exposure to the game in tournament play (with real money online or in the cash form), or intermediate players with knowledge of the player's capabilities. Once again, due to the given constraints, the experiments were held to a lower capacity than ideal. However, at least three individuals were gathered from each of the prescribed categories and were asked to play till they either won or lost a tournament. The final result data from all the tournaments was gathered, and sorted once again according to the categories in which the human players had been divided. Figure 23 provides a brief outlook of Anki V1's performance against the human players. Each point on the line of a performance curve is the cumulative average of Anki V1's money at that point of time, hereby measured in number of games. Also, the important points in the game are provided with their game number. Figure 23. Anki V1's performance against human players It can be seen from the figure above that Anki V1 succeeds in its primary objective of beating the beginner player, i.e. the tournament ends with Anki V1 having all of the 2000 money on the 50

57 table. The beginners involved in the testing found the player to be quite informative and user friendly, however, they did sometimes require assistance in trying to understand the winning situations with kickers, etc. Intermediate players finished better off against Anki V1, but only after a good struggle. It can be seen from the graph that Anki - V1 managed to get an upper hand very early in the game, while the human players tried to control their losses. About halfway in the graph (marked at game 560), it can be seen how Anki V1 loses a lot money, this was mostly attributed to two of the players having a couple of very big games that went their way around that time frame. This lucky break allowed the human players to move close to winning, however, by looking at the graph, it took a bit of commitment to finish off Anki V1. This can also seen by the fact that it took an average of 1308 hands for Intermediate Human Players to finally beat the Anki V1 Player. The general feedback from intermediate players was positive, whereby they felt that the player had a lot to offer if it incorporated a looser or more aggressive form of betting strategy. The general strategy of the Human Intermediate Players became 'bet-first'. They utilised an exceedingly loose strategy, as it lead to Anki V1 folding on most accounts. Similarly, closer to the end, the players commented on how they were beginning to trust the tightness of Anki V1, i.e. they folded when they saw Anki V1 fighting hard for its cards. This was an expected result from the intermediate bench, as Anki V1 definitely had the short-comings of being partially predictable. It is also clear from Figure 23 of how Anki V1 succumbed to the aggressive and loose behaviour of the Advanced Players. Yet, it is against these players that the Anki V1 can show its best traits. Anki V1's relatively quick defeat was expected at the hands of the Advanced players, due to its failure to cause doubts in the opponent's mind. The advanced players began to trust the computer's tightness strategy from the beginning and used this to their advantage. Apart from all these well-understood problems, Anki V1 still needed to prove its worth in at least one of the department for which it was created, i.e. quality of evaluation of playing hands. The best quality of human hand evaluation is obviously available from the advanced players. And it is quite clear that by bluffing and aggressive play Anki V1 can be beaten without the need for 51

58 extensive evaluation knowledge. However, the comparison between Human and Anki V1's evaluators needs to be done to prove its competence. In the final result file generated through human play, it was noted that the majority of AI losses were due to folding early on in the game. This resulted in the loss of 10 or 20 units of money each time, but were so frequent that it led to Anki V1's downfall. Thus to properly estimate the power of Anki V1's evaluator these smaller values need to be slowly removed. Figure 24 provides two indexes to measure Anki V1's true capabilities. The indexes are grouped by 'Bet Placed By Anki - V1', this is the 'at least amount' committed by Anki V1 in the game. Thus a game with 20+ of Bet Placed By Anki - V1 removes the games in which Anki V1 or the human player folded right after a person bet in the first round of betting, i.e. all the games with the value of just Relative Performance Index Money acquired per winning game Bet Placed by Anki - V1 Figure 24. Anki V1 evaluation against advanced human players using a couple of indexes explained in text. Once again, the improvement is visible. The first index that can be seen slowly rising is Relative Performance Index. It is calculated by the formula given in Figure 25. As expected, this value is above 1 for Anki V1 from the start, this is because Anki V1 only plays games it believes it will win, thus the result per won games for Anki V1 will be higher than that of an opponent, who play aggressive to just win 10 units most 52

59 of the time. The Relative Performance Index can also be seen to increase in the graph and thus proves Anki V1's growing dominance in the higher valued games. This shows that Anki V1 wins larger games more often than advanced players. Anki V1's Performance Index Relative Performance Index = Opponent's Performance Index Money won from game type Performance Index= Games won in game type Figure 25. Formulas used to calculate Relative Performance Index The second index is concerning the Money acquired per Winning Game. This is calculated by the formula given in Figure 26. As expected, this value starts off in the negative, this is due to the high majority of 10 valued games that Anki V1 loses, which also ultimately costs Anki V1 the tournament. However, Anki V1's true brilliance is shown when the 10 valued games are removed. Instantly, the value jumps from money being lost in between each victory by Anki V1 to 9.88 money being won with every game. In addition to the increase, Anki V1 can be seen to perform even better as the values of games increase, and more money is at stake. Money Acquired per winning game = Money Anki V1 won from game types Money Opponent won from game types Games Anki won in game type Figure 26. Formula used to calculate Money acquired per winning game Evaluation of Anki - V2 Like the evaluation of Anki V1, Anki V2 was also played against both computer and human players. Due to the time constraints, Anki V2 was played with the strict constraints of Tightness and Aggressiveness expressed in Chapter 4, Methodology, against Human players. The strict constraints and a few slightly modified versions were played against the pre-designed AI Players to further understand the workings of strategy and evaluation in a successful Poker Player. More 53

60 on the actual experiments and their results is given in the subsections below, with certain aspects of the research expressed in Chapter 6, Conclusions and Future Work Anki V2 vs. Computer players Tests for Anki V2 start from the easiest, i.e. the Completely Random Player. Anki V1 has set the bar for most of the test results of Anki V2, and thus Anki V2 would be expected to win all the tournaments against this player, like Anki V1. The amount of result data, however, is more restricted in this case. Anki V1 could play 10,000 tournaments in an hour and was thus given the ideal figure of 10,000 tournaments. On the other hand, Anki V2 takes 1 2 seconds for each evaluation that it creates. This introduces a massive time lag into the system, and faced with the time constraints of the project, it is unreasonable to play 10,000 tournaments. Instead, the figure has been reduced to 100 tournaments. This figure may seem very small, but it is the best compromised size that can be considered. 100 tournaments take about a day to finish evaluation, and provide an average of 25,000 games, which is more than twenty times the recommended amount required to get rid of the luck factor [10]. Figure 27 shows a graphical representation of the Anki V2 vs. Completely Random Player tournaments. As it can be seen, Anki V2 passes the first test and easily wins all its tournaments against Random 1. 54

61 % Victory % Tournaments Won % Games Won 0 Anki - V2's Performance Figure 27. Anki V2's performance against Completely Random Player The next test for Anki V2 comes from Random 2, or the Non-Folding Random Player. This should be a true test for Anki V2 as Anki V2 is a quite a moderate player in terms of tightness, and tends to fluctuate quite highly between its perception of tightness and looseness. This is mostly due to the incorporation of the Simulation engine and allowing some deliberate randomness to creep into the system. Figure 28 shows the result of the first test conducted between Anki V2 and Random 2. As can be seen, it was a complete failure. This was mostly seen due to Anki V2's inability to continue a strategy. It is found to abandon reasonably good cards in which it has invested a lot of money just because their their chance of winning falls below the specified threshold. Unlike Anki V1, Anki V2 saw a lot of hands folded in the latter stages of the game, after they had already been deemed play worthy. Betting round memory was found to be inadequate and the need for game memory was recognised. 55

62 % Victory % Tournaments Won % Games Won 0 Anki - V2 Figure 28. Original Anki V2's failure against Random 2 Anki V2 still comes out superior to Anki V1 and Random 2, through the capabilities incorporated into the Player to make it more like a learning human being. A human being can change his/her strategy based on the opponent, and if this power is imparted onto the Player, then the player would only need expert knowledge. Anki V2 has the ability to change its strategy by simply changing a couple of numbers in the program, a task that the Player would hopefully do by itself in much later versions. A human being can see the operation of Random 2, and quite clearly, the current strategy operated by Anki V2 is not sufficient to battle it. Random 2 is very loose and aggressive, and thus requires a loose evaluating player to beat it as well. On one hand, where Anki V1 was willing to battle to the showdown with a simple pair, Anki V2 is too tight, and can sometimes even fold a low Flush. Quite clearly, this is not acceptable. A change needed to be made in the Tightness setting of the Player, this is the number that was pre-defined in the player using the Winning Potential of the Player's hand. Chapter 4, Methodology, described how the setting of 60 was chosen to limit Anki V2's 'silly' actions, i.e. to prevent it from folding anything that is too good by mistake. To make the player more loose, this number needed to be lowered. Figure 29 shows the results of the experiments done from doing exactly this. 56

63 Loose and Tight Anki - V2 % Victory % Tournaments Won % Games Won Winning Potential Threshold (Looseness of Betting Strategy) Figure 29. Increasing Performance of Anki V2 as the Looseness of the player is increased or Winning Potential Threshold is lowered. The above figure gives the desired result, and thus through a specific and experimental mechanism, it can be seen that Anki V2 can easily beat Random 2 provided it is supplied with the right modifications. It is this ability of Anki V2 to be able to easily change its strategy that it will be of further use to the development of future versions, more of which has been discussed in Chapter 6, Conclusions and Future Work. Section 5.5 also compares the results received above to those of Anki V1. Time constraints on the project prevented future experimentation to obtain values for the Tightness settings to be 20 and 10. However, it is expected that Anki - V2 would suffer from such low settings, as it would make it almost equivalent to the Non Folding Random Player itself, and thus level the playing field Anki V2's Evaluation against vs. Human Once again, the experiments on the program were conducted on the three broad categories of humans labeled as Beginner, Intermediate and Advanced. The complex and randomised strategy base of Anki V2 removed the possibility of anyone understanding the player beforehand, thus the category upgrade that certain people had received in the Anki V1's experiments is now no 57

64 longer necessary. There is however another issue at hand, those beginners who participate in both Anki V1 and Anki V2 experiments, bring with them knowledge and strategy to deal with the poker player. Also, as this strategy closely resembles that of Random 2, and the original Anki V2's incompetence has been seen against it, these Human Players were considered Intermediate. In addition to the above information, it is also imperative to understand that due to time and resource constraints of this project, there was only limited users and time available for a single batch of experiments. Thus, the original value of 60 for Tightness was retained and so was all the values of Aggressive and Conservative behaviour. Figures 30 and 31 show the results of the experiments conducted with Human Players. Figure 30 allows a closer look at Beginner and Advanced Players, which are not to clear in Figure 31. Also, once again, there were at least 3 persons present per playing group. Figure 30. Anki V2's Performance playing against Beginner and Advanced Humans 58

65 Figure 31. Anki V2's Performance playing against Human Players Overall, Anki V2 was found to perform satisfactorily. Its exact comparison to Anki V1 is given in the section below. There was a lot of positive feedback from the Human players, especially from the Advanced Players who seemed to find the player quite good. In comparison to the Beginner players, Anki V2 was found to be quite superior, however, two of the three beginners commented on the player's good luck at the time. Even if the true performance of Anki V2 is actually lower at the current setting, it can be expected that through strategy modification and experimentation, a player can be created that is more suited to fight beginners. The players lost, as expected, to the Advanced Players, once again, not putting up too much of a fight. Once again, due to time constraints, modified versions of the player which were more loose and aggressive were not tried, but would have offered some insight into the playing capabilities of the master level players, and then try to imitate them. The intermediate players took a long time to understand the working of Anki V2 (upto 1686 games), but were once again able to exploit the Tightness in pre-flop and the inability to stick to a strategy for the whole game. Anki V2 was found to fold a lot of the times after the turn or the river because the probability of winning dropped, and it chose a 'bad' random number. This flaw 59

66 encouraged the Humans to be even more aggressive and loose, allowing them to once again imitate the Random 2 Player Anki V1 vs. Anki V2 The most important results also lie in the comparison between the two players Anki V1 and Anki V2. This comparison is two-fold, firstly, there is the direct play, in which the two Players Anki V1 and Anki V2 play against each other. Secondly, the players are compared on their performances against the Human and Random 2 Players Direct Anki Comparison The two players faced each other in a direct competition similar to experiments performed on them with Random 2, i.e. 100 tournaments. Figure 32 shows the result of the first series of experiments, playing Anki V1 against the original Anki V2. 60 % Victory Anki - V2's Performance % Tournaments Won %Games Won Figure 32. Original Anki V2's playing against Anki V1 It is visible from the figure that Anki V2 acted at par with Anki V1, so there is no apparent increase in performance. However, similar to Anki V2's fight against Random 2, the most 60

67 powerful tool that Anki V2 possesses is the power to adapt or change strategies to fight its opponent. Anki V1 offers a high tightness scale, where it only plays the games that it is sure it would do well in. Thus, the ideal strategy against such a player is either to play according to the opponent's play, i.e. use opponent modeling, or increase the aggressiveness and thereby bet more than often. The latter scenario would allow the player to maintain its tightness strategy and yet behave more appropriately against the opponent. Figure 33 shows the Anki 2's changing performance with the change in betting aggressiveness % Victory % Tournaments Won %Games Won 0 Conservative Anki Original Anki Aggressive Anki Anki - V2 Players' Performance Figure 33. Changing Anki V2 to fight Anki V1, and succeeding It is clear once again from Figure 33 that the abilities of Anki V2 gives it clear superiority over Anki V1. Conservativeness is introduced into the player by changing the base values of the probability triple formation, making it more likely to check and bet more than raising. Similarly, aggressiveness is incorporated by reducing the chances of checking Anki Comparison in Human and Random Players The first comparison lay in the performance of the players against the Random 2 or Non- Folding Random Player. It is clear that the Original Anki V2 fails in comparison, as it even fails to win a single tournament against Random 2. But it is best to compare the top performer of Anki V1, i.e. the final Player, against the best performance of Anki V2, with a 30 Tightness Index as discussed in the section Figure 34 shows this comparison. 61

68 % Victory % Tournaments Won Performance Index % Games Won Anki V1 Anki V2 Figure 34. Comparison of Anki V1's and Anki V2's performance against Random 2 It is clear from the above figure that Anki V2 outperforms Anki V1, though both of them put in an excellent performance of beating the Random V2 Player, more than 90% of the time in all cases. The next comparison lies in the players' play against Beginner Human Players. Figure 35 below shows the two curves represented in the Figures 23 and 31, so that their contrast is more visible. Figure 35. Comparison of Anki V1 and Anki V2 against Beginner Humans 62

69 As it is clear from the figure above, Anki V2 beats the human players slightly faster than Anki V1, and all this with the settings of the Original Anki V2 engine. A more adaptive or experimentally enhanced Anki V2 Player could easily outdo Beginner Players, thereby proving its dominance. Figure 36 shows the Intermediate Human Player comparison for Anki V1 and Anki V2. Both Players were found to lose in the long run, but the major comparison lies in the delay of loss. The longer the computer player lasts, the more chance it has to recover or understand the opponent. Figure 36. Comparison of Anki V1 and Anki V2 against Intermediate Humans It is clear from the figure that Anki V2 takes longer to lose against the Intermediate people. Thus, Anki V2 proves its improvement over Anki V1, even without the necessary adaption to Human competition. Finally, the comparison lies in the play against Advanced Players. Figure 37 shows this comparison. 63

70 Figure 37. Comparison of Anki V1 and Anki V2 against Advanced Humans It is in this figure that we see some disappointing results, as Anki V2 loses faster than Anki V1. However, once again, further experiments with a more aggressive and intelligently tight versions of Anki V2 will offer a better insight into the true power of Anki V2, as it will win more and in a more efficient manner. Also, apart from the numerical results, all the advanced players who had played Anki V1 as well, commented on the uncertainty of Anki V2. One of the Advanced Player had the following insight to offer; Anki V2 has clearly introduced randomisation; this is forcing me to concentrate harder and play a smarter game. I am less likely to blindly bet pre-flop, am forced to fold later in the game and don't understand the player's hand till after the turn. Anki V1 offered the same card information at every betting round (which Anki V2 doesn't). [24] The advanced players were surprised to find that they had beaten Anki V2 faster than Anki V1. They commented on how they found Anki V2 a much harder player, and thought that the only reason for a faster tournament was that each game became worth a lot more due to Anki V2's randomised betting strategy even in the face of bad hands. It is also for this reason that the players treated Anki V2 with more respect as it fought harder than Anki V Anki and the Previous Research Some of the results that were observed in the sections given above are consistent with many of the 64

71 observations received from the result sets of Loki, Poki and PsOpti. This proves the consistency of the program and the results. Most of the similarities lay in the patterns observed playing against both Intermediate and Advanced Players. Figure 5 from Chapter 2, Literature Review is represented again below in the form of Figure 38. The figure will be discussed in detail, taking comments made about it from [2], and comparing it to the results obtained in the evaluation of Anki V1 and Anki V2 against all categories of Human Players. The play in Figure 38 corresponds to PsOpti, the best two-person Poker player available, against a Master level player known as 'the count'. Figure 38. The count 's (Human Player's) performance against PsOpti, same as Figure 5. Starting from the left-most point on the graph above, it can be seen that PsOpti won the first couple of hands. This is the learning period, during which human opponents try to understand each other and take a few risks just to see the opponent's reactions. PsOpti does not incorporate any opponent modeling, similar to both the Anki versions created in this project, and thus starts to play its near-optimal strategies from the beginning, oblivious to the workings of its opponent. Exactly the same behaviour can also be seen in Anki V1's play against all beginner, intermediate and advanced players. This can be confirmed from Figures 35, 36 and 37 by observing the play of Anki V1. Anki V2 has the opposite result, but this is due to player experience against Anki V1, this is also discussed later in this section. 65

72 'Luck plays a huge factor in poker'; this is a statement that has been repeated many times in both this thesis and in many previous research papers. This can be seen in Figure 38, at both the sharp drops, at around 2500 and 5500 hands played. These are the sharpest drops in the graph, and have been accredited to luck in [3]. Similar trends can also be seen in Figure 36, in the plays of both Anki V1 and Anki V2 against Intermediate Human Players. These games against the Intermediate Players are long enough for certain luck factors to show themselves, and they do. It can also be noted that these luck factors introduce the steepest curves seen in any of the analysis. Thus, the factor of luck cannot be denied in Poker play. Another observation made in both PsOpti and the versions of Anki is that of the 'Blink Factor', or the fact that humans get tired, but machines never do. The first fall of 'the count' at around 2500 hands was due to luck, but he continued to perform badly, and he stopped after a while saying that he was tired and wanted to retire for the day. He came the next day and started to recover. Once again, the Anki analysis that best shows this result is the long tournaments Anki V2 played against Intermediate Human Players. At least half of the Intermediate players complained of fatigue at this point in the game, and commented on how they 'took a short break to gather their senses' before continuing once again. This 'Blink factor' is something a future version of Anki could be taught to exploit, as Human players' ability seem to take a serious fall during this time. Lastly, an aspect of memory that needs to be developed that human beings already utilise in their strategies is that of 'remembering the best strategy to compete against an opponent'. Both PsOpti and the versions of Anki do not try to remember the strategies that helped it win against the opponent, and this makes a difference to the output. After the 'Blink Factor' experienced by 'the count', he bounces back after around 3800 hands. This is because, even after a day's break, he remembers the strategies he had played earlier to good effect, and utilises these strategies once again to increase his performance. This phenomenon can be seen in Figures 36 and 37 for Anki V2 plays against Intermediate and Advanced players. Anki V1 was seen to perform well right at the start, as was noted at the start of this discussion, but Anki V2 starts of quite poorly. This is simply because the Human Players attack Anki V2 similar to the strategy they learned playing against Anki V1, and even though Anki V2 is a better player, that strategy is found to work well against it as well. 66

73 The final discussion concerning memory can be seen as a form of opponent modeling, but it is not. Opponent modeling requires extensive calculation of the opponent's behaviour, and using expert systems to decide a battling strategies. Memory is simpler, and only needs to increase the probability of playing of strategies that consistently lead to positive results. Finally, the most relevant observation that can be made from the play of PsOpti against 'the count' and that of Anki V2 against Intermediate Human Players is that the similarities between them are uncanny. Both play against master and intermediate level players respectively and result in very similar results. PsOpti claims to be a sub-master level player, whereby it can play two-player tournaments against most humans and win; losing only in the case of master level players. Similarly, Anki V2 can make a claim to be at least a sub-intermediate level player. Its dominance against beginners has already been proven, and if anything, the performance of Anki V2 can be expected in increase with a different Aggressive and Tightness strategy against Intermediate-level Human Players. This feature also gives both Anki V1 and Anki V2 the ability to be excellent learning tools, as they offer a form of Level 1 and Level 2 of Playing Capabilities to teach a Human Player the strategies and manner of playing Poker. More on the subject is discussed in the Chapter 6, Conclusions and Future Work. 67

74 Chapter 6 Conclusion and Future Work General conclusions Four computer players, i.e. Random 1, Random -2, Anki V1 and Anki V2, along with initial strict strategy players such as Always-Checks1, Always-Checks2, Always-Bets and Always- Raises were created by the author of this thesis. The project succeeded in its goal to make Knowledge and Strategy based Texas-Hold'em Poker Players in Prolog. The players created, Anki V1 and Anki V2, were of good quality and provided many positive results. They were created using different approaches, which were well documented, and have been presented in this thesis. The documentation also verifies the stability and soundness of the program, which allows the results to be considered reliable. Through these results, the quality of Anki V1 and Anki V2 can be justified to be good, as they both appear to be sub-intermediate level, with the possibility of Anki V2 excelling given the correct settings. Both Anki V1 and Anki V2 were evaluated against Random 1, a completely random player; Random 2, a non-folding random player; and three categories of Human Players, Beginner, Intermediate and Advanced. The project followed a life-cycle shown in Figure 39. The author's previous knowledge and extensive reading provided a basis for Human Heuristics. These heuristics were formulated into Computer/Machine Heuristics. Experimentation led to many lessons being learned and being reiterated to improve the Human Heuristics, which lead to further formulation of Computer Heuristics. Figure 39. Life-cycle of the Project leading to a better formulation of Human Heuristics 68

75 6.2 - Conclusions of Anki V1 and Anki V2 Anki V1 used a Knowledge Base and Rule-Based Hand Evaluation technique to convert each hand into a form of pattern. These patterns were grouped together using a technique known as Bucketing, which was found to be very effective, and similar to the manner in which the human brain treats its poker hands. These buckets were assigned scores based on their potential to form winning hands, and further, these scores were converted into strategies which changed over the game but remained strict over a betting round. Anki V1 played very well against Beginner Human Players and won all its tournaments. It fought hard against the Intermediate Human Players, but lost eventually, and finally, lost comprehensively against Advanced Player. The main reason behind its defeat was found to be the predictable tight strategy that it followed, more on which is discussed in Section 6.3. Anki V2 offered a Simulation and Enumeration based approach to the two-player Poker Playing scenario. It calculated its Winning Potential by playing simulated games with its current hand and other available knowledge. This Winning Potential was converted into Probability Triples through formulas that control the player's Aggressive and Conservative behaviours. These probability triples remained static over a betting round, but were used to create a controlled randomised betting action every time the player is faced with a decision. The decisions were slightly altered through the Winning Potential Threshold that controlled the player's Tightness and Looseness behaviours. Finally, the player recorded the opponent's actions in the game, and used enumeration to make small changes to the Winning Potential to reflect it more accurately. Anki V2 was found to perform even better in most respect than Anki V1 through its ability to tweak the Aggressiveness and Tightness of its strategies. This was apparent when it played against Random 2 and Anki V1 in tournament play as discussed in Section 5.5. Anki V2 beats all the Beginner Human Players, slightly better than Anki V1. It also loses against Intermediate players but after a longer struggle than Anki V1. This is mostly due to its randomised betting action strategies that make it harder to figure out. Finally, Anki V2 loses against Advanced Players very hurriedly, and worse than Anki V1. This is seen due to its moderate bluffing strategy that changes in between betting rounds. Once again, this is discussed further in Section 6.3. Anki V2 can be expected to play even better and defeat Intermediate 69

76 Players if enough experiments are carried out with humans to tune its Aggressive and Tightness Behaviours against those of the Intermediate Human Players. Comparison of results obtained from PsOpti's play against the master level human player, 'the count', and that of Anki V2 against Intermediate Level Human Players revealed that Anki V2 was at least just below Intermediate Level in its play. This coupled with the fact that Anki V2 can further be tweaked in its strategies to play Intermediate Level Players even better, means that it can possibly claim to be Intermediate Level with the right adjustments Conclusions of Poker Game and Betting Strategies The creation and evaluation of Poker Players Anki V1 and Anki V2 have revealed a lot of interesting results and conclusions relating to human strategies. As discussed earlier, one category of strategies is concerning the relative number of hands played, which is described by Tight, Moderate or Loose. Anki V1 was found to be very tight, as it only played the hands it had a very good chance of winning in, and Anki V2 was found to range from tight to moderate depending on its randomised betting strategy. The second category of discussing player strategies is concerning the amount bet per game. Both Anki V1 and Anki V2 were very aggressive. This was a good choice as both players showed good winnings in the games that they were confident of winning, and thus the aggressive attitude managed to take more money than usual away from the Human Players they played against. The extremely tight strategy of Anki V1 was defeated by loose strategies of Human Players. Beginners lost their games due to bad plays, and never sticking to a strategy. They also folded early in the game without trying to figure out the hand that Anki V1 may have. Thus, a tight strategy against a very tight intelligent player does not work. Both advanced and intermediate human players eventually defeated Anki - V1 by adopting loose strategies and trusting the player. Thus, a predictably tight player can be defeated by a reactive loose player. The human players had to be reactive loose players, i.e. react to the strategy of Anki V1 and fold when Anki V1 raises multiple times. The need for the reactive aspect can be proven by looking at the play of Anki V1 against Random 2. Random 2 was a very loose player and it lost against Anki V1 to the extent of 93% of the tournaments. It can thus be concluded that a tight aggressive player 70

77 will beat a very loose player. The aggressiveness can be shown through the reasoning that a tight player needs to make up for the money lost in folding numerous games through winning large amounts in the games that it plays to the end. Anki V2 offered a much better test-bed for strategy manipulation as its aggressive and tightness strategies could be tweaked to give the desired combinatorial strategy. Anki V2's victory against Beginners was for the same reason as that of Anki V1, but its longer struggle against Intermediate Human Players arose from its unpredictability and randomised betting strategy which forced the human players to observe the player for longer to figure out a strategy to defeat it. Thus, a controlled randomised strategy is a much better option against human players, as it adds confusion to a human player's mind. Anki V2's quick defeat against Advanced players was also one of the project's biggest lessons. Even though it is known that Anki V2 could be tweaked to perform better against that category of human players, the conclusion received from the experiment was that betting strategies need to be remembered over betting rounds. This is true as most of the Advanced Players commented on Anki V2's excellent unpredictability, but poor choice to fold a bluffed hand after committing lots of money to the pot. Anki V2 also lost horribly against Random 2 player, this was because an unsure moderate (tightness index) player will loose to a completely loose player. However, when the moderate player is converted into an intelligent loose player, it can perform wonders, as the intelligent loose player was found to beat the completely loose player in over 98% of the tournaments played. By intelligently loose, it refers to the fact that Anki V2 chose to fold the worst of its hands. Finally, the last conclusion was obtained through the play of Anki V1 against Anki V2. They were both created equal, and they performed as such. But as usual, Anki V2's adaptive capabilities allowed it to gain an upper hand through modifications. In the case of equal tightness, the more aggressive player is found to win. This was proven through the experimentation of making Anki V2 more aggressive. 71

78 6.4 - Future Work The Anki Poker Teaching Tool The quality of Anki V1 and Anki V2 allow these players to be excellent tools for teaching beginner and intermediate players simple strategies of tightness and aggressiveness. These players have already been fully coded, with the Prolog code provided in Appendix A, and thus only need small additions to form teaching support. Both players have also proven their stability through both documentation and extensive play against computer and human players. Anki V1 plays against beginner players really well, and offers a great future as a teaching tool of poker to new human poker players. This is because it plays just above their level, and is based completely on rules and strict strategy. Each action of Anki V1 is justified through the hand strength or the hand potential, and it can use this information to suggest possible moves and provide support to beginner level human players. Human players at this level were found to have most trouble understanding the concepts of poker, and getting to grasp with the winning patterns and simple loose and tight strategies. All of these aspects can be expressed through an Anki V1 based help engine. Anki V2 offers a slightly higher level teaching tool, as it encompasses not just tightness and looseness but also aggressive and conservative strategies. On top of that, it is also fully customisable, which allows a person to set the playing strength of Anki V2. Moreover, the complete control over Anki V2's strategy, allows people learning poker to learn to play against Aggressive, Conservative, Tight, Loose and Moderate players all using the same program. This way, users can experience play against particular strategies, and learn to play better. The strategy maker itself can be randomised, to give the user an Intermediate level playing platform. Once again, like Anki V1, coding of the actual methods has already been done. Anki V2's learning tool can also provide help to its users, by giving them a good estimate of the winning probability of their current hand, thereby allowing them to learn the value of any kind of hand. 72

79 Testing and Extensions to project The previous section discussed one of the possible applications of Anki V1 and Anki V2. There are also certain additions that can to be made to the project, to both improve and realise its full potential. Most of these additions involve human testing with the various settings of Anki V2. There is very few code addition to be done, but the inputs of experts to help provide better enumeration is explored further in Section Anki V1 has a static hand evaluation technique, and thus all of its groups and scoring system was put to a complete test when it was played against both pre-coded players and Human players. The extension that can be considered for this player involves the increase in scoring categories for buckets and additional strategies. The current buckets are based on the patterns that were observed in the Anki V1's hand. The scoring system only involves numbers 0 to 4, in which 0 signifies the strategy of 'check else fold', 1 2 allocates betting, 3 allocates raising and 4 signifies an excellent finished pattern which does not need to be re-evaluated. The scoring system can be extended to provide support for additional strategies, some of which are present in Poki [10], like 'check else bet', 'bet if opponent checks, else check', etc. Anki V2, being the second player that was created, did not get the full extensive testing it deserved to show its abilities against human players. It showed how by changing its internal strategies, it could perform much better against static strategy players. Using the same reasoning, and an automated strategy modifier, it can force it's opponents to frequently change their strategy against this Anki V2 as well, thereby increasing its performance. Also, this would allow a much more extensive set of conclusions similar to the kind that were documented in Section Resource based extensions to project One of the major aspects missing from this project was that of expert systems. All previous players such as Loki, Poki and PsOpti had the expert input available from a master level player, i.e. Darse Billings. The author of this project started from being an Intermediate Human Player and can now be considered Advanced at best. Darse Billings justifies a lot of his heuristics and expert systems of Loki through past experiences [13], which are not expressed in the research to the extensive detail as is required for the coding of a similar Poker Player. Thus, the availability of 73

80 input from a high-level Poker Player can offer a lot to the expert systems of Anki V1 and the strategy formation of Anki V2. The final help of an expert lies in the enumeration strategy of Anki V2, which can only be confirmed by master level players due to the strategy's complexity and presence in mostly master-level tournaments. Anki V1 and Anki V2 both also estimate the Winning Potential through bucketing and simulation respectively. Higher computation power and an time efficient platform can help overcome the exponential blow up of the exact statistical method of Winning Potential calculation. This can help future computer players to have more exact strategies and assign near-optimal values to it's hands at any point in the game. Another possible extension to the program can be its migration to a platform-independent language such as Java, or to incorporate a Java interface to the already existing Prolog code. This would allow the users of the program to test it at their convenience, and provides the possibility of publishing it online. Anki V1 and Anki V2 are very good players in their own respect, but they form even a better foundation for the future to understand human behaviour and replicate it in the Poker Playing community. As the results from this project show, the day when Imperfect Information Games compete well with Master level players in full size tournaments is definitely close at hand. 74

81 Bibliography 1. D. Koller, A. Pfeffer; Generating and Solving Imperfect Information Games. IJCAI 1995: D. Billings, N. Burch, A. Davidson, R. Holte, J. Schaeffer, T. Schauenberg, D. Szafron; Approximating Game-Theoretic Optimal Strategies for Full-scale Poker. IJCAI 2003: D. Papp; Dealing with imperfect information in poker. Master's thesis, Department of Computing Science, University of Alberta, S. J. J. Smith and D. S. Nau; Strategic planning for imperfect-information games. In Working Notes of the AAAI Fall Symposium on Games: Planning and Learning, D. Koller, A. Pfeffer; Representations and Solutions for Game-Theoretic Problems. Artif. Intell. 94(1-2): (1997). 6. S. A. Gordon; A comparison between probabilistic search and weighted heuristics in a game with incomplete information, in: AAAI Fall J. R. S. Blair, D. Mutchler and C. Liu; Games with imperfect information. In Proceedings of the AAAI Fall Symposium on Games: Planning and Learning, (1993). 8. D. Koller, N. Megiddo, B. von Stengel; Fast algorithms for finding randomized strategies in game trees. STOC 1994: (1994). 9. M. van Lent and D. Mutchler; A pruning algorithm for imperfect information games. In Proceedings of the AAAI Fall Symposium on Games: Planning and Learning (1993). 10. D. Billings, A. Davidson, J. Schaeffer, D. Szafron: The challenge of poker. Artif. Intell. 134(1-2): (2002). 11. J. Schaeffer, D. Billings, L. Pea, D. Szafron; Learning to Play Strong Poker. In proceedings of the Sixteenth International Conference on Machine Learning (ICML-99), J. Stefan Institute, Slovenia (Invited Paper), J. Cassidy; The Last Round of Betting in Poker, The American Mathematical Monthly, Vol. 105, No. 9. (Nov., 1998), pp D. Billings, L. P. Castillo, J. Schaeffer, D. Szafron; Using Probabilistic Knowledge and Simulation to Play Poker. AAAI/IAAI 1999: J. Shi, M. L. Littman; Abstraction Methods for Game Theoretic Poker. Computers and Games 2000: A. Junghanns, J. Schaeffer; Search Versus Knowledge in Game-Playing Programs Revisited. IJCAI (1) 1997: J. F. Nash; Non-cooperative games, Ann. Math. 54 (1951) J. F. Nash, L. S. Shapley; A simple three-person poker game, Contributions to the Theory of Games 1 (1950) J. von Neumann, O. Morgenstern; The Theory of Games and Economic Behavior, 2nd 75

82 Edition, Princeton University Press, NJ, H.W. Kuhn; A simplified two-person poker, Contributions to the Theory of Games 1 (1950) N. Findler; Studies in machine cognition using the game of poker. Communications of the ACM 20(4): (1977). 21. C. Cheng; Recognizing poker hands with genetic programming and restricted iteration. Genetic Algorithms and Genetic programming at Stanford, J. Koza (editor), Stanford, California (1997). 22. D. Sklansky and M. Malmuth; Texas Hold em for the Advanced Player, Two Plus Two Publishing, 2nd edition, K. Takusagawa; Nash equilibrium of Texas Hold em poker, Undergraduate thesis, Computer Science, Stanford University, Personal correspondence with Human Players 76

83 Appendix A Program Code % This program is the one that was used to test Human Play against Anki V2. It still contains all the %different players such as Anki V1, Random 1, etc. within it. :- use_module(library(system)). % 1-13 for Ace, 2, 3,... Jack, Queen and King % 1-4 for Clubs, Diamonds, Hearts and Spades. % Strategy 1 - Always bets. % Strategy 2 - Always raises when possible, otherwise bets. % Strategy 3 - Random choice including folding. % Strategy 4 - Random choice without folding. % Strategy 5 - Uses initial eval to decide fold(check) or bet. % Strategy 6 - Uses initial eval to decide fold(check) or raise. % Strategy 7 - Uses initial eval to decide fold(check) or bet or raise. % Strategy 8 - Uses initial eval to decide Strategy 3 or 4. % :- dynamic seed/1. seed( ). :- nl, nl, write('************ Welcome to Texas Hold Them ***************'), nl, write('* Each card is represented in a tuple *'), nl, write('* (Card-number,Card-suit) as denoted below: *'), nl, write('* *'), nl, write('* 1-13 for Ace, 2, 3,... Jack, Queen and King; *'), nl, write('* 1-4 for Clubs, Diamonds, Hearts and Spades. *'), nl, write('*******************************************************'), nl, write('please give seed : '), read(x), retract(seed(_)), assert(seed(x)), nl, write('*******************************************************'), nl, write('please write play. to start game!'), nl, write('*******************************************************'), nl. % rand(r) generates random real number R in the range [0,1) rand(r) :- retract(seed(s)), N is (S * ) mod , assert(seed(n)), R is N / % ramdom(r,m) generates random integer R in the range 0..M-1 random(r,m) :- rand(rr), 77

84 R is integer(m * RR). random_float(r,m) :- rand(rr), R is (M * RR). play:- open('ac_poker_result.pl', append, Stream), play_poker(x, (0,0,0,0,0,0,0), Stream), (X = 0 -> true ; play(x, Stream)), nl, write('*******************************************************'), nl, write('please write play. again to play more games.'), nl, write('*******************************************************'), nl, close(stream). play(x, Stream):- play_poker(y, X, Stream), (Y = 0 -> true ; play(y, Stream)). play_poker(x, N1, Stream):- initialise_game(a), set_players(a, B, C, , ), play_game(a, B, C, X, 0, N1, Stream),!. play_poker(x, Y, Z, 'n', N, N1, Stream):- end_game_check(y, Z, X, N, N1, Stream),!. play_poker(x, Y, Z, 'y', N, N1, Stream):- initialise_game(a), set_players(a, B, C, Y, Z), play_game(a, B, C, X, N, N1, Stream),!. play_poker(x, Y, Z, _, N, N1, Stream):- nl, write('kindly choose from the given options of y or n!'), nl, write('do you wish to play another game? '), read(a), play_poker(x, Y, Z, A, N, N1, Stream). initialise_game(x) :- game_hand(x). game_hand(x):- run_rand(x, [], 4),!. set_players(a, B, C, D, E) :- A = [M,N P], B = (1, [M,N], D), C = (2, P, E). end_game_check(b1, C1, X, N, (_, _, _, _, N6, N7, N9), Stream) :- nl, write('*******************************************************'), nl, write('the tournament has ended. Player 1 has '), write(b1), write(' money, and Player 2 has '), write(c1), write('.'), nl, nl(stream), write(stream, 'The game has ended. Player 1 has '), write(stream, B1), 78

85 write(stream, ' money, and Player 2 has '), write(stream, C1), write(stream, '.'), nl(stream), X = 0, (B1> C1 -> write('player 1 wins the tournament!') ; write('player 2 wins the tournament!')), write('there were '), write(n), write(' games played.'), nl, write('player 1 won '), write(n6), nl, write('player 2 won '), write(n7), nl, write('drawn games were '), write(n9), write(stream, 'There were '), write(stream, N), write(stream, ' games played.'), nl(stream), write(stream, 'Player 1 won '), write(stream, N6), nl(stream), write(stream, 'Player 2 won '), write(stream, N7), nl(stream), write(stream, 'Drawn games were '), write(stream, N9),!. play_game(_, (_,_,0.0), (_,_,C1), X, N, (N1, N2, N3, T, N6, N7, N9), Stream) :- T1 is T + N, N4 is N3 + 1, N5 is N1 + 1, nl, write('*******************************************************'), nl, write('the tournament has ended. Player 1 has 0 money, and Player 2 has '), write(c1), write('.'), nl, write('player 2 wins the tournament! '), write('there were '), write(n), write(' games played.'),!, (N5 > 0 -> X = 0, write(n5), write(' tounaments played.'), nl, write('total games played: '), write(t1), nl, write('player 1 won '), write(n2), write(' and '), write(n6), nl, write('player 2 won '), write(n4), write(' and '), write(n7), nl, write('drawn games were '), write(n9), nl(stream), write(stream, 'Total games played: '), write(stream, T1), nl(stream), write(stream, 'Player 1 won '), write(stream, N6), nl(stream), write(stream, 'Player 2 won '), write(stream, N7), nl(stream), write(stream, 'Drawn games were '), write(stream, N9); X = (N5, N2, N4, T1, N6, N7, N9)). play_game(_, (_,_,5.0), (_,_,C1), X, N, (N1, N2, N3, T, N6, N7, N9), Stream) :- T1 is T + N, N4 is N3 + 1, N5 is N1 + 1, nl, write('*******************************************************'), nl, write('the tournament has ended. Player 1 has 5 money, and Player 2 has '), write(c1), write('.'), nl, write('player 2 wins the tournament! '), write('there were '), write(n), write(' games played.'),!, (N5 > 0 -> X = 0, write(n5), write(' tounaments played.'), nl, write('total games played: '), write(t1), nl, write('player 1 won '), write(n2), write(' and '), write(n6), nl, write('player 2 won '), write(n4), write(' and '), write(n7), nl, write('drawn games were '), write(n9), nl(stream), write(stream, 'Total games played: '), write(stream, T1), nl(stream), write(stream, 'Player 1 won '), write(stream, N6), nl(stream), write(stream, 'Player 2 won '), write(stream, N7), nl(stream), write(stream, 'Drawn games were '), write(stream, N9); X = (N5, N2, N4, T1, N6, N7, N9)). play_game(_, (_,_,B1), (_,_,0.0), X, N, (N1, N2, N3, T, N6, N7, N9), Stream) :- 79

86 T1 is T + N, N4 is N2 + 1, N5 is N1 + 1, nl, write('*******************************************************'), nl, write('the tournament has ended. Player 2 has 0 money, and Player 1 has '), write(b1), write('.'), nl, write('player 1 wins the tournament! '), write('there were '), write(n), write(' games played.'),!, (N5 > 0 -> X = 0, write(n5), write(' tounaments played.'), nl, write('total games played: '), write(t1), nl, write('player 1 won '), write(n4), write(' and '), write(n6), nl, write('player 2 won '), write(n3), write(' and '), write(n7), nl, write('drawn games were '), write(n9), nl(stream), write(stream, 'Total games played: '), write(stream, T1), nl(stream), write(stream, 'Player 1 won '), write(stream, N6), nl(stream), write(stream, 'Player 2 won '), write(stream, N7), nl(stream), write(stream, 'Drawn games were '), write(stream, N9); X = (N5, N4, N3, T1, N6, N7, N9)). play_game(_, (_,_,B1), (_,_,5.0), X, N, (N1, N2, N3, T, N6, N7, N9), Stream) :- T1 is T + N, N4 is N2 + 1, N5 is N1 + 1, nl, write('*******************************************************'), nl, write('the tournament has ended. Player 2 has 5 money, and Player 1 has '), write(b1), write('.'), nl, write('player 1 wins the tournament! '), write('there were '), write(n), write(' games played.'),!, (N5 > 0 -> X = 0, write(n5), write(' tounaments played.'), nl, write('total games played.'), write(t1), nl, write('player 1 won '), write(n4), write(' and '), write(n6), nl, write('player 2 won '), write(n3), write(' and '), write(n7), nl, write('drawn games were '), write(n9), nl(stream), write(stream, 'Total games played.'), write(stream, T1), nl(stream), write(stream, 'Player 1 won '), write(stream, N6), nl(stream), write(stream, 'Player 2 won '), write(stream, N7), nl(stream), write(stream, 'Drawn games were '), write(stream, N9); X = (N5, N4, N3, T1, N6, N7, N9)). play_game(a, (1,B,B0), (2,C,C0), X, N, N2, Stream) :- N1 is N + 1, B1 is B0-10, C1 is C0-10, nl, write('*******************************************************'), nl, write('your hand (personal 2 cards) is : '), write(b), nl, write('*******************************************************'), nl, nl(stream), write(stream, 'Player 1 has cards : '), write(stream, B), nl(stream), write(stream, 'Player 2 has cards : '), write(stream, C), % First round evaluation and betting. eval_start_good(c, Good1), 80

87 write(stream, Good1), %eval_start(c, E), %better(e, _),!, betting_round(1, B, C, [], B1, C1, 20.0, B2, C2, R1, 'c', P, 0, 0, Good1, Stream, 0, _), % Second Round evaluation and betting. get_flop(a, Y, T1, P, Stream), eval_flop_good(c, Y, Good2, T1), write(stream, Good2), append(c, Y, Cf), ace_it(cf, Cf1), q_sort(cf1, [], Cf2), %setof(e2, eval_flop(c, Cf2, 2, E2), E3), %get_best_str(e3, E4), %better(e4, E5),!, betting_round(1, B, C, Y, B2, C2, R1, B3, C3, R2, P, P1, 0, 0, Good2, Stream, 0, Chan), write(stream, Chan), % Third round evaluation and betting. get_turn(t1, D, T2, P1, Stream), eval_turn_good(c, Y, [D], Good3, T2, Chan), write(stream, Good3), append([d], Y, F), insert_card(d, Cf2, Ct),!, %setof(e6, eval_turn(c, Ct, 2, E6), E7), %get_best_str(e7, E8), %better(e8, E9),!, betting_round(1, B, C, F, B3, C3, R2, B4, C4, R3, P1, P2, 0, 0, Good3, Stream, Chan, Chan1), write(stream, Chan1), % Final round of betting get_river(t2, G, P2, Stream), eval_river_good(c, Y, [D], [G], Good4, Chan1), write(stream, Good4), append([g], F, H), insert_card(g, Ct, Cr),!, %setof(e10, eval_river(c, Cr, 2, E10), E11), %get_best_str(e11, E12), %better(e12, E13),!, betting_round(1, B, C, H, B4, C4, R3, B5, C5, R4, P2, P3, 0, 0, Good4, Stream, 0, _), % Final evaluation - Complete! final_eval(b, C, H, Cr, B5, C5, R4, B6, C6, P3, P4, Stream), game_add(p4, N2, (N3,N4,N5,N6,N7,N8,N9)), nl, write('*******************************************************'), nl, (P4 = 3 -> write('it is a draw!'), write(stream, 'It is a draw!'); write('player '), write(p4), write(' has won!'), write(stream, 'Player '), write(stream, P4), write(stream, ' has won!')), nl, write('player 1 now has '), write(b6), write(' money.'), nl, write('player 2 now has '), write(c6), write(' money.'), nl, 81

88 write('*******************************************************'), nl, nl(stream), write(stream, ' ; ; ; '), write(stream, N1), write(stream, ' ; '), write(stream, N7), write(stream, ' ; '), write(stream, N8), write(stream, ' ; '), write(stream, N9), write(stream, ' ; '), write(stream, B6), write(stream, ' ; '), write(stream, C6), nl, nl, nl, write('do you wish to play another hand? Type y for yes and n for no : '), read(x1), play_poker(x, B6, C6, X1, N1, (N3,N4,N5,N6,N7,N8,N9), Stream). insert_card([], Ct, Ct). insert_card((1,s), X, [(14,S) T]):- append(x, [(1,S)], T). insert_card((a,s), [], [(A,S)]). insert_card((a,s), [(A1,S1) T], [(A,S),(A1,S1) T]):- A > A1. insert_card(x, [H T], [H T1]):- insert_card(x, T, T1). betting_round(a, _, _, _, B1, C1, R1, B1, C1, R1, 'f', A, _, _, _, _, _, 0). betting_round(_, _, _, _, B1, C1, R1, B1, C1, R1, 1, 1, _, _, _, _, _, 0). betting_round(_, _, _, _, B1, C1, R1, B1, C1, R1, 2, 2, _, _, _, _, _, 0). betting_round(_, _, _, _, B1, C1, R1, B1, C1, R1, 'm', 'm', _, _, _, _, _, 0). betting_round(a, _, _, _, 0.0, C1, R1, 0.0, C1, R1, _, 'm', _, _, _, Stream, _, 0):- nl, write('player '), write(a), write(' has finished his/her money'), nl(stream), write(stream, 'Player '), write(stream, A), write(stream, ' has finished his/her money'). %betting_round(a, _, _, _, B1, 0.0, R1, B1, 0.0, R1, _, 'm', _, _, _, _):- % change_over(a, 'm', B), % nl, write('player '), write(b), write(' has finished his/her money'). betting_round(a, _, _, _, 5.0, C1, R1, 5.0, C1, R1, _, 'm', _, _, _, Stream, _, 0):- nl, write('player '), write(a), write(' has insufficient money'), nl(stream), write(stream, 'Player '), write(stream, A), write(stream, ' has insufficient money'). %betting_round(a, _, _, _, B1, 5.0, R1, B1, 5.0, R1, _, 'm', _, _, _, _):- % change_over(a, 'm', B), % nl, write('player '), write(b), write(' has insufficient money'). betting_round(_, _, _, _, B1, C1, R1, B1, C1, R1, 'b2', 'c', _, _, _, _, C, C). betting_round(_, _, _, _, B1, C1, R1, B1, C1, R1, 'c3', 'c', _, _, _, _, C, C). betting_round(1, B, C, D, B1, C1, R1, B2, C2, R2, P, P1, N, M, E, Stream, Chan, Chan1):- nl, nl, nl, write('*******************************************************'), nl, write('player 1'), nl, nl, write('the current state of the poker game is as follows :'), nl, write('your cards are '), write(b), write('.'), nl, write('the cards on the table are '), write(d), write('.'), nl, write('your money is '), write(b1), write('.'), nl, write('your opponent has '), write(c1), write(' money.'), nl, write('the money currently in the pot is '), write(r1), write('.'), nl, (M > 0 -> write('you need to bet a minimum of '), write(m), write(' to continue.'); true), nl, write('please choose one of the following options for betting,'), nl, write('f - fold, b - bet'), ((M > 0, N < 6, B1 >= 20, C1 >= 10) -> write(', r - raise') ; write('')), (M = 0 -> write(', c - check : ') ; write(' : ')), read(x), eval(x, B1, C1, R1, B3, R3, M, M1, 1, N, X1, Stream),!, change_over(1, X1, A1), change_over(p, X1, P2), 82

89 change_over_num(n, X1, N1), next_betting_round(a1, B, C, D, B3, C1, R3, B2, C2, R2, P2, P1, N1, M1, X1, E, Stream, Chan, Chan1). %Always raises, but in other situations, bets from below betting_round(2, B, C, D, B1, C1, R1, B2, C2, R2, P, P1, N, M, 2, Stream, Chan, Chan1):- M > 0, N < 6, B1 >= 20, C1 >= 10, eval('r', B1, C1, R1, B3, R3, M, M1, 2, N, X1, Stream),!, change_over(2, X1, A1), change_over(p, X1, P2), change_over_num(n, X1, N1), next_betting_round(a1, B, C, D, B3, C1, R3, B2, C2, R2, P2, P1, N1, M1, X1, 2, Stream, Chan, Chan1). %Always checks otherwise bets or folds from below betting_round(2, B, C, D, B1, C1, R1, B2, C2, R2, P, P1, N, M, 0, Stream, Chan, Chan1):- M = 0, eval('c', B1, C1, R1, B3, R3, M, M1, 2, N, X1, Stream),!, change_over(2, X1, A1), change_over(p, X1, P2), change_over_num(n, X1, N1), next_betting_round(a1, B, C, D, B3, C1, R3, B2, C2, R2, P2, P1, N1, M1, X1, 0, Stream, Chan, Chan1). betting_round(2, B, C, D, B1, C1, R1, B2, C2, R2, P, P1, N, M, 0, Stream, Chan, Chan1):- eval('f', B1, C1, R1, B3, R3, M, M1, 2, N, X1, Stream),!, change_over(2, X1, A1), change_over(p, X1, P2), change_over_num(n, X1, N1), next_betting_round(a1, B, C, D, B3, C1, R3, B2, C2, R2, P2, P1, N1, M1, X1, 0, Stream, Chan, Chan1). %Strategy player betting_round(2, B, C, D, B1, C1, R1, B2, C2, R2, P, P1, N, M, (E1, E2, E3, Win), Stream, Chan, Chan1):- random_float(e4, 100), write(stream, E4), choose_rel_str(e4, E1, E2, E3, S1),!, change_win(win, Win1, P, Chan, Chan2), exact_str(s1, Win1, S2, M, N, B1, C1),!, eval(s2, B1, C1, R1, B3, R3, M, M1, 2, N, X1, Stream),!, change_over(2, X1, A1), change_over(p, X1, P2), change_over_num(n, X1, N1), next_betting_round(a1, B, C, D, B3, C1, R3, B2, C2, R2, P2, P1, N1, M1, X1, (E1,E2,E3,Win1), Stream, Chan2, Chan1). %Always bets betting_round(2, B, C, D, B1, C1, R1, B2, C2, R2, P, P1, N, M, E, Stream, Chan, Chan1):- eval('b', B1, C1, R1, B3, R3, M, M1, 2, N, X1, Stream),!, 83

90 change_over(2, X1, A1), change_over(p, X1, P2), change_over_num(n, X1, N1), next_betting_round(a1, B, C, D, B3, C1, R3, B2, C2, R2, P2, P1, N1, M1, X1, E, Stream, Chan, Chan1). %Random strategy chooser with folding %betting_round(2, B, C, D, B1, C1, R1, B2, C2, R2, P, P1, N, M, E, Stream, Chan, Chan1):- % (M = 0 -> Y1 is 1 ; Y1 is 2), % ((M > 0, N < 6, B1 >= 20, C1 >= 10) -> Y2 is 5 ; Y2 is 4), % ret_rand(y1, Y2, X), % eval(x, B1, C1, R1, B3, R3, M, M1, 2, N, X1, Stream),!, % change_over(2, X1, A1), % change_over(p, X1, P2), % change_over_num(n, X1, N1), % next_betting_round(a1, B, C, D, B3, C1, R3, B2, C2, R2, P2, P1, N1, M1, X1, E, Stream, Chan, Chan1). %Random strategy chooser without folding %betting_round(2, B, C, D, B1, C1, R1, B2, C2, R2, P, P1, N, M, E, Stream, Chan, Chan1):- % (M = 0 -> Y1 is 5 ; Y1 is 6), % ((M > 0, N < 6, B1 >= 20, C1 >= 10) -> Y2 is 8 ; Y2 is 7), % ret_rand(y1, Y2, X), % eval(x, B1, C1, R1, B3, R3, M, M1, 2, N, X1, Stream),!, % change_over(2, X1, A1), % change_over(p, X1, P2), % change_over_num(n, X1, N1), % next_betting_round(a1, B, C, D, B3, C1, R3, B2, C2, R2, P2, P1, N1, M1, X1, E, Stream, Chan, Chan1). next_betting_round(a, B, C, D, B1, C1, R1, B2, C2, R2, P, P1, N, M, 's', E, Stream, Chan, Chan1):- betting_round(a, B, C, D, B1, C1, R1, B2, C2, R2, P, P1, N, M, E, Stream, Chan, Chan1),!. next_betting_round(a, B, C, D, B1, C1, R1, B2, C2, R2, P, P1, N, M, _, E, Stream, Chan, Chan1):- betting_round(a, C, B, D, C1, B1, R1, C2, B2, R2, P, P1, N, M, E, Stream, Chan, Chan1). choose_rel_str(x, C, _, _, 'c'):- X =< C. choose_rel_str(x, C, _, R, 'r'):- C1 is C + R, X =< C1. choose_rel_str(_, _, _, _, 'b'). exact_str('b', _, 'b', _, _, _, _). exact_str('r', _, 'r', M, N, B1, C1):- M > 0, N < 6, B1 >= 20, C1 >= 10. exact_str('r', _, 'b', _, _, _, _). exact_str('c', _, 'c', M, _, _, _):- M = 0. 84

91 exact_str('c', Win, 'f', M, _, _, _):- M > 0, Win < 60. exact_str('c', _, 'b', _, _, _, _). change_win(win, Win1, 'r', Chan, Chan1):- Win1 is Win - 2, Chan1 is Chan - 2. change_win(win, Win1, 'c2', Chan, Chan1):- Win1 is Win + 2, Chan1 is Chan + 2. change_win(win, Win, _, Chan, Chan). better(0, 0). %For the random strategies %better(_,1). better(e, 1):- E > 0, E < 3. better(e, 2):- E > 2, E < 5. get_best_str([4 _], 4). get_best_str([h], H). get_best_str([h T], E):- get_best_str(t, E1), (H > E1 -> E = H ; E = E1). get_flop(_, _, [], 1, _). get_flop(_, _, [], 2, _). get_flop(x, Y, Z, _, Stream) :- run_rand([a,b,c X], X, 3), Y = [A,B,C], nl, nl, write('*******************************************************'), nl, write('the flop (first three of five community cards) is : '), write(y), nl, write('*******************************************************'), nl, nl(stream), nl(stream), write(stream, 'The flop is : '), write(stream, Y), append(x,y,z). get_turn(_, _, [], 1, _). get_turn(_, _, [], 2, _). get_turn(z, D, E, _, Stream) :- run_rand([d Z], Z, 1), nl, nl, write('*******************************************************'), nl, write('the turn (fourth of five community cards) is : '), write(d), nl, write('*******************************************************'), nl, nl(stream), nl(stream), write(stream, 'The turn is : '), 85

92 write(stream, D), E = [D Z]. get_river(_, [], 1, _). get_river(_, [], 2, _). get_river(z, D, _, Stream) :- run_rand([d Z], Z, 1), nl, nl, write('*******************************************************'), nl, write('the river (final community card) is : '), write(d), nl, write('*******************************************************'), nl, nl(stream), nl(stream), write(stream, 'The river is : '), write(stream, D). eval('c', B1, _, R1, B1, R1, M, 0, A, _, 'c', Stream):- M = 0, nl, nl, write('player '), write(a), write(' has checked.'), nl(stream), write(stream, 'Player '), write(stream, A), write(stream, ' has checked.'). eval('c', B1, _, R1, B1, R1, M, M, _, _, 's', _):- nl, nl, write('did I say that you could check? Did I? Huh? Huh? Go up and check...'), nl, write('i did not, did I? So, please answer correctly.'). eval('b', B1, _, R1, B2, R2, _, 10, A, _, 'b', Stream):- B2 is B1-10, R2 is R1 + 10, nl, nl, write('player '), write(a), write(' has bet.'), nl(stream), write(stream, 'Player '), write(stream, A), write(stream, ' has bet.'). eval('r', B1, C1, R1, B2, R2, M, 10, A, N, 'r', Stream):- M > 0, N < 6, B1 >= 20, C1 >= 10, B2 is B1-20, R2 is R1 + 20, nl, nl, write('player '), write(a), write(' has raised.'), nl(stream), write(stream, 'Player '), write(stream, A), write(stream, ' has raised.'). eval('r', B1, _, R1, B1, R1, M, M, _, _, 's', _):- nl, nl, write('did I say that you could raise? Did I? Huh? Huh? Go up and check...'), nl, write('i did not, did I? So, please answer correctly.'). eval('f', B1, _, R1, B1, R1, _, _, A, _, 'f', Stream):- write('player '), write(a), write(' has folded!'), write(stream, 'Player '), write(stream, A), write(stream, ' has folded!'). eval(_, B1, _, R1, B1, R1, M, M, _, _, 's', _):- nl, nl, write('did I say that you could write that? Did I? Huh? Huh? Go up and check...'), nl, write('i did not, did I? So, please answer correctly.'). change_over(1, 's', 1). change_over(1, _, 2). change_over(2, 's', 2). change_over(2, _, 1). change_over(p, 's', P). change_over('c', 'c', 'c2'). change_over('c2', 'c', 'c3'). 86

93 change_over('b', 'b', 'b2'). change_over('c', 'b', 'b'). change_over('c2', 'b', 'b'). change_over('r', 'b', 'b2'). change_over(_, 'r', 'r'). change_over(_, 'f', 'f'). change_over_num(n, 's', N). change_over_num(n, _, N1):- N1 is N + 1. final_eval(_, _, _, _, B1, C1, R, B2, C1, 1, 1, _):- B2 is B1 + R. final_eval(_, _, _, _, B1, C1, R, B1, C2, 2, 2, _):- C2 is C1 + R. final_eval(_, _, _, _, B1, C1, R, B2, C2, 3, 3, _):- R1 is R / 2, C2 is C1 + R1, B2 is B1 + R1. final_eval(b, C, D, C3, B1, C1, R, B2, C2, _, X, Stream):- nl, nl, nl, write('*******************************************************'), nl, write('player 1 has the following cards : '), write(b), nl, write('player 2 has the following cards : '), write(c), nl, write('the following cards are on the table : '), write(d), nl, nl, nl(stream), nl(stream), write(stream, 'Player 1 has the following cards : '), write(stream, B), nl(stream), write(stream, 'Player 2 has the following cards : '), write(stream, C), nl(stream), write(stream, 'The following cards are on the table : '), write(stream, D), nl(stream), append(b, D, B3), ace_it(b3, B4), q_sort(b4, [], B5), winner_eval(b5, C3, X, Win), final_eval(_, _, _, _, B1, C1, R, B2, C2, X, X, _),!, write('*******************************************************'), nl, write(win), write(stream, Win). winner_eval(b, C, X, 'Royal Flush'):- royal_flush(b, 0, A1, 0, 1, Y), royal_flush(c, A1, A2, 0, 2, Z), (A2 = 0 -> fail ; true), (A2 = 3 -> kick_eval(y, Z, X, [4,_,_]) ; X is A2). winner_eval(b, C, X, '4 of a kind'):- multiple_kind(b, 0, A1, 1, Y, 0, 3, 15), multiple_kind(c, A1, A2, 2, Z, 0, 3, 15), (A2 = 0 -> fail ; true), (A2 = 3 -> kick_eval(y, Z, X1, [4,_,_]) ; X1 is A2), 87

94 (X1 = 3 -> get_kicker(b, C, Y, X, 4) ; X is X1). winner_eval(b, C, X, 'Full House'):- multiple_kind(b, 0, A1, 2, Y1, 0, 2, 15), (A1 = 0 -> A2 = 0 ; multiple_kind(b, A1, A2, 1, Y2, 0, 1, Y1)), multiple_kind(c, 0, C1, 2, Z1, 0, 2, 15), (C1 = 0 -> C2 = 0 ; multiple_kind(c, C1, C2, 1, Z2, 0, 1, Z1)), full_house_eval(a2, C2, A3),!, (A3 = 3 -> kick_eval(y1, Z1, X1, [4,_,_]) ; X1 is A3), (X1 = 3 -> kick_eval(y2, Z2, X, [4,_,_]) ; X is X1). winner_eval(b, C, X, 'Flush'):- flusher(b, 0, A1, 1, Y, 3), flusher(c, A1, A2, 2, _, 3), (A2 = 0 -> fail ; true), (A2 = 3 -> flush_decide(y, B, C, X, 0) ; X is A2). winner_eval(b, C, X, 'Sequence'):- sequen(b, 0, A1, 0, 1, Y, 4), sequen(c, A1, A2, 0, 2, Z, 4), (A2 = 0 -> fail ; true), (A2 = 3 -> kick_eval(y, Z, X, [4,_,_]) ; X is A2). winner_eval(b, C, X, '3 of a kind'):- multiple_kind(b, 0, A1, 1, Y, 0, 2, 15), multiple_kind(c, A1, A2, 2, Z, 0, 2, 15), (A2 = 0 -> fail ; true), (A2 = 3 -> kick_eval(y, Z, X1, [4,_,_]) ; X1 is A2), (X1 = 3 -> get_kicker(b, C, Y, X, 3) ; X is X1). winner_eval(b, C, X, '2 pair'):- multiple_kind(b, 0, A1, 2, Y1, 0, 1, 15), (A1 = 0 -> A2 = 0 ; multiple_kind(b, A1, A2, 1, Y2, 0, 1, Y1)), multiple_kind(c, 0, C1, 2, Z1, 0, 1, 15), (C1 = 0 -> C2 = 0 ; multiple_kind(c, C1, C2, 1, Z2, 0, 1, Z1)), full_house_eval(a2, C2, A3),!, (A3 = 3 -> (greater(y1, Y2, Y11, Y22), greater(z1, Z2, Z11, Z22), kick_eval(y11, Z11, X1, [4,_,_])) ; X1 is A3), (X1 = 3 -> kick_eval(y22, Z22, X2, [4,_,_]) ; X2 is X1), (X2 = 3 -> get_kicker(b, C, Y1, Y2, X, _) ; X is X2). winner_eval(b, C, X, 'A pair'):- multiple_kind(b, 0, A1, 1, Y, 0, 1, 15), multiple_kind(c, A1, A2, 2, Z, 0, 1, 15), (A2 = 0 -> fail ; true), (A2 = 3 -> kick_eval(y, Z, X1, [4,_,_]) ; X1 is A2), (X1 = 3 -> get_kicker(b, C, Y, X, 2) ; X is X1). winner_eval(b, C, X, 'High Card'):- get_first(b, A1), get_first(c, A2), kick_eval(a1, A2, X1, [4, _, _]), (X1 = 3 -> get_kicker(b, C, A1, X, 1) ; X is X1). royal_flush([], A, A, _, _, []). royal_flush([(h, _) _], A, A1, 4, Y, H):- 88

95 A1 is A + Y. royal_flush([(h1, T1) T], A, A1, C, Y, Z):- H2 is H1-1, member((h2, T1), T), C1 is C + 1, royal_flush([(h2, T1) T], A, A1, C1, Y, Z),!. royal_flush([_ T], A, A1, _, Y, Z):- royal_flush(t, A, A1, 0, Y, Z),!. multiple_kind([(h, _) _], A, A1, Y, H, X, X, N):- \+ N = H, \+ H = 1, A1 is A + Y,!. multiple_kind([_], A, A, _, [], _, _, _). multiple_kind([(h, _), (H, _) T], A, A1, Y, Z, _, X, H):- multiple_kind(t, A, A1, Y, Z, 0, X, H),!. multiple_kind([(h, _), (H, _) T], A, A1, Y, Z, C, X, N):- C1 is C + 1, multiple_kind([(h, _) T], A, A1, Y, Z, C1, X, N),!. multiple_kind([_ T], A, A1, Y, Z, _, X, N):- multiple_kind(t, A, A1, Y, Z, 0, X, N),!. flusher([], A, A, _, [], _). flusher([(_, S) T], A, A1, Y, Z, N):- count_and_rem_s((_, S), T, T1, C), (C > N -> ( A1 is A + Y, Z is S) ; ( flusher(t1, A, A1, Y, Z, N))),!. sequen([], A, A, _, _, [],_). sequen([(h, _) _], A, A1, X, Y, H, X):- A1 is A + Y. sequen([(h1, _) T], A, A1, C, Y, Z, X):- H2 is H1-1, member((h2, _), T), C1 is C + 1, sequen([(h2, _) T], A, A1, C1, Y, Z, X),!. sequen([_ T], A, A1, _, Y, Z, X):- sequen(t, A, A1, 0, Y, Z, X),!. full_house_eval(3, 3, 3). full_house_eval(3, _, 1). full_house_eval(_, 3, 2). flush_decide(y, B, C, X, 4):- get_max_s(b, Y, A1), get_max_s(c, Y, A2), kick_eval(a1, A2, X, [4,_,_]),!. flush_decide(y, B, C, X, N):- get_max_s(b, Y, A1), 89

96 get_max_s(c, Y, A2), kick_eval(a1, A2, X, [Y, B, C, N]). get_kicker(b, C, Y, X, 4):- count_and_rem_c((y, _), B, B1, _), count_and_rem_c((y, _), C, C1, _), get_first(b1, A1), get_first(c1, A2), kick_eval(a1, A2, X, [4,_,_]). get_kicker(b, C, Y, X, N):- count_and_rem_c((y, _), B, B1, _), count_and_rem_c((y, _), C, C1, _), get_first(b1, A1), get_first(c1, A2), kick_eval(a1, A2, X, [N, B1, C1]). get_kicker(b, C, Y1, Y2, X, _):- count_and_rem_c((y1,_), B, B1, _), count_and_rem_c((y1,_), C, C1, _), get_kicker(b1, C1, Y2, X, 4). kick_eval(a1, A2, 1, _):- A1 > A2. kick_eval(a1, A2, 2, _):- A2 > A1. kick_eval(a1, A2, 3, [4,_,_]):- A1 = A2. kick_eval(a1, A2, X, [N,B,C]):- A1 = A2, rem_one((a1,_), B, B1), rem_one((a1,_), C, C1), get_first(b1, A11), get_first(c1, A22), N1 is N + 1, kick_eval(a11, A22, X, [N1,B1,C1]). kick_eval(a1, A2, X, [Y, B, C, N]):- A1 = A2, rem_one((a1,y), B, B1), rem_one((a1,y), C, C1), N1 is N + 1, flush_decide(y, B1, C1, X, N1). ret_rand(y1, Y2, X):- Y3 is Y2 - Y1, random(y4, Y3), Y is Y4 + Y1, ret_rand(y, X). ret_rand(1, 'c'). ret_rand(2, 'f'). ret_rand(3, 'b'). ret_rand(4, 'r'). ret_rand(5, 'c'). ret_rand(6, 'b'). ret_rand(7, 'r'). 90

97 run_rand(x, X, 0). run_rand(h, X, N) :- random(y1, 13), Y is Y1 + 1, random(z1, 4), Z is Z1 + 1, B is N - 1, (member((y,z),x) -> run_rand(h, X, N); run_rand(h, [(Y,Z) X], B)). eval_start(x, A) :- paired(x, A),!. eval_start(x, A) :- suited(x, A1), sequenced(x, A1, A), A > 0,!. eval_start(x, A) :- high(x, A, 8). suited([(_,a),(_,a)], 1). suited(_, 0). paired([(a,_), (A,_)], X):- ((A > 9 A = 1) -> X is 3 ; X is 2). sequenced([(a,_), (B,_)], X, X1) :- (B is A + 1 B is A - 1 A = 1, B = 13 B = 1, A = 13), greater(a, B, _, B1), (B1 > 9 -> X1 is X + 2 ; X1 is X + 1). sequenced(_, A, A). high([(a, _), (B, _)], 1, N):- (A = 1 A > N), (B = 1 B > N). high(_,0, _). eval_flop(_, _, 4, 4). eval_flop(_, X, _, E):- sequen(x, 1, 2, 0, 1, A, 2), E1 is 0, (A > 9 -> A1 = A ; A1 is A - 1), (member((a1,_),x) -> E2 is E1 + 2 ; E2 = E1), A2 is A1-1, (member((a2,_),x) -> E is E2 + 2 ; E = E2). eval_flop(b, X, _, E):- eval_flusher(x, E1, 1, A), member((_,a), B), (E1 = 2 -> B = [(_,A),(_,A)], E is E1 ; E is E1). 91

98 eval_flop(_, X, _, E):- multiple_kind(x, 0, 2, 2, Y1, 0, A1, 15), multiple_kind(x, 0, 2, 2, _, 0, A2, Y1), A1 > 0, A1 < 3, A2 > 0, A2 < 3, E is 3. eval_flop(b, X, _, E):- eval_multiple(x, E1, A), (member((a,_),b) -> E = E1 ; high(b, E, 9)). eval_flop(b, _, _, 1):- high(b, 1, 10). eval_flop(_, _, _, 0). eval_turn(_, _, 4, 4). eval_turn(_, X, _, E):- sequen(x, 1, 2, 0, 1, A, 2), E1 is 0, A1 is A - 1, (member((a1,_),x) -> E2 is E1 + 2 ; E2 = E1), A2 is A1-1, (member((a2,_),x) -> E is E2 + 2 ; E = E2). eval_turn(b, X, _, E):- eval_flusher(x, E1, 2, A), (E1 = 3 -> E = 2 ; E = 4), member((_,a), B). eval_turn(b, X, _, E):- multiple_kind(x, 0, 2, 2, Y1, 0, A1, 15), multiple_kind(x, 0, 2, 2, Y2, 0, A2, Y1), A1 > 0, A1 < 4, A2 > 0, A2 < 4, (member((y1,_),b) -> E1 is 2 ; E1 is 0), (member((y2,_),b) -> E is E1 + 2; E is E1). eval_turn(b, X, _, E):- eval_multiple(x, E1, A), (member((a,_),b) -> E = E1 ; high(b, E, 10)). eval_turn(b, _, _, 1):- high(b, 1, 10). eval_turn(_, _, _, 0). eval_river(_, _, 4, 4). eval_river(_, X, _, 4):- sequen(x, 1, 2, 0, 1, _, 4). eval_river(b, X, _, 4):- eval_flusher(x, _, 3, A), member((_,a), B). eval_river(b, X, _, 4):- multiple_kind(x, 0, 2, 2, Y1, 0, A1, 15), 92

99 multiple_kind(x, 0, 2, 2, Y2, 0, A2, Y1), A1 > 0, A1 < 4, A2 > 0, A2 < 4, (member((y1,_),b) member((y2,_),b)). eval_river(b, X, _, E):- eval_multiple(x, E1, A), (member((a,_),b) -> E = E1 ; high(b, E, 10)). eval_river(b, _, _, 1):- high(b, 1, 10). eval_river(_, _, _, 0). eval_flusher([_,_], 0, _, 0). eval_flusher([(_,s) T], E, N, S1):- count_and_rem_s((_,s), T, T1, A), (A > N -> E = A, S1 = S ; eval_flusher(t1, E, N, S1)). eval_multiple([(a,_),(a,_),(a,_),(a,_) _], 4, A). eval_multiple([(a,_),(a,_),(a,_) _], 4, A). eval_multiple([(a,_),(a,_) _], E, A):- (A > 9 -> E is 3 ; E is 2). eval_multiple([_ T], E, A):- eval_multiple(t, E, A). eval_start_good(b, X):- good_eval(b, [], [], [], X, 0). eval_flop_good(_, _, (0, 0, 0, 0), []). eval_flop_good(b, F, X, _):- good_eval(b, F, [], [], X, 0). eval_turn_good(_, _, _, (0, 0, 0, 0), [], _). eval_turn_good(b, F, T, X, _, Chan):- good_eval(b, F, T, [], X, Chan). eval_river_good(_, _, _, [[]], (0, 0, 0, 0), _). eval_river_good(b, F, T, R, X, Chan):- good_eval(b, F, T, R, X, Chan). good_eval(b, F, T, R, X, N):- play_pseudo_game(b, F, T, R, W, L, D, 0), Win is ((W + (D / 2)) / (W + D + L)) * 100, Win1 is Win + N, assign_str(win1, X),!. play_pseudo_game(_, _, _, _, 0, 0, 0, 1000). play_pseudo_game(b, F, T, R, W, L, D, N):- N1 is N + 1, append(b, F, T, R, X), run_rand([c1, C2 X], X, 2), get_pseudo_flop(f, [C1, C2 X], F1, X1),!, get_pseudo_turn(t, X1, T1, X2),!, get_pseudo_river(r, X2, R1),!, 93

100 final_pseudo_eval(b, [C1, C2], F1, T1, R1, Y),!, play_pseudo_game(b, F, T, R, W1, L1, D1, N1), adder(w1, L1, D1, Y, W, L, D). get_pseudo_flop([], X, [A, B, C], [A, B, C X]):- run_rand([a, B, C X], X, 3). get_pseudo_flop(f, X, F, X). get_pseudo_turn([], X, [T], [T X]):- run_rand([t X], X, 1). get_pseudo_turn(t, X, T, X). get_pseudo_river([], X, [R]):- run_rand([r X], X, 1). get_pseudo_river(r, _, R). final_pseudo_eval(b, C, F, T, R, X):- append(b, F, T, R, B1), append(c, F, T, R, C1), ace_it(b1, B2), ace_it(c1, C2), q_sort(b2, [], B3), q_sort(c2, [], C3), winner_eval(b3, C3, X, _),!. adder(w1, L1, D1, 1, W, L1, D1):- W is W adder(w1, L1, D1, 2, W1, L, D1):- L is L adder(w1, L1, D1, 3, W1, L1, D):- D is D append(b, [], [], [], B). append(b, [M, N, O], [], [], [M, N, O B]). append(b, [M, N, O], [P], [], [M, N, O, P B]). append(b, [M, N, O], [P], [Q], [M, N, O, P, Q B]). assign_str(win, (C, B, R, Win)):- Win =< 50, C is (80 - (Win / 2)), B is ((Win / 2) + 10), R is 10. assign_str(win, (C, B, R, Win)):- Win =< 75, C is (90 - Win), B is (Win - 10), R is 20. assign_str(win, (C, B, R, Win)):- Win > 75, C is 15, B is Win, R is Win

101 get_first([(h, _) _], H). get_last([(h, _)], H). get_last([_ T], X):- get_last(t, X). get_max([(h,_)], H). get_max([(h, _) T], X):- get_max(t, Z), (H > Z -> X is H ; X is Z). get_max_s([(h,s) _], S, H). get_max_s([_ T], S, X):- get_max_s(t, S, X). count_and_rem_c(_, [], [], 0). count_and_rem_c((x, _), [(X, _) T], T1, A):- count_and_rem_c((x, _), T, T1, A1), A is A1 + 1,!. count_and_rem_c(x, [H T], [H T1], A):- count_and_rem_c(x, T, T1, A),!. count_and_rem_s(_, [], [], 0). count_and_rem_s(x, [(1, _) T], T1, A):- count_and_rem_s(x, T, T1, A). count_and_rem_s((_, X), [(_, X) T], T1, A):- count_and_rem_s((_, X), T, T1, A1), A is A1 + 1,!. count_and_rem_s(x, [H T], [H T1], A):- count_and_rem_s(x, T, T1, A),!. rem_one(h, [H T], T). rem_one(h, [H1 T], [H1 X]) :- rem_one(h, T, X). ace_it([],[]). ace_it([(1,s) T], [(1,S),(14,S) T1]):- ace_it(t, T1). ace_it([h T], [H T1]):- ace_it(t, T1). game_add(1, (A,B,C,D,N,E,F), (A,B,C,D,N1,E,F)):- N1 is N + 1. game_add(2, (A,B,C,D,E,N,F), (A,B,C,D,E,N1,F)):- N1 is N + 1. game_add(3, (A,B,C,D,E,F,N), (A,B,C,D,E,F,N1)):- N1 is N + 1. q_sort([],acc,acc). q_sort([h T],Acc,Sorted):- pivoting(h,t,l1,l2), q_sort(l1,acc,sorted1),q_sort(l2,[h Sorted1],Sorted). pivoting(_,[],[],[]). 95

102 pivoting((h,s),[(x,s1) T],[(X,S1) L],G):-X=<H,pivoting((H,S),T,L,G). pivoting((h,s),[(x,s1) T],L,[(X,S1) G]):-X>H,pivoting((H,S),T,L,G). greater(a, B, A, B):- A > B. greater(a, B, B, A):- A < B. member_rem(x, [X T], T). member_rem(x, [H T], [H T1]) :- member_rem(x, T, T1). member(x, [X _]). member(x, [_ T]) :- member(x, T). append([], X, X). append([h T], X, [H Y]):- append(t, X, Y). 96