Artificial Intelligence and Poker
Two Carnegie Melon researchers, Tuomas Sandholm, professor of computer science at CMU, and PhD student Noam Brown, have developed an AI called Libratus, that recently beat the world’s best human players at no-limit Hold ‘Em, a version of poker that allows any bet at any time. No machine before this has been able to beat humans at this unusually complex game of cards. AI systems have beaten the best players at checkers, chess, Othello, and Japanese strategy game Go, but no-limit Hold ‘Em has a different obstacle: Poker is an ‘imperfect information’ game. Since so many of the cards are hidden, besides skill, some luck is certainly involved.
This is not the only AI attempting to tackle poker, and by extension, real world problems involving incomplete information. A rival team of researchers at the University of Alberta published a paper that their AI, DeepStack, had already beaten good human poker players, using a markedly different strategy. (The paper is yet to be peer-reviewed as of Feb 2017).
DeepStack uses deep neural networks to mimic human intuition, in a design similar to that of AlphaGo, Google’s Go-playing AI. Though Go may be complex, it still has, like chess, perfect information.
Texas Hold ’em, by contrast, is a card game with imperfect information. A player is dealt two “hole” cards that only that player can see, before three communal cards are dealt face up on the table, followed in turn by a fourth and a fifth card. Players can place bets after each stage of the deal, and in no-limit Texas Hold ‘Em, they can bet as much as they want at any stage.
The strategy in poker is to win as much money as possible, and not necessarily each hand. As the game progresses, it becomes a competition where players try to guess the cards their opponents are holding based on not only the most recent bet, but all the bets made over the course of the match by all the players. There is also full-on bluffing.
That’s why it’s is so tough for AIs to play poker. Libratus, running its calculations on a supercomputer at the Pittsburgh Supercompting Center, has one big advantage over humans: in seconds, it can ‘play’ tens of thousands of different scenarios of the game, and use this to decide the best play.
DeepStack uses a different method. It doesn’t necessarily look all the way to the end of each possible scenario. Instead, it uses a neural network to guess the results of each play. The Alberta team ‘trained’ this DeepStack neural net using thousands of poker situations, looking at not just the cards but the bets. In this way, the neural network ‘learns’ to judge which bets will be more successful. It does not calcuate every possible outcome of every hand, but does a fast approximate estimate. DeepStack beat a number of fairly good players, but has some way to go to beat top players. In the meantime, Libratus has captured the attention of the poker world.
Libratus won over $US1.7 million against four of the world’s top professional poker players in a 20-day poker tournament that ended on 31 Jan 2017!
“The best AI’s ability to do strategic reasoning with imperfect information has now surpassed that of the best humans,” said Prof Sandholm.
One of the main improvements as compared to an earlier AI, Claudico, also from CMU, was in Libratus’ ability to bluff.
Dong Kim, one of the four top poker players who participated in the tournament had also participated in a similar tournament with Claudico in 2015.
About the challenge from Libratus, Kim said, “It was about half way through the challenge. I knew we wouldn’t come back. It had less bugs in the algorithm. We just ran over Claudico, bluffed it everywhere, but this time I felt like it was the other way around.”
Against Claudico, the human players had won $US700,000 over 80,000 hands, beating it almost every day of the tournament. However, against Libratus, they only won five days out of 20.