Poker is a game that not only requires quick and smart decision but also a strong control over emotions, as humans we have pretty much master it but for the very first an AI program has outwitted the whole table of pros poker with unseen moves and strategies.
The researches from Facebook and Carnegie Mellon University (CMU) have created a poker-playing bot called Pluribus that has defeated some of the world’s top players in a series of games of six-person no-limit Texas Hold ‘em poker.
Pluribus played over poker over 12 days and 10,000 hands against 12 pros in two different settings. In the first one, the AI played 5,000 hands against five human players and consistently won more than its opponents.
In the other one, five versions of the AI played with one human player and the bot again emerged as victorious. Pluribus won an average of $5 per hand with hourly winnings of around $1,000.
The players that Pluribus defeated were not easy ones, each player involved in this game had previously won more than a million dollars at the poker table.
In this team of 12 involves Darren Elias who holds the record for most World Poker Tour titles, and Chris “Jesus” Ferguson, who has won six World Series of Poker titles.
“Pluribus is a very hard opponent to play against. It’s really hard to pin him down on any kind of hand, it’s very tough to beat that bot,” Chris Ferguson.
Pluribus is a very special kind of AI, as it becomes the very first AI program that has beaten human players at a game with more than two players.
Others AI such as DeepMind’s Go-playing bots have shown that they are unbeatable in two-player zero-sum matches. Compare to poker the game of Go also has more possible board combinations than atoms in the observable universe, making it a huge challenge for AI to map out what move to make next.
But all the information is available to see, and the game only has two possible outcomes for players: win or lose. This makes it easier, in some senses, to train an AI on.
In the game of Poker not only the information needed to win hidden from players (making it what’s known as an “imperfect-information game”), it also involves multiple players and complex victory outcomes.
By solving multiplayer poker, Pluribus lays the foundation for future AIs to tackle complex problems of this sort, says Brown. He thinks that their success is a step towards applications such as automated negotiations, better fraud detection, and self-driving cars.
Pluribus is an updated version of its previous model called Libratus. Libratus was created in 2015 and it has beaten human professionals at two-player Texas Hold ‘em.
After Libratus success, the team decided to increase the number of opponents to five ultimately increasing the complexity for the system.
To tackle six-player poker, Brown, and Sandholm radically overhauled Libratus’s search algorithm. Most game-playing AIs search forwards through decision trees for the best move to make in a given situation. Libratus searched to the end of a game before choosing an action.
But the complexity introduced by extra players makes this tactic impractical. Poker requires reasoning with hidden information players must work out a strategy by considering what cards their opponents might have and what opponents might guess about their hand based on the previous betting.
But more players makes choosing an action at any given moment more difficult because it involves assessing a larger number of possibilities. The key breakthrough was developing a method that allowed Pluribus to make good choices after looking ahead only a few moves rather than to the end of the game.
Pluribus teaches itself from scratch using a form of reinforcement learning similar to that used by DeepMind’s Go AI, AlphaZero. It starts off playing poker randomly and improves as it works out which actions win more money.
After each hand, it looks back at how it played and checks whether it would have made more money with different actions, such as raising rather than sticking to a bet. If the alternatives lead to better outcomes, it will be more likely to choose a theme in the future.
By playing trillions of hands of poker against itself, Pluribus created a basic strategy that it draws on in matches. At each decision point, it compares the state of the game with its blueprint and searches a few moves ahead to see how the action played out.
It then decides whether it can improve on it. And because it taught itself to play without human input, the AI settled on a few strategies that human players tend not to use.
Mr. Brown said just 20 hours of learning was needed to program the AI up to the ability of a world-beating poker professional.
During the game, Pluribus was remarkably good at bluffing its opponents, with the pros who played against it praising its relentless consistency and the way it squeezed profits out of relatively thin hands.
Pluribus adopted some surprising strategies, including “donk betting,” or ending one round with a call but then starting the next round with a bet.
It was predictably unpredictable, a fantastic quality in a poker player. And it did it just by playing cards; there’s no element of machine vision or facial recognition incorporated into Pluribus to spot tells.
“It is mind-blowing,” says Tuomas Sandholm, a professor at Carnegie Mellon who helped develop Pluribus. “I didn’t think we were anywhere close it was only about a year ago that I started to believe.”
Another amazing feature of this poker-playing AI is its cost and efficiency. Pluribus needs only $150-worth of cloud computing resources to work and it requires only two central processing units power.
If we compare it with DeepMind’s original Go bot then it used nearly 2,000 CPUs and the previous version of Pluribus, the Libratus uses 100 CPUs,
As part of its Facebook’s announcement for the new technology, Facebook quoted several human poker champions who had been invited to play against the AI.
“With six players, it is so much more complicated that you can’t search to the end of the game,” says Brown, who is now part of the Facebook Artificial Intelligence Research (FAIR) group. “The algorithm is the key.”
More in AI