Artificial Intelligence (AI) programs have proven they “know when to hold ’em, and when to fold ’em,” AIs have long dominated games such as chess, and last year conquered Go, but they have made relatively lousy poker players. In DeepStack, researchers have broken their poker losing streak by combining new algorithms and deep machine learning, a form of computer science that in some ways mimics the human brain, allowing machines to teach themselves.
“It’s a … a scalable approach to dealing with [complex information] that could quickly make a very good decision even better than people,” said Murray Campbell then, a senior researcher at IBM in Armonk, New York.
This has been until last year as recently, scientists observed that Libratus, the artificial intelligence – that defeated four top professional poker players this year, uses a three-pronged approach to master a game with more decision points than atoms in the universe. Researchers from the Carnegie Mellon University in the US detailed about how their AI was able to achieve superhuman performance by breaking the game into computationally manageable parts and, based on its opponents’ game play, fix potential weaknesses in its strategy during the competition which was quoted in a case study that got published in the journal Science.
Poker players, contend with hidden information – what cards their opponents hold and whether an opponent is bluffing. In a 20-day competition involving 120,000 hands at Rivers Casino in Pittsburgh in January, Libratus became the first AI to defeat top human players at head’s up no-limit Texas Hold’em Poker – the primary benchmark and long-standing challenge problem for imperfect-information game-solving by AIs. It collectively amassed more than USD 1.8 million in chips.
According to the researchers, “The techniques in Libratus do not use expert domain knowledge or human data and are not specific to poker. Thus, they apply to a host of imperfect-information games.”
Libratus includes three main modules, the first of which computes an abstraction of the game that is smaller and easier to solve than by considering all possible decision points – about 10 multiplied 161 times – in the game. It creates its own detailed strategy for the early rounds of Texas Hold’em and a strategy for the later round which is called the blueprint strategy.
The study marks that, “The techniques that are developed and are largely domain independent can thus be applied to other strategic imperfect-information interactions, including non-recreational applications.”
The second module constructs a new, finer-grained abstraction based on the state of play which also computes a strategy for sub-game in real-time and balances strategies across different sub-games by using the blueprint strategy for guiding something to achieve safe sub- game solving. And the third module, is designed to improve the blueprint strategy as competition proceeds. Typically, AIs use machine learning to find mistakes in the opponent’s strategy and exploit them.
However, this exploits the AI if the opponent shifts their strategy, said Sandholm.
“Due to the ubiquity of hidden information in real-world strategic interactions, we believe the paradigm introduced in Libratus will be critical to the future growth and widespread application of AI,” they said. Also, Libratus’ self-improver module analyses opponents’ bet sizes to detect potential holes in Libratus’ blueprint strategy. Hence, they added the missing decision branches that computes strategies for them and adds to the blueprint.