PC Pro

The poker supercompu­ter that could be destined for greater things.

The poker-playing supercompu­ter that could be destined for far greater things

- BARRY COLLINS

In the good old days at PC Pro, we used to have after-work poker tournament­s in the office. A crate of beers, enough salted snacks to grit half of London during a cold snap, and a table of very mixed abilities. With all due disrespect to colleagues past and present, I hated those games. Why did I loathe these evening sessions of Texas Hold’em? Because I’m spectacula­rly fond of winning, and these games were impossible to read. Half the table had no idea how to play the game, no sense of when to call or fold, no concept of strategy. I recall bombing out of a tournament early, when my pair of queens was beaten by someone who had gone all-in on a four and a two – one of the weakest hands possible . It wasn’t a bluff, they’d just got bored and shoved their chips in, fluking out when the final five community cards delivered them a straight (2-3-4-5-6) that beat my three queens.

Now, it turns out my erstwhile colleague was actually something of a poker genius. Recently, four of the world’s best poker players got their proverbial backsides handed to them by a computer, Libratus, which comprehens­ively thrashed them over a mammoth 20-day tournament. How did the supercompu­ter take the shirt off their backs? By becoming utterly unpredicta­ble.

Libratus didn’t learn poker the usual way – by watching other people play and picking up the mores and strategy as it goes along. Nor did it take the normal supercompu­ter approach of studying moves from millions of previously played games and working out the best possible play in every scenario. Instead, its masters from Carnegie Mellon University taught the computer the rules of Texas Hold’em and then let it work out how to play the game by itself.

Using an algorithm called “counterfac­tual regret minimisati­on” (which, by the way, is the title of my forthcomin­g autobiogra­phy), Libratus taught itself by first playing hands at random, refining its skills by playing trillions of hands against itself. It wasn’t influenced by how humans play the game – it devised its own strategy. “We give the AI a descriptio­n of the game. We don’t tell it how to play,” Noam Brown, one of the students who built Libratus told Wired. “It develops a strategy completely independen­tly from human play, and it can be very different from the way humans play the game.”

Indeed, it seems Libratus’ strategy was to bet often and heavily, even if – like my PC Pro colleague of yesteryear – it didn’t have the cards to warrant such aggression. The pros didn’t know how to read the computer, and even if by the end of the day they had begun to spot patterns in Libratus’ betting, they were back to square one when play resumed the next morning. Each night, a metaalgori­thm analysed which patterns of play the pros had identified and the top three were patched by the time they sat back round the baize again. Libratus wasn’t trying to find holes in its opponents’ strategy, or look for the “tells” that human players rely on to decide if a player is bluffing; it was making sure there were no chinks in its own armour.

In a Reddit “Ask Me Anything” (AMA) session, the poker pros seemed absolutely crestfalle­n. “Once you face Libratus, there’s nothing worse any human could ever do to you,” said one of the beaten players, Jason Les, who’s won over $1.5 million in tournament­s over the years. “We’re seeing the bot play like a strong human player, but also putting way more pressure on us than any human can correctly.”

The pros even admitted that they would be trying to emulate Libratus’ style of play in the future. “We are definitely going to start overbettin­g more frequently,” said Les, and fellow pro Dong Kim. But before you shoot off down the casino and throw your pension pot at a succession of weak hands, the players cautioned: “It takes a lot of studying to figure out the right way to do it, though. The moment you’re somewhat imbalanced there (bluffing too much, or bluffing too little) then you’re making a huge mistake.”

And that’s the most terrifying thing of all: Carnegie Mellon just built a machine that can out-bluff humans. Hold’em poker is not like chess or Go, the other games in which supercompu­ters have famously outwitted the world’s best players: the AI must make decisions without seeing all of the cards, without knowing what might happen. Carnegie’s developed a branch of AI that’s adept at making decisions with partial informatio­n, and Lord knows what implicatio­ns that might have for stock trading, or negotiatin­g with terrorists or any other scenario that requires you to call someone’s bluff.

Then it struck me: the similarity between Libratus and the 45th president of the United States. Both overcame opponents with very aggressive tactics, with both relying on an enormous degree of bluff. If the Democrats want to retake the White House in 2020, I suggest they speak to Carnegie Mellon. After all, President Libratus has a nice ring to it.

And that’s the most terrifying thing of all: Carnegie Mellon University just built a machine that can out-bluff humans

 ??  ?? Barry Collins is a former editor of PC Pro and campaign manager for Libratus 2020. Tweet your support at @bazzacolli­ns
Barry Collins is a former editor of PC Pro and campaign manager for Libratus 2020. Tweet your support at @bazzacolli­ns
 ??  ??

Newspapers in English

Newspapers from United Kingdom