AI godfathers have given Canada a lead they don’t want to lose

2019-02-20 - JAMES MCLEOD

The centrepiece of Canada’s innovation strategy is the $950-million “supercluster” initiative. The goal, according to the federal government, is for companies of all sizes, academia and the non-profit sector to collaborate on new technologies, to spur economic growth and create jobs. As part of the Innovation Nation series, the Financial Post is taking an in-depth look at each of the five regional projects, and provide continuing coverage of their progress. Not too long ago, neural nets were deeply uncool.

Researchers who believed in the usefulness of such computer programming were “outcasts in their own departments” at universities,

Geoffrey Hinton recalls, treated like misguided eccentrics at best, and outright heretics at worst.

Hinton was one of those heretics for decades, a computer science professor at the University of Toronto playing around with neural

networks. His eyes light up and he leans forward as he tells stories about the old days, when breakthrough research papers would be rejected from scientific conferences because their contents were deemed too radical.

“People like me and Yann and Yoshua thought this is just going to blow everything away. And when we were uncautious, we said so.” he said. “It was heresy.” Yann is Yann LeCun, now

Facebook Inc.’s head of AI research and a professor at New York University, while Yoshua Bengio is a professor at the University of Montreal, director of MilaQuebec Artificial Intelligence Institute and founder of Element AI, which raised $135 million in venture capital in 2017. Its next investment round could lead to unicorn status — a valuation of more than $1 billion — Bloomberg reported last year.

Hinton, meanwhile, is still a professor (albeit emeritus) at the University of Toronto, but at 71 years old, he works at Google LLC’s Toronto office as part of the AI team. At the end of 2018, he was appointed to the Order of Canada.

Though Bengio’s research receives 131 academic citations every day, according to Google Scholar data — Hinton isn’t far behind at 127 daily citations and LeCun gets 62 — it’s Hinton who is considered something of an unlikely rock star in the world of artificial intelligence: a British septuagenarian with a wry sense of humour and a back problem that means he absolutely never sits down.

All three of them are often collectively referred to as the “godfathers” of modern artificial intelligence and the fact that two of them — Hinton and Bengio — are Canadian puts the country on the leading edge of this technology, with both Toronto and Montreal considered globally significant centres driving billions of dollars in research and engineering investment.

The two Canadians arguably represent the biggest innovation home runs in a generation or more, but they play in a field that many don’t understand and some of those who do are questioning whether the lead they helped create is slipping away.

One of the problems in understanding artificial intelligence is that marketing departments have co-opted the term, diluting it to the point of being nearly meaningless.

The real juice in modern AI is deep learning, which is a computer programming technique that relies on artificial neural networks modelled to mimic the synapses and neurons in a human brain. The basic idea is that a neural network has an input layer that takes in data and an output layer that delivers an answer. In between, there are many hidden layers, which is why it’s called deep learning.

Say you’re trying to teach a neural network to recognize images of trees and you’ve got a million photos, some of which have trees and some don’t.

Each neuron on the input layer might receive data about one pixel from the photo, which means the input layer could have millions of neurons. The output layer would only have two neurons, labelled “tree” and “no tree.”

In between, there is a whole bunch of hidden layers with a web of connections linking up neurons from one layer to the next, and — this is the really critical part — how strong each connection is determines whether the message is passed from one layer to the next.

As a picture goes into the system, the neurons start firing and passing their excitement along through the network until either the “tree” or “no tree” neuron gets excited at the output layer.

At first, the answers from the neural network will just be random guesses, but if the connections that lead to the correct answer are strengthened, then the neural network will over time get really good at guessing whether there’s a tree in the photo.

For example, to find a tree, you might look for a green shape above a brown shape, a weak rule that is good unless you’re looking at a tree in autumn or a birch tree, which has a mostly white trunk. People understand the physical world by learning such rules, exceptions and patterns, and a neural network can, too, by repeating the process thousands or millions of times.

Of course, this is a gross simplification of the process. Real deep learning systems build on this structure by using complex mathematics and structural techniques such as “convolutional” neural networks or “generative adversarial networks” to produce results.

But at its core, deep learning is all about neural networks getting smarter by making guesses and learning from its successes and failures.

This is why there’s so much excitement about deep learning. Neural networks could be used for any task you can structure so that a computer receives information — a picture, video, search query, job application, financial report — makes a judgment on what to do, and then receives feedback about whether it came up with the right answer.

These basic concepts behind neural networks are not new. One of the key papers Hinton co-authored on the topic was published in 1986.

“What really changed was the amount of computation and the amount of data, and there was no real way of knowing back then,” he said.

“It really hinged on having a lot of data to find all these weak rules, and having a lot of processing power to learn them all. And we didn’t know how much data and how much processing power, and we were off by a factor of about a million, so of course it didn’t work very well.”

Despite flashes of promise, neural networks were widely considered to be a dead end for most of the 1980s, 1990s and even well into the 2000s.

“The neural network community consisted of, like, I don’t know, Geoff, Yann, Yoshua and a handful of students and a handful of postdocs,” said Roland Memisevic, who studied under Hinton, was a professor at the University of Montreal alongside Bengio, and is now the founder and chief executive of TwentyBN, a startup trying to bring deep learning to video-based chatbots.

In interviews, Hinton, Bengio, LeCun and several other people involved at the time all mentioned Canadian Institute For Advanced Research grants as a key reason why neural network expertise stayed in Canada during the decades when nearly nobody believed in the technology.

“It enabled Yoshua and Yann to do their thing, despite the headwind they were getting. This clearly adds to the Canadianness of AI,” Memisevic said. “There was this agency that just gave them the funds they needed so they could do their weird stuff that nobody else believed in.”

LeCun fondly remembers little gatherings where a small clutch of researchers could bounce ideas off each other and refine their thinking.

“We needed a safe space, and that safe space was provided by Canada,” he said. “Canadian universities were smart enough to actually hire Geoff and Yoshua and basically trust them to do the right thing, even though they were working on things that were not very popular at the time. This is unique.” Depending on who you ask, the breakthroughs for neural networks came sometime between about 2007 and 2012, but the world didn’t really take notice until the 2012 ImageNet competition to build a computer program that would recognize images from a data set of 15 million of them displaying 22,000 different categories of objects.

A neural network designed by one of Hinton’s students blew the competition out of the water, achieving a 15.3 per cent error rate — the second-best entry was 26.2 per cent.

The message sent to the computer science world was crystal clear: Neural networks actually work.

Money has since flooded into research and commercialization, with Hinton, Bengio and LeCun being treated as visionary geniuses.

The biggest companies

in the world — Amazon. com Inc., Apple Inc., Facebook, Google, Microsoft Corp., Samsung and Uber

Technologies Inc. — have enthusiastically embraced neural networks, and a whole generation of startups has started applying the technology to everything from medicine to self-driving cars.

So far, deep learning has been extremely good at things such as image recognition and natural language processing — speech recognition and translation — areas where there is a lot of data to work with, but it’s unstructured data in the form of pictures or text, instead of numbers in a database.

What makes deep learning so radically different from previous computer programming is that it doesn’t rely on firm rules and rigid structures. Nobody teaches a neural network how to make decisions. You just tell it whether it’s guessed right or wrong, again and again, and it gets smarter.

Neural networks are already everywhere — in phones, video games and many of the internet services used every day — though deep learning believers say we’re really only scratching the surface.

But despite AI’s early edge, it’s entirely possible Canada will get left in the dust as the gold rush to commercialize goes global.

For example, Amazon said access to AI talent was reportedly one of the key factors in deciding where to locate its HQ2. Ultimately, it chose Virginia, though Toronto was among the 20 finalists.

Moreover, China’s rapacious entrepreneurial spirit, combined with access to massive troves of data from the surveillance state and an unmatched enthusiasm for AI could turn the Middle Kingdom into the dominant player, argues Kai-Fu Lee in AI Superpowers, his highly influential book.

CANADIAN UNIVERSITIES WERE SMART ENOUGH TO ACTUALLY HIRE GEOFF AND YOSHUA AND BASICALLY TRUST THEM TO DO THE RIGHT THING, EVEN THOUGH THEY WERE WORKING ON THINGS THAT WERE NOT VERY POPULAR AT THE TIME. — YANN LeCUN, FACEBOOK INNOVATION NATION

“We are only in the game because of the great bench strength of our researchers, but we are just hanging on by our fingernails because we are so far behind in basically every area of commercialization,” said Ajay Agrawal, University of Toronto economics professor and co-author of Prediction Machines, a widely read book on the economic implications of AI.

Undaunted, Bengio is trying to maintain Montreal’s position as an AI power, partly by using his own celebrity in the field.

Unlike Hinton, who is most excited to talk about the history and the concepts underpinning deep learning, Bengio loves to talk about policy implications.

Bengio said AI is going to create enormous value and transform the world, but managing that transition, even with retraining and education, will be difficult and expensive.

“AI and automation are going to potentially create misery in people who are going to lose their jobs. There is going to be a fast transition and there is going to be a social cost, and who’s going to pay for this?” he said.

“So how do we make sure that the Canadian government gets that wealth? If all of the growth happens from companies that are headquartered in Silicon Valley or Beijing, well, I’m sorry, we’re just going to be buying those products and not getting any of that wealth.”

At Mila, the Montrealbased AI institute, Bengio is using his star power to drive industry partnerships with giants such as Facebook, Google and Samsung, to name a few.

They want access to AI talent, and talent surrounds Bengio, because he’s one of the AI godfathers.

The federal government also situated the AI Powered Supply Chain Supercluster in Montreal as part of its $950-million innovation supercluster strategy.

“The big AI boom in Montreal is due to Yoshua. Period. Montreal is positioning itself as the leader in AI and so on, and Toronto is doing that in its own way,” Memisevic said.

Toronto has the AI-focused Vector Institute and Alberta has a notable AI community, too, but Montreal arguably has the stronger foundation, and the research generated there is closely watched by the industry in the rest of the world.

“It took a little longer in Toronto, because Geoff himself is not as much of an organizer as Yoshua is,” LeCun said.

Both Hinton and Bengio are a little bashful about their celebrity status in the deep learning community, but both say Canada can benefit from it, partly by drawing in the next generation of academics to carry the deep learning research forward.

“If you look at the main impact I’ve had, it’s been by my students,” Hinton said.

“I believed in this stuff strongly and long before most other academics. I got the best students. I got the students who had the good intuition to say this is where the future is.

“I got the best ones, and they did amazing work … We’ve still got some very, very good people here.”

?? NATIONAL POST PHOTO ILLUSTRATION ?? Geoffrey Hinton, left, Yoshua Bengio, centre, and Yann LeCun are often collectively referred to as the “godfathers” of modern artificial intelligence. — NATIONAL POST PHOTO ILLUSTRATION Geoffrey Hinton, left, Yoshua Bengio, centre, and Yann LeCun are often collectively referred to as the “godfathers” of modern artificial intelligence.

AI godfathers have given Canada a lead they don’t want to lose

Newspapers in English

Newspapers from Canada