THE RISE OF THE LEARNING MACHINES
Machine learning is much more than a set of clever algorithms; it’s changing everything
Machine learning is absolutely everywhere now; it’s the computing zeitgeist. Digital assistants use it to translate speech. Facebook uses it to automatically tag images (DeepFace has a claimed accuracy of 97.25 percent). Uber uses it to calculate arrive times and pickup locations. Amazon, Spotify, and Netflix use it to select the recommendations they serve to you. Google introduced RankBrain to its search engine in 2015, to untangle query semantics, Gmail has its Smart Reply service, and last year, Google Maps started using machine learning to extract street names and house numbers from images.
These are just the tip of the iceberg. Behind the scenes, big business has taken machine learning to heart. Financial institutions use it to track market trends, calculate risks, and delve into the streams of financial data. The race to the self-driving car is being powered by machine learning. Any business that is trying to see patterns in the untidy mass of data humans create can make some use of machine learning, from the FBI, medical research, and insurance down. Currently, the five biggest companies by market value in the world are Apple, Amazon, Alphabet (Google), Microsoft, and Facebook. Every one of these has made massive investments in machine learning, and has access to the kind of big data its effective use requires.
One thing to remember is that machine learning and artificial intelligence are two different things. Artificial intelligence is a device, program, or system that appears “smart,” one that can act and react in a quasi-human way. How this is achieved is a separate thing. Machine learning is a method that AI can employ to appear smart but, more importantly, to learn as it operates. It is this ability to adapt that is causing the revolution. PROGRAMING VERSUS LEARNING A computer program is a set of instructions, all neat and logical. Given an input, you can trace the data through the program’s algorithms, and accurately predict the output. This is the traditional way that computers work—inflexible, and quite unlike the real world. Each task requires
the program to be written to deal with that specific task.
Machine learning is a way to create a program that apes the human brain, rather than a calculating engine. Instead of trying to program for every eventuality, every task, every type of data, you create a machine that thinks more like a human. Enter the concept of the neural network: a system designed to mimic the fluid way the human brain creates and changes its internal connections as it learns.
It all sounds wonderfully modern and fresh, only it isn’t. Like many programing concepts, it’s actually not that new. The practical capabilities of machines have long lagged behind ideas. The first recognizable computer algorithms were written around 1837, at a time when hardware was an unfinished mass of cogs and gears.
Machine learning isn’t quite that old; the idea is generally credited to Arthur Samuel, who coined the term “machine learning” in 1959, and helped write a checkers-playing program that learned from its mistakes. Its roots are deeper still, though. Back in 1943, neurophysiologist Warren McCulloch and mathematician Walter Pitts described how a neural network operates, and modeled one using electrical circuits. In 1949, the psychologist Donald Hebb gave us Hebb’s Law: “Neurons that fire together, wire together,” fundamental to the way internal connections are reinforced by learning.
Progress was slow, mainly due to the technical limitations of available hardware, and the lack of key algorithms, such as back propagation (used to calculate weights, more later). Despite decades of research and theory, it took until the 1990s before we had widespread useable machine learning programs, and it took until the 2010s before large neural networks were feasible.
A neural network consists of a set of input and output units. Between these is a grid of artificial neurons, called nodes. These are arranged into layers, the output from one layer feeding into the next. Connections are assigned a weight, a level of influence, which changes as the network learns. Each node is fed data, performs an operation, and sends the result on, according to the weight.
A neural network needs teaching (OK, not all of them, more later). A common
use is pattern recognition in images. So, for example, if you feed your network an image, labeled as “contains cat,” the network knows the desired output, so the configuration of the node weighting is simple. Then you feed in more images: black cats, ginger cats, running cats, sleeping cats, partial images, and images without cats. Each time, the connection weighting needs to be adjusted, so that the correct output is maintained, not only for the current image, but for all previous ones, too. As the network learns, the weight of each connection is established between the nodes.
A system of pattern recognition now emerges within the network—it will learn the outline of a cat, the position of its eyes and ears, and so forth. Eventually, you will be able to feed in an image without revealing the correct output, and the system will correctly see, or not see, a cat. This is the classification model; it attempts to predict one or more fixed outcomes. The regression model is similar, but the output is a continuous variable, say a dollar value, or floating point number.
The magic of a neural network is that the internal process is hidden. You may have created it, but once trained to a task, you don’t actually know what the weights are at each node. It may be using the shape of the cat’s eye to identify them, or it may not, you don’t really know exactly how it achieves the results it does, which is a long way from the fixed logic of a traditional program.
There is an obvious drawback: It is fallible. There is a level of accuracy, and there is never the guarantee of 100 percent; an unusual concept for a computer program, where output errors are seen as bugs, errors in the code logic that can be corrected. A neural network, in becoming more human, has gained our ability to err, too. This isn’t too much of a issue when we are tagging pictures of cats for Facebook, but is something to bear in mind when designing a system for a self-driving car.
There are other issues, too. It’s sometimes difficult to work out why one result outranks another, as so much of it is hidden in the internal weighting. This can also make it difficult to fine-tune. There has reportedly been much internal discussion at Google over the relative merits of machine learning over its rivals for ranking search results and advert targeting—it doesn’t lend itself to a quick tweak when you want to push one result over another.
A simple neural network can run on a handful of nodes. Detailed work requires something a little larger. Facebook’s DeepFace runs on nine layers and has 120 million connection weights. For contrast, the human brain is widely quoted as having 100 billion neurons, although a recent, and rather grisly, experiment involving sampling a human brain soup revealed
a figure of 86 billion. Each is connected to somewhere between 1,000 and 10,000 other neurons (nobody is really sure; 7,000 is often cited as a fair guess). This puts the number of connections into the trillions, more than there are stars in the Milky Way. Wow. The largest artificial systems built so far touch one billion connections, and these have been short-lived research projects. We still have a long way to go.
As you might have guessed, we’ve stuck to the basics here. Machine learning is not all neural networks, for starters; support vector machines are another popular method, which are trained in a similar way, but use a different mathematical model internally. These are simpler, don’t require huge amounts of computational power or big data sets, and the internal workings are more open to examination. But they don’t have the power or scale of a neural network.
Machine learning is a subject that becomes complicated very quickly as you delve deeper. We can’t even begin to list the basic machine learning methodologies; there are over 50. These use a whole slew of statistical analysis tools, decision tree algorithms, dimension reduction, regression analysis, and heaps more. This is high-grade math.
As well as the supervised learning outlined, there are semi-supervised systems, which use a minimum of labeled data, and fully unsupervised systems. These work with no labels at all; you just feed in raw data, and let the algorithms go to work. From these emerge patterns of clusters and associations that might not be obvious any other way.
You can only train a system if you know the output criteria you’re looking for. If you didn’t know you were looking for cats, you can’t train a system to find them. Unsupervised systems are useful for creating data labels, which can then be fed back into supervised systems; for example, finding a cluster of images that appear to contain the same object. They are also good at finding anomalies in data, ideal for security systems looking for signs of fraud or hacking, but when you have no idea where or how these are going to be made.
Another much-used machine learning buzzword is deep learning, essentially just used to describe large and multi-layered neural networks. In image recognition systems, for example, layers may be used to divide images into areas or blocks that may be objects; the next layer may try to define edges; and further layers identify specific shapes, ending with a trainable output. The more layers, the greater the sophistication, as the input is broken down into an increasingly abstract representation of the data. Simple neural networks may only have a few layers; a deep learning
system can run to three figures. They scale well, but do require significant resources.
What machine learning needs to thrive is access to a lot of data. This has now been provided, thanks to the Internet, by us. We’ve typed in untold search requests, emails, and blogs, uploaded millions upon millions of images and videos, created purchase histories, travel histories, we’ve shared things we like, what we’ve seen, heard, and read, and more. This is big data, a set large enough to reveal underlying patterns, associations, and behaviors. We’ve been feeding data into the Internet for years now, and an awful lot of that is sitting in data farms ripe for processing.
The other thing it requires is processing power. The GPU proved to be just what the systems needed for the simple but repetitive operations, and now we have dedicated hardware from Google and IBM, with Intel and others to follow (see box below). Wireless Internet is in our homes, and the hardware required to connect devices is cheap and plentiful. Add these factors together, and we have the perfect storm for an explosion in machine learning.
We have reached the point where it has become easy and relatively cheap to add machine learning voice control or gesture
recognition to something as mundane as your television. Modern sets are already connected to your wireless hub; it’s a simple matter to route your voice command or images to a server running one of the popular machine learning frameworks— there are dozens, including Google’s TensorFlow and Amazon’s Machine Learning. Here, a neural network quickly translates your command and routes it back to your television. And as if by magic, you can instruct your television to turn down the volume by voice or movement, rather than be bothered to press a button. Ten years ago, this would have been difficult and impressive; now, it’s nothing unusual.
We’ve become acclimatized to virtual or digital assistants quickly, too. These have only been on the general market since 2011, the first being Apple’s Siri on the iPhone 4S. Last year, the market for smart speakers grew by over 100 percent, and they were Amazon’s best-selling item. By the end of this year, it’s estimated that there will be over 100 million in people’s homes, and by the end of 2020, there will be over 225 million, all listening for the magic words.
These digital assistants won’t stay in our homes; there are plans to bring them into your car, and your office, as well as build them into other devices—a refrigerator you can talk to, anyone? These assistants won’t just talk to you, either. This year, Google demonstrated Google Duplex, which will make phone calls for you. Currently, it can only cope with mundane tasks, such as booking a reservation at a restaurant. The system has a natural-sounding voice, and even adds “ums” and “ers.” Add a neural network system to answer calls, and it won’t be long before we can have our digital assistant call their digital assistant.
One of the biggest, and most public, applications of machine learning in the real world is the self-driving car. This little project is currently soaking up billions of dollars of research funding, and has a dozen or more of the world’s biggest firms chasing it, from giants such as Google and traditional car companies such as Ford, to a new generation of firms, including Tesla and Uber. It is a huge test of the technology, and results have been generally positive, although not without mishap.
The kind of things that can throw a system are myriad, and often unexpected. While testing its system in Australia, Volvo found that it was confused by kangaroos, unable to work out which way they were going. Apparently, the hopping was registered as both moving toward and away from the vehicle. In India, Tatra’s system is struggling to cope with the strange variety of mixed traffic on the roads. Auto-rickshaws have proved a special problem, as they are
often decorated and customized to such a level that they become unidentifiable.
Despite technical hurdles, the race to autonomous cars appears unstoppable. There are dozens of test programs on public roads, and Ford plans to launch a fully autonomous car by 2021, “in scale.” Early cars will be expensive, and probably rented out for use as a taxi service, rather than personal transport. They also look likely to be initially limited to carefully mapped locations. We can be sure of a fuss— crashes by self-driving cars still make the news even now, and the legal implications are interesting—but they are coming to a road near you soon.
Machine learning will soon be endemic, humming away in the background, helping to run virtually every institution with a network connection, banks and governments, down to bots in our games, and assistants that suggest we order a pizza because an algorithm has calculated that our behavior model indicates this as a likely response to the four-hour gaming session it detected. You never know, perhaps one day we will have a version of Windows that can detect and correct bugs as it operates.
A brave new world? Not a flattering analogy perhaps, but there is a hint of dystopia here. Alongside patently worthy applications, such as medical research, and useful tools, there is going to be a mountain of marketing. Our digital footprint is going to be sifted and sorted, as companies look for patterns and clusters, trying to predict what we want, where we want it, and when we want it. You are the customer, but you will also become the product.
Our rights to online privacy are still being hammered out, too. How much of your life do you want processed by deep learning machines? Because there is a whole generation growing up that sees a smartphone as an essential item, and appears happy to share a detailed view of their lives online. The ramifications for social media are already becoming clear, as increasingly sophisticated bots learn to mimic real people. Twitter and Facebook are already deleting millions of accounts a year. Governments and organizations have always practiced social manipulation in one form or another, from slogans and billboards upward. Integrated machine learning systems bring a whole new armory into play.
At some point, boundaries will have to be drawn. Some of them rather important. It looks as though we are going to have to accept that people will occasionally be run down by autonomous cars, but what about being shot by an autonomous military machine? These are already out there (see boxout below). Are we getting too dark now? Probably. Like any tool we invent, it is up to us how it is used. However “smart” machines become, they are still kinda stupid, too. A three-year-old child
doesn’t mistake a picture of a cake for one of a dog, an error that even the best image recognition system can make.
Many current machine learning projects have a narrow field. They are aimed at specific problems with limited goals: improving search engines results or guiding a car through traffic. The next step is a more generalized approach, where common sense and an understanding of human behavior and interaction are the target. Machines that don’t just talk to order food for us, but can carry out a conversation that you can enjoy, too.
The world of human cognitive development is providing clues. We learn not only through trial and error, but guided by a set of internal instincts, nature, and nurture. Among the researchers in this new field is Intelligence Quest, a group based at MIT, and the Allen Institute for Artificial Intelligence. The next generation of machine learning systems should be able to react to the chaotic and unpredictable human world with more of that most human of qualities: common sense. So when it sees a cake shaped like a dog, it won’t be fooled.
Soon, we may look back at the “dumb” computers of the recent past, with their inflexible logic, with some amusement. In the 1980s, John Gage of Sun Microsystems coined the phrase, “The network is the computer.” He was right, but it took years to sink in. Today, we view any unconnected machine as crippled. Perhaps we could now add one word to that phrase: “The neural network is the computer.”
Nvidia trained this deep learning image noise reduction system with 50,000 images. The most impressive part is that it was trained only using corrupted images.
This little bird was generated by a Microsoft research project called the Drawing Bot. It was created from a text caption: “This bird is red with white and has a very short beak.”
This stop sign, created by researchers at the University of Washington, fools self- driving cars—the added stickers are designed to throw the image recognition off balance.
A team at MIT has created RF-Pose, a system that uses wireless signals to track people, even through walls. It’s trained using a camera and radio signals.
If you train a neural network to identify dogs, then process the same image 50 times, this is the result; here thanks to Google’s DeepDream.
Feed thousands of images into a neural network, and you can create lifelike, but artificial, images. These were generated by Nvidia using a GAN, Generative Adversarial Network.
Adobe’s upcoming Cloak feature can remove selected elements from videos. No more painful frame-by-frame editing required, and the results are impressive.
The Super aEgis II Sentry gun can operate autonomously, tracking targets up to 2km.
These comical cars have been autonomously trundling around the Google campus for a while. Despite the controlled conditions and low speeds, it has not gone without incident.
A self- driving Jaguar iPace car from Waymo (part of Alphabet); these are starting tests in San Francisco.
Using machine learning to study all of Rembrandt’s paintings, a team in the Netherlands generated this: “The Next Rembrandt.” It’s completely new, but unmistakably in his style.