A CHIP-DESIGN GURU AND THE UNFATHOMABLE FUTURE OF AI CPU ARCHITECT AND TECH LEGEND JIM KELLER ON MACHINE LEARNING
Is there anybody involved in chip design with a better résumé than Jim Keller? He’s the brains behind not only AMD’s first CPU architecture to really sock it to Intel (that’ll be Hammer back in the 2000s), but also the driving force that led to Zen and AMD’s renaissance in the late 2010s. He sired Tesla’s Autopilot chip, set Apple on the path to building arguably the most efficient client CPUs available today, and more recently had a stint at Intel.
Now Keller is heading a start-up, Tenstorrent, specializing in custom chips for accelerating AI workloads. That fact alone is enough to lend serious credibility to the field of AI and machine learning. According to Keller, there are now three kinds of computers. First there were CPUs. Then came GPUs. Now, the AI computer has arrived.
“In the beginning,” Keller explains of the development of computers, “there was scalar math, like A equals B plus C times D. With a small number of transistors, that’s all the math you could do.” As the transistor count of computer chips grew exponential, so did the complexity of math that was possible. First, vector math was possible, then matrix multiply. Today, the complexity is pushing chip design in a new direction.
“As we get even more transistors, you want to take big operations and break them up. If you make your matrix multiplier too big, you begin to waste energy. So you build this optimal size block that’s not too small, like a thread in a GPU, but not too big, like covering the whole chip with one matrix multiplier.” The result, according to Keller, is an array of medium-sized processors where “medium” means a processor capable of four tera operations per second. What Keller is describing is the next step in AI computing from specialized blocks that accelerate matrix math, like Nvidia’s Tensor cores, to a new type of chip that accelerates what he calls “graph” computing.
The other piece of the puzzle is software, an element Keller clearly views as just as important as hardware, and which he refers to as Software 2.0. “The first time I heard about Software 2.0 it was coined by Andrei Karpathy, who is the director of AI and autopilot at Tesla. His idea was that we’re going from a world where you write programs to modify data to where you build neural networks and then program them with data to do the things you want. So modern computers are programmed with data. It means a very different way of thinking about programming in many places where AI has had so much success. I think in the Software 2.0 future, 90 percent of computing will be done that way,” he says
The most fascinating of Keller’s prognostications is the future unfathomability of AI computing. “If you go look inside a neural network that’s been trained and ask why is this particular element given, say, the weight 0.0015843, nobody knows,” Keller explains. “There’s phenomena now where you have a machine learning system with inputs and outputs. There’s an AI black box in between the inputs and outputs, but if you look in the box you can’t tell what it means. You don’t understand the math or the circuits underneath.”
Keller concedes that already the full complexity of a high-end AMD, Intel, or Apple chip is almost unfathomable, but if you dig into any given detail, someone knows what it does. If Keller is correct, it won’t be long before 90 percent of computing is beyond the wit of man. Whether it’s intelligent is another matter.