THE AI EVOLUTION
How artificial intelligence is changing everything
IT’S THE NE
XT BIG WAVE artificiali ntelligence SosaysIntel
ofcomputing. it. According ( AI). If anything, of
to some that’s underselling almost every evangelists, major AI is the key
problem to unlocking cancer to limitless humanity faces,
clean energy. from a cure for including philosopher Less optimistic
AI as and observers, one o f neuroscienti
the most st S am Harris survival pressing , see of m ankind. existential threats
Either way, as to the put it, it’s pretty Dr. Em
heavy. mett Brown w
Even ould
a brief analysis takes on of the epic implications
proportions. of AI quickly of the epistemolog Back at the more ical scale, practical end getting a g rasp on current commercial implementat
ions of AI can
Machine learning, be equally cores—keeping deep learning, baffling.
neural networks, track of the processes Tensor mention the jargon, and hardware,
associated with not to So, the long- AI is a full-time term impact job.
of AI may be
But, in the here
and now, there anyone’s guess. canatleastbegin are plenty of
questions that computingterms? tobeaddressed.
WhatisAIin hardware is fortoday?What
Whatisitused practical involved and kindof
how are AI workflows
And does it
add up to any processed? enthusiast thing that you should care about?
as a c omputing to bolster their Or is it just a
balance sheets? tool for Big Tech
BUZZWORDS AND BULLCRAP or the greatest paradigm shift in the history of computing? What, exactly, is artificial intelligence, or AI? According to the hype, AI won’t just radically revolutionize computing. Eventually it will alter almost every aspect of human life. Right now, however, defining AI and determining how relevant it really is in day-to-day computing, that’s not so easy.
Put another way, we can all agree that when, for instance, the self-driving car is cracked, it’ll have a huge impact on the way we live. But more immediately, when a chip maker bigs up the “AI” abilities of its new CPU or GPU, does that actually mean much beyond the marketing? Whether it’s a graphics card or a smartphone chip, is the addition of “AI” fundamentally different from the usual generational improvements in computing performance?
Taken in its broadest sense, AI is any form of intelligence exhibited by machines. The meaning of “intelligence” obviously poses philosophical problems, but that aside, it’s a pretty straightforward concept. Drill down into the specifics, however, and it all gets a lot more complicated. How do you determine that any given computational process or algorithm qualifies as artificial intelligence?
WHAT DEFINES AI?
One way to define AI is the ability to adapt and improvise. If a given process or algorithm can’t do that to some degree, it’s not AI. Another common theme is the combination of large amounts of data with the absence of explicit programming. In simple terms, AI entails a system with an assigned task or a desired output, and a large set of data to sort through, but the precise parameters under which the data is processed aren’t defined. Instead, the algorithms are designed to spot patterns and statistical relationships, and learn in a trial and error fashion. This is what is otherwise known as machine learning and it’s usually what is meant when the term AI is used in a commercial computing context.
A good example of how this works in practice is natural language processing. A non-AI approach would involve meticulously coding the specific rules, syntax, grammar, and vocabulary of a given language. With machine learning, the algorithmic rules are much less specific and all about pattern spotting, while the system is fed huge amounts of sample data from which patterns eventually emerge.
GPT-3
Generative Pre-trained Transformer 3 ( GPT-3), developed by San Francisco-based OpenAI and released in 2020, is just such a machine learning natural language system. It was trained using billions of English language articles harvested from the web. GPT-3 arrived to much acclaim, with TheNewYorkTimes declaring it “terrifyingly good” at writing and reading. In fact, GPT-3 was so impressive, Microsoft opted to acquire an exclusive license in order to use the technology to develop and deliver advanced AI-powered natural language solutions, the first of which is a tool that converts text into Microsoft Power Fx code, a programming language used for database queries and derived from Microsoft Excel formulas.
GPT-3 goes beyond the usual question-and-answer Turing Test tricks. It can do things such as build web layouts using JSX code in response to natural language requests. In other words, you type something like “build me a page with a table showing the top 20 countries listed by GDP and put a relevant title at the top,” and GPT-3 can do just that. It showcases both the ability to adapt and a system that’s based on processing data rather than intricate hand-coded rules.
However, GPT-3 is also a case study in the limitations of machine learning. Indeed, it’s debatable whether GPT-3 is
actually intelligent at all. Arguably, it could all be considered something of a digital conjuring trick. That’s because GPT3 and its machine-learning natural language brethren do not understand language—or, ultimately, anything else—at all. Instead, everything they do is simply based on statistics and patterns.
By analyzing enormous quantities of written text, statistical rules that output plausible responses to natural language queries can be created without any need for what, on a human level, would be classed as understanding. And that, essentially, is the basis for most—if not all—machine learning, whether it’s applied to the problem of natural language processing and voice assistants, self-driving cars, facial recognition, or recommending products and content to customers and consumers. It’s just pattern spotting on an epic scale. From Amazon’s Alexa to Tesla’s Autopilot, the fundamental approach is the same. You can find out more about the limitations of existing AI systems in the boxout page 43, but if we’ve established the rough parameters of AI, the
next question is how it’s implemented and what kind of hardware is required.
HARDWARE REQUIREMENTS
At one end of the scale is what you might call AI infrastructure, the supercomputers and cloud installations used to do the epic data analysis and generate the pattern maps or models. At the other? Client devices that use those models in practical applications.
It’s the latter that’s of most interest to us as PC enthusiasts. Put another way, the question this all begs is whether AI is a clearly identifiable computing paradigm that benefits from dedicated hardware in client devices such as PCs and phones. Does it mean anything to have something akin to “AI compute cores” or specific parts of a chip expressly designed to accelerate AI applications? Or is AI on the client level largely marketing spin?
For sure, chips from the big players in computing are increasingly sold on the strength of their AI prowess. The most obvious example on the PC is Nvidia’s idia’s graphics chips. For several generations now, Nvidia’s GPUs Us have claimed specific AI capabilities thanks to a feature known wn as a Tensor core. First seen in Nvidia’s Volta GPUs for enterprise applications and then in the Turing family of gaming GPUs, sold as the GeForce RTX TX
20-series, Nvidia’s Tensor cores are now in their third generation in the
Ampere GPUs, otherwise known as the GeForce RTX 30-series.
DLSS
The most familiar application for
Nvidia’s Tensor cores is its DLSS, or r Deep Learning Super Sampling technology. gy. It’s a graphics upscaling implementation designed to deliver higher image quality while also boosting frame
rates. The basic idea is to render a game engine at a a lower resolution, say 1080p, to achieve higher frame rates, and then use us DLSS to upscale the image to mimic the visual vis quality of a higher resolution, such as 4K.
The basic idea of upscaling ups isn’t new. But instead of conventional spatial upscaling, DLSS is a temporal algorithm that compares the th current frame to the previous frame, generating motion vecto vectors that can be used to enhance image quality during upscaling. As it happens, DLSS is also a neat example of how AI infrastr infrastructure interfaces with client hardware.
DLSS is based on a supercompute supercomputer-powered per-game training process that compares millions of high-quality reference frames rendered at 16K (15,360 by 8,640 pixels) to scaled outputs, incrementally generating improved algorithms. Once the “training” process is complete, the DLSS scaling model for a given game is included in Nvidia’s graphics drivers, then accelerated by the Tensor cores in Nvidia RTX graphics chips, allowing DLSS to be applied in real time to a running game engine.
And you know what? At its best, DLSS does things that other upscalers simply cannot achieve. Nvidia’s arch-rival AMD recently released a more conventional non-AI alternative to DLSS, and while it works well enough, it lacks the magic of DLSS. Nvidia’s AI upscaler isn’t flawless, but it comes closer to convincingly mimicking native resolution rendering than anything else.
WHAT’S IN A TENSOR?
So, what exactly are Nvidia’s Tensor cores and how do they differ from more familiar circuitry like pixel shaders in GPUs or integer and floating point units in CPUs? Is there really something specific about them that justifies the “artificial intelligence” angle?
In fundamental computational terms, Tensor cores accelerate matrix operations by performing mixed-precision matrix multiply and accumulate calculations in a single operation. These kinds of calculations are mathematically straightforward, but computationally intensive on general-purpose computing architectures, such as CPU cores. The “Tensor” branding, for the record, is an appropriation of the mathematical term of the same name that refers to containers of data of which matrices
are a subset, and also includes scalars and vectors. Scalars can be thought of as zero-dimensional tensors composed of a single number, vectors are a one-dimensional tensor composed of a single column of numbers, and matrices are two-dimensional tensors made up of both rows and columns of numbers.
Modern CPUs do have floating point units with SIMD or single instruction multiple data extensions designed, among other things, to accelerate calculations involving matrices. But that hardware remains multipurpose rather than purely dedicated to matrix math. Highly parallelized GPUs are even better at crunching matrices, but even they employ broadly programmable units, such as Nvidia’s CUDA cores, that are not narrowly and exclusively designed for matrix math.
Except, of course, Nvidia’s Tensor cores. Nvidia’s GV100 chip, as found in the Titan V graphics card, was its first GPU with Tensor cores—640 of them to be precise, rated at 110TFLOPS. By way of comparison, the GV100 also has 5,120 of Nvidia’s conventional CUDA cores, which can also do matrix math. The combined computational output of those CUDA cores comes to just 27.6TFLOPS, despite being far more numerous and eating up the majority of the die space on the GV100 chip. When it comes to matrix math, then, Tensor cores are orders of magnitude more efficient.
What’s more, the matrix math performance of Nvidia GPUs is growing fast. At one end of the scale, the GeForce RTX 3060, currently the lowest-end RTX 30 series GPU available, roughly matches the GV100 in terms of Tensor core performance. Meanwhile, Nvidia’s latest high-end enterprise GPU, the A100, has around triple the Tensor core performance. As it happens, the A100 chip is the basis for Nvidia’s Perlmutter supercomputer, claimed to be the world’s fastest AI machine. Created in
partnership with HP, it packs no fewer than 6,159 A100 GPUs, delivering nearly four exaflops of compute power, a number so huge that it’s difficult to, well, compute.
INTEL, GOOGLE, AND APPLE
Anyway, AI-specific logic in chips from Nvidia’s competitors is also all about matrix math. Intel’s upcoming Sapphire Rapids Xeon processor for servers, for instance, packs what Intel is describing as a new “built-in” AI acceleration engine called Intel
Advanced Matrix Extensions, or AMX for short. For now, Intel isn’t giving away much by way of specifics. But it’s thought that AMX essentially amounts to a matrix math overlay on top of Intel’s existing AVX-512 vector engines.
Coincidentally, Apple has its own matrix-optimized AI blocks in its in-house chips, beginning with the A13 SoC in the iPhone 11, and revised for the new A14 and M1 chips, the latter found in the latest MacBook Air, entry-level MacBook Pro, Mac Mini, and iMac. Those blocks are also dubbed AMX, although in this case that refers to “Apple Matrix,” as opposed to Intel’s Advanced Matrix Extensions. Actually, Apple’s chips confusingly have yet another AI-accelerating block, known as the Neural Engine. For more on that and the broader concept of neural networks, head for the boxout on the right.
Meanwhile Google’s TPU, or Tensor Processing Unit, which is a whole chip dedicated to matrix or tensor math, is now on its fourth generation, having debuted in 2015. Granted, the Google TPU is not something you’ll find in your PC or phone, but it’s t s another example of how the computing industry is converging g around a common approach to AI acceleration, and confirmation mation that AI is a meaningful computing paradigm, not merely erely a marketing term. Google says those TPUs, incidentally, y, were used to identify and process all the text appearing in its s entire StreetView database in Google Maps in under five days. s.
Overall, AI presents an intriguing dichotomy. On the one hand, there is no doubting the huge impact it is having on computing and in turn the way we live our lives. es. AI and machine learning can do things no other computing puting paradigm can match. On the other, it’s hard to dispel el the sense that it’s misdescribed. Machines can certainly learn arn to do some seemingly clever things, but it’s debatable if there’s here’s any true intelligence involved. If it isn’t, if all this really ally is purely pattern spotting, AI may be destined to remain main siloed into narrow if ever more numerous competencies, cies, the full promise and peril of a more generalized artificial ficial intelligence tantalizingly out of reach.