Tech Advisor

AMD Radeon Vega

55 things you need to know about AMD’s cutting-edge graphics cards AMD’s latest cards bring graphics and memory closer together, reveals Brad Chacos

-

Wait for Vega.” For the past six months that’s been the message from the Radeon faithful, as Nvidia’s beastly GeForce GTX 1070 and GTX 1080 stomped above AMD’s Radeon RX 400-series graphics cards.

While Nvidia’s powerful new 16nm Pascal GPU architectu­re scales all the way from the lowly £150 GTX 1050 to the mighty £1,300 GTX Titan X, AMD’s 14nm Polaris graphics are designed for more mainstream video cards, and the flagship Radeon RX 480 is no match for Nvidia’s higher-end brawlers. Thus ‘Wait for Vega’ has become the rallying cry for AMD supporters with a thirst for face-melting gameplay – Vega being the code name of the new enthusiast-class 14nm Radeon graphics architectu­re teased on AMD road maps for early 2017.

Unfortunat­ely, the wait will continue, as the new architectu­re won’t appear in shipping cards until sometime later in the first half of 2017. But at CES, Vega is becoming more than a mere codename: AMD is finally revealing some technical teases for Radeon’s performanc­e-focused response to Nvidia’s titans, including how the new GPU intertwine­s graphics performanc­e and memory architectu­res in radical new ways.

Before we dive in too deeply, here’s a high-level overview of the Vega technical architectu­re preview. All those words will become meaningful in time. Let’s start with what you want to hear about first.

1. Speed demon

In a preview shown to journalist­s and analysts in December, AMD played 2016’s sublime Doom on an early Radeon Vega 10 graphics card with everything cranked to Ultra at 4K resolution. The game scales like a champ, but that’s hell on any graphics card: even the GTX 1080 can’t hit a 60 frames per second (fps) average at those

settings, per Techspot tinyurl.com/gvt9kxe. Radeon Vega, meanwhile, floated between 60- and 70fps. Sure, it was running Vulkan – a graphics API that favours Radeon cards in Doom – rather than DirectX 11. But, hot damn, the demo was impressive.

A couple of other sightings in recent months confirm Vega’s speed. At the New Horizon livestream that introduced AMD’s Ryzen CPU to the world, the company showed Star Wars: Battlefron­t running on a PC that pairs Ryzen with Vega. The duo maxed out the 4K monitor’s 60Hz speed with everything cranked to Ultra. The GTX 1080, on the other hand, hits just shy of 50fps, Techspot’s testing shows (tinyurl.com/hbf6fad).

Meanwhile, a since-deleted leak in the Ashes of the Singularit­y database in early December showed a GPU with the Device ID ‘687F:C1’ surpassing many GTX 1080s in benchmark results. Here’s the twist: the Device ID shown in the frame rate overlay during AMD’s recent Vega preview with Doom confirmed that Vega 10 is indeed 687F:C1. These numbers come with all sorts of caveats: Vega 10 isn’t in its final form yet, we don’t know whether the graphics card AMD teased is Vega’s beefiest incarnatio­n, all three of those benchmarke­d games heavily favour Radeon, and so on.

But all that said, Vega certainly looks competitiv­e on the graphics performanc­e front, partly because AMD designed Vega to work smarter, not just harder. “Moving the right data at the right time and working on it the right way,” was a major goal for the team, according to Mike Mantor, an AMD corporate fellow focused on graphics and parallel compute architectu­re – and a large part of that stems from tying graphics processing more closely with Vega’s radical memory design.

2. All about memory

When it comes to onboard memory, Vega is downright revolution­ary – just like its predecesso­r. AMD’s current high-end graphics cards, the Radeon Fury series, brought cutting-edge high-bandwidth memory to the world. Vega carries on the torch with improved next-gen HBM2, bolstered by a new ‘high-bandwidth cache controller’ introduced by AMD.

Technical limitation­s limited the first generation of HBM to a mere 4GB of capacity, which in turn limited the Fury series to 4GB of onboard RAM. Thankfully, HBM’s raw speed hid that flaw in the vast majority of games, but now HBM2 tosses those shackles aside. AMD hasn’t officially confirmed Vega’s capacity, but the overlay during the Doom demo revealed that particular graphics card packed 8GB of RAM. And that super-fast RAM is getting even faster, with AMD’s Joe Macri stating that HBM2 offers twice the bandwidth per pin of HBM1.

But as it turns out, HBM was just the beginning. “It’s an evolutiona­ry technology we can take through time, make it bigger, faster, make all these key improvemen­ts,” said Macri, a driving force behind HBM’s creation. Vega builds on HBM’s shoulders with the introducti­on of a new highbandwi­dth cache and high-bandwidth cache controller, which combine to form what Radeon boss Raja Koduri calls “the world’s most scalable GPU memory architectu­re”.

AMD crafted Vega’s high-bandwidth memory architectu­re to help propel memory design forward in a world where sheer graphics performanc­e keeps improving by leaps and bounds, but memory capacities and capabiliti­es have remained relatively

static. The HB cache replaces the graphics card’s traditiona­l frame buffer, while the HB cache controller provides fine-grained control over data and supports a whopping 512TB – not gigabytes, terabytes – of virtual address space. Vega’s HBM design can expand graphics memory beyond onboard RAM to a more heterogene­ous memory system capable of managing several memory sources at once.

That’s likely to make its biggest impact in profession­al applicatio­ns, such as the new Radeon Instinct line-up or the cutting-edge Radeon Pro SSG card that graft high-capacity NAND memory directly to its graphics processor. “This will allow us to connect terabytes of memory to the GPU,” David Watters, AMD’s head of Industry Alliances, told our colleagues at PCWorld when the Radeon Pro SSG was revealed, and this new cache and controller architectu­re designed for HBM’s high speeds should supercharg­e those capabiliti­es even more.

To drive the potential benefits home, AMD revealed a photoreali­stic recreation of Macri’s home living room. The 600GB scene normally takes hours to render, but the combinatio­n of Vega’s prowess and the new HBM2 architectu­re pumps it out in mere minutes. AMD even allowed journalist­s to move the camera around the room in real-time, albeit somewhat sluggishly. It was an eye-opening demo.

Koduri stressed that games can also benefit from the high-bandwidth cache controller’s fine-grained, dynamic data management, citing Witcher 3 and Fallout 4, each of which actually use less than half the memory allocated by the games when they’re running at 4K resolution. “And those are well-optimised games,” he said. Memory demands are only getting greedier in highprofil­e games, and doubly so at bleedinged­ge resolution­s. Here’s hoping that the HB cache’s finer controls paired with HBM’s sheer speed alleviates that somewhat.

AMD also says that future generation­s of games could take advantage of highbandwi­dth memory design to upload large data sets directly to the graphics processor, rather than handling it with a more hands-on approach as done today.

3. Efficient pipeline management

The way graphics cards render games isn’t very efficient. Case in point: the scene (shown top right) from Deus Ex: Mankind Divided. It packs in a whopping 220 million polygons, according to Koduri, but only two million or so are actually visible to the player. Enter Vega’s new programmab­le geometry pipeline.

Rendering a scene is a multi-step process, with graphics cards processing vertex shaders before passing the informatio­n on to the geometry engine for additional work. Vega speeds things up with the help of primitive shaders that identify the polygons that aren’t visible to players so the geometry engine doesn’t waste time on them. Vega also blazes through informatio­n at over twice the peak throughput of its predecesso­rs, and includes a new ‘Intelligen­t Workgroup Distributo­r’ to improve task load balancing from the very beginning of the pipeline.

The scene from Deus Ex: Mankind Divided. It packs in a whopping 220 million polygons, according to Koduri, but only two million or so are actually visible to the player

These tweaks drive home how AMD’s infiltrati­on in consoles can benefit PC gamers, too. The inspiratio­n for the load balancing tweaks comes from console developers used to working “closer to the

metal” than PC developers, who highlighte­d it as a potential area for improvemen­t for AMD, Raja Koduri explained.

4. Right task, right time

AMD designed Vega to “smartly schedule past the work that doesn’t have to be done,” according to Mike Mantor. The final tidbits made public by the company drive that home. Vega continues AMD’s multiyear push to reduce memory bandwidth consumptio­n. Its next-gen pixel engine includes a ‘draw stream binning rasterizer’ that improves performanc­e and saves power by teaming with the high-bandwidth cache controller to more efficientl­y process a scene. After the geometry engine performs its (already reduced amount of) work, Vega identifies overlappin­g pixels that won’t be seen by the user and thus don’t need to be rendered. The GPU then discards those pixels rather than wasting time rendering them. The draw stream binning rasterizer’s design “lets us visit a pixel to be rendered only once,” reveals Mantor.

The revamped Vega architectu­re also now feeds render back-ends from the pixel engine into the larger, shared L2 cache, rather than pumping them directly into the memory controller. AMD says that should help improve performanc­e in GPU compute applicatio­ns that rely on deferred shading.

5. Next-gen compute engine

Finally, AMD teased Vega’s ‘next-gen compute engine’, which is capable of 512 8-bit operations per clock, 256 16-bit operations per clock, or 128 32-bit operations per clock. The 8- and 16-bit ops mostly matter for machine learning, computer vision, and other GPU compute tasks, though Koduri says the 16-bit ops can come in handy for certain gaming tasks that require less stringent accuracy as well. (The AMD-powered PlayStatio­n 4 Pro also supports 256 16-bit operations per clock.)

Coincident­ally enough, the Vega NCU can perform two 16-bit ops simultaneo­usly, doubled up and scheduled together. This wasn’t possible in previous AMD GPUs, Koduri says. Vega’s next-gen compute unit has been optimised for the GPU’s higher clock speeds and higher instructio­ns-percycle, though AMD declined to disclose the core clock speeds for Vega just yet.

Waiting for Vega

The wait for Vega continues, but now we have some idea of the ace hidden up the Radeon Technologi­es Group’s sleeve. These technical teases provide just enough of a glimpse to whet the whistle of graphics enthusiast­s while revealing tantalisin­gly little in the way of hard news relating to consumer-focused Vega graphics cards. (AMD doesn’t want to show its hand to Nvidia too much, after all.) It’s clear that AMD’s attempting some nifty new tricks to improve the efficiency and potential of Vega both in games and profession­al uses. Details are sure to drip-drop out over the coming months.

Fingers crossed that Vega comes sooner rather than later, however. AMD teased its 14nm Polaris GPU architectu­re at CES 2016 but failed to actually launch the Radeon RX 480 until the very end of June. Vega has been slapped with a release window sometime in the first half of 2017, so if AMD waits until E3 to launch this new generation of enthusiast-class graphics cards, Nvidia’s beastly GTX 1080 will have already been on the streets for a full year.

Vega looks incredibly intriguing, but even the most diehard Radeon loyalists can only wait for so long to build a new rig, especially with AMD’s much-hyped Ryzen processors launching now.

 ??  ?? Vega’s Primitive Shaders
Vega’s Primitive Shaders
 ??  ??
 ??  ??
 ??  ?? Vega’s high-bandwidth cache and cache controller unlock a world of memory potential
Vega’s high-bandwidth cache and cache controller unlock a world of memory potential
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ?? A technical preview of AMD’s Radeon Vega graphics architectu­re
A technical preview of AMD’s Radeon Vega graphics architectu­re
 ??  ?? Vega’s New Compute Unit can perform two 16-bit ops at once
Vega’s New Compute Unit can perform two 16-bit ops at once
 ??  ??
 ??  ??

Newspapers in English

Newspapers from Australia