UNPACKING INTEL’S ROCKET LAKE & BEYOND

After the Rocket come the Alders, Raptors, and Meteors.

2021-07-12 - JARRED WALTON

After gazing into broad wells, we’ve scoured the skies, poked at the kabies (whatever those are), drunk plenty of coffee – in two flavours – then returned to our celestial beginnings with comets. And then there were rockets. If that’s as clear as mud, we’re talking about the past seven generations of Intel desktop processors, all of which have one thing in common: They’re manufactured using a variant of Intel’s 14nm process technology.

Granted, today’s 11th-gen CPUs, aka Rocket Lake, have about as much in common with the 14nm lithography of Broadwell and Skylake as the latest SpaceX vehicles have in common with the space shuttle, but that’s a long time for a company that used to pride itself on alternating between new architectures and die shrinks every year. So, we’re peeling back the heat spreader and digging into the underlying architecture to find out what makes Rocket Lake tick. Or is it tock?

Intel isn’t done yet, either, and much as Broadwell for desktops was quickly kicked under the rug to make way for Skylake in 2015, there are promising future CPUs coming down the Intel pipeline. We’ll discuss what we know of Alder Lake and the upcoming 12th-gen CPUs, and even look at what lies ahead. So, latch your helmet, strap in, and let’s shoot for the moon.

Backporting Rocket Lake

To understand Rocket Lake, we need to go back to Ice Lake, Intel’s 10th-gen mobile-only processors officially launched in late 2019. After an aborted launch of the first-generation 10nm Cannon Lake CPUs in 2018 (no, we don’t really count the Core i3-8121U), Intel went back to the drawing board to refine its already-two-years-delayed 10nm process. The result was a reasonably potent architecture, tied to a 10nm process that by all accounts struggled to reach the desired performance levels. Even today, with the server-focused Ice Lake SP launch having just happened, there were clearly some difficulties.

The best and brightest of the Ice Lake laptop chips reached a maximum CPU clock of 4.1GHz. That was the Core i7-1068NG7, a 28W part; the more common Core i7-1065G7 was a 15W configurable TPD chip that topped out at a less impressive 3.9GHz. Perhaps even more concerning, the mobile chips offered a maximum of four cores, with Hyper-Threading allowing eight threads. Back when Skylake was all the rage, that might have sufficed, but the attack of AMD’s Ryzen CPUs made such tame configurations look far less impressive.

While Intel reworked its existing 14nm desktop offerings to eventually reach eight-core and 10-core chips, it continued to struggle with Ice Lake and 10nm. Ultimately, the decision was made to backport the CPU architecture of Ice Lake to 14nm, which comes with all sorts of difficulties. The Sunny Cove CPU cores at the heart of Ice Lake were designed and built around a 10nm node; returning to 14nm would result in much larger chips, not to mention differences in the lithography rules between 10nm and 14nm. Despite the complexity, however, Intel made the decision to proceed.

The result is the Cypress Cove CPU architecture. It takes many elements of Sunny Cove, then tunes things for 14nm, higher core counts, and higher clock speeds. Also, higher power. Outside of HEDT (high-end desktop) chips like the 18-core i9-7980XE, Rocket Lake is one of the most power-hungry chips Intel has ever created.

The CPU cores weren’t the only thing that needed a swift kick in the pants. Intel’s 14nm CPUs previously used Gen9 or Gen9.5 graphics solutions. Back in 2015, Gen9 might have been OK, but it can’t keep up in 2021. Even the Gen11 graphics of Ice Lake were deemed lacking, and Intel stepped up to the most recent Gen12 Xe Graphics for Rocket Lake. That meant backporting Xe Graphics from 10nm Tiger Lake (the sequel to Ice Lake) to 14nm as well. Here, however, some compromise was necessary. While

Tiger Lake has up to 96 EUs in the fastest configurations, Rocket Lake is limited to just 32 EUs – it’s a painful but ultimately needed compromise.

The biggest drawback of backporting both the CPU and GPU architectures to an older process is size. Intel hasn’t revealed official die sizes for the resulting chip, but CPU sleuths have wasted little time in delidding and measuring the die size of Rocket Lake. The previous -generation Core i9-10900K (Comet Lake) offered 10 CPU cores and 24 GPU EUs, and measured around 206.1mm2. The Core i9-9900K in 2018 was an eight-core chip that measured about 180.3mm2, and the six-core i7-8700K from 2017 was just 153.6mm2.

In contrast, the i9-11900K returns to an eight-core design but measures 276.4mm2 – 34 percent larger than the 10th-gen part, despite having two fewer cores. Even worse, compared to the eight-core 9900K, the new chip is 53 percent larger. Granted, some of that comes from having slightly more GPU oomph, but ultimately this is the price Intel had to pay in choosing to backport.

Cypress Cove CPU architecture

The good news is that backporting means Intel can finally leave behind the old Skylake CPU architecture of 2015. Rocket Lake is the first new desktop architecture from Intel since two years before AMD revealed Ryzen – which feels like an eternity ago. Intel claims Rocket Lake delivers up to 19 percent

higher IPC (instructions per cycle) than Skylake, though individual gains will vary. Here’s a rundown of the updates Intel made with Sunny Cove, which is largely the same as Cypress Cove.

First, Intel increased the size of the caches, as well as their efficiency. The L1 data cache went from 32KiB to 42KiB, and increased set associativity from eight-way to 12-way. Set associativity improves the efficiency and hit rates of a cache by allowing it to better handle memory addresses that would otherwise overlap. The L2 cache changed from four-way to eight-way, and can be larger, depending on the product. Most Skylake desktop derivatives used a 256KiB L2 size, though Skylake-X derivatives (HEDT) had a 1MB (and 16-way set associative) cache size per core. Rocket Lake strikes for a middle ground with a 512KiB L2 cache size.

Along with the caches, various buffers and other elements are larger, too. The micro-op cache handles 2.25K entries, 50 percent more than the old 1.5K UOP cache. The L2 translation lookaside buffer (TLB) also increased from 1,536 to 2,048 entries. The out-of-order execution window size went from Skylake’s 224 to 352 with Ice Lake, in-flight loads improved from 72 to 128 entries, and in-flight stores grew from 56 to 72 entries.

The front end was also made wider, going from four to five instruction allocations, and from eight to 10 execution ports. The L1 cache got twice the store bandwidth, there are four AGUs (address generation units) compared to three on Skylake, and twice as many store operations are available per cycle. To help all areas of performance, Intel also made unspecified improvements to the branch prediction.

Finally, Sunny Cove and Cypress Cove added some new instructions. These include AVX-512, DL Boost for AI Inference, and double the peak throughput for AES-NI (cryptography) instructions. It also supports a larger virtual address space for memory (57 bits, up from 48).

Mix all these improvements together, and you get the 19 percent IPC improvement. The platform also has some new features, such as PCIe Gen4 support for the primary graphics slot and first M.2 slot. Official memory speed support now extends to

DDR4-3200 (though we’ve been able to reach such speeds as far back as Skylake). And topping things off, the width of the interface between the CPU and PCH (platform controller hub, aka chipset) has been doubled, with an eight-lane DMI 3.0 link in place of a four-lane link.

Xe Graphics GPU architecture

Intel hasn’t been quite as forthcoming with changes to the graphics architecture, and in fact Xe Graphics, aka Gen12 Graphics, appears to largely build off the existing Gen11 Ice Lake GPU. That’s still a big step forward from the old Gen9.5 desktop graphics, however.

The most noticeable upgrade in the GPU department is the move from 24 EUs (Execution Units) to 32, a minimum of 33 percent more computational power, but the underlying EUs feature some improvements that should boost performance more. Clock speeds are also higher, by about 100MHz, but that won’t matter much.

The problem is that, even if the new UHD Graphics 750 doubles the performance of UHD Graphics 630, it won’t be sufficient for most games. GPUs like the GTX 1050, which we consider borderline for gaming, are still several times faster than UHD 630, which presumably means at least double the speed of UHD 750.

Thankfully, Intel knows this, and it knows desktop users don’t really care about integrated graphics performance – that’s a way for OEMs and system

builders to cheap out on the GPU and keep prices lower. Rather than trying to compete with dedicated GPUs it can’t hope to match, Intel has focused on other capabilities: The graphics engine includes HEVC, VP9, and SCC encoders that support up to 4K60 HDR, and hardware-accelerated AV1 decode support for 4K60.

While the graphics updates won’t necessarily set the gaming world aflame, they should at least prove capable for media-streaming duties.

Can you feel the power?

With the backporting of architectural updates, there’s another issue: power consumption. Intel used to be far ahead of AMD in the realm of CPU power requirements, but it’s been on a downhill slope with desktop CPUs since 2017. The Core i7-8700K and previous two generations of Core i7s all had a maximum TDP of 95W. Overclocking could exceed that, but even at 5.0GHz the i7-8700K tended to land below 150W. That changed with the 9900K, things got worse with the 10900K, and the situation definitely hasn’t improved with the 11900K.

Nominally, the 11900K has an official TDP of 125W, with a PL2 (Power Level 2) rating of 250W. That means that if the motherboard follows the exact limits, the CPU should be able to use up to 250W for a period of time defined as Tau – 56 seconds, in this case. In practice, though, a lot of enthusiast (Z490/ Z590) motherboards thumb their noses at such limits and will run at PL2 indefinitely – or at least until the cooler fails to keep up.

Certain workloads can even go beyond official PL2 levels, especially applications that make heavy use of AVX-512 instructions. We saw peak power, running

“stock,” of up to 300W during stress testing. Despite the larger die size and greater surface area for cooling, that also resulted in temperatures peaking at 100 C. Thankfully, that’s more of a worst-case scenario (Prime95), and it’s the price of backporting to 14nm and then running at 4.8GHz on eight Cypress Cove cores.

Also of note is that Intel has further refined its opportunistic boost algorithms. Rocket Lake includes Turbo Boost 2.0, which allows for frequencies of up to 5.1GHz. Turbo Boost Max 3.0 adds an additional 100MHz on the fastest two cores, up to 5.2GHz. Thermal Velocity Boost allows a single favored core to reach up to 5.3GHz, or all-core clocks of up to 4.8GHz, provided the CPU temperature remains below a threshold of 70 C.

Finally, Adaptive Boost Technology (ABT) feels quite similar to AMD’s Precision Boost Overdrive. It’s dynamic overclocking of sorts, but it doesn’t void your CPU warranty. Unfortunately, ABT is only available on the top Rocket Lake chips – 11900/K/ KF/F. ABT can kick in as long as the CPU stays below 100 C and works in conjunction with the other boost technologies. It’s often disabled by default in the motherboard BIOS, but there’s no good reason to leave it off as it’s “free” performance.

Failure to Launch

Taken together, all the updates to Rocket Lake sound great, if it weren’t for that pesky 14nm+++ process. The net result ends up being less than impressive. AMD’s Zen 3 architecture leads in most benchmarks, and the real saving grace for Intel is that, considering the chip shortages, it has inventory relatively available.

There’s one redeeming feature in the fact that the Core i7-11700K sports the same eight-core/16-thread configuration as the 11900K, just with slightly lower clocks – and without ABT. But manual overclocking can make up the difference if you want to take that route.

We mentioned that Rocket Lake feels a lot like the Broadwell launch, though, and that’s because Alder Lake should release later this year. That’s right: We’ll get two new desktop CPU architectures this year. But let’s douse your enthusiasm by noting that Alder Lake will shift to a new socket, LGA1700 – no upgrades for existing Comet Lake or Rocket Lake users.

Introducing Alder Lake

Intel hasn’t been quite as forthcoming with architectural details on its Alder Lake processors, but we do know it plans for a complete suite of desktop chips. Alder Lake represents Intel’s first fully featured hybrid x86 processor – well, second if you want to count Lakefield, but this one will be much better.

Hybrid CPU designs aren’t new. ARM has been doing big.Little for a decade now, and it’s pretty much standard fare for modern smartphone processors. Intel isn’t ARM, of course, and x86 cores tend to be quite a bit more complex than even the larger, higher performance ARM cores. Which is perhaps partly why Intel dubs its take on hybrid CPUs “Big-Bigger.”

Lakefield used a single large core (Sunny Cove, the same as in Ice Lake) with four smaller low-power Tremont cores. The catch with Big-Bigger is that all the CPU cores must support the same instruction set. Alder Lake takes what started with Lakefield and catapults it into the latest tech. The high

performance CPU cores use the new Golden Cove architecture, while the smaller cores use the Gracemont architecture. The fastest configuration will support up to eight of each type of core, but that’s just the start. Golden Cove is the all-in-one successor to Sunny Cove, Willow Cove, and Cypress Cove. It supports Hyper-Threading, so just on the “bigger” cores that should match Rocket Lake with an eight-core/16-thread configuration. The Gracemont cores don’t support SMT, but should provide for excellent power savings, giving a final maximum configuration of 16 cores and 24 threads.

There will also be many lower spec variations, with different numbers of big and small cores. That will create some interesting possibilities, and scheduling within the OS and hardware is going to be a critical factor for performance. That’s likely why Intel started with Lakefield, to pave the way for future designs. The lowest configuration is rumored to feature just two big cores (two-core/four-thread), and there will also be a four-core/eight-thread configuration with just four big cores. Models with six or eight big cores will come with zero, two, four, six, or eight little cores.

The hybrid design isn’t the only big change with Alder Lake. The more important update is that it will use an updated version of Intel’s 10nm process, which it calls Enhanced SuperFIN. That should dramatically improve on the chip size and power characteristics relative to Rocket Lake. There’s no word yet on clock speeds or IPC, but with Intel currently trailing AMD, we expect it will try to pull out all the stops.

Alder Lake will take other technologies a step beyond AMD, by being the first consumer CPU to support both PCIe Gen5 and DDR5 memory, or at least that’s the current rumour. Those might matter more on servers and workstations, and may not be enabled on all CPUs or motherboards, but we’ll have to wait to find out more.

As mentioned earlier, Alder Lake, or 12th-gen Intel Core, will use a new LGA1700 socket. The new socket is required for both the PCIe Gen5 and DDR5 support, and it could include additional lanes from the CPU to other components as well. One thing that isn’t changing relative to Rocket Lake: the GPU. Alder Lake will use the same Xe Graphics with up to 32 EUs on the desktop chips. The mobile variants will come in a higher performance GT2 configuration with up to 96 EUs, similar to 11th-gen Tiger Lake processors.

Alder Lake should reunite Intel’s desktop and mobile platforms with a single architecture. We’re expecting desktop, mobile, low-power, and even Atom-type ultra-low-power designs. Server models are unlikely any time soon considering Ice Lake LP just launched.

Looking forward

After years of delays, Intel finally has an updated desktop architecture and new chips rolling out, and it’s moving beyond 14nm. All indications are that Intel’s SuperFIN already matches up well with TSMC’s N7 7nm node, so Enhanced SuperFIN ought to be at least reasonably competitive with upcoming N6 and N5 processes. Intel has far more going on than just Rocket Lake and Alder Lake, of course (see “Many More Lakes” above).

Alder Lake isn’t expected until the end of this year, though we suspect Intel will want Alder Lake sooner rather than later. There’s no word on TDP or clock speed yet, but we expect it to be far more competitive than what Intel currently offers. If you’re already running an eight-core CPU like the i7-9700K or better, or a Ryzen 7 or higher AMD chip, skipping the Rocket and waiting for Alder Lake looks like a reasonable option.

The next Intel versus AMD showdown looks set for later this year or early 2022. By then, Team Red might have Zen 4 CPUs for us, also with PCIe Gen5 and DDR5 support. Rocket Lake didn’t quite hit the mark, but Alder Lake is promising, with future 7nm options closing the gap. The next year should be far more exciting in the CPU realm than late 2020 and early 2021, at least.

?? ?? Pretty on the inside: Intel’s 11th-gen Rocket Lake processors are ultimately an exercise in compromise. — Pretty on the inside: Intel’s 11th-gen Rocket Lake processors are ultimately an exercise in compromise.

?? ?? Ice Lake launched in 2019 and only offered mobile processors, but it is key to understanding the rationale behind Rocket Lake. — Ice Lake launched in 2019 and only offered mobile processors, but it is key to understanding the rationale behind Rocket Lake.

?? ?? Intel’s Core i9-9900K from 2018 was considerably smaller than today’s 11th-gen i9-11900K. — Intel’s Core i9-9900K from 2018 was considerably smaller than today’s 11th-gen i9-11900K.

?? ?? The Sunny Cove architecture has lent itself to many of the features of Cypress Cove. — The Sunny Cove architecture has lent itself to many of the features of Cypress Cove.

?? ?? Tiger Lake boasts up to 96 EUs in its speediest incarnations, but Rocket Lake has a paltry 32. — Tiger Lake boasts up to 96 EUs in its speediest incarnations, but Rocket Lake has a paltry 32.

?? ?? Rocket Lake is a power-hungry animal, but not as bad as the 18-core HEDT i9-7980XE. — Rocket Lake is a power-hungry animal, but not as bad as the 18-core HEDT i9-7980XE.

?? ?? Meanwhile, the i9-11900K has gone back to just eight cores, but still manages to take up 276.4mm2 of space. — Meanwhile, the i9-11900K has gone back to just eight cores, but still manages to take up 276.4mm2 of space.

?? ?? Measuring a relatively dainty 206.1mm2, the Comet Lake Core i9-10900K squeezed in 10 CPU cores alongside 24 GPU EUs. — Measuring a relatively dainty 206.1mm2, the Comet Lake Core i9-10900K squeezed in 10 CPU cores alongside 24 GPU EUs.

?? ?? Despite Rocket Lake’s multiple updates, it still can’t best AMD’s Zen 3 architecture in most benchmarks. — Despite Rocket Lake’s multiple updates, it still can’t best AMD’s Zen 3 architecture in most benchmarks.

?? ?? The Core i7-8700K and previous two generations of Core i7s had a maximum TDP of just 95W – Rocket Lake has a much more voracious appetite. — The Core i7-8700K and previous two generations of Core i7s had a maximum TDP of just 95W – Rocket Lake has a much more voracious appetite.

?? ?? If you have an eight-core CPU like the i7-9700K or better in your PC, you might want to skip Rocket Lake altogether. — If you have an eight-core CPU like the i7-9700K or better in your PC, you might want to skip Rocket Lake altogether.

UNPACKING INTEL’S ROCKET LAKE & BEYOND

After the Rocket come the Alders, Raptors, and Meteors.

Newspapers in English

Newspapers from Australia