Mak­ing a Mesh of Things

AT THE BEAT­ING HEART of your PC sits the CPU, crunch­ing tasty bits of 0s and 1s in byte-sized chunks. It does so at an ex­tremely fast rate— around four bil­lion cy­cles per sec­ond—but each core can do up to six in­struc­tions per cy­cle, with Sky­lake-X CPUs co

Maximum PC - - QUICKSTART - Jarred Wal­ton Jarred Wal­ton has been a PC and gam­ing en­thu­si­ast for over 30 years.

Mod­ern CPUs are com­plex, and the task of rout­ing data be­tween var­i­ous parts—cache, cores, me­mory, and I/O con­trollers—is a crit­i­cal el­e­ment of the CPU ar­chi­tec­ture. For sev­eral gen­er­a­tions, In­tel’s HEDT CPUs have used a ring bus ar­chi­tec­ture.

Think of it as mass tran­sit for data, run­ning in a loop, with sta­tions where data can get on or off. As core counts in­crease, dual rings are added, with a buffered switch be­tween them. If data needs to move be­tween rings, it’s like get­ting off at a trans­fer sta­tion, and wait­ing for the next train—a five-cy­cle de­lay, on top of the de­lay from travers­ing the rings.

The Broad­well HCC (High Core Count) de­signs sup­ported up to 24 cores, and while it’s pos­si­ble to use ad­di­tional rings for higher core counts, the in­creased la­tency lim­its scal­a­bil­ity. With the Sky­lake-X HCC/ XCC de­signs (6 to 28 cores), In­tel is us­ing a new mesh net­work. Each block (core, me­mory, I/O, cache, and so on) has a router, with the blocks in grids. It’s like city blocks, with the router switches at each in­ter­sec­tion di­rect­ing traffic. The goal is im­proved scal­a­bil­ity through lower la­tency.

De­spite all the talk of the mesh low­er­ing la­tency, re­duc­ing power use, and im­prov­ing scal­a­bil­ity, in test­ing it’s not all sun­shine and roses. Com­par­ing the 10-core Broad­well-E i7-6950X to the 10-core Sky­lake-X i9-7900X, in­ter-core com­mu­ni­ca­tion la­ten­cies have in­creased from 80ns to 100ns. The real-world im­pact is nowhere near that, though, and higher per-core per­for­mance does com­pen­sate.

More con­cern­ing in my test­ing of Sky­lake-X CPUs is that power draw has gone way up from Broad­well-E. At stock, power draw isn’t too bad, but all X299 moth­er­boards I’ve tested auto-over­clock. In­tel rates the i9-7900X for all-core turbo of 4.0GHz, with a max­i­mum turbo of 4.3GHz (or 4.5GHz via Turbo Boost 3.0 Max), but the base clock is only 3.3GHz.

That base clock is what the CPU is guar­an­teed to achieve with­out ex­ceed­ing TDP, and it’s up to the mobo firmware to keep things in check. If power use goes over TDP, clock speeds should drop, but boards are be­ing more ag­gres­sive. Some run all cores on the i9-7900X at 4.0GHz, no mat­ter what, and oth­ers de­fault to 4.3GHz and even 4.5GHz. Power use scales rapidly, but the real prob­lems start to show up on the i9-7960X and i9-7980XE.

The TDP is 165W for both, and they went well over that. Sys­tem power use in Cinebench R15 is around 350W, with 50-100W go­ing to other com­po­nents, so the CPUs use over 200W. Push clock speeds to 4.0-4.4GHz on all cores, and it goes to over 500W. Over­clocker der8auer took things to the next level with liq­uid ni­tro­gen on the i9-6980XE, and got all 18 cores to 6.1GHz— us­ing over 1,000W just for the CPU.

These are amaz­ingly fast CPUs, but we’re hit­ting the lim­its of 14nm. Mesh topol­ogy may pave the way for more cores, but even though it has two fewer cores, the i9-7960X is only 3-5 per­cent slower than the i9-7980XE. If In­tel made a 24-core CPU, with­out 10nm it’s un­likely to de­liver a sig­nif­i­cant boost in per­for­mance with­out an equiv­a­lent in­crease in power use. Moore’s Law is dead, right when we need it most.

Sky­lake-X uses a mesh topol­ogy to al­low scal­ing to higher core counts.

Newspapers in English

Newspapers from Australia

© PressReader. All rights reserved.