THE ART OF ZEN

Any high-level over­view of AMD’s Ryzen pro­ces­sors will men­tion a set five of tech­nolo­gies col­lec­tively known as Sense MI. These in­clude Pure Power, Pre­ci­sion Boost, eX­tended Fre­quency Range (XFR), Neu­ral Net Pre­dic­tion and Smart Prefetch. A brief in­tro­duc

HWM (Malaysia) - - LEARN - by Zachary Chan

Pure Power and Pre­ci­sion Boost

Pure Power and Pre­ci­sion Boost con­trol volt­age and op­er­at­ing fre­quency of the Ryzen pro­ces­sor. This con­cept isn’t new as ev­ery mod­ern pro­ces­sor fea­tures some form of volt­age and boost fre­quency range to op­ti­mize per­for­mance-per-watt de­pend­ing on work­load.

What makes Ryzen dif­fer­ent, how­ever, is how fine-grained this con­trol has be­come. AMD claims Ryzen has more than a thou­sand em­bed­ded sen­sors dis­trib­uted per core com­plex that con­tin­u­ously an­a­lyze power, op­er­at­ing tem­per­a­ture and fre­quency. Ryzen pro­ces­sors won’t just be able to re­spond to vary­ing work­load needs with greater ef­fi­ciency (as fast as 1ms), but with finer gran­u­lar­ity as well (25MHz ad­just­ments com­pared to 100MHz of past gen­er­a­tions). The ef­fect of Pure Power and Pre­ci­sion Boost com­bined is to bet­ter op­ti­mize per­for­mance-per-watt of a Ryzen CPU at any work­load with a goal to re­duce power con­sump­tion while main­tain­ing sim­i­lar per­for­mance com­pared to pre­vi­ous ar­chi­tec­tures.

eX­tended Fre­quency Range

All mod­ern pro­ces­sors have what’s known as the base clock and boost clock, its max­i­mum rated fre­quency. The Ryzen 7 1800X, for ex­am­ple, has a base and boost clock of 3.60GHz of 4.00GHz, re­spec­tively. How­ever, if the CPU de­tects that it is con­tin­u­ously op­er­at­ing un­der a cer­tain tem­per­a­ture thresh­old, which would in­di­cate a high-per­for­mance cus­tom cool­ing so­lu­tion is be­ing used (such as wa­ter cool­ing), an ad­di­tional re­serve fre­quency beyond the CPU’s max­i­mum boost clock is un­locked. In the case of the Ryzen 7 1800X, that’s an ad­di­tional 100MHz. De­pend­ing on how you look at it, XFR is ei­ther the lazy man’s over­clock­ing mode or the pro­ces­sor thank­ing you for us­ing bet­ter cool­ing.

All three fea­tures, Pure Power, Pre­ci­sion Boost and XFR are highly in­te­grated, au­to­matic and fully con­trolled by the CPU. What this means for the av­er­age con­sumer is that Ryzen CPUs will per­form best if you just leave it alone. AMD even rec­om­mends that

Win­dows’ power pro­file be set to High Per­for­mance in­stead of the de­fault Bal­anced mode to give full hard­ware con­trol over to the CPU.

For power users, how­ever, Pure Power, Pre­ci­sion Boost and XFR won’t mean much be­cause any at­tempt at man­ual over­clock­ing will over­ride and dis­able them any­way.

Neu­ral Net Pre­dic­tion and Smart Prefetch

AMD claims that each Ryzen CPU has a “true ar­ti­fi­cial in­tel­li­gence neu­ral net­work” for bet­ter in­struc­tion pre­dic­tion. This is a lit­tle clever mar­ket­ing on AMD’s be­half, but this neu­ral net­work is es­sen­tially the CPU’s branch pre­dic­tor, al­beit a much smarter one based on hashed per­cep­trons. Ryzen can pre­dict two branches per cy­cle, and the penalty for in­cor­rect pre­dic­tion has been im­proved by three cy­cles with the in­tro­duc­tion of a mi­cro-op cache. The ad­di­tion of a deeper 3-level Trans­la­tion Looka­side Buf­fer (TLB) and a 0-cy­cle Re­cent Pre­dic­tor means that prefetched in­struc­tions get loaded faster as well. The en­tire cache sys­tem has been re­designed to be much faster and more ef­fi­cient over­all (ap­prox­i­mately dou­ble the band­width for L1 and L2, and up to 5x to­tal band­width for L3).

Si­mul­ta­ne­ous Multi Thread­ing and the In­fin­ity Fabric

Ryzen is the first AMD pro­ces­sor to sup­port proper Si­mul­ta­ne­ous Multi Thread­ing (SMT), which is the ex­e­cu­tion of two threads per core in the same way In­tel does Hy­per Thread­ing. In the pre­vi­ous Bull­dozer ar­chi­tec­ture, AMD preached Clus­tered Multi Thread­ing (CMT), where a core fea­tured two phys­i­cal in­te­ger units and one shared float­ing point unit.

It is in­ter­est­ing to point out that phys­i­cally, Ryzen is pack­aged with a CPU Com­plex (CCX) struc­ture. Each CCX fea­tures four cores, with their own pri­vate 512KB L2 cache and a shared 8MB L3. Now, when you look at the Ryzen 7 1800X, for ex­am­ple, you’ll no­tice that it is an 8-core pro­ces­sor with 16MB L3. This means it fea­tures two CCXes connected at the SoC level via AMD’s cus­tom In­fin­ity Fabric in­ter­con­nect, which is based on an en­hanced co­her­ent Hyper­Trans­port pro­to­col.

AMD claims that with its In­fin­ity Fabric de­sign, they’re able to scale not just pro­ces­sor cores but multi-socket con­fig­u­ra­tions with al­most lin­ear per­for­mance gain per core in­crease. An ex­am­ple of this claim is the re­cently an­nounced 32-core, 64-thread Naples server pro­ces­sor based on the Zen mi­croar­chi­tec­ture.

Ryzen pro­ces­sors are bet­ter equipped to re­act to work­load changes quicker and tune per­for­mance in finer 25MHz bumps. FINE GRAINED P-STATES

Ryzen pro­ces­sors has a thou­sand sen­sors connected and con­trolled through AMD’s new in­ter­con­nect, the In­fin­ity Fabric.

Each Ryzen Core Com­plex (CCX) has four cores, and a 16-way as­so­cia­tive 8MB L3 cache that’s a vic­tim cache of L2. All cores can ac­cess the en­tire L3 cache with the same av­er­age la­tency.

Ryzen also introduces a mi­cro-op queue to re­duce prefetch er­ror penalty.

Newspapers in English

Newspapers from Malaysia

© PressReader. All rights reserved.