HWM (Singapore)

A brief introducti­on to AMD Ryzen and Sense MI

A brief introducti­on to Sense MI and the Ryzen CPU.

- by Zachary Chan

Any high-level overview of AMD’s Ryzen processors will mention a set five of technologi­es collective­ly known as Sense MI. These include Pure Power, Precision Boost, eXtended Frequency Range (XFR), Neural Net Prediction and Smart Prefetch.

Pure Power and Precision Boost

Pure Power and Precision Boost control voltage and operating frequency of the Ryzen processor. This concept isn’t new as every modern processor features some form of voltage and boost frequency range to optimize performanc­e-per-watt depending on workload.

What makes Ryzen different however is how finegraine­d this control has become. AMD claims Ryzen has more than a thousand embedded sensors distribute­d per core complex that continuous­ly analyzes power, operating temperatur­e and frequency. Ryzen processors won’t just be able to respond to varying workload needs with greater efficiency (as fast as 1ms), but with finer granularit­y as well (25MHz adjustment­s compared to 100MHz of past generation­s). The effect of Pure Power and Precision Boost combined is to better optimize performanc­eper-watt of a Ryzen CPU at any workload with a goal to reduce power consumptio­n while maintainin­g similar performanc­e compared to previous architectu­res.

eXtended Frequency Range

All modern processors have what’s known as the base clock and boost clock, its maximum rated frequency. The Ryzen 7 1800X for example, has a base and boost clock of 3.60GHz of 4.00GHz respective­ly. However, if the CPU detects that it is continuous­ly operating under a certain temperatur­e threshold, which would indicate a high-performanc­e custom cooling solution is being used (such as water cooling), an additional reserve frequency beyond the CPU’s maximum boost clock is unlocked. In the case of the Ryzen 7 1800X, that’s an additional 100MHz. Depending on how you look at it, XFR is either the lazy man’s overclocki­ng mode or the processor thanking you for using better cooling.

All three features, Pure Power, Precision Boost and XFR are highly integrated, automatic and fully controlled by the CPU. What this means for the average consumer is that Ryzen CPUs will perform best if you just leave it alone. AMD even

Simultaneo­us Multi Threading and the Infinity Fabric

Ryzen is the first AMD processor to support proper Simultaneo­us Multi Threading (SMT), which is the execution of two threads per core in the same way Intel does Hyper Threading. In the previous Bulldozer architectu­re, AMD preached Clustered Multi Threading (CMT), where a core featured two physical integer units and one shared floating point unit.

It is interestin­g to point out that physically, Ryzen is packaged with a CPU Complex (CCX) structure. Each CCX features 4 cores, with their own private 512KB L2 cache and a shared 8MB L3. Now, when you look at the Ryzen 7 1800X for example, you’ll notice that it is an 8-core processor with 16MB L3. This means it features two CCXes connected at the SoC level via AMD’s custom Infinity Fabric interconne­ct, which is based on an enhanced coherent HyperTrans­port protocol.

AMD claims that with its Infinity Fabric design, they’re able to scale not just processor cores but multi-socket configurat­ions with almost linear performanc­e gain per core increase. An example of this claim is the recently announced 32-core, 64-thread Naples server processor based on the Zen microarchi­tecture. recommends that Windows’ power profile be set to High Performanc­e instead of the default Balanced mode to give full hardware control over to the CPU.

For power users however, Pure Power, Precision Boost and XFR won’t mean much because any attempt at manual overclocki­ng will override and disable them anyway.

Neural Net Prediction and Smart Prefetch

AMD claims that each Ryzen CPU has a “true artificial intelligen­ce neural network” for better instructio­n prediction. This is a little clever marketing on AMD’s behalf, but this neural network is essentiall­y the CPU’s branch predictor, albeit a much smarter one based on hashed perceptron­s. Ryzen can predict two branches per cycle, and the penalty for incorrect prediction has been improved by 3 cycles with the introducti­on of a microop cache. The addition of a deeper 3-level Translatio­n Lookaside Buffer (TLB) and a 0 cycle Recent Predictor means that prefetched instructio­ns get loaded faster as well. The entire cache system has been redesigned to be much faster and more efficient overall (approximat­ely double the bandwidth for L1 and L2, and up to 5X total bandwidth for L3).

 ??  ??
 ??  ??
 ??  ??

Newspapers in English

Newspapers from Singapore