Samsung Puts a Processor Into Memory
SAMSUNG HAS DESIGNED a new iteration of Aquabolt (its high-end second-generation HBM2 modules). This one has a neat trick to it: integrated into the wafer is a processor to handle data transfers, which would otherwise be run by attached processors. The HBM-PIM (High Band Memory Processing In Memory) module uses some layers from its 3D stack to carry a dinky 300MHz programmable computing unit, or AI engine. The host (the processor) controls this using standard memory commands. The 16-bit PCU can perform FP16 ADD, MUL, MAC, and MAD calculations within the memory. Samsung says it “exploits bank-level parallelism to provide four times the processing bandwidth than an off-chip solution.” Total processing power is quoted as 1.2TFLOPS
Samsung has some bold claims: a doubling of system performance, and a 70 percent decrease in power consumption between its HBM and HBM-PIM modules, with no software alteration. Giving over space to a processor does mean losing some capacity. The modules are 6GB, rather than the standard 8GB, effectively losing two layers of the eightlayer chip. The main market will initially be accelerating AI in fairly high-end applications. However, HBM is also found on the better graphics card, where bandwidth is king. Mixing memory and processor has been difficult because of the different manufacturing processes, and problems with heat dissipation. The first has been solved thanks to the 3D stack construction of HBM modules. The second problem may well account for the relatively modest clock speed of the processor part.