Nvidia’s Titan V Brings Volta to Consumers
LAST MONTH I TALKED ABOUT Intel’s increased efforts in the field of AI, and this month Nvidia launched a new consumer product that targets the same market. Nvidia already has a professional solution, with its DGX-1 servers and Tesla V100 processors, but t
The Titan V is a prosumer product, designed to take deep learning and AI research mainstream. It seems silly to call a $2,999 graphics card “mainstream,” but the Tesla V100, which has similar specs, is currently only available in packs of eight, shipped inside Nvidia’s DGX-1 servers, which cost $149,000. About $29,000 of that can be attributed to the core server components, with the remaining $120K going to the eight Tesla V100 cards, which means one V100 costs around $15,000—the Titan V is a bargain by comparison! The DGX-1 also uses NVLink mezzanine connectors for higher bandwidth interlinks, while the Titan V sticks with the PCIe x16 interface, so you can drop the card into just about any modern PC, and suddenly have a deep learning powerhouse.
How powerful is the Titan V? Nvidia rates it at up to 15 Tflops for single-precision operations, 7.5 Tflops for double-precision, and 30 Tflops of FP16 operations. But the kicker is the new Tensor cores, customized for high performance deep learning operations that don’t need higher precision. There are 640 Tensor cores in the Titan V, providing 110 Tflops in total of FP16 performance—roughly five times faster than the previous-gen Pascal, and nearly 10 times faster than any previous GeForce.
Nvidia has also overhauled some of the underlying architecture in Volta. There are rumblings that the changes make it better for low-level APIs, such as DX12 and Vulkan, with asynchronous compute being one area of traditional weakness on Nvidia’s recent GPUs, but we haven’t been briefed on any specific changes yet. The GV100 processor is very much focused on areas other than gaming, and it has more similarities to the GP100 architecture—a chip that never actually showed up in any consumer products, unless you count the Quadro GP100, which retails for $8,500.
The full GV100 includes up to 84 SMs (Streaming Multiprocessors), with half as many CUDA cores per SM compared to the Maxwell GPUs like GM104, plus eight Tensor cores. It has a base clock of 1,200MHz, and a boost clock of 1,455MHz. The Titan V also includes 12GB of HBM2 memory, down from the 16GB that the GV100 supports, so it is partially disabled—four SMs and one HBM2 stack are disabled. That’s hardly surprising, as the chip is massive, measuring 815mm2, with 21 billion transistors, and manufactured on TSMC’s 12nm process; it’s as big as Nvidia could make it with current tech. Large chips mean lower yields, but by designing in redundancies, and allowing portions of a chip to be disabled, Nvidia can get more chips from each wafer.
While the Titan V is not intended for gamers (it doesn’t even have an SLI connector), that doesn’t mean it can’t run games. With around 20 percent more compute performance and memory bandwidth than the Titan Xp, if you’re serious about AI research and want to play games after hours, the card is no slouch. It’s early, so the drivers aren’t fully optimized, but in testing there are games where the Titan V is up to 25 percent faster than a GTX 1080 Ti—and others where it’s slightly slower, but it’s a net overall gain.
The GTX 1080 Ti is far more sensible for gaming purposes, and Nvidia is serious about this being an “entry level” AI chip. With 10 times the deep learning performance, it puts more power than ever into the hands of software developers. I can’t wait to see what they do with it.