OPEN-SOURCE CPUs
The quiet revolution changing the face of processing
MONOPOLIES AREN’T BAD. At least, they don’t have to be. But with a single corporation owning the rights to the x86 instruction set, the consumer processor market is hardly abuzz with competition. It’s true enough that AMD’s star is currently on the ascent, but that’s been helped by a decade of stagnation by Intel, ever since the heights of the original Core 2 releases. The situation is an indication, though, of what would happen if x86’s only competitor dropped out of the processor market—Intel could stop all development, and you would still have no option but to buy its processors.
It isn’t healthy to over-specialize. “But there’s ARM!” you rightly cry. ARM is a peddler of intellectual property, though. The company licenses its design, and manufactures nothing itself. It’s a healthier arrangement—there’s a range of competition, big and small—but the license fees are far from trivial.
Of course, this is hardware—people can’t just give it away for free, like software…. Or can they? Just as the open-source Linux kernel has ultimately triumphed over all other competitors (in certain areas, anyway), it’s now the turn of open hardware to have its day.
So, how is this revolution going to happen? What devices are we going to see powered by open hardware? And when is this change going to occur? To answer these questions in a meaningful way, we’re first going to have to delve into what actually makes a processor tick, before considering what’s actually needed in this day and age of the Internet of Things?
Ones and zeros. At their heart, processors are pretty simple engines; so simple, you can recreate one inside Minecraft ( http://lazcraft.info/tagged/CPU). There are registers to store instructions, data, and results. There’s a program counter that moves everything along to the beat of the processor clock. A control unit decodes fetched instructions, shifts data, and co-ordinates the pipelines that execute individual instructions. Bolt on a cache and memory, and you’re pretty much there. We’ve designed a processor—now, where’s our money?
The element that really defines this architecture is the instruction set—the ISA, Instruction Set Architecture. As you might imagine, this can get complicated. The original design philosophy, if you can call it that, for processor instructions is known as Complex Instruction Set Computer, or CISC. We’re being dismissive, because when x86 was being devised, instructions were chosen and added as needed, building on top of the 1972 8-bit 8008, but really starting with the 1978 8086, and onward.
It was later, out of a university project at Berkeley, that the Reduced Instruction Set Computer, RISC, philosophy was created, between 1980 and 1984. The realization was that a compiled version of Unix used only a third of the available CISC instructions. The obvious move was to concentrate design effort on those few instructions, to make them run as fast as possible.
The reduction in complexity was astounding: The RISC II architecture from Berkeley was implemented with 39,000 transistors; the CISC Motorola 68000 produced around the same time, and which powered the Amiga 500, used 68,000, but the RISC II design ran up to 420 percent faster. The Intel 80286 also manufactured around then used 134,000 transistors, and ran far slower than either.
The interesting part of this bit of computing history is that while consumer PCs consolidated around Intel’s x86, the high-performance computing market fell in love with RISC designs. SUN Sparc was the industry leader, but there was DEC Alpha, IBM Power, and MIPs (these last two are still going strong), to mention just the leaders at the time. So while, in 1992, Intel had its 486DX2 at 66MHz managing 25MIPs, Alpha had its 150MHz 21064 pumping out 86MIPs, and the confusingly named MIPs had its 150MHz R4400 managing 85MIPs.
We should also mention an obscure UK outfit called Acorn Computers, which at the time (1981–1994) was manufacturing the BBC (as in the UK’s state broadcaster) Microcomputer. In 1983, it took the radical decision to develop its own 32-bit RISC processor architecture called ARM to power its next computer model. If that name isn’t familiar, ARM now powers tens of billions of mobile devices.
At this point, no one would blame you for thinking this was a feature on RISC, rather than open hardware. What we’ve been doing is looking at the background of why open hardware is needed. And that background is one of ever-reducing options, not just for consumers, but for developers, manufacturers, and
innovators. The last decade of the 20th century was vibrant for the processor market, with multiple vendors selling multiple architectures. So what happened? Intel kept driving its fab technology, which eventually gave it enough transistors to optimize its x86 architecture—x86 in post Pentium Pro designs is decoded to an internal RISC microcode—and push out all competition. The main downside of the pursuit of power is that it consumes more energy, which is fine when your target is servers, workstations, desktops, or even bulky laptops. But jump forward four decades, and the emphasis is on low-power architectures for cell phones and ultra-mobile laptops, and Intel’s x86 simply can’t complete with ARM and its RISC design. A lazy way of putting this is that it requires more transistors to decode an x86 CISC instruction than an ARM RISC one, therefore x86 consumes more power per instruction. It isn’t the only reason why Intel has failed to make an impact in the ultra-mobile market, but it’s a major one.
So, all we’ve done is swapped one monopolizing architecture for another— big whoop. You can argue that ARM is a better competitive landscape, in that anyone can license the ARM architecture and sign its required NDAs, if they want to and can afford it. With x86, there’s just Intel, AMD, and VIA that can design and sell x86 processors; we believe IBM can produce processors based on the 80486 architecture, but it’s legally complicated, and there have been lawsuits.
An open hardware processor promises to do away with licenses and processor monopolies. It might sound like a pipe dream, but a solution that you can buy already exists. It’s called RISC-V ( http://
riscv.org) and it was developed at Berkeley University, the birthplace of RISC.
Established during the summer of 2010, RISC-V (fifth-generation Berkeley design) sets itself apart from other processor designs for a number of key reasons. The first we’ve already covered: It’s an open design, licensed under the open-source BSD license. Another key difference is that previous open architecture designs focused on being simplistic, to ease understanding for academic teaching, rather than being robust and optimized for practical commercial deployment. RISC-V also has all the required commercial-level software tools in place; the instruction architecture, for example, is fully implemented in the open-source compiler GCC. All the components to boot an OS, such as Debian Linux or FreeBSd, have been in place since the start of 2016, including the important U-Boot and UEFI specification support.
These are the required firmware and software tools you need to compile an operating system and software for the platform, with the firmware to boot-strap a system from cold, load the UEFI system, and hand over to an OS kernel. Boom! An entire open system.
So what is RISC-V, and when is it going to be running our desktop/laptop/robotoverlords? At its heart, RISC-V is an open, extensible processor architecture. There’s a base fixed set of definitions, with
various open extensions. It supports 32-, 64-, and (as yet unused) 128-bit instruction word lengths (see table, pg. 50—a RISC-V processor aimed at the desktop might be named RV64GC, for instance).
The base design allocates a 16-integer register, with R0 always returning a zero, which can be a standard design tool. The floating-point extension adds an additional 32 float registers on top. Instructions can only access registers, which again is a standard RISC design.
Interestingly, RISC-V turned out to be unaffected by the Spectre and Meltdown hardware prediction security flaws, which affected Intel, AMD, and more recent ARM processors, mainly because the RISC-V design team rejected complex branch prediction, arguing it increases processor complexity and power consumption.
To start, it uses a simplified number of conditional instructions; it offers ten branch operations across just six instructions, which work over shorter branches, just 12 bits from the program counter. The design then encourages compilers to optimize for standardized branch predictions: Backward conditions should be taken, forward should not. This ends up with larger compiled code, but a faster hardware implementation.
Certainly, at this stage RISC-V lacks a number of instruction areas that desktop and server users take for granted, but they’re being worked on. For desktop users, we’re used to SIMD being available to accelerate multimedia and digital signal processing. The packed-SIMD extensions are still being developed; the idea is that single instructions could process 64 bits of data stored across four 16-bit registers.
Interestingly, there is resistance to the addition of SIMD, with some saying it is a poor use of processor resources, and it would be better to concentrate on vector extensions. A full vector extension could potentially be implemented within the default register set, be more flexible than a SIMD extension, and ultimately execute as fast, if not faster. There is a solid argument here, as anyone who remembers MMX, MMX2, 3DNow!, 3DNow! 2, SSE1/2/3/4, AVX, and so on can attest. The danger is an ever-growing instruction set, which goes against RISC, rather than a fixed, flexible vector set. No, it won’t—stop asking. It’s still very early days for RISC-V, but there are products on the market. They are, however, aimed at makers, embedded projects, and research markets. Take SiFive ( www.sifive.com): It shipped the HiFive1, an Arduino-style microcontroller using a RISC-V processor (RV32IMAC—go check those codes), in the middle of 2017. Benchmarks (running at 256MHz) showed it outperforming an ARM Cortex-M4 and Arduino Micro, on a MHz for MHz level, all for $69.
More interestingly, in mid-2018, SiFive shipped its HiFive Unleashed. It might have cost $999, but this development board was capable of running desktop Linux, and used a quad-core 1.5GHz RISC-V RV64GC architecture. Each core uses a five-stage, single-issue, in-order pipeline, backed by 32KB L1 instruction and data cache, and 2MB L2 cache. The interesting part is that you can download every element of the design—the schematics, bill of materials, and processor design files.
At this early stage of development, it seems silly to compare performance;
the single-issue and in-order design is going to peg the performance below any ARM Cortex-A5x processor, which is out of order with branch prediction. Indeed, tests of the HiFive Unleashed show it running 4–10 times slower than the 2GHz Nvidia Jetson TX2.
The Nvidia SoC shows the sort of competition RISC-V, as an architecture, is up against. It’s not enough to deliver a working processor core; these days, people expect a host of connectivity to come with it, from memory controllers and PCIe buses to Bluetooth and wireless and wired networking. All of these come with their own controllers and patents, so the dream of building a truly open hardware platform is an uphill struggle, but it’s one that can be overcome.
RISC-V is certainly attracting industry attention—current Platinum members of the RISC-V Foundation include Google, HP, IBM, Oracle, Microsoft, Nvidia, and Qualcomm. At the seventh RISC-V workshop at the end of November 2017, Western Digital announced that it was planning to transition a billion cores per year to RISC-V design, for data center and edge computing.
It’s a bold statement from a big player, but its implementations will be, at best, low-end processors running in embedded controllers, which could certainly deliver a cost and competitive advantage to WD down the line. It’s all early days for RISC-V, but like the Linux kernel, which was also dismissed at the time, once academia, researchers, and businesses start using and enhancing your open design, world domination is just a decade away.
If we were Intel, we’d be worried, and it seems ARM is already circling its wagons. But RISC processors are puny and weak, aren’t they? You forget how powerful they were in the early nineties. Up until 2018, China had the fastest supercomputer in the world, built entirely around ARM RISC processors. It’s planning three new models for 2020, again x86-free. Japan has a similar design using ARM processors for 2021. Even the recent USA Summit supercomputer gets the majority of its computing power, not from the IBM Power9 RISC processors, but from Nvidia’s Tesla V100 units. While Intel or AMD will undoubtedly be powering your desktop through the 2020s, with ARM inside your mobile devices, we’d be surprised if RISC-V wasn’t appearing in all manner of embedded devices, even some mobile ones, and perhaps moving to the data center. Your desktop could be next.