Today's article comes from the journal of Electronics. The authors are Dakic et al., from Algebra University, in Croatia. In this paper, they run a performance-shootout between the RISC-V and ARM architectures to see which is more suitable for High Performance Computing with Kubernetes. Let's dive in and see what they found.
DOI: 10.3390/electronics13173494
In 2020, Apple revealed the first release in their “M” series of computers, their foray into designing and building their own chips. No longer would they depend on Intel's processors to power their machines; the future of Apple would be vertically-integrated chip development. And it all started with The M1. While most of the headlines of the time were about the sheer processing power of the computer, many articles buried the real lead. The incredible part of the announcement was less about Apple bringing chip-manufacturing in-house, and more about the fact that they had switched from x86 to ARM. This wasn’t just a coup d’etat for ARM, it was a win for RISC: The Reduced Instruction Set Computer architecture.
For most of its existence, RISC architecture has been viewed as the “little brother” to x86. The large and powerful machines, servers and workstations used x86. The processors were big, power hungry and generally good at everything. Mobile phones, tablets, and IoT devices used RISC. These chips were less power hungry and more purpose-built for specific functions. They weren’t trying to be good at everything, just good at the mission they were being deployed to complete. The term Reduced Instruction Set is literally true. RISC machines utilize a smaller, highly optimized set of instructions compared to x86, and this enables faster simpler execution.
Apple’s debut of the M1 helped change a lot of minds. Apple had spent years learning how to optimize their RISC chips for the iPhone, iPad, iPod and watch. Now, they were confident that RISC, while still utilizing a comparably smaller instruction-set, could outperform x86 at the key tasks that really matter to most computers.
One year after the debut of the M1, another milestone for RISC came to pass. A new supercomputer from Fujitsu and Riken called Fugaku topped the list of the 500 most powerful supercomputers in the world. And it was powered by the Fujitsu A64FX, a RISC processor.
If the M1 had been RISC’s moon-landing moment, then Fugaku was RISC landing on mars. There had been a number of good reasons to believe that such an accomplishment would never come to pass. Now that it has, it's caused many players in both academia and industry to sit back and rethink everything they ever thought about RISC-based chips. After all: if it was good enough for the the most powerful supercomputer in the world, why wasn't it good enough for them?
It’s now a full three years later, and I think it’s fair to say that RISC has arrived. You can find RISC-based chips powering cloud infrastructure, personal PCs, gaming computers, supercomputers and everything else. At the forefront of this revolution is ARM, who contrary to popular belief is not actually a chipmaker. ARM is a design firm. They design chips; almost exclusively RISC-based chips. To say they’re good at this is an understatement. When it comes to RISC-based chip design, they're in a class of their own. There is ARM, and then there is everyone else. This is why Apple chose ARM's designs for the M-series computers, and Fujitsu chose them to design Fugaku’s processor.
Founded in 1990, ARM is the ultimate secret weapon for computer companies who want to disrupt the status quo. But now, there is a disruptor to the disruptor. Virtually everything ARM produces (and helps other companies produce) is highly proprietary and closely guarded. And now, they have an open-source challenger that is 20 years younger, and coming for blood: RISC-V.
Risk-V is less mature, has a smaller ecosystem, has a shorter track record, and like so many open-source projects is not always a great experience. But, they’re doing it. They’re making an attempt to challenge ARM. The question is: will ARM continue to outpace them, or will RISC-V be able to harness the power of an open community to crowdsource their way to market dominance. Does that seem crazy?. Well, there was a time when thinking that Linux would one day have more market-share than Unix was crazy. In our industry, sometimes a scrappy open-source project ends up dominating the proprietary project that they cloned. It happens.
The question is: is that the direction we’re heading, or not? Is RISC-V gaining significant ground on ARM? Are they approaching feature parity and performance parity?
That’s what the authors of this paper intended to find out. Specifically as it relates to high-performance computing workloads running in Docker and Kubernetes. This mashup might sound a little odd. When you think of high performance computing you probably think of dedicated racks of physical machines, having C, FORTRAN, and Assembly programs loaded in as close to the metal as possible. And that’s how it was for a long time. But the nature of High Performance Computing is changing. HPC practitioners have begun setting their sights on Docker and Kubernetes. Why? Because like any other programmer, HPC devs are concerned with consistently replicable environments. This is Docker’s specialty. And, the experiments that HPC developers run are increasingly structured as distributed workloads that are virtualized, parallel, and asynchronous. All the things that Kubernetes is good at. Not to mention: HPC developers are largely tasked with designing and executing experiments. And like any other dev, the more they can abstract away the cruft and overhead of server-setup and maintenance, the more time they can spend working on the stuff that actually creates and provides value at their organizations.
So the research question that these authors are asking is fairly straightforward: If we’re going to be running Dockerized, Kubernetes-ized workloads for our experiments, should we choose ARM or RISC-V architecture as the underlying substrate? Or should we forget about both of them and stick to x86? In other words:
The authors structured this research as a shootout. I will spare you all the details of how exactly they set up this experiment. There are really just a few important parts:
I’m glossing over a lot of the details here, especially about how much difficulty they had getting some of the RISC-V options up and running. Suffice it to say, getting Docker running on ARM was a simple installation; getting it running on RISC-V was like…well…most of the people listening to this are or were Software Engineers, I think you know exactly what it was like. Incomprehensible errors, banging your head against the keyboard, late nights sifting through the archives of obscure mailing lists and reading Github issues that were closed 5 years ago. It was work. But, let’s keep in mind that the scope of this experiment is to figure out whether the performance of RISC-V is approaching that of ARM. The goal is not to determine if RISC-V is as user-friendly. I for one, would much prefer that the folks stewarding RISC-V continue to focus on making their product more performant for now, and save the nice Developer-Experience for later.
But I digress. With everything set up they ran six main types of tests:
They ran those tests for ARM and RISC-V. Then for a baseline (and to make the results more meaningful) they ran the tests on an x86 processor as well. Here are the results: ARM beat RISC-V every time, and it wasn’t even close. ARM also beat x86 a number of times, but there were a few areas where x86 still outperformed: namely all-core performance and HPCC-stream. The only universal for the tests was that RISC-V lost, and it lost by a lot. The authors didn't always provide specific numbers for each test, but the tests that did have numbers showed ARM as between 4X and 15X the performance of RISC-V.
So, what are the takeaways from this paper?
Well, RISC-V isn’t nothing, but it’s still got some work to do. It’ll be worth revisiting these kinds of tests in the future, as change can happen fast. But for now, ARM is the clear favorite. The decision that HPC Devs (who want to use Docker and Kubernetes) need to make is whether x86 or ARM is the right choice. Not, whether ARM or RISC-V is. That being said, if the proprietary nature of ARM’s designs repel your organization, then RISC-V will allow you to build a chip and computer that is as open as you desire it to be.
If you’d like to read more details about the shootout they set up, or read about all the steps they had to take to get RISC-V running, definitely download the paper. There’s also a Github repo dedicated to this paper, where they put a lot of cool stuff. But just a heads up, the Github is not mentioned in the Data Availability Statement, it’s actually listed as citation #64 in the references.