As the HPC community hurtles toward the exascale era, it’s good to pause and reflect. Here are a few thoughts…

By Steve Conway | May 16, 2016

Steve Conway, IDC Research Vice President, HPC

The DOE CORAL procurement signaled that extreme-performance supercomputers from the U.S., Japan, China and Europe should reach the 100-300PF range in 2017-2018. That’s well short of DOE’s erstwhile stretch goal of deploying a trim, energy-efficient peak exaflop system in 2018 or so, but still impressive. It would appear to leave room for one more pre-exascale generation before full-exascale machines begin dotting the global landscape in the 2020-2024 era.

An exaflop is an arbitrary milestone, a nice round figure with the kind of symbolic lure the four-minute mile once held. And as NERSC Director Horst Simon pointed out many moons ago, there are three temporal stages to these computing milestones that have occurred about once a decade. First will come peak exaflop performance, then a Linpack/TOP500 exaflop, and finally the one that counts most but will likely be celebrated least: sustained exaflop performance on a full, challenging 64-bit user application.

A peak exascale system is merely an “exasize” computer, to cite the term Chinese experts used in an SC13 conference talk. It’s a show dog without a repertoire of tricks. A system that completes a Linpack run at exascale shows at least that a major fraction of the system can be engaged to tackle a dense system of linear equations. The path to the third stage — sustained exaflop performance on challenging user applications — is where many of the biggest hurdles lie. Prominent among these, as is well-known, are scaling the software ecosystem, providing enough reliability and resiliency to finish exa-jobs, and supplying enough IO to keep the heterogeneous processing elements busy. These are the same challenges advanced users face today, only more so.

The IO challenge is particularly nasty. In recent decades, HPC systems have become extremely compute-centric (“f/lopsided”). This increasing imbalance has aggravated the memory wall and narrowed the breadth-of-applicability for each succeeding generation of high-end supercomputers, especially for data-intensive simulation and the growing importance of advanced analytics. Fortunately, strategies are under way to alleviate (but not fix) this issue, including more capable interconnect fabrics, burst buffers and NVRAM, tighter linkages between CPUs and accelerators, clever data reduction methods, and more besides. But no one should expect supercomputers to return to the more balanced status of yesteryear. IDC vendor studies show that the basic architecture of HPC systems is unlikely to change in the next five to seven years, although configurations and some components will shift.

Not long ago, a fundamental premise underlying advanced supercomputer development was that evolutionary market forces were too slow and governments needed to stimulate revolutionary progress. The idea was that the government would do the heavy lifting to pave the way, and the mainstream HPC market would follow to take advantage of the revolutionary advances. In our annual HPC pre- dictions, IDC back then pointed out the risk that the government-supported high-end HPC market might split off as its own ecological niche, while the mainstream market continued to evolve on its own inertial path.

That split, though still possible, has not happened. Instead, government officials, for the most part, have realized that they are no longer the primary drivers of HPC. Market forces have usurped that role. The worldwide HPC mar- ket’s diversification and ten-fold expansion in the past three decades, from $2 billion to more than $20 billion, has removed the government from the kingpin position it once held. Government officials in most HPC-exploiting countries have inflected their strategies to take better advantage of market forces, especially technology commoditization and open standards.

The fact that governments have met HPC market forces partway is ultimately a good thing for all parties. It means that many of the government-supported advances for exascale computing will sooner or later benefit the mainstream HPC market, including SMEs that buy only a rack or two of technical servers. That, in turn, means that savvy government officials can help justify the skyrocketing investments needed for extreme-scale supercomputers by pointing to ROI that benefits the large mainstream market, including industry and commerce. Government-driven advances can be used both to out-compute and to out-compete.

So, it appears that at least through the early exascale era, vendors will continue to build Linpack machines, because most government buyers will continue to see superior Linpack performance as a mark of leadership. Things might develop differently if more leading sites followed the example of the NCSA “Blue Waters” procurement, where the overwhelming stress was on the assessed needs of user applications and Lin- pack performance was not even reported. That was a deliberate decision, because “Blue Waters” is also a competent Linpack machine at heart and could have recorded impressive Linpack results. The point here is that, if lots of buyers gave primary consideration to user requirements in the procurements, this should lead to better system balance and wider applicability over time.

At the high end of the supercomputer market, money talks, too. Government funding appetites will play a major role in determining the sequence in which the entrants cross the exascale finish line. In earlier times, the global supercomputer race pitted “muscle cars” from the U.S. and Japan against each other, and these monoliths featured lots of custom technology. But today, as a successful Arnold Schwarzenegger once advised a neophyte bodybuilder, “it’s not the size of your muscles that counts; it’s the size of your wallet.” Among governments, the U.S. is still the largest funder and the Obama Administration’s budget request puts a high priority on exascale funding — although Congress has not approved this yet. The EU has been ramping up exascale funding, although not as fast as China, and Japan is likely to give everyone a run for their money.

World-leading supercomputers have not exactly morphed from muscle cars to family sedans yet, but they’ve been on that path — and it’s generally a healthy one. The adoption of industry standards has been necessary for the expansion and democratization of the HPC industry, for broader collaboration, for better reliability, and for preserving and leveraging investments in software and hardware development. It’s hard to imagine how vendors could make exascale muscle cars affordable, even for government buyers with the deepest pockets. The “Blue Waters” and CORAL procurements, among others, prove that, in the era of evolutionary HPC systems, important innovations can be pursued on behalf of users.

Governments around the world have increasingly recognized that HPC is a trans- formational technology that can boost not only scientific leadership, but also industrial and economic competitiveness. Accompanying this recognition is the notion that HPC is too strategic to outsource to another country, meaning to the U.S. in most cases. Exascale initiatives in Asia and Europe are promoting the development of indigenous technologies, often in conjunction with non-native components.

I’ve been talking so far about hardware, but we’ve said for some years at IDC that software advances will be more important than hardware progress in determining future HPC leadership. It’s gratifying to see national and regional exascale initiatives increase funding for exascale software development, although the amounts still seem unequal to the task.

The long-term good news is that HPC has become a mature market, one driven by market forces. That gives strong assurance that the market will behave rationally over time. Demand, in the form of buyer and user requirements, will increasingly win out.