Steve Conway, IDC Research Vice President, HPC
Big data arguably originated in the global high-performance computing (HPC) community in the 1950s for government applications such as cryptography, weather forecasting, and space exploration. High-performance data analysis (HPDA)—big data using HPC technology—moved into the private sector in the 1980s, especially for data-intensive modeling and simulation to develop physical products such as cars and airplanes. In the late 1980s, the financial services industry (FSI) became the first commercial market to use HPC technology for advanced data analytics (as opposed to modeling and simulation). Investment banks began to use HPC systems for daunting analytics tasks such as optimizing portfolios of mortgage-backed securities, pricing exotic financial instruments, and managing firm-wide risk. More recently, high-frequency trading joined the list of HPC-enabled FSI applications.
The invention of the cluster by two NASA HPC experts in 1994 made HPC technology far more affordable and helped propel HPC market growth from about $2 billion in 1990 to more than $20 billion in 2013. More than 100,000 HPC systems are now sold each year at starting prices below $50,000, and many of them head into the private sector.
It’s widely known that industrial firms of all sizes have adopted HPC to speed the development of products ranging from cars and planes to golf clubs and potato chips. But lately, something new is happening.Leading commercial companies in a variety of market segments are turning to HPC-born parallel and distributed computing technologies — clusters, grids, and clouds — for challenging big data analytics workloads that enterprise IT technology alone cannot handle effectively. IDC estimates that the move to HPC has already saved PayPal more than $700 million and is saving tens of millions of dollars per year for some others.
The commercial trend isn’t totally surprising when you realize that some of the key technologies underpinning business analytics (BA) originated in the world of HPC. The evolution of these HPC-born technologies for business analytics has taken two major leaps and is in the midst of a third. The advances have followed this sequence:
Phase 1 was the advance from the mainframe mentality of running single applications on traditional SMP servers to modern clusters (i.e., systems that lash together homogeneous Linux or Windows blades to exploit the attractive economics of commodity hardware). The cluster was invented by two NASA HPC experts in 1994.
Phase 2 was the move to grids with the goal of supporting multiple applications across business units coherently. This enables enterprisewide management of the applications and workloads.
Phase 3 is the emerging move to cloud computing, which focuses on delivering generic computing resources to the applications and business units on an on-demand, pay-as-you-go basis. Clouds can be hosted within a company, by an external provider, or as a hybrid combination of both.
Why Businesses Turn to HPC for Advanced Data Analytics
High-performance data analysis is the term IDC coined to describe the formative market for big data workloads that exploit HPC resources. The HPDA market represents the convergence of long-standing, data-intensive modeling and simulation (M&S) methods in the HPC industry/application segments that IDC has tracked for more than 25 years and newer high-performance analytics methods that are increasingly employed in these segments as well as by commercial organizations that are adopting HPC for the first time. HPDA may employ either long-standing numerical M&S methods, newer methods such as large-scale graph analytics, semantic technologies, and knowledge discovery algorithms, or some combination of long-standing and newer methods.
The factors driving businesses to adopt HPC for big data analytics (i.e., HPDA) fall into a few main categories:
High complexity. HPC technology allows companies to aim more complex, intelligent questions at their data infrastructures. This ability can provide important advantages in today’s increasingly competitive markets. HPC technology is especially useful when there is a need to go beyond query-driven searches in order to discover unknown patterns and relationships in data — such as for fraud detection, to reveal hidden commonalities within millions of archived medical records, or to track buying behaviors through wide networks of relatives and acquaintances. IDC believes that HPC technology will play a crucial role in the transition from today’s static searches to the emerging era of higher-value, dynamic pattern discovery.
High time criticality. Information that is not available quickly enough may be of little value. The weather report for tomorrow is useless if it’s unavailable until the day after tomorrow. At PayPal, enterprise technology was unable to detect fraudulent transactions until after the charges had hit consumers’ credit cards. The move to high-performance data analysis using HPC technology corrected this problem. For financial services companies engaged in high frequency trading, HPC technology enables proprietary algorithms to exploit market movements in minute fractions of a second, before the opportunities disappear.
High variability. People generally assume that big data is “deep,” meaning that it involves large amounts of data. They recognize less often that it may also be “wide,” meaning that it can include many variables. Think of “deep” as corresponding to lots of spreadsheet rows and “wide” as referring to lots of columns (although a growing number of high-performance data analysis problems don’t fit neatly into traditional row-and-column spreadsheets). A “deep” query might request a prioritized listing of last quarter’s 500 top customers in Europe. A “wide” query might go on to analyze their buying preferences and behaviors in relation to dozens of criteria. An even “wider” analysis might employ graph analytics to identify any fraudulent behavior within the customer base.
© HPC Today 2020 - All rights reserved.
Thank you for reading HPC Today.