The Lustre parallel file system software has always seen healthy adoption in the research labs around the world, in spite of a challenging start across multiple owners before being released to the open source community. A boost from OpenSFS, contributions from Whamcloud beginning in 2010, Intel’s acquisition of Whamcloud in 2013, and introduction of commercial distributions based on the open source version from a number of suppliers, including Intel, have helped accelerate the relevance and today are contributing to a growing adoption of Lustre in enterprise HPC.
According to Earl Joseph, a top HPC industry analyst with IDC, “Along with IBM’s General Parallel File System (GPFS), Lustre is the most widely used file system. But Lustre is experiencing healthy growth in terms of market share while GPFS remains flat. Lustre is also supported by a large number of OEMs, providing the HPC community with a strong base for growth.”
INTEL CONTINUES TO SEE SIGNIFICANT LUSTRE ADOPTION
“Intel has experienced considerable growth in sales, measured by support contracts for the Intel editions of Lustre, year-over-year from 2013 to 2014,” said Bret Costelow, Director of Global Sales for Lustre Solutions at Intel. “This is representative of those Lustre adopters who have moved away from unsupported, rollyour-own versions, and the competition to Intel’s editions,” said Costelow. “And, we continue to see growth momentum in 2015.”
Since the formation of Intel’s High Performance Data Division (HPDD), the company has also grown its Lustre channel—covering OEMs and system Integrators—from a handful in 2013 to more than 170 today. “This momentum is evidence of our efforts at Intel to penetrate the enterprise market with a highperformance data solution that offers HPCclass performance with the enterprise-grade reliability, stability, manageability, and high availability features that IT departments require,” stated Costelow.
According to Costelow, Intel has seen multiple design wins with major OEMs, like Bull, Cray, DataDirect Networks, Dell, EMC, Huawei, HP, Inspur, NetApp, SGI, and SuperMicro. Their efforts emphasize the growing relevance for Lustre in the marketplace against other storage solutions—both in the academic and national labs and also in enterprise HPC—and thus the momentum Lustre is gaining as a leading parallel file system for highperformance applications.
EXPANDING ENTERPRISE DEPLOYMENTS
“We’re seeing Lustre adoption in more enterprise and enterprise-influencing markets, including oil and gas, financial services, and genomics,” commented Costelow. For example, TATA Consulting Services (TCS) has been working with Intel to look at the convergence of HPC and Big Data in the Financial Services Industry (FSI) with Intel Enterprise Edition for Lustre software as the underlying file system. Instead of testing with industry standard benchmarks, which don’t always reflect the real impact of system performance for a particular usage, TCS is testing with a range of real-world financial applications. “Our objective was to come up with a platform for Hadoop data analysis using an HPC cluster that would give us good performance,” said Rekha Singhal, Senior Scientist with TCS. “TATA was able to achieve a 3X performance gain using Hadoop on Intel’s version of Lustre compared to Hadoop on the Hadoop Distributed File System (HDFS).”
At the Bank of Italy, the IT division of the economics and statistics department is modernizing its HPC infrastructure that supports user applications and provides some database storage. In particular, the newest system currently being readied for production is based on Intel Xeon processors and the Intel Enterprise Edition for Lustre software. It will support about 700 users in the department with 100 TB of storage. The new system is being designed to enable collaboration across the organization by storing and sharing from the Linux cluster’s Lustre file system all of the users’ documents created with their Windows applications. Additionally, this installation is representative of the smaller file sizes Lustre is having to move with the same high performance and efficiency as it has supported in traditional HPC installations. Instead of typical massively large file sizes in the hundreds of gigabytes to terabytes range, the new system will be supporting an average 4K file size, but many, many millions of them. Thus, Intel Enterprise Edition for Lustre version 2.5, with the ability to support many more meta data servers than possible in earlier versions, is being deployed.
Oil and Gas industries run some of the largest clusters around the world, according to Brent Gorda, General Manager of Intel’s HPDD. With the massive data sets in seismic research, oil and gas represents commercially the types of workloads Lustre was designed for when it was first funded under U.S. Department of Energy grants.
“At DownUnder GeoSolutions (DUG), we operate Intel Lustre filesystems totaling 9 PB across our worldwide processing centers,” stated Phil Schwan, Head of Software at DUG. “And we’re adding multiple PB each year.” Some of their codes are extremely I/O intensive. And the demands on the cluster’s files system become more extreme because they run Intel Xeon Phi™ coprocessors in many of their compute nodes. “Keeping them all busy at 90+ percent of their peak capability requires a lot of filesystem bandwidth,” commented Schwan.
LUSTRE ACHIEVEMENTS IN THE CLOUD
“One of the most exciting opportunities we’re seeing for Lustre expansion is with Intel Cloud Edition for Lustre software,” commented Costelow. Amazon has integrated the Cloud Edition into their Amazon Web Services (AWS) to offer high-performance, scalable storage using Lustre in their Elastic Compute Cloud (EC2). SAS, the Business Analytics software company, delivers clustered analytics services through the Amazon Web Services Marketplace and recommends using Lustre on AWS for their analytics software. “With Amazon and SAS, we are seeing and expect to see even more significant and exciting traction taking place in the near future with Lustre in the Cloud,” added Gorda.
LUSTRE IN ACADEMIA INFLUENCES COMMERCE
Academic research can have a significant influence on commercial enterprise. The HPC systems used in this arena often use Lustre to support their compute- and I/O-intensive workloads. For example, Iowa State University (ISU) in Ames, Iowa recently installed its Condo cluster, based on Intel Xeon processors, Intel True Scale fabric, and the Intel Enterprise Edition for Lustre software with Intel support. Condo is being used across many scientific disciplines with highly potential commercial impacts. ISU researchers use Condo in genomics studies for the purpose of improving brood stocks, and for genomic analysis of ancestral varieties of corn to gain insight into what might improve the genotype of corn for food crop production. Corn is the most studied plant around the globe.
Another area of research is simulating and analyzing climatic changes across the U. S. Midwest. Scientists are interested in weather changes relative to impacts from known issues, such as global warming, but also looking at how changes in land use and farming methods over 100 years might be a factor in how the Midwest has evolved to a wetter climate over the last few decades. The researchers hope to inform agricultural practice and policymaking in the face of changing weather patterns. These kinds of simulations demand a level of I/O throughput from data sets that only a parallel file system like Lustre can serve. These kinds of proofs of concept with Big Data and HPC cluster convergence, stretching Lustre beyond its historical usages, and cloud, commercial, and academic deployments further create momentum for Lustre across the evolving marketplace.
INTEL TO BUILD MOMENTUM BY ENHANCING OPPORTUNITIES FOR CHANNEL PARTNERS
To improve its reach further and accelerate Lustre adoption, Intel is developing an entirely new world-class Intel Lustre Solutions Reseller Channel Program. According to Gorda, the goals of this program are to develop an elite worldwide network of resellers with the best trained and technically competent staff; to deliver the most current, standardized, technically comprehensive, and efficient webbased progression of modular curriculum and testing in the parallel file systems reseller arena; and to deliver compelling incentives and unparalleled preferential access to Intel Solutions for Lustre Software staff, resources, support and information. “We are going to bring Lustre resellers the very best information and tools to further their work and continue the momentum we are experiencing with Lustre adoption,” he said.
INTEL ADVANCING LUSTRE INNOVATION THROUGH INTEL PARALLEL COMPUTING CENTERS GRANTS
Intel Parallel Computing Centers are universities, institutions, and labs that are leaders in their field. The primary focus of the Centers is to modernize applications to increase parallelism and scalability through optimizations that leverage cores, caches, threads, and vector capabilities of microprocessors and coprocessors. A new program includes supporting independent developer innovations for Lustre through grants funded by Intel. The goals of the program include the following:
- Fund universities, labs, and institutions to perform Lustre development.
- Provide funding via grant proposals that get new talent involved in developing features that the community cares about.
- Require that the resulting work must end up in the community tree for the benefit of the community.
- Seek a broader development community advancing Lustre.
The recipient of the first approved grant is André Brinkmann of Johannes Gutenberg University, Mainz Germany for his proposal “Lustre QoS: Network Request Scheduler and Monitoring Revisited.”
LUSTRE LEADING THE CHARGE IN EXASCALE COMPUTING
Of course, Lustre continues to be a key component of the next-generation of exascale supercomputers. To do this, Lustre has taken position along with other Intel technologies in the Intel scalable system framework, a flexible blueprint for developing high performance, balanced, power-efficient and reliable systems capable of supporting both compute- and data-intensive workloads. “To create the next generation of highly efficient supercomputers, we need to make sure the right ingredients are combined in precisely the right way,” stated Al Gara, Intel Fellow and Chief Exascale Architect for Intel’s Technical Computing Group. “To do that, we must look at things from a holistic view, from a total system perspective. And out of that comes the definition and the development of those ingredients.” Intel Enterprise Edition for Lustre is a key element of this scalable system framework, which also includes next-generation Intel Xeon processors and Intel Xeon Phi coprocessors, Intel Omni-Path fabric, silicon photonics, and innovative memory technologies, along with the ability to efficiently integrate them into a broad spectrum of system solutions.
The new Aurora system, being designed and built for Argonne National Laboratory under the Collaboration of Oak Ridge, Argonne, and Lawrence Livermore (CORAL) initiative, is designed around the Intel scalable system framework, and will include the Intel Enterprise Edition for Lustre software. According to Gara, the Intel scalable system framework helps OEMs, SIs, and end users to break through the performance wall that has challenged exascale computing. “The framework will help drive even more momentum behind Lustre, offering both our channel partners and end users an efficient blueprint for building out their next-generation clusters for the important work they want to do,” added Costelow.
NO END IN SIGHT
“Lustre momentum is growing,” said Intel’s Gorda. “And we will continue to strongly press the applications of Lustre throughout the HPC community, whether in the labs, in academia, or the enterprise,” he added. “We think Lustre yet has a lot of opportunity to enhance HPC.”
More around this topic...
© HPC Today 2021 - All rights reserved.
Thank you for reading HPC Today.