Verbatim: Jack Wells, Director of Science – ORNL LCF
By   |  February 12, 2014

How is the support to Titan’s users organized in terms of expertise and tools?

The first thing coming on is the Scientific Computing Group, which consists of computational domain scientists. They bring a combination of domain science expertise and extensive experience with Titan and its architecture into partnerships with science teams that are awarded big resources on a machine.

This domain science expertise covers a wide range of computational approaches and disciplines, including astrophysics, biophysics, nuclear physics, chemistry, climate science, material science, engineering, mathematics, visualization, etc. The models were created at the founding of our facility, about ten years ago in 2004, to provide an advanced support and collaboration with projects so that our resources can be used effectively. This kind of scientific computing group is becoming a best practice in supercomputing centers in the US. In order to have the users use these complex machines, you need this kind of advanced support.

We also have another group called User Assistance and Outreach that handles a wide variety of tasks: set up user accounts, manage the help desk to provide the front line technical support and triage of our users’ needs, create user portals and web pages, create and execute a training curriculum that spans a range of topics like introduction to supercomputing, accelerators and management techniques, etc. These two groups work hand in hand: one has a more traditional role for computing centers and the second one, the scientific and computing group, is less traditional but we think it is essential for the work that we do.

OLCF provides a wide spectrum of tools to its users with the aim of offering an environment including the widely used and familiar tools, many open source tools and a certain number of commercial tools. In this case, we work closely with the vendors to ensure support for jobs that are in scale with Titan.

From the organization point of view, we have a Programming Environment and Tools group at the OLCF that really spans our Oak Ridge Leadership Computing Facility. We also have the Computer Science Research Group within the Computer Science and Mathematics division which carries out cutting edge research in computer science. We have this major arrangement here to make sure that the tools for cutting edge computation research and operational support are combined. For example, ORNL people in this group are active in the OpenACC, OpenMP, and MPI standards organizations to push in these standards and their implementation, in order to better meet the needs of our users. We are working on better algorithms for collected communications, better MPI fault-tolerance and various techniques to facilitate porting codes to new platforms like Titan and beyond. These are some of the research activities that are paid for by the Computer Science Mathematics program within DoE. These people helped us manage the development of these tools by working with our vendors. The vendors are an important part of the story here.

 

As part of the Titan budget, we also have contracted with several key tool vendors to ensure that these new tools really work on the scale and heterogeneous nature of Titan. In particular, we have relationships with Allinea for the DDT debugger and the Technical University in Dresden for the Vampir profiler suite. More recently, we have added the CAPS Entreprise team who focuses on compilers for the Titan GPU accelerators and on the OpenACC language. This is in addition to the tools that are provided by Cray.

Four of the six SC13 Gordon Bell Prize finalists used Titan to run their simulations. Can you elaborate on one or two science challenges that Titan has already been or will be able to solve?

This is what it’s all about, the outcomes. There is a paper in the Gordon Bell competition from a Swiss American team: Peter Staar from ETH at Zurich, Thomas Maier and Thomas Schulthess at ORNL. They were performing superconductivity calculations on Titan at 15 PFlops, effectively using the architecture. The superconducting materials conduct electricity without resistance and therefore without loss, showing a promise for important applications in power transmission and magnets. From the point of view of science, this team solved the two-dimensional single-band HyPerModel in a completely converged fashion. They showed that this model correctly reproduces a phase diagram for high temperature superconductivity in the cuprate based superconductors, those high temperature superconductors that include a copper. These models were proposed for high-Tc superconductors in 1987, just a year after they were discovered. Literally thousands of papers were written on this subject over the last 26 years. This team has for the first time successfully converged the solution using a new extremely effective numerical algorithm that scales very efficiently on Titan. Compared to a CPU only machine – the XE6 that instead of one CPU and one GPU has two CPUs – Titan achieved the simulation 6 times faster and 7 times more energy-efficiently. We take that as a tremendous outcome.

The second topic I want to emphasize is actually a project led by Masako Yamada, a General Electric (GE) Global Research computational scientist. At GE Global Research, they have a science project focusing on understanding how water droplets freeze upon surfaces of a wind turbine placed in a cold climate. Her goal is to understand the molecular mechanism behind the freezing of water droplets on these non-icing surfaces. To do so, in collaboration with Mike Brown at ORNL. Yamada performed a series of molecular dynamic simulations over a range of the problem parameters. These simulations are the largest and longest of this kind ever attempted, and they were able to replicate GE’s experimental result. GE knew from experiments that the creation of ice is delayed on these non-icing surfaces, and that this effect becomes less prominent as the operating temperature decreases. Yamada was able to do this successfully with reproducible results and now GE is in position to use Titan to study this phenomenon and try to engineer solutions with the computational science approach in addition to their existing experimental capabilities. For that, she received an IDC HPC Innovation Excellence award at SC’13.

ORNL has strong industry partnership programs (with Boeing or Ford, for instance). What proportion of Titan’s computing resources and computation time do these programs represent?

Our industrial partnership program does not assign a specific amount of time on Titan for these industrial projects. Instead, we work with our partners to ensure that they are effective in accessing time on Titan through the same three pathways as any university-based or laboratory-based research project. The first of these pathways is the INCITE (Innovative and Novel Computational Impact on Theory and Experiment) program. We manage that program jointly with our sister facility at Argonne. That represents 60% of the time allocated on Titan. The second one, called ALCC (ASCR Leadership Computing Challenge), is also a program that our sponsor manages at Washington. That’s 30% of the time. The remaining 10% is the Director’s Discretionary allocation program that we run here in our center. I chair the internal committee that manages that time. INCITE and ALCC are annual calls for proposals for projects beginning either in January for INCITE and in July for ALCC. The Director’s Discretionary program accepts proposals at any time during the year.

The Discretionary program exists for three main purposes. One is to allow people to have some preliminary results for the INCITE and ALCC programs because these are very competitive programs, oversubscribed three times, so people need preliminary results to have successful applications. It is also used to outreach to new communities that do not correctly use leadership computing and that includes industry. The third case is that we can use some of our discretionary time for ORNL priorities. Otherwise, through all the other programs, ORNL scientists are on a level field with researchers in the US and around the world.

Navigation

<123>

© HPC Today 2024 - All rights reserved.

Thank you for reading HPC Today.

Express poll

Do you use multi-screen
visualization technologies?

Industry news

Brands / Products index