ATS-5

ATS-5, the fifth Advanced Technology System (ATS) in the Advanced Simulation and Computing (ASC) program, will be critical to the success of these NNSA missions. ATS-5 will support current and future simulation codes and will tackle some of the largest-scale 3D simulation workloads in support of the stockpile stewardship mission. These large-scale simulations are known as “hero”-class simulations, and ATS-5 will reduce these hero simulations’ time-to-completion from months to days. Additionally, ATS-5 will provide the ability to run multiple of these hero-class simulations simultaneously. Vastly reduced time-to-solutions, combined with the ability to run multiple large-scale simulations, will dramatically improve NNSA’s ability and agility to manage the stockpile.

In 2027, Crossroads (ATS-3) will be nearing the end of its useful lifetime. ATS‑5 will replace Crossroads and will be the first NNSA HPC system in the post-exascale era providing a large portion of the simulation resources for the NNSA ASC tri-lab community of Lawrence Livermore National Laboratory (LLNL), Los Alamos National Laboratory (LANL), and Sandia National Laboratories (SNL).

ATS‑5 will support a diverse set of applications spanning the analysis of manufacturing defects, changing material properties as a result of the use of new materials with both current and new manufacturing techniques, the analysis of the impact of combined environments, aging defects, and potentially alternate delivery vehicles. New and emerging workloads, including digital engineering and machine learning (ML), will also be enabled by ATS‑5.

Numerous architectural advancements in ATS‑5 system design will support the NNSA Office of Defense Programs mission needs. ATS-5 will be designed with the following architectural advancements and major project goals:

Overcoming the memory wall—continued memory bandwidth performance improvements for tri-lab applications
Improved efficiency—programmer productivity, energy usage, and increased processor utilization
Architectural diversity—ensuring that the high-performance computing ecosystem remains vibrant with multiple advanced technology solutions
Time-to-solution—advancing strong scaling improvements to tackle the most pressing challenge of major improvements in time-to-solution for NNSA’s largest and most complex stockpile simulations

Achieving these goals will enable weapons designers, analysts, and computational scientists to make more routine use of today’s hero-class simulations in support of the stockpile stewardship certification and assessments to ensure that the Nation’s nuclear stockpile is safe, secure, and reliable.

ATS-5 Timeline

Draft Technical Specification - September 2023
RFP release - March 2024
Source selection - July 2024
Award NRE/Build contract - May 2025
System Delivery - late 2026/early 2027
Acceptance - August/September 2027

Triad National Security, LLC (Triad) which manages and operates Los Alamos National Laboratory (LANL), is planning to release a Request for Proposal (RFP) for NNSA’s next generation Advanced Technology System (ATS), ATS-5, to be delivered in the 2027 time frame.

On September 28, 2023, LANL posted the draft Technical Requirements document (current version 1.0) for the planned ATS-5 solicitation. Draft technical documents are posted in the links below and may be subject to change.

All comments, questions, etc. should be sent to ats5-rfp-comments@lanl.gov. Any information provided by industry to Triad is strictly voluntary and the information obtained from responses to this notice may be used in the development of an acquisition strategy and future solicitation.

Assuring that real applications perform efficiently on ATS-5 is key to their success. A suite of benchmarks have been developed for Request For Proposal (RFP) response evaluation and system acceptance. These codes are representative of the workloads of the NNSA laboratories.

The benchmarks contained within this site represent a pre-RFP draft state. Over the next few months the benchmarks will change somewhat. While we expect most of the changes will be additions and modifications it is possible that we will remove benchmarks prior to RFP.

To use these benchmarks, please refer to the ATS-5 benchmark documentation as well as the benchmark code repository.

Benchmark changes from Crossroads

The key differences from Crossroads benchmarks and ATS-5 benchmarks are as summarized below:

Crossroads	ATS-5	Notes
Few GPU-ready benchmarks	All proxy benchmarks have GPU implementations.
System level performance metric: Scalable System Improvement geometric mean of app FOMs. Use of single node benchmarks for RFP.	Multi-node benchmarking for system acceptance based on RFP benchmarks, negotiated with vendor as part of SOW.	Attempting to limit multi-node benchmarking for RFP to communication (MPI), and IO (IOR). Expect responses to include multiple node configurations and ability to compose them to meet our needs in a codesign partnership. Will use scaled single node improvement to assess proposals (along with other factors) and SSI for acceptance.
Mini-Apps + full scale apps some of which were export controlled.	Mini-apps only - all open source.
No Machine Learning.	ML training and inference included.	Focuses on material science workloads of relevance.

Benchmark Overview

Benchmark	Description	Language	Parallelism
Branson	Implicit Monte Carlo transport	C++	MPI + Cuda/HIP
AMG2023	AMG solver of sparse matrices using Hypre	C	MPI+CUDA/HIP/SYCL OpenMP on CPU
MiniEM	Electro-Magnetics solver	C++	MPI+Kokkos
MLMD	ML Training of interatomic potential model using HIPYNN on VASP Simulation data. ML inference using LAMMPS, Kokkos, and HIPYNN trained interatomic potential model.	Python, C++, C	MPI+Cuda/HIP
Parthenon-VIBE	Block structured AMR proxy using the Parthenon framework.	C++	MPI+Kokkos
Sparta	Direct Simulation Monte Carlo	C++	MPI+Kokkos
UMT	Deterministic (Sn) transport	Fortran	MPI+OpenMP and OpenMP Offload

Microbenchmark Overview

Benchmark	Description	Language	Parallelism	Multi-node
Stream	Streaming memory bandwidth test	C/Fortran	OpenMP	No
Spatter	Sparse memory bandwidth test driven by application memory access patterns.	C++	MPI+OpenMP/ CUDA/OpenCL	No
OSU MPI + Sandia SMB message rate	MPI Performance Benchmarks	C++	MPI	Yes
DGEMM	Single node floating-point performance on matrix multiply.	C/Fortran	Various	No
DAXPY	Single node floating-point performance of a scaled vector plus a vector.	C/Fortran	Various	No
IOR	Performance testing of parallel file system using various interfaces and access patterns.	C	MPI	No
mdtest	Metadata benchmark that performs open/stat/close operations on files and directories.	C	MPI	Yes