NVIDIA BlueField DPUs on NVIDIA Quantum InfiniBand Networking Take Accelerated Computing to the Next Level

Steve Poole is among many researchers around the world harnessing the power of the network. The distinguished senior scientist at Los Alamos National Laboratory (LANL) envisions huge performance gains using accelerated computing that includes data processing units (DPUs) running on NVIDIA Quantum InfiniBand networks.

Steve Poole is among many researchers around the world harnessing the power of the network. The distinguished senior scientist at Los Alamos National Laboratory (LANL) envisions huge performance gains using accelerated computing that includes data processing units (DPUs) running on NVIDIA Quantum InfiniBand networks.

In Europe and the US, other HPC developers are developing ways to offload computing and communications work to DPUs. They are supercharging supercomputers with the power of Arm cores and accelerators inside NVIDIA BlueField-2 DPUs.

An open API for DPU

Poole’s work is one part of an extensive, multi-year collaboration with NVIDIA that aims to speed up to 30x computational multiphysics applications. It includes pioneering techniques in computational storage, pattern matching, and more using BlueField and its NVIDIA DOCA software framework.

The efforts will also help better define OpenSNAPI, an application interface that anyone can use to take advantage of DPUs. Poole chairs the OpenSNAPI project on the Unified Communication Framework, a consortium enabling heterogeneous computing for HPC applications whose members include Arm, IBM, NVIDIA, US National Labs, and US universities.

“DPUs are an integral part of our overall solution and I see great potential in using DOCA and similar software packages in the near future,” said Poole.

Flash storage 10-30 times faster

LANL is already feeling the power of network computing, thanks to a DPU-powered storage system it created. The Accelerated Box of Flash (ABoF, pictured below) combines solid-state storage with DPU and InfiniBand accelerators to accelerate performance-critical parts of a Linux file system. It is up to 30 times faster than similar storage systems and is set to become a key component in LANL’s infrastructure.

A working prototype of the Flash Speed ​​Box. Hardware components are all standard for easy adoption. Accelerators and storage devices are placed in the U.2 slots in the front bays, while there is also an internal PCIe (Rapid Peripheral Component Interconnect) slot used for accelerator hardware.

ABoF makes possible “more scientific discoveries. Placing compute close to storage minimizes data movement and improves the efficiency of simulation and data analysis pipelines,” said Dominic Manno, a researcher at LANL, in a recent LANL blog.

An internal view of the Flash accelerated case showing the connectivity of the NVMe SSDs (front of chassis) to the BlueField-2 DPUs. At the top right is a PCIe location for an accelerator. This demo used an Eideticom NoLoad device.

DPU in the company

We’ve seen a good number of DPUs arrive in major enterprise data centers. VAST data for example, it uses DPUs in its new Ceres all-flash storage nodes. And while it is a competitive technology, expendables it has also turned to DPUs to create its disaggregated GPU and storage offering. Clearly, we’re in the early stages, but network and system administrators should get up to speed on what DPUs can do to enable more efficient system and application delivery.

nvidia blue field

Interact with StorageReview

Newsletter | Youtube | Podcast itunes/Spotify | Instagram | Twitter | Facebook | tik tok | RSS Feed

Leave a Reply

Your email address will not be published.