NVIDIA announces H100 Hopper GPU with up to 16896 FP32 cores, 80GB HBM3 memory, 700W of TDP

Published: Mar 22nd 2022, 15:58 GMT   Comments

« press release »

NVIDIA Hopper GPU Architecture Accelerates Dynamic Programming Up to 40x Using New DPX Instructions

Dynamic programming algorithms are used in healthcare, robotics, quantum computing, data science and more.

The NVIDIA Hopper GPU architecture unveiled today at GTC will accelerate dynamic programming — a problem-solving technique used in algorithms for genomics, quantum  computing, route optimization and more — by up to 40x with new DPX instructions.

An instruction set built into NVIDIA H100 GPUs, DPX will help developers write code to achieve speedups on dynamic programming algorithms in multiple industries, boosting workflows for disease diagnosis, quantum simulation, graph analytics and routing optimizations.

What Is Dynamic Programming?

Developed in the 1950s, dynamic programming is a popular technique for solving complex problems with two key techniques: recursion and memoization.

Recursion involves breaking a problem down into simpler sub-problems, saving time and computational effort. In memoization, the answers to these sub-problems — which are reused several times when solving the main problem — are stored. Memoization increases efficiency, so sub-problems don’t need to be recomputed when needed later on in the main problem.

DPX instructions accelerate dynamic programming algorithms by up to 7x on an NVIDIA H100 GPU, compared with NVIDIA Ampere architecture-based GPUs. In a node with four NVIDIA H100 GPUs, that acceleration can be boosted even further.

Use Cases Span Healthcare, Robotics, Quantum Computing, Data Science

Dynamic programming is commonly used in many optimization, data processing and omics algorithms. To date, most developers have run these kinds of algorithms on CPUs or FPGAs — but can unlock dramatic speedups using DPX instructions on NVIDIA Hopper GPUs.


Omics covers a range of biological fields including genomics (focused on DNA), proteomics (focused on proteins) and transcriptomics (focused on RNA). These fields, which inform the critical work of disease research and drug discovery, all rely on algorithmic analyses that can be sped up with DPX instructions.

For example, the Smith-Waterman and Needleman-Wunsch dynamic programming algorithms are used for DNA sequence alignment, protein classification and protein folding. Both use a scoring method to measure how well genetic sequences from different samples align.

Smith-Waterman produces highly accurate results, but takes more compute resources and time than other alignment methods. By using DPX instructions on a node with four NVIDIA H100 GPUs, scientists can speed this process 35x to achieve real-time processing, where the work of base calling and alignment takes place at the same rate as DNA sequencing.

This acceleration will help democratize genomic analysis in hospitals worldwide, bringing scientists closer to providing patients with personalized medicine.

Route Optimization

Finding the optimal route for multiple moving pieces is essential for autonomous robots moving through a dynamic warehouse, or even a sender transferring data to multiple receivers in a computer network.

To tackle this optimization problem, developers rely on Floyd-Warshall, a dynamic programming algorithm used to find the shortest distances between all pairs of destinations in a map or graph. In a server with four NVIDIA H100 GPUs, Floyd-Warshall acceleration is boosted 40x compared to a traditional dual-socket CPU-only server.

Paired with the NVIDIA cuOpt AI logistics software, this speedup in routing optimization could be used for real-time applications in factories, autonomous vehicles, or mapping and routing algorithms in abstract graphs.

NVIDIA DGX H100 systems, DGX PODs and DGX SuperPODs will be available from NVIDIA’s global partners starting in the third quarter.

Customers can also choose to deploy DGX systems at colocation facilities operated by NVIDIA DGX-Ready Data Center partners including Cyxtera, Digital Realty and Equinix IBX data centers.

NVIDIA Data-Center GPUs Specifications
VideoCardz.comNVIDIA H100NVIDIA A100NVIDIA Tesla V100NVIDIA Tesla P100
Die Size814 mm²828 mm²815 mm²610 mm²
Fabrication NodeTSMC N4TSMC N712nm FFN16nm FinFET+
GPU Clusters132/114*1088056
CUDA Cores16896/14592*691251203584
L2 Cache50MB40MB6MB4MB
Tensor Cores528/456*432320
Memory Bus5120-bit5120-bit4096-bit4096-bit
Memory Size80 GB HBM3/HBM2e*40/80GB HBM2e16/32 HBM216GB HBM2
InterfaceSXM5/*PCIe Gen5SXM4/PCIe Gen4SXM2/PCIe Gen3SXM/PCIe Gen3
Launch Year2022202020172016

« end of the press release »

Comment Policy
  1. Comments must be written in English and should not exceed 1000 characters.
  2. Comments deemed to be spam or solely promotional in nature will be deleted. Including a link to relevant content is permitted, but comments should be relevant to the post topic. Discussions about politics are not allowed on this website.
  3. Comments and usernames containing language or concepts that could be deemed offensive will be deleted.
  4. Comments complaining about the post subject or its source will be removed.
  5. A failure to comply with these rules will result in a warning and, in extreme cases, a ban. In addition, please note that comments that attack or harass an individual directly will result in a ban without warning.
  6. VideoCardz has never been sponsored by AMD, Intel, or NVIDIA. Users claiming otherwise will be banned.
  7. VideoCardz Moderating Team reserves the right to edit or delete any comments submitted to the site without notice.
  8. If you have any questions about the commenting policy, please let us know through the Contact Page.
Hide Comment Policy