NVIDIA DGX H100
The ultimate AI infrastructure system
A new era of performance with NVIDIA H100
The fourth-generation DGX AI appliance is built around the new Hopper architecture, providing unprecedented performance in a single system and unlimited scalability with the DGX POD and SuperPOD enterprise-scale infrastructures. The DGX H100 features eight H100 Tensor Core GPUs, each with 80MB of memory, providing up to 6x more performance than previous generation DGX appliances, and is supported by a wide range of NVIDIA AI software applications and expert support.
8x NVIDIA H100 GPUs WITH 640 GIGABYTES OF TOTAL GPU MEMORY 18x NVIDIA® NVLink®
connections per GPU, 900 gigabytes per second of GPU-to-GPU bidirectional bandwidth
4x NVIDIA NVSWITCHES™
7.2 terabytes per second of bidirectional GPU-to-GPU bandwidth, 1.5X more than previous generation
8x NVIDIA CONNECTX®-7 and 2x NVIDIA BLUEFIELD® DPU 400 GIGABITS-PER-SECOND NETWORK INTERFACE
1 terabyte per second of peak bidirectional network bandwidth
DUAL x86 CPUs AND 2 TERABYTES OF SYSTEM MEMORY
Powerful CPUs for the most intensive AI jobs
30 TERABYTES NVME SSD
High speed storage for maximum performance
The Transformer Engine uses a combination of software and specially designed hardware to accelerate transformer model training and inferencing, such as those commonly used in language models such as BERT and GPT-3. The Transformer Engine intelligently manages and dynamically switches between FP8 and FP16 calculations, automatically handling re-casting and scaling between the two levels of precision, speeding up large language models compared to the previous generation Ampere architecture.
The H100 Tensor Core GPUs in the DGX H100 feature fourth-generation NVLink which provides 900GB/s bidirectional bandwidth between GPUs, over 7x the bandwidth of PCIe 5.0.
Building on the capabilities of NVLink and NVSwitch within the DGX H100, the new NVLink NVSwitch System enables scaling of up to 32 DGX H100 appliances in a SuperPOD cluster with up to 57.6TB/s of aggregate bandwidth.
Previously generation GPU-accelerators did not support confidential computing, with data only being encrypted when at rest in storage or in transit across the LAN. Hopper is the first GPU architecture to include support for confidential computing, securing data from unauthorised access as it passes through the DGX H100. NVIDIA confidential computing provides hardware-based isolation of multiple instances sharing a H100 GPU using MIG, single-user H100 GPUs and between multiple H100 GPUs.
Multi-Instance GPU (MIG) expands the performance and value of each NVIDIA H100 GPU. MIG can partition the H100 GPU into as many as seven instances, each fully isolated with their own high-bandwidth memory, cache, and compute cores. Now administrators can support every workload, from the smallest to the largest, offering a right-sized GPU with guaranteed quality of service (QoS) for every job, optimising utilisation and extending the reach of accelerated computing resources to every user.
Expand GPU access to more users
With MIG, you can achieve up to 7X more GPU resources on a single H100 GPU. MIG gives researchers and developers more resources and flexibility than ever before.
Optimise GPU utilisation
MIG provides the flexibility to choose many different instance sizes, which allows provisioning of right-sized GPU instance for each workload, ultimately delivering optimal utilization and maximizing data center investment.
Run simultaneous mixed workloads
MIG enables inference, training, and high-performance computing (HPC) workloads to run at the same time on a single GPU with deterministic latency and throughput.
Up to 7 GPU instances in a single H100
Dedicated SM, Memory, L2 Cache, Bandwidth for hardware QoS & isolation
Simultaneous workload execution with guaranteed quality of service
All MIG instances run in parallel with predicatable throughput & latency
Right-sized GPU allocation
Different sized MIG instances based on target workloads
To run any type of workload on a MIG instance
Diverse deployment environment
Supported with Bare metal, Docker, Kubernetes, Virtualised env.
Hardware-based isolation of individual MIG instances.
Dynamic programming is a popular programming technique that breaks down complex problems using two methods, recursion and memoization. Traditionally these tasks were run on CPUs or FPGAs, but the Hopper architecture introduces new DPX instructions, enabling the GPU to offload these computationally intensive algorithms, boosting performance by up to 7x.
The NGC provides researchers and data scientists with simple access to a comprehensive catalogue of GPU-optimised software tools for deep learning and high performance computing (HPC) that take full advantage of NVIDIA GPUs. The NGC container registry features NVIDIA A100 tuned, tested, certified, and maintained containers for the top deep learning frameworks. It also offers third-party managed HPC application containers, NVIDIA HPC visualisation containers, and partner applications.Find out more
As an end-to-end AI solution provider, Scan can provide complete AI clusters featuring NVIDIA DGX AI appliances, certified storage platforms and networks in the form of DGX BasePOD and SuperPOD. These NVIDIA reference architectures push performance even further adding either NVIDIA Command Base orchestration software or NVIDIA Unified Fabric Manager respectively.Find out more
Deep learning appliances such as the DGX A100 only works as intended if the GPU accelerators are fed data consistently and rapidly enough that the maximum utilisation is delivered. Scan offers a wide range of AI-optimised storage appliances suitable for deployment with the DGX A100.Find out more
Run:ai Atlas combines GPU resources into a virtual pool and enables workloads to be scheduled by user or project across the available resource. By pooling resources and applying an advanced scheduling mechanism to data science workflows, Run:ai greatly increases the ability to fully utilise all available resources, essentially creating unlimited compute. Data scientists can increase the number of experiments they run, speed time to results and ultimately meet the business goals of their AI initiatives.Find out more
Protect your Deep Learning Investment
NVIDIA DGX systems are cutting-edge hardware solutions designed to accelerate your deep learning and AI workloads and projects. Ensuring that your system or systems remain in optimum condition is key to consistently achieving the rapid results you need. Each DGX appliance has a range of comprehensive support contracts covering both software updates and hardware components, coupled with a choice of media retention packages to further protect any sensitive data within your DGX memory or SSDs.Learn more
|NVIDIA DGX H100
|8x NVIDIA H100 Tensor Core GPUs
|16,896 CUDA cores & 528 TF32 Tensor Cores per GPU
|80GB per GPU - 640GB total
|2x Intel Xeon Platinum 8480C, total 112 cores / 224 threads
|2TB ECC Reg DDR5
|2x 1.92TB NVMe SSDs
|8x 3.84TB NVMe SSDs
|8x single-port NVIDIA ConnectX-7 400Gb/s InfiniBand/Ethernet. 2x dual-port NVIDIA ConnectX-7 DPUs each with 2x 400Gb/s InfiniBand/Ethernet
|DGX OS / Ubuntu Linux / Red Hat Enterprise Linux
|Operating Temperature Range
|5ºC to 30ºC (41ºF to 86ºF)