VOLTAGEGPU

NVIDIA H200 141GB

The H100 successor with 76% more VRAM. 141GB HBM3e for frontier models and research.

...from / gpu / hour
...available now
<60sdeploy time
1x+multi-GPU

Technical Specifications

GPU Memory
141 GB
HBM3e
Bandwidth
4,800 GB/s
Memory Bandwidth
CUDA Cores
16,896
Hopper
Tensor Cores
528
4th Gen
FP32
67 TFLOPS
Single Precision
Tensor Perf
3,958 TFLOPS (FP8)
Mixed Precision
NVLink
900 GB/s
GPU Interconnect
PCIe
PCIe 5.0
Host Interface
141GB HBM3e Memory4,800 GB/s BandwidthTransformer EngineFP8 Precision

Ideal Use Cases

Frontier LLM Training

  • GPT-4 class models
  • LLaMA 405B
  • Mixtral 8x22B

Large-Scale Inference

  • High-throughput serving
  • Production chatbots
  • Multi-modal models

Scientific Research

  • Drug discovery
  • Climate simulation
  • Genomics at scale

Performance Comparison

This GPU
H200 141GB
VRAM141 GB HBM3e
Bandwidth4,800 GB/s
FP3267 TFLOPS
Tensor3,958 TFLOPS
H100 80GB
VRAM80 GB HBM3
Bandwidth3,350 GB/s
FP3267 TFLOPS
Tensor3,958 TFLOPS
A100 80GB
VRAM80 GB HBM2e
Bandwidth1,555 GB/s
FP3219.5 TFLOPS
Tensor312 TFLOPS

Multi-GPU Configurations

2x
282 GB VRAM134 TFLOPS FP32
4x
564 GB VRAM268 TFLOPS FP32
8x
1,128 GB VRAM536 TFLOPS FP32

FAQ

76% more VRAM (141GB vs 80GB) and 43% more memory bandwidth (4,800 vs 3,350 GB/s). Same compute, much more memory for larger models.

Yes, with 8x H200 you get 1,128 GB total VRAM — enough for LLaMA 405B inference without quantization.

HBM3e is the latest high-bandwidth memory technology, offering 4,800 GB/s bandwidth per GPU — 43% faster than HBM3 on H100.

VoltageGPU bills per second with no minimum commitment. Run for 5 minutes or 5 months — pay only for what you use.

Other GPUs

Ready to Deploy H200 141GB?

$5 free credit. No credit card required. Deploy in under 60 seconds.

99.9% Uptime Per-second billing Global network