HopperEnterprise

Rent NVIDIA H200 141GB

Rent NVIDIA H200 141GB HBM3e 8-GPU clusters from $26.60/hr. 1.1 TB total VRAM for large-scale LLM training, 200B+ model serving, and frontier AI research. VoltageGPU cloud.

141 GB HBM3e per GPU (1.1 TB total) 4,800 GB/s memory bandwidth 8-GPU NVLink 4.0 topology Train 200B+ parameter models

Starting from

$26.60/hr

~$638.40/day

~$19,152/month (24/7)

Deploy H200 141GB

Per-minute billing · No commitment

H200 141GB Technical Specifications

VRAM

8×141 GB HBM3e

Memory Type

HBM3e

Memory Bandwidth

4,800 GB/s

CUDA Cores

14,592

Tensor Cores

456

FP16 Performance

989.5 TFLOPS

FP32 Performance

67 TFLOPS

TDP

700W (SXM)

Architecture

Hopper

Interconnect

NVLink 4.0 / PCIe 5.0

Included Storage

1 TB NVMe SSD

vCPUs

48 vCPUs

System RAM

384 GB DDR5 ECC

Manufacturer

NVIDIA

H200 141GB Cloud Pricing

See how VoltageGPU compares to other cloud GPU providers.

ProviderHourly RateEst. Monthlyvs VoltageGPU
VoltageGPUYou$26.60$19,152
RunPod$29.90$21,52811% cheaper
Vast.ai$28.50$20,5207% cheaper
Lambda$32.00$23,04017% cheaper
AWS (p5e equivalent)$42.00$30,24037% cheaper

Competitor pricing sourced from public pages as of March 2026. Prices may vary.

What Can You Do with the H200 141GB?

Popular workloads and use cases for NVIDIA H200 141GB cloud instances.

🏗️

Large-Scale LLM Training

Train 70B–200B parameter models with massive VRAM across 8 GPUs. The 141 GB per GPU (1.1 TB total) eliminates memory bottlenecks.

LLM Inference at Scale

Serve 70B models unquantized or 400B+ models with quantization across the 8-GPU configuration for production inference.

🎥

Multi-Modal AI

Train and serve vision-language models, video generation models, and other multi-modal architectures that require massive memory.

🔬

Research Clusters

Dedicated compute for AI research labs. Combine multiple H200 nodes for frontier model development.

H200 141GB Performance Benchmarks

Relative performance scores across common workload categories (B200 = 100).

Training96/100
Inference95/100
Fine-Tuning97/100
Rendering72/100

Deploy H200 141GB via API

Programmatically launch a H200 141GB instance with a single API call.

terminal
curl -X POST https://api.voltagegpu.com/v1/pods \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "gpu": "h200-141gb",
    "gpu_count": 8,
    "template": "pytorch-2.2",
    "storage_gb": 1000,
    "name": "my-h200-cluster"
  }'

H200 141GB — Frequently Asked Questions

What is the H200 and how does it differ from the H100?+
The H200 uses the same Hopper architecture as the H100 but features HBM3e memory with 141 GB per GPU (vs 80 GB HBM3 on H100) and 4,800 GB/s bandwidth (vs 3,350 GB/s). This 76% more memory and 43% more bandwidth makes the H200 significantly better for large model training and inference.
Is the H200 141GB a single GPU or a cluster?+
Our H200 141GB offering is an 8-GPU server with 141 GB HBM3e per GPU (1,128 GB total VRAM). The 8 GPUs are connected via NVLink 4.0 for maximum bandwidth. Pricing shown ($26.60/hr) is for the full 8-GPU node.
What models can run on the H200 cluster?+
With over 1 TB of total VRAM, the H200 8-GPU cluster can train models up to 200B parameters in full precision, serve 400B+ quantized models, or run massive multi-modal models. It is the ideal platform for frontier AI research.
How does the H200 compare to the H100 for LLM inference?+
The H200 delivers approximately 1.5-1.9x higher LLM inference throughput compared to the H100, primarily due to the larger HBM3e memory (allowing larger batch sizes and KV caches) and higher memory bandwidth. For serving 70B models, the H200 can handle significantly more concurrent users.

Start using the H200 141GB today

Deploy a H200 141GB instance in 30 seconds. No upfront costs, no long-term contracts. Per-minute billing starting at $26.60/hr.

Deploy H200 141GB Now