Best GPU clouds for fine-tuning LLMs (2026)

Fine-tuning sits between training and inference: short bursts of GPU work interleaved with eval, dataset shuffling, and human review. Billing granularity (per-second vs per-hour), persistent storage, and dataset egress hit harder than the headline $/hr. Three picks across the axes.

Live pricing + 7-day reliability update on every page load. Curation refreshed manually.

Best for performance

Hyperstack · 1× A100-80GB

$1.35/hr98% 7-day reliabilityfree egress

region: CANADA-1

LoRA / QLoRA on a 70B model fits a single A100 80GB. Region-aware availability across 3 DCs (CANADA-1, NORWAY-1, US-1), 100% data coverage in our 7-day window. Free egress.

Provider page →GPU page →

Best for ops

Lambda · 4× H100

$14.20/hr1% 7-day reliabilityfree egress

18 regions: asia-northeast-1, asia-northeast-2, asia-south-1 +15

Per-minute billing, persistent filesystems on 1-Click Clusters, snapshot workflow that doesn't require a manual EBS dance. You can stop a 4×H100 box overnight without losing your dataset cache.

Provider page →GPU page →

Best for cost

RunPod · 1× H100

$2.39/hr42% 7-day reliabilityfree egress

8 regions: CA, CZ, IS +5

Per-second billing + persistent volumes at $0.07/GB-mo. Iterative fine-tuners burn 30-50% of wall-clock idle (debugging, eval, waiting for human review). Per-second vs per-hour on a 14-min run = pay for 14 min vs 60 min — a ~4× difference invisible in the listing price.

Provider page →GPU page →

Avoid: Vast (RTX 4090)

Consumer cards (24 GB VRAM) for anything past 7B QLoRA force aggressive memory offload that 5×'s wall-clock time vs an A100 80GB at 2× the hourly rate. Net loss on TCO every time.

The caveat we wish more pages mentioned

Dataset egress is the silent killer. If your dataset lives in S3 and you fine-tune on Lambda, AWS charges egress every time you re-pull. A 2 TB dataset = $180 one-time exit fee at $0.09/GB. Co-locate compute with storage, OR park your dataset in Cloudflare R2 ($0 egress) as a hub.

Data we don't yet show — and how it might change the call

Billing granularity (per-second / per-minute / per-hour / monthly-only) — Hetzner is monthly-only; matters dramatically on <1h iterations
Snapshot/checkpoint create + restore time for a 70B image — nobody publishes this
Persistent storage $/GB-mo per provider

Honesty about gaps beats false confidence. We add data as it becomes structurally available.

All picks·Compare 28 providers·How we collect data

Best GPU clouds for fine-tuning LLMs (2026)

Live pricing + 7-day reliability update on every page load. Curation refreshed manually.

Best for performance

Hyperstack · 1× A100-80GB

$1.35/hr98% 7-day reliabilityfree egress

region: CANADA-1

LoRA / QLoRA on a 70B model fits a single A100 80GB. Region-aware availability across 3 DCs (CANADA-1, NORWAY-1, US-1), 100% data coverage in our 7-day window. Free egress.

Provider page →GPU page →

Best for ops

Lambda · 4× H100

$14.20/hr1% 7-day reliabilityfree egress

18 regions: asia-northeast-1, asia-northeast-2, asia-south-1 +15

Per-minute billing, persistent filesystems on 1-Click Clusters, snapshot workflow that doesn't require a manual EBS dance. You can stop a 4×H100 box overnight without losing your dataset cache.

Provider page →GPU page →

Best for cost

RunPod · 1× H100

$2.39/hr42% 7-day reliabilityfree egress

8 regions: CA, CZ, IS +5

Provider page →GPU page →

Avoid: Vast (RTX 4090)

Consumer cards (24 GB VRAM) for anything past 7B QLoRA force aggressive memory offload that 5×'s wall-clock time vs an A100 80GB at 2× the hourly rate. Net loss on TCO every time.

The caveat we wish more pages mentioned

Data we don't yet show — and how it might change the call

Billing granularity (per-second / per-minute / per-hour / monthly-only) — Hetzner is monthly-only; matters dramatically on <1h iterations
Snapshot/checkpoint create + restore time for a 70B image — nobody publishes this
Persistent storage $/GB-mo per provider

Honesty about gaps beats false confidence. We add data as it becomes structurally available.