The cheapest H100 isn't useful if you can't actually get one.
Most GPU price comparison sites stop at price. We've been asking a different question — how often is the GPU you want actually available to rent? For the last week we've been polling 15 provider APIs every hour and storing the results. Now we have enough data to publish reliability scores per provider.
Below is the live picture for the H100 (1 GPU). The reliability column is the percentage of the last 7 days of polls where each provider reported that GPU as available.
| Provider | Cheapest H100 /hr | 7d reliability | Hours covered |
|---|---|---|---|
| Cudo | $1.82 | 100% | 596 |
| PrimeIntellect | $1.90 | 100% | 596 |
| Digital Ocean | $6.74 | 100% | 149 |
| Theta EdgeCloud | $2.29 | 100% | 149 |
| Scaleway | $2.52 | 100% | 149 |
| Hyperstack | $1.90 | 90% | 168 |
| RunPod | $2.39 | 50% | 4,032 |
| Shadeform | $1.90 | 41% | 1,815 |
| Lambda | $2.86 | 23% | 6,048 |
| Vast | $1.80 | 13% | 840 |
| Hyperbolic | $1.29 | — | — |
| Seeweb | $1.89 | — | — |
| Verda | $2.29 | — | — |
| Nebius | $2.95 | — | 20 |
| Google Cloud | $4.96 | — | — |
| AWS | $6.88 | — | — |
| Azure | $6.98 | — | — |
What "reliability" means here
Every hour our cron hits each provider's availability API. We record the answer (available, limited, unavailable) along with a timestamp, and we deduplicate the history — a row is only stored when status or stock changes, not on every poll. Each row marks the start of an interval that holds until the next row (or now).
Reliability is the duration-weighted ratio: total time within the 7-day window where status was available, divided by total time covered. So a provider that's available for 6 days then drops for 1 day reads as ~86%, not 50%.
Some details that matter:
- 7-day window. Long enough to smooth out a single bad afternoon, short enough to reflect current state. We'll publish 30-day numbers when we have 30 days of data.
- Per-region aggregation. A provider with H100s in 3 regions averages duration across all 3. If only one region usually has stock, the score reflects that — you'd still see the badge as "available" on the provider page because somewhere you can launch one.
- Coverage-floor of 48 hours. Below that we show
—instead of a number. Two days of coverage is the minimum to draw any conclusion, and providers that started reporting late in the window or have very flat history may still fall below. available≠ guaranteed launch. It means the API said the SKU was in stock during that interval. Hyperscalers like AWS/GCP have no public availability API at all (the "—" rows above), so we genuinely have no signal there.
Why the spread is so wide
The numbers tend to cluster into three bands:
Near-100% reliability is what you see from providers who keep large fleets and don't oversell. They have headroom — even at peak hours, plenty of H100s sit idle. You pay for that with a higher hourly rate, but the GPU is there when you need it.
60-90% reliability is the marketplace tier. RunPod, Vast, PrimeIntellect aggregate compute from many smaller hosts. Stock fluctuates with demand and host churn. The price is great when it's there; you may need a second-choice region or wait an hour to get one at peak.
Under 50% typically means a small fleet that sells out fast, or a brand-new GPU SKU that providers are still scaling up. Worth checking back later.
A provider showing — isn't necessarily unreliable — it just means they don't expose an availability API. AWS, Google Cloud, Azure, and the niche regional clouds all fall in this bucket. The published price is real; whether you can actually launch the SKU you want at any given moment is opaque.
What we do with this data
Every row of every provider page on this site now shows its reliability score directly under the availability badge. So when you're comparing the H100 across providers, you see both the price and how often each provider has actually had it in stock the past week. Same for the H200, A100, L40S and every other GPU we track.
Reliability data is also wired into the /gpu/[model] detail pages as a 7-day heatmap — one cell per provider per day, colored by the percentage of polls that returned available that day. You can spot patterns like "this provider runs out every Sunday" or "this region is consistently green."
If you're building a workload that needs to actually run, the reliability score is the number you should prioritize after price.
Caveats and what's next
This is week-one data. The honest version of this post in three months will have:
- 30-day reliability scores. Our
AvailabilityRecordhistory retention is currently 7 days. Bumping to 30 is a one-line config change once we're confident in the storage budget. - Per-region scores on the provider page. Today we aggregate across regions; soon you'll be able to see "Nebius eu-north1 was 100% reliable, us-central1 was 32%" separately.
- Hyperscaler best-effort. AWS/GCP/Azure don't expose stock APIs, but their console gives a rough "low / no capacity" signal in the launch flow. Whether to scrape that is an open question — clean signal but messy collection.
- Outage attribution. When reliability drops, we'd love to say why — was it actually the provider, or our poller? Distinguishing those is important.
Honest about today: a few providers have one or two flaky regions that drag their score lower than the live experience feels. Mid-week the polling cron also briefly used a stale endpoint for one provider; the data behind today's score is good but not perfect.
Sources
- All availability data comes from each provider's own public API. We hit one endpoint per provider every hour and store the raw response.
- Reliability scoring is computed from the stored history: each
AvailabilityRecordrow marks the start of an interval; reliability = duration where status wasavailable÷ total covered duration over the 7-day window. Same math is reused on every/gpu/[model]and/providers/[name]page. - Pricing is sourced from the SkyPilot catalog plus direct provider APIs — same as everywhere else on the site.
- Methodology, limitations, and caveats: /about.
Got feedback or want a specific GPU added to the table? Email me or DM on LinkedIn.