True-Util™ is the one pricing model behind every Kinesis service — whether the compute is ours, sourced from a partner, or yours. Pay only for the cycles your workloads actually consume, capped at the reserved rate. Idle capacity isn't wasted — it's monetized.
$0.00
$0.00
$0.00
Available to monetize on the Kinesis grid
True-Util works the same way regardless of where the compute comes from. Pick the buying mode that matches your workload — the metering, caps, and telemetry are identical.
Multi-tenant compute on the Kinesis grid. You pay for the CPU and GPU cycles your workloads actually consume — and never more than the equivalent Reserved monthly rate. Spiky, variable, or hard-to-forecast workloads save the most.
Single-tenant compute, reserved monthly. Full control over the box, predictable billing, the same Kinesis orchestration and telemetry as Shared. For workloads where steady utilization is a given.
Run the Kinesis grid on compute you already own or rent — AWS savings plans, on-prem servers, your datacenter, your private cloud. Same product, same kernel-level orchestration, priced at 20% of True-Util Shared.
Traditional clouds bill wall-clock time on whatever you reserved — full price whether you used 5% of the box or 95%. True-Util inverts the math: Shared meters real cycles, Dedicated is priced for steady use, and On Your Compute is priced as a percentage of Shared. The more output you get from a processor, the more we both earn from it.
Average cloud customer utilization is under 20%. The other 80% is what True-Util captures.
The low end is what you pay when the machine is idle (Shared) or unused (On Your Compute). The high end is the Reserved cap — what you'd pay on a traditional cloud regardless.
28 vCPUs, 96 GB RAM per card · 1×, 2×, 4×, 8× configs
TRUE-UTIL SHARED OR DEDICATED
24 vCPUs, 96 GB RAM
TRUE-UTIL SHARED OR DEDICATED
4 vCPUs, 8 GB RAM · Spot-class for fault-tolerant work
TRUE-UTIL SHARED
Any hardware you own or rent · Raspberry Pi to H200 superclusters
YOUR HARDWARE, OUR SOFTWARE
The same workloads that cost the most on traditional clouds save the most on True-Util.
H100s sit idle between prompts. The bill is the same whether you served 100 requests or 100,000.
True-Util Shared meters inference time only. No queries, no cost. Bursty traffic caps at the Reserved rate.
H100s sit idle between prompts. The bill is the same whether you served 100 requests or 100,000.
The Kinesis winTrue-Util Shared meters inference time only. No queries, no cost. Bursty traffic caps at the Reserved rate.
Overprovisioning for traffic that hasn’t shown up. Or worse — under-provisioning and falling over the first time it does.
Pay pennies at low traffic. Costs cap at the Reserved rate during spikes. Headroom without prepayment.
Overprovisioning for traffic that hasn’t shown up. Or worse — under-provisioning and falling over the first time it does.
The Kinesis winPay pennies at low traffic. Costs cap at the Reserved rate during spikes. Headroom without prepayment.
Staging servers run 24/7 to be ready, but burn nights and weekends.
True-Util drops the bill as activity drops. Same reservation, lower cost when the team’s asleep.
Staging servers run 24/7 to be ready, but burn nights and weekends.
The Kinesis winTrue-Util drops the bill as activity drops. Same reservation, lower cost when the team’s asleep.
Reserved AWS instances, on-prem servers, donated lab GPUs — capacity already paid for, sitting underused.
Run the Kinesis grid on your hardware at 20% of Shared. Same orchestration, FinOps visibility, 80% less spend on what you already own.
Reserved AWS instances, on-prem servers, donated lab GPUs — capacity already paid for, sitting underused.
The Kinesis winRun the Kinesis grid on your hardware at 20% of Shared. Same orchestration, FinOps visibility, 80% less spend on what you already own.
$100 in free credit. No credit card required. Deploy your first container in under five minutes — bring a GitHub repo, a Dockerfile, or just describe what you want