QUANTA
by Ryshe
Pricing

Start free. Pay a nominal fee as you scale.

Point your base URL at the gateway and bring your own provider key. You pay for the tokens you route through it, and the compression usually pays for itself.

Free

$0forever

For trying it on a side project.

  • 1M tokens processed / month
  • 1 API key
  • Drop-in base URL, bring your own provider key
  • Basic savings report (response headers)
  • Community support
Start free

Starter

Popular
$29/ month

For a production feature or two.

  • 25M tokens processed / month
  • 3 API keys
  • Savings dashboard with history
  • Conservative and aggressive compression
  • Email support
Start free, upgrade later

Pro

$99/ month

For teams running multiple workloads.

  • 100M tokens processed / month
  • Unlimited keys, team accounts
  • Per-workflow attribution and forecasting
  • Policy and redaction controls
  • Priority support
Start free, upgrade later

Enterprise

Customin your tenant

Azure-native, governed, deployed by Ryshe.

  • Unlimited volume
  • Deployed inside your Azure tenant
  • Entra ID, Key Vault, private endpoints
  • Audit logging and data retention controls
  • Managed operations and SLAs
Talk to us

Indicative tiers for launch. Token limits and prices may change before general availability.

Questions

How is usage measured?

By tokens processed through the gateway, the same unit your model provider bills you on. You only route the workloads you choose, and the savings typically more than cover the fee.

Do you store my provider keys?

No. In the default mode you bring your own provider key on each request and we forward with it. We never persist it.

Will compression change my answers?

The default compression is lossless: it normalizes whitespace and removes duplicate instructions and messages. Aggressive, reversible compression is opt-in and gated by evaluation.

Can I self-host instead?

Yes. The paid plans exist for the hosted dashboard, team accounts, governance, and support. If you want to run it yourself, the engine layer is built on open standards.