Skip to main content

Pricing

Token-based pricing in USD. You pay only for the tokens you send (input) and the tokens the model returns (output). No subscription, no minimum spend, no per-seat fees — the same rates apply to every account.

Per-million-token rates

ModelInputOutputCard
GLM 5.1 FP8$0.50 / 1M$2.00 / 1M/models/zai-org/GLM-5.1-FP8
Kimi K2.6$0.55 / 1M$2.20 / 1M/models/moonshotai/Kimi-K2.6
MiniMax M2.5$0.45 / 1M$1.80 / 1M/models/MiniMaxAI/MiniMax-M2.5
DeepSeek V4 Flash$0.35 / 1M$1.40 / 1M/models/deepseek-ai/DeepSeek-V4-Flash
Rates are per million tokens. A typical 1k-token chat turn costs a fraction of a cent. See each model card for context window, capabilities, and example prompts.

How billing works

Corvex uses prepaid credits. Top up once and the platform meters every request against your balance.
  • Buy credits in any amount once you sign in — credits land in your account immediately and never expire.
  • Auto-reload keeps your balance above a threshold you set, so production traffic doesn’t stall.
  • Usage pauses at $0 — requests return an insufficient_credits error instead of running up a surprise bill.
  • One invoice per top-up, downloadable from the dashboard for accounting.

Monthly spend caps

A workspace owner can set an optional monthly spend cap — a hard dollar ceiling (for example, $500/month) on top of your prepaid balance.
  • Set it once. The cap lives on the workspace. While usage stays under the cap, requests run normally.
  • Hard stop at the cap. When the month’s spend reaches the cap, inference requests return HTTP 402 with an insufficient_credits error until the cap resets or is raised. There is no soft/overage mode.
  • Resets monthly. The cap resets at the start of each calendar month (UTC).
  • Independent from token quotas. A spend cap is dollar-denominated and is separate from per-key token rate limits (which return 429). New keys are provisioned with no default token cap — usage is bounded by your prepaid balance and any spend cap. You can still set a per-key token limit explicitly; whichever limit is hit first applies.
  • Raise it anytime. Increasing the cap unblocks requests immediately — no waiting for the reset.

What’s included

Every account on the public catalog gets the same platform features at the rates above:
  • Every model in the table on Corvex’s shared inference fleet.
  • OpenAI-compatible /v1/chat/completions, /v1/embeddings, and /v1/models endpoints.
  • Per-key usage metering, rate limits, and structured error responses.
  • The Corvex playground and dashboard at app.corvex.ai.

FAQ

Do I need a contract to get started?

No. Sign in, top up, and call the API. The dashboard handles keys, usage, and invoices end-to-end.

Is there a free tier or free credits?

New accounts can claim starter credits after sign-up. See the dashboard for the current promotion.

What about steady, high-volume workloads?

Corvex Reserved Endpoints offer dedicated single-tenant GPU capacity at a fixed per-hour rate. Better unit economics once your traffic is predictable.

When does pricing change?

Rates on this page mirror the platform catalog at /v1/models. When a rate changes, the next API response and this page update together — the tier-0 drift gate below blocks any release where they disagree.
Pricing is sourced from the Metronome rate card (infrastructure/metronome/config.yaml) and kept in sync with /v1/models. A tier-0 drift gate fails CI if the docs page and the rate card disagree.
Ready to build? Sign up at app.corvex.ai to claim your starter credits. Need a model that isn’t listed? Tell us.