Cheap tokens for AI agents
Agents run in loops and burn tokens around the clock. Point them at a drop-in OpenAI/Anthropic-compatible endpoint and cut the cost ~10×.
Built for token burn
Open-weight models on cost-efficient infrastructure at a fixed USD margin — roughly 10× less per token than OpenRouter. For a daemon making thousands of calls a day, that compounds fast.
Programmatic, drop-in access
Same OpenAI and Anthropic APIs, same models — just a different base URL and key. Pay-as-you-go from a USD balance, so a runaway loop can't run up a card. No subscriptions, no KYC.
Drop-in: change the base URL
# Point any OpenAI- or Anthropic-compatible agent at cheaptokens
export OPENAI_BASE_URL="https://api.cheaptokens.dev/v1"
export OPENAI_API_KEY="YOUR_CHEAPTOKENS_KEY"
# Anthropic-style agents:
export ANTHROPIC_BASE_URL="https://api.cheaptokens.dev" Same model, a fraction of the price
Per-million-token rates versus OpenRouter on the same model.
| Model | OpenRouter | cheaptokens | You save |
|---|---|---|---|
| MiniMax-M2.7 | $0.24 / $0.96 | $0.025 / $0.1 | ~10× cheaper |
| Kimi-K2.6 | $0.66 / $3.41 | $0.07 / $0.35 | ~9× cheaper |
Same models. USD per 1M tokens (input / output). OpenRouter shown for comparison. · Open cost calculator
FAQ
Why does agent token cost matter now?
Agentic harnesses pay full API rates after the 2026 subsidy wind-down, and autonomous loops burn tokens fast — fixed cheap per-token keeps that bounded.
How do I cap spend per agent?
Fund a USD balance; when it's spent the key pauses, so an agent can't overrun a card.
Will it fit my agent framework?
Any OpenAI- or Anthropic-compatible framework works by swapping the base URL.
Related use cases
Scale agents without the bill shock
Create an account, top up with USDT, and point your agents at cheaptokens.
Get started