Economics

What your coding agents really cost, and how to stop paying for it twice.

Agent spend is the runaway cost of the moment. Most of it is re-derivation: a thousand agents working out the same fix, the same setup, the same gotcha, at frontier-model prices, every week. Here is why it is so expensive, and what changes once the expensive thinking happens only once.

Learn it once on the frontier model. Run it cheap forever after.

Figuring a task out for the first time needs a cutting-edge model: full reasoning, retries, dead ends. Doing it again does not. Firmament stores the lesson the first run produces, so every run after can take a much cheaper model and still get it right. The first agent pays for the discovery; the rest of the team gets it served.

the same task, twice (illustrative)

First run · Claude Opus 4.8 · no memory

≈ $6.40

figures it out: full reasoning, retries, dead ends

the lesson is captured and stored in Firmament

Every run after · DeepSeek v4 Flash + the lesson

≈ $0.70

or Claude Haiku 4.5, or GPT-5.1 Codex Mini: whatever is cheap that quarter

same task · any vendor

11% of the cost

The price gap is real, and it is enormous.

The whole argument rests on one fact: a frontier agent model costs one to two orders of magnitude more per token than a capable cheap one. List output prices, per million tokens:

ModelInput /1MOutput /1M
Claude Opus$5.00$25.00
Claude Haiku$1.00$5.00
GPT-5 mini$0.25$2.00
DeepSeek V4-Flash$0.14$0.28

List API prices as of June 2026, from Anthropic, DeepSeek and OpenAI. Model names and rates move fast; the gap does not. Frontier output runs roughly 90× a cheap open model's and about 5× same-vendor Haiku.

Stop paying frontier prices for work you have already done.