Economics
Agent spend is the runaway cost of the moment. Most of it is re-derivation: a thousand agents working out the same fix, the same setup, the same gotcha, at frontier-model prices, every week. Here is why it is so expensive, and what changes once the expensive thinking happens only once.
Figuring a task out for the first time needs a cutting-edge model: full reasoning, retries, dead ends. Doing it again does not. Firmament stores the lesson the first run produces, so every run after can take a much cheaper model and still get it right. The first agent pays for the discovery; the rest of the team gets it served.
First run · Claude Opus 4.8 · no memory
≈ $6.40
figures it out: full reasoning, retries, dead ends
the lesson is captured and stored in Firmament
Every run after · DeepSeek v4 Flash + the lesson
≈ $0.70
or Claude Haiku 4.5, or GPT-5.1 Codex Mini: whatever is cheap that quarter
same task · any vendor
11% of the costThe whole argument rests on one fact: a frontier agent model costs one to two orders of magnitude more per token than a capable cheap one. List output prices, per million tokens:
| Model | Input /1M | Output /1M |
|---|---|---|
| Claude Opus | $5.00 | $25.00 |
| Claude Haiku | $1.00 | $5.00 |
| GPT-5 mini | $0.25 | $2.00 |
| DeepSeek V4-Flash | $0.14 | $0.28 |
List API prices as of June 2026, from Anthropic, DeepSeek and OpenAI. Model names and rates move fast; the gap does not. Frontier output runs roughly 90× a cheap open model's and about 5× same-vendor Haiku.