Skip to content
Herowebs

01

AI Platforms · Engineering · 12 April 2026

The economics of retrieval

Most teams underestimate how much retrieval quality, not model choice, drives the cost and reliability of an LLM product.

When a team tells us their AI product is “too expensive,” we almost never solve it by switching models. We solve it in the retrieval layer — by chunking better, ranking better, and admitting when retrieval has failed before we ever call the model.

The cost of a bad retrieval is not the tokens. It is the user trust you spend on a hallucinated answer.

Three rules we keep returning to

  1. Measure retrieval quality on your own data. Public benchmarks are useful for picking a vendor, useless for picking a configuration.
  2. Make retrieval failures legible. If the system can’t find relevant context, the UI must say so. Never paper over it with the model.
  3. Optimise the surface, not the score. A retrieval system with 88% recall and a great refusal UX outperforms one with 94% recall that bluffs.

We will write more on the evaluation harness we use, including the way we run retrieval, generation, and end-to-end suites independently — in a future entry.

Next entry →

UK + EU AI Act readiness, in plain English

What UK operators actually need to do before the EU AI Act high-risk obligations bite — and what the UK regime is likely to ask in parallel.