2026-04-28

When 1M context actually beats agent retrieval

Long context isn't a substitute for retrieval. It's a substitute for some retrievals.

#context-window#retrieval#claude#agents

The temptation

Claude 4.7 has 1M context. The temptation is to dump everything in.

Don't. Or at least, don't always.

When 1M context wins

  • Single-document analysis: a 200K-word PDF, a meeting transcript, a full repo subdirectory. Retrieval over a single coherent doc adds latency for no quality.
  • Cross-references inside the same artifact: "find every place this function is called in this codebase" — the model SEES every callsite when you give it the whole file.
  • One-shot synthesis: "summarize these 20 emails into one digest" — way faster to drop them all in than to embed and retrieve.

When agent retrieval still wins

  • Decision memory: querying "what did we decide about X six months ago" across 200 wiki entries. The wiki is 5MB. You don't want to load 5MB into every prompt.
  • Pricing checks: "is this proposed pricing consistent with our canonical pricing" — load only the canonical, not the entire business-ops repo.
  • Cross-project: "did we solve this in another repo" — agents that scope to specific repos beat dumping everything.

The hybrid I run

For Struvo decision agents:

  1. Each agent has a scoped retriever — wiki only, code only, pricing only
  2. The retrieval pulls 5-10 relevant entries
  3. THOSE get loaded into the model context with the question
  4. The model answers from the loaded context

Total tokens per decision: about 15-30K instead of 1M+. Latency: 3-5 seconds instead of 30. Cost: about 1/30th.

The 1M is still useful — for the rare case where the agent needs to expand scope. But "always load 1M" is wasteful by default.

What you can steal

The decision is per-call:

  1. Single coherent artifact, one question? Long context.
  2. Many artifacts, one question? Retrieve, then context.
  3. Same question, many artifacts? Embed once, retrieve forever.

The mistake is using long context as a default. It's a tool, not a habit.