Type to search posts and projects to navigate

Instrument · Agents

Convergence is not correctness

An agentic loop retries until a verifier says done, so it converges on whatever passes the check. Whether that is correct is decided by the verifier, not by convergence. Watch refinement and reward-hacking pull against each other.

Sources: Loop Engineering

Bet 10 / Loop engineering

A loop always converges. That tells you nothing about correctness.

An agentic loop retries until a verifier says "done", so it converges on whatever passes the check, not on what you meant. Whether "done" is actually correct is set by the verifier, not by convergence. Two forces pull against each other: refinement makes the loop genuinely better, while optimizing against a weak verifier games it. Watch which one wins.

Loops that converged
%
Of those, correct
%
"Done" but wrong
%
Verdict
converged (looks done) correct given done (the truth) over iterations

Outcome of every loop at the budget:

How this is computed

Exact, via the loop's stopping-time distribution. Each iteration t, a candidate is correct with probability pₜ = base + refinement·(t-1); the verifier's false-accept rate is faₜ = fa₀ + gaming·(t-1). A correct candidate passes with probability 1 − false-reject; a wrong one passes with faₜ. The loop stops on the first pass.

  • P(correct | done) is the verifier's precision at the base rate: p(1−fr) / (p(1−fr) + (1−p)·fa). It does not depend on how many times you loop.
  • More iterations raise convergence toward 1, and nothing else. On a weak verifier, they raise the chance "done" is a false-accept.
  • Refinement lifts correctness over iterations; gaming collapses it. The verifier, not the loop, decides which wins.

The argument of Loop Engineering and Bet #10: the verifier is the part that actually decides.

Embed this on your site

Paste this HTML where you want the widget. It stays in sync with the live version, and matches your page in light or dark.

Subhadip Mitra