Activation probe cost

Sources: What runtime interpretability actually costs · Code

Bet 12 / Runtime interpretability

What does an activation probe actually cost?

On paper a probe is a rounding error: one dot product against a hidden state the forward pass already computed. So the "too expensive to run in production" folklore cannot be about the math. It is about the implementation, and the gap between a naive hook and a batched async readout is the whole story.

ModeliModel size. The forward pass grows with parameters while the probe does not, so the arithmetic gap widens as the model grows.

Probe scopeiRead one layer or every layer. Even reading every layer keeps the arithmetic a tiny fraction of the forward pass.

Probes / layeriHow many linear readouts per layer. A thousand readouts per token is still a fraction of a percent of the arithmetic.

ImplementationiNaive: a Python forward hook that breaks CUDA graph capture and copies scores to CPU each step. Batched: one small in-graph matmul with async export.

Arithmetic overhead

—%

Memory (bytes) overhead

—%

Serving latency overhead

—%

Verdict

—

Where the latency actually goes (log scale) — the arithmetic floor vs the two implementations:

How this is computed

Per generated token: forward pass ≈ 2 × params FLOPs; each linear readout is 2 × d_model FLOPs. The arithmetic floor is that ratio. The bytes version tells the same story: the probe weights for a thousand readouts are single-digit megabytes against the ~14 GB of model weights moved per token.

Naive hook: a Python forward hook breaks CUDA graph capture, forcing the decode path back to eager. An open vLLM internals plug-in pencils a decode-time mode at ~25% throughput; naive hooks run an order of magnitude over baseline. Per-readout device-to-host copies add synchronization points on top.
Batched async: pack all probes at a layer into one weight matrix, do a single small matmul inside the graph, accumulate in a device buffer, export asynchronously. Cost stays in low single digits.

The cost was never in the math. That is the argument of the post. Overhead figures here are illustrative bands anchored to those measured numbers.

Embed this on your site

Paste this HTML where you want the widget. It stays in sync with the live version, and matches your page in light or dark.

<iframe src="https://subhadipmitra.com/instruments/probe-cost/embed/" width="100%" height="920" loading="lazy" style="border:0;max-width:760px" title="Activation probe cost — subhadipmitra.com"></iframe>