The Deterministic Horizon: When Extended Reasoning Fails and Tool Delegation Becomes Necessary

Research official 1 src. ~1 min

Accepted at ICML 2026, this paper establishes an Attention Bottleneck Theorem bounding the state-tracking capacity of decoder-only transformers and identifies a 'Deterministic Horizon' around 19–31 steps beyond which chain-of-thought reasoning degrades super-exponentially. Empirical validation across 12 models and 8 task domains — including SWE-Bench and WebArena — shows hybrid neural-plus-tool systems reach 86–94% accuracy versus 24–42% for pure chain-of-thought.

Why it matters

The paper shifts the narrative around reasoning failures from a training-data problem to an architectural capacity limit, providing principled thresholds for when agentic systems should delegate to external tools rather than reason further.

Importance: 3/5

ICML 2026 acceptance + broad empirical validation; establishes principled architectural limits on chain-of-thought reasoning with direct agent design implications

reasoning agents theory benchmark paper formal-reasoning

Sources

official The Deterministic Horizon: When Extended Reasoning Fails and Tool Delegation Becomes Necessary — arXiv