The Deterministic Horizon: When Extended Reasoning Fails and Tool Delegation Becomes Necessary
Accepted at ICML 2026, this paper establishes an Attention Bottleneck Theorem bounding the state-tracking capacity of decoder-only transformers and identifies a 'Deterministic Horizon' around 19–31 steps beyond which chain-of-thought reasoning degrades super-exponentially. Empirical validation across 12 models and 8 task domains — including SWE-Bench and WebArena — shows hybrid neural-plus-tool systems reach 86–94% accuracy versus 24–42% for pure chain-of-thought.
Why it matters
The paper shifts the narrative around reasoning failures from a training-data problem to an architectural capacity limit, providing principled thresholds for when agentic systems should delegate to external tools rather than reason further.
Importance: 3/5
ICML 2026 acceptance + broad empirical validation; establishes principled architectural limits on chain-of-thought reasoning with direct agent design implications