How Much Thinking is Enough? Quantifying and Understanding Redundancy in LLM Reasoning
The paper investigates the redundancy in reasoning processes of large language models (LLMs). It quantifies how much reasoning can be truncated without affecting the correctness of the answers produced. The findings suggest that a significant portion of reasoning steps is unnecessary, indicating a structural issue in how these models are trained.
- ▪The study formalizes reasoning redundancy in terms of the reasoning model itself.
- ▪Redundancy levels were found to be between 61% and 93% across various models and benchmarks.
- ▪The research shows that over-thinking is a structural property of current reasoning models, not a flaw in individual models.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.23926 (cs) [Submitted on 21 Apr 2026] Title:How Much Thinking is Enough? Quantifying and Understanding Redundancy in LLM Reasoning Authors:Zhiyuan Zhai, Xinkai You, Wenjing Yan, Xin Wang View a PDF of the paper titled How Much Thinking is Enough? Quantifying and Understanding Redundancy in LLM Reasoning, by Zhiyuan Zhai and 3 other authors View PDF HTML (experimental) Abstract:Reasoning-capable large language models solve hard problems by emitting long chains of thought, paying heavily in latency, GPU time, and energy.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.