What Makes Chain-of-Thought Work at Probe Time? Local Co-occurrence Rather Than Global Derivation
The paper explores the effectiveness of chain-of-thought (CoT) prompting in language models, focusing on probe-time rather than generation-time behavior. It identifies that local co-occurrence, particularly short-range token adjacency, plays a significant role in improving model accuracy. The findings suggest that lexical activation is more influential than global logical derivation in achieving CoT performance.
- ▪Chain-of-thought prompting enhances language-model accuracy significantly.
- ▪The study emphasizes the importance of local co-occurrence over global derivation in probe-time scenarios.
- ▪Short-range token adjacency contributes more to performance gains than sentence-level logical ordering.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.26795 (cs) [Submitted on 26 May 2026] Title:What Makes Chain-of-Thought Work at Probe Time? Local Co-occurrence Rather Than Global Derivation Authors:Xiang Wang, Wei Wei View a PDF of the paper titled What Makes Chain-of-Thought Work at Probe Time? Local Co-occurrence Rather Than Global Derivation, by Xiang Wang and 1 other authors View PDF HTML (experimental) Abstract:Chain-of-thought (CoT) prompting reliably improves language-model accuracy, but which properties of a rationale text drive the improvement is poorly understood. Prior work has largely studied generation-time behavior.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.