Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning
The paper presents a novel training framework called ProxyCoT aimed at improving long-context reasoning in large language models. It highlights the performance disparity between proxy contexts and full contexts in reasoning tasks. The proposed method demonstrates improved efficiency and generalization in reasoning capabilities across various datasets.
- ▪Large language models struggle with long-context tasks that require complex reasoning.
- ▪ProxyCoT transfers reasoning capabilities from short proxy contexts to full long contexts.
- ▪Experiments show that ProxyCoT outperforms strong baselines with reduced computational overhead.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Computation and Language arXiv:2605.20201 (cs) [Submitted on 6 Apr 2026] Title:Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning Authors:Miao Li, Irina Saparina, Alexander Gurung, Mirella Lapata View a PDF of the paper titled Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning, by Miao Li and 3 other authors View PDF HTML (experimental) Abstract:Recent large language models support inputs of up to 10 million tokens, yet they perform poorly on long-context tasks that require complex reasoning. Such tasks can be solved using only a subset of the input -- a proxy context -- rather than the full sequence. Despite sharing the same underlying reasoning process, models exhibit a significant performance disparity between proxy and full contexts.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.