ICRL: Learning to Internalize Self-Critique with Reinforcement Learning
The paper presents a novel framework called ICRL, which aims to enhance the self-improvement capabilities of language model-based agents. By jointly training a solver and a critic, ICRL enables the model to internalize critique and improve its performance without relying on external feedback. The results demonstrate significant performance gains on various reasoning tasks, indicating the effectiveness of this approach.
- ▪ICRL stands for Learning to Internalize Self-Critique with Reinforcement Learning.
- ▪The framework trains a solver and a critic from a shared backbone to enhance the model's ability to self-improve.
- ▪Results show average performance gains of 6.4 points on agentic tasks and 7.0 points on mathematical reasoning tasks.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.15224 (cs) [Submitted on 13 May 2026] Title:ICRL: Learning to Internalize Self-Critique with Reinforcement Learning Authors:Jianbo Lin, Xiaomin Yu, Yi Xin, Yifu Guo, Zhuosong Jiang, Zhongqi Yue, Weishi Wang, Heqing Zou, Chengwei Qin, Hui Xiong View a PDF of the paper titled ICRL: Learning to Internalize Self-Critique with Reinforcement Learning, by Jianbo Lin and 8 other authors View PDF HTML (experimental) Abstract:Large language model-based agents make mistakes, yet critique can often guide the same model toward correct behavior. However, when critique is removed, the model may fail again on the same query, indicating that it has not internalized the critique's guidance into its underlying capability.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.