Metacognition as Reward: Reinforcing LLM Reasoning via Knowledge and Regulation Signals

May 25, 2026 · 4:00 AM UTC ·3 min read · 0 reactions · 0 comments · 27 views

#artificial intelligence #machine learning #natural language processing

TL;DR · WeSearch summary

The paper introduces a new reinforcement learning framework called Metacognition-as-Reward (MaR) aimed at enhancing the reasoning capabilities of large language models (LLMs). MaR focuses on guiding LLM reasoning through metacognitive knowledge and regulation, providing a more comprehensive reward system. Experiments demonstrate that MaR significantly improves model performance across various benchmarks, surpassing existing models in several instances.

Key facts

▪Recent reinforcement learning methods have improved the reasoning abilities of large language models.
▪The Metacognition-as-Reward framework guides LLM reasoning through metacognitive knowledge and regulation.
▪Experiments show that MaR achieves up to a 7.7% gain over base models and outperforms stronger models on individual benchmarks.

Original article

arXiv cs.AI

Read full at arXiv cs.AI →

Opening excerpt (first ~120 words) tap to expand

Computer Science > Computation and Language arXiv:2605.23384 (cs) [Submitted on 22 May 2026] Title:Metacognition as Reward: Reinforcing LLM Reasoning via Knowledge and Regulation Signals Authors:Sirui Chen, Lei Xu, Yuying Zhao, Yutian Chen, Yu Wang, Beier Zhu, Hanwang Zhang, Shengjie Zhao, Chaochao Lu View a PDF of the paper titled Metacognition as Reward: Reinforcing LLM Reasoning via Knowledge and Regulation Signals, by Sirui Chen and 8 other authors View PDF Abstract:Recent RL methods have substantially improved the reasoning abilities of LLMs.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed

Discussion

0 comments

Metacognition as Reward: Reinforcing LLM Reasoning via Knowledge and Regulation Signals

Discussion

More from arXiv cs.AI