WeSearch

Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression

·3 min read · 0 reactions · 0 comments · 14 views
#machine learning#reinforcement learning#language models#predictive distributions
Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression
⚡ TL;DR · AI summary

The paper introduces a novel reinforcement learning objective called Distribution-Aware Reward, aimed at improving predictive distributions in large language models for regression tasks. This method evaluates multiple decoded samples as an empirical predictive distribution, enhancing both accuracy and dispersion of predictions. The authors demonstrate that their approach outperforms traditional supervised fine-tuning and pointwise reinforcement learning across various tasks, leading to better uncertainty diagnostics and model robustness.

Key facts
Original article
arXiv cs.AI
Read full at arXiv cs.AI →
Opening excerpt (first ~120 words) tap to expand

Computer Science > Machine Learning arXiv:2605.20740 (cs) [Submitted on 20 May 2026] Title:Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression Authors:Jungsoo Park, Hyungjoo Chae, Ethan Mendes, Jay DeYoung, Varsha Kishore, Wei Xu, Alan Ritter View a PDF of the paper titled Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression, by Jungsoo Park and 6 other authors View PDF HTML (experimental) Abstract:Large language models can predict real-valued quantities from heterogeneous inputs such as text, code, and molecular strings, but most training objectives score each decoded floating-point number independently, improving point estimates without ensuring calibrated predictive distributions.

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from arXiv cs.AI