$f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, Generative Models, and LLMs with Off- and On-Policy Data
The paper discusses a new family of loss functions called $f$-Trajectory Balance for tuning generative models, including GFlowNets and large language models. It highlights the effectiveness of these loss functions in both on-policy and off-policy settings, maintaining the same global minimizer. The authors demonstrate the application of these losses across various tasks, showcasing their benefits in generative modeling.
- ▪The mean square error between target and model log probabilities is an effective surrogate loss for training generative models.
- ▪The proposed $f$-Trajectory Balance loss functions correspond to $f$-divergences, allowing for improved mode coverage.
- ▪The authors applied their loss functions to tasks such as molecule discovery and tuning large language models.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Machine Learning arXiv:2605.15417 (cs) [Submitted on 14 May 2026] Title:$f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, Generative Models, and LLMs with Off- and On-Policy Data Authors:Jake Fawkes, Jason Hartford View a PDF of the paper titled $f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, Generative Models, and LLMs with Off- and On-Policy Data, by Jake Fawkes and 1 other authors View PDF HTML (experimental) Abstract:In GFlowNets and variational inference, it has been shown that the mean square error between target and model log probabilities is an effective, low variance, surrogate loss for training generative models.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.