Benchmarking the Limits of In-Context Reinforcement Learning for Ad-Hoc Teamwork
The paper explores the limitations of In-Context Reinforcement Learning (ICRL) in the context of Ad-Hoc Teamwork (AHT). A benchmark called ICRL4AHT is introduced to evaluate the performance of various ICRL algorithms in multi-agent settings. Results indicate that these algorithms struggle with test-time adaptation and often perform worse than random baselines.
- ▪The study introduces a benchmark named ICRL4AHT to assess ICRL's effectiveness in AHT scenarios.
- ▪The benchmark includes a diverse suite of teammates and provides a reproducible evaluation pipeline.
- ▪Findings reveal that ICRL algorithms fail to adapt effectively in multi-agent environments, underperforming compared to random strategies.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.24423 (cs) [Submitted on 23 May 2026] Title:Benchmarking the Limits of In-Context Reinforcement Learning for Ad-Hoc Teamwork Authors:Yuheng Jing, Kai Li, Ziwen Zhang, Jiajun Zhang, Zeyao Ma, Jiaxi Yang, Lei Zhang, Zhe Wu, Jinmin He, Junliang Xing, Jian Cheng View a PDF of the paper titled Benchmarking the Limits of In-Context Reinforcement Learning for Ad-Hoc Teamwork, by Yuheng Jing and 10 other authors View PDF HTML (experimental) Abstract:In-Context Reinforcement Learning (ICRL) has enabled foundation agents to adapt instantaneously to novel tasks, yet its efficacy in Ad-Hoc Teamwork (AHT)-where coordination with unknown partners is required-remains unexplored.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.