WeSearch

One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents

·3 min read · 0 reactions · 0 comments · 15 views
#artificial intelligence#gaming#reinforcement learning
One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents
⚡ TL;DR · AI summary

The paper introduces a novel approach for controlling non-player characters (NPCs) in life simulation games using a single reinforcement learning policy. This method, called pcsp, allows for scalable and real-time persona-conditioned NPC behavior. The results demonstrate significant improvements in persona identification and behavioral divergence in multi-agent environments.

Key facts
Original article
arXiv cs.AI
Read full at arXiv cs.AI →
Opening excerpt (first ~120 words) tap to expand

Computer Science > Artificial Intelligence arXiv:2605.23652 (cs) [Submitted on 22 May 2026] Title:One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents Authors:Yoosung Hong View a PDF of the paper titled One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents, by Yoosung Hong View PDF HTML (experimental) Abstract:On a 300-persona life-simulation benchmark, pcsp achieves compositional zero-shot persona identification up to 17x above chance, Spearman rho approx 0.73 semantic-behavioral alignment, and 22x faster inference than an LLM-as-policy baseline.

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from arXiv cs.AI