WeSearch

RealUserSim: Bridging the Reality Gap in Agent Benchmarking via Grounded User Simulation

·3 min read · 0 reactions · 0 comments · 11 views
#human-computer interaction#artificial intelligence#user simulation
RealUserSim: Bridging the Reality Gap in Agent Benchmarking via Grounded User Simulation
⚡ TL;DR · AI summary

The paper introduces RealUserSim, a new user simulation framework designed to improve agent benchmarking by grounding simulations in real behavioral data. It highlights the limitations of current LLM-based simulations, which often fail to accurately represent human behavior. By utilizing data from over 14,000 authentic conversations, the framework significantly enhances the fidelity of agent evaluations.

Key facts
Original article
arXiv cs.AI
Read full at arXiv cs.AI →
Opening excerpt (first ~120 words) tap to expand

Computer Science > Human-Computer Interaction arXiv:2605.20204 (cs) [Submitted on 7 Apr 2026] Title:RealUserSim: Bridging the Reality Gap in Agent Benchmarking via Grounded User Simulation Authors:Ming Zhu, Juntao Tan, Rithesh Murthy, Jielin Qiu, Liangwei Yang, Wenting Zhao, Silvio Savarese, Shelby Heinecke, Huan Wang View a PDF of the paper titled RealUserSim: Bridging the Reality Gap in Agent Benchmarking via Grounded User Simulation, by Ming Zhu and 8 other authors View PDF HTML (experimental) Abstract:LLM-based user simulation is the primary mechanism for end-to-end agent evaluation, yet simulated users are poor proxies for real humans: unconstrained LLM defaults produce a Formalism Ceiling (style match rates of 6-8% against real users), while hand-crafted behavioral directives…

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from arXiv cs.AI