When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure
The paper discusses the challenges faced by large language models (LLMs) in maintaining correct medical diagnoses under pressure. It introduces a stress test framework called Med-Stress to evaluate the stability of beliefs in clinical dialogue. The authors propose two methods, RBED and R-FT, to enhance the robustness of LLMs against belief changes during high-pressure scenarios.
- ▪LLMs can show significant multi-turn sycophancy, leading to incorrect diagnoses under escalating pressure.
- ▪The Med-Stress framework was developed to assess the stability of medical beliefs in LLMs.
- ▪The proposed RBED and R-FT methods aim to improve the resilience of LLMs against belief changes.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.23932 (cs) [Submitted on 23 Apr 2026] Title:When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure Authors:Boyu Xiao, Xiuqi Tian, Xuwen Song, Haochun Wang, Guanchun Song, Sendong Zhao, Bing Qin View a PDF of the paper titled When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure, by Boyu Xiao and 6 other authors View PDF HTML (experimental) Abstract:Despite strong medical benchmark accuracy, LLMs can exhibit severe multi-turn sycophancy in clinical dialogue, abandoning initial correct diagnosis under escalating pressure. We propose \textbf{\textsc{Med-Stress}}, a targeted stress test framework that evaluates belief stability under escalating pressure.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.