MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection
The article discusses a new framework called MemAudit designed for auditing the memory of language model agents. This framework addresses vulnerabilities caused by adversarial users who can inject malicious records into the agents' memory. MemAudit significantly reduces the success rates of memory injection attacks in various scenarios.
- ▪MemAudit is a post-hoc causal memory auditing framework for memory-augmented language model agents.
- ▪The framework combines a counterfactual memory influence score and a memory consistency graph to identify harmful memories.
- ▪Evaluation results show that MemAudit reduces QA attack success from 70% to 0% and RAP attack success from 83.3% to 0%.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.23723 (cs) [Submitted on 22 May 2026] Title:MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection Authors:Zhewen Tan, Yilun Yao, Huiyan Jin, Wenhan Yu, Guoan Wang, Mengyuan Fan, liang lu, Feng Liu, Xiangzheng Zhang, Duohe Ma, Tong Yang, Lin Sun View a PDF of the paper titled MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection, by Zhewen Tan and 11 other authors View PDF HTML (experimental) Abstract:Large language model agents increasingly rely on persistent memory to store past interactions, retrieve relevant demonstrations, and improve long-horizon task execution.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.