ASRU: Activation Steering Meets Reinforcement Unlearning for Multimodal Large Language Models
The paper introduces ASRU, a framework designed for multimodal large language models that enhances machine unlearning. It addresses the challenge of maintaining generation quality while unlearning sensitive information. Experimental results indicate that ASRU significantly improves both unlearning effectiveness and generation quality, while preserving model utility.
- ▪ASRU incorporates generation quality as a core evaluation objective for multimodal unlearning.
- ▪The framework achieves a 24.6% improvement in unlearning effectiveness and a 5.8x enhancement in generation quality.
- ▪ASRU uses a small amount of retained supervision data to optimize the unlearning process.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Computation and Language arXiv:2605.15687 (cs) [Submitted on 15 May 2026] Title:ASRU: Activation Steering Meets Reinforcement Unlearning for Multimodal Large Language Models Authors:Jiahui Guang, Yingjie Zhu, Cuiyun Gao, Haiyan Wang, Jing Li, Di Shao, Zhaoquan Gu View a PDF of the paper titled ASRU: Activation Steering Meets Reinforcement Unlearning for Multimodal Large Language Models, by Jiahui Guang and 6 other authors View PDF HTML (experimental) Abstract:Multimodal large language models (MLLMs) may memorize sensitive cross-modal information during pretraining, making machine unlearning (MU) crucial. Existing methods typically evaluate unlearning effectiveness based on output deviations, while overlooking the generation quality after unlearning.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.