ASRU: Activation Steering Meets Reinforcement Unlearning for Multimodal Large Language Models

May 18, 2026 · 4:00 AM UTC ·2 min read · 0 reactions · 0 comments · 32 views

#machine learning #artificial intelligence #language models

TL;DR · WeSearch summary

The paper introduces ASRU, a framework designed for multimodal large language models that enhances machine unlearning. It addresses the challenge of maintaining generation quality while unlearning sensitive information. Experimental results indicate that ASRU significantly improves both unlearning effectiveness and generation quality, while preserving model utility.

Key facts

▪ASRU incorporates generation quality as a core evaluation objective for multimodal unlearning.
▪The framework achieves a 24.6% improvement in unlearning effectiveness and a 5.8x enhancement in generation quality.
▪ASRU uses a small amount of retained supervision data to optimize the unlearning process.

Original article

arXiv cs.AI

Read full at arXiv cs.AI →

Opening excerpt (first ~120 words) tap to expand

Computer Science > Computation and Language arXiv:2605.15687 (cs) [Submitted on 15 May 2026] Title:ASRU: Activation Steering Meets Reinforcement Unlearning for Multimodal Large Language Models Authors:Jiahui Guang, Yingjie Zhu, Cuiyun Gao, Haiyan Wang, Jing Li, Di Shao, Zhaoquan Gu View a PDF of the paper titled ASRU: Activation Steering Meets Reinforcement Unlearning for Multimodal Large Language Models, by Jiahui Guang and 6 other authors View PDF HTML (experimental) Abstract:Multimodal large language models (MLLMs) may memorize sensitive cross-modal information during pretraining, making machine unlearning (MU) crucial. Existing methods typically evaluate unlearning effectiveness based on output deviations, while overlooking the generation quality after unlearning.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed

Discussion

0 comments

ASRU: Activation Steering Meets Reinforcement Unlearning for Multimodal Large Language Models

Discussion

More from arXiv cs.AI