Spectral Unforgetting: Post-Hoc Recovery of Damaged Capabilities Without Retraining
The paper titled 'Spectral Unforgetting' addresses the issue of catastrophic forgetting in language models during fine-tuning. It proposes a method called DG-Hard that aims to recover damaged capabilities without retraining. The results indicate that fine-tuning-induced capability loss can be mitigated through a spectral repair approach.
- ▪The study focuses on the phenomenon of catastrophic forgetting in language models during fine-tuning.
- ▪DG-Hard is introduced as a checkpoint-only spectral repair method that preserves target-task gains while recovering lost capabilities.
- ▪The method has shown strong balanced repair results across multiple model and task settings, restoring safety alignment without using alignment data.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Machine Learning arXiv:2605.20296 (cs) [Submitted on 19 May 2026] Title:Spectral Unforgetting: Post-Hoc Recovery of Damaged Capabilities Without Retraining Authors:Aarash Abro, Muhammad Tahir View a PDF of the paper titled Spectral Unforgetting: Post-Hoc Recovery of Damaged Capabilities Without Retraining, by Aarash Abro and 1 other authors View PDF HTML (experimental) Abstract:Fine-tuning a language model for a target task routinely degrades capabilities the training data never explicitly threatened. We study this phenomenon, known as catastrophic forgetting, and propose a post-hoc repair solution that uses only the pretrained checkpoint $W_{\mathrm{base}}$ and its fine-tuned descendant $W_{\mathrm{ft}}$.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.