Memorization Dynamics of Fill-in-the-Middle Pretraining

May 25, 2026 · 4:00 AM UTC ·2 min read · 0 reactions · 0 comments · 11 views

#artificial intelligence #machine learning #language models

⚡ TL;DR · AI summary

The paper titled 'Memorization Dynamics of Fill-in-the-Middle Pretraining' explores the effects of the fill-in-the-middle (FIM) pretraining objective on verbatim memorization in causal language models. The authors compare FIM with standard left-to-right (LTR) objectives using a FineWeb-Gutenberg corpus. Their findings indicate that FIM is more effective at recovering short or partially matching spans, while LTR excels in high-confidence long exact continuations.

Key facts

▪The study investigates the memorization dynamics of FIM in a controlled setting.
▪FIM more often recovers short or partially matching spans compared to LTR.
▪Verbatim extraction under FIM-training grows approximately linearly with repetitions.

Original article

arXiv cs.AI

Read full at arXiv cs.AI →

Opening excerpt (first ~120 words) tap to expand

Computer Science > Computation and Language arXiv:2605.22981 (cs) [Submitted on 21 May 2026] Title:Memorization Dynamics of Fill-in-the-Middle Pretraining Authors:Tobias von Arx, Tanguy Dieudonné View a PDF of the paper titled Memorization Dynamics of Fill-in-the-Middle Pretraining, by Tobias von Arx and 1 other authors View PDF HTML (experimental) Abstract:Fill-in-the-middle (FIM) is a pretraining objective widely used to equip causal language models with infilling ability, yet its effect on verbatim memorization remains underexplored. We study the memorization dynamics of FIM in a controlled setting by pretraining matched Llama 3.2 models with FIM and standard left-to-right (LTR) objectives on a FineWeb-Gutenberg corpus containing repeated Gutenberg excerpts.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed

Discussion

0 comments

Memorization Dynamics of Fill-in-the-Middle Pretraining

Discussion

More from arXiv cs.AI