Microsoft Research: LLMs Corrupt your files during delegated work
A recent study reveals that large language models (LLMs) can corrupt documents during delegated tasks. The research, conducted using a framework called DELEGATE-52, found that even advanced models can degrade document content by an average of 25%. This degradation is influenced by factors such as document size and interaction length, highlighting the unreliability of current LLMs in delegated workflows.
- ▪LLMs are increasingly used in delegated work, which requires trust in their accuracy.
- ▪The DELEGATE-52 study involved 19 LLMs and simulated workflows across 52 professional domains.
- ▪Current LLMs corrupt an average of 25% of document content during long workflows.
Opening excerpt (first ~120 words) tap to expand
LLMs Corrupt Your Documents When You Delegate Philippe Laban , Tobias Schnabel , Jennifer Neville April 2026 arXiv Download BibTex Large Language Models (LLMs) are poised to disrupt knowledge work, with the emergence of delegated work as a new interaction paradigm (e.g., vibe coding). Delegation requires trust – the expectation that the LLM will faithfully execute the task without introducing errors into documents. We introduce DELEGATE-52 to study the readiness of AI systems in delegated workflows. DELEGATE-52 simulates long delegated workflows that require in-depth document editing across 52 professional domains, such as coding, crystallography, and music notation.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Microsoft Research.