A Universal Cliff and a Design Fingerprint: Cross-Section Defect Detection Under LLM Orchestration
The paper discusses the challenges of detecting cross-section defects in documents processed by language model systems. It identifies a significant drop in detection capability when models operate under orchestration compared to single-agent scenarios. The findings reveal that the most aligned systems may not be the safest, highlighting structural issues in defect detection.
- ▪A universal detection cliff is observed, where models lose the ability to find cross-section defects under orchestration.
- ▪Detection capability falls by two-thirds or more across various tested paradigms.
- ▪Only one developer's model shows improvement in defect detection as alignment strengthens, but it also raises false alarms.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Software Engineering arXiv:2605.26174 (cs) [Submitted on 25 May 2026] Title:A Universal Cliff and a Design Fingerprint: Cross-Section Defect Detection Under LLM Orchestration Authors:Hiroki Fukui View a PDF of the paper titled A Universal Cliff and a Design Fingerprint: Cross-Section Defect Detection Under LLM Orchestration, by Hiroki Fukui View PDF HTML (experimental) Abstract:Production language-model systems answer a request by partitioning it across an invisible orchestration of worker agents that recompose one integrated report. We ask what this does to a class of defect no single worker can see: a contradiction in the relation between two distant sections of a document.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.