JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA
The paper presents JUDO, a new framework for industrial anomaly detection that integrates domain knowledge into visual and textual reasoning. JUDO enhances the performance of Large Multimodal Models by incorporating domain-specific context, allowing for improved anomaly understanding. Experimental results indicate that JUDO outperforms existing models on the MMAD benchmark.
- ▪JUDO is a Juxtaposed Domain-Oriented Multimodal Reasoner designed for industrial anomaly detection.
- ▪The framework uses visual reasoning to segment defect regions by comparing query images with normal images.
- ▪JUDO incorporates domain knowledge through supervised fine-tuning and reinforcement learning to enhance reasoning capabilities.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Computer Vision and Pattern Recognition arXiv:2605.20284 (cs) [Submitted on 19 May 2026] Title:JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA Authors:Hyunju Kang, Woohyun Lee, Jaewon Kim, Hogun Park View a PDF of the paper titled JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA, by Hyunju Kang and 3 other authors View PDF HTML (experimental) Abstract:Industrial anomaly detection has been significantly advanced by Large Multimodal Models (LMMs), enabling diverse human instructions beyond detection, particularly through visually grounded reasoning for better image understanding.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.