JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA

May 22, 2026 · 4:00 AM UTC ·3 min read · 0 reactions · 0 comments · 31 views

#computer vision #artificial intelligence #machine learning

TL;DR · WeSearch summary

The paper presents JUDO, a new framework for industrial anomaly detection that integrates domain knowledge into visual and textual reasoning. JUDO enhances the performance of Large Multimodal Models by incorporating domain-specific context, allowing for improved anomaly understanding. Experimental results indicate that JUDO outperforms existing models on the MMAD benchmark.

Key facts

▪JUDO is a Juxtaposed Domain-Oriented Multimodal Reasoner designed for industrial anomaly detection.
▪The framework uses visual reasoning to segment defect regions by comparing query images with normal images.
▪JUDO incorporates domain knowledge through supervised fine-tuning and reinforcement learning to enhance reasoning capabilities.

Original article

arXiv cs.AI

Read full at arXiv cs.AI →

Opening excerpt (first ~120 words) tap to expand

Computer Science > Computer Vision and Pattern Recognition arXiv:2605.20284 (cs) [Submitted on 19 May 2026] Title:JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA Authors:Hyunju Kang, Woohyun Lee, Jaewon Kim, Hogun Park View a PDF of the paper titled JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA, by Hyunju Kang and 3 other authors View PDF HTML (experimental) Abstract:Industrial anomaly detection has been significantly advanced by Large Multimodal Models (LMMs), enabling diverse human instructions beyond detection, particularly through visually grounded reasoning for better image understanding.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed

Discussion

0 comments

JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA

Discussion

More from arXiv cs.AI