IAD-Unify: A Region-Grounded Unified Model for Industrial Anomaly Segmentation, Understanding, and Generation
Summary: arXiv:2604.12440v1 Announce Type: cross
Abstract: Real-world industrial inspection requires not only localizing defects, but also explaining them in natural language and generating controlled defect edits. However, existing approaches fail to jointly support all three capabilities within a unified framework and evaluation protocol.
In response to this gap, we present IAD-Unify, a dual-encoder unified framework designed to enhance industrial anomaly detection and understanding. The framework employs a frozen DINOv2-based region expert that supplies precise anomaly evidence to a shared Qwen3.5-4B vision-language backbone via lightweight token injection. This innovative design enables three critical tasks:
- Anomaly segmentation
- Region-grounded understanding
- Mask-guided generation
To facilitate a comprehensive evaluation of these capabilities, we constructed Anomaly-56K, a robust multi-task IAD evaluation platform. This platform encompasses a total of 59,916 images distributed across 24 categories featuring 104 distinct defect variants. The findings from controlled ablations reveal significant insights:
- Decisive mechanism: Region grounding is critical for understanding; its removal results in a decline of location accuracy by more than 76 percentage points.
- Deployment viability: The predicted-region performance closely aligns with oracle results, confirming the potential for practical deployment.
- Image fidelity: Region-grounded generation leads to optimal full-image fidelity and enhanced perceptual quality in masked regions.
- Training efficiency: Pre-initialized joint training enhances understanding with minimal impact on generation quality, resulting in a decrease of only 0.16 dB.
Moreover, IAD-Unify demonstrates impressive performance on the MMAD benchmark, including categories that were not part of the training data. This capability illustrates the model’s robust cross-category generalization, making it a significant advancement in the field of industrial anomaly detection.
In summary, IAD-Unify represents a pioneering approach to addressing the multifaceted challenges of industrial inspection. By integrating anomaly segmentation, understanding, and generation into a single framework, it not only enhances detection accuracy but also improves the interpretability of defects through natural language explanations. The use of a comprehensive evaluation platform ensures that the model’s performance can be rigorously assessed, paving the way for future advancements in the intersection of artificial intelligence and industrial applications.
