Universal Defect Generation Model with Large-Scale Dataset

Large-Scale Universal Defect Generation: Foundation Models and Datasets

Summary: arXiv:2604.08915v1 Announce Type: cross

Abstract: Existing defect/anomaly generation methods often rely on few-shot learning, which overfits to specific defect categories due to the lack of large-scale paired defect editing data. This issue is aggravated by substantial variations in defect scale and morphology, resulting in limited generalization, degraded realism, and category consistency. We address these challenges by introducing UDG, a large-scale dataset of 300K normal-abnormal-mask-caption quadruplets spanning diverse domains, and by presenting UniDG, a universal defect generation foundation model that supports both reference-based defect generation and text instruction-based defect editing without per-category fine-tuning.

Introduction

The field of defect generation has traditionally struggled with the limitations of few-shot learning techniques. These methods tend to focus on specific defect categories, which can lead to overfitting and hinder the overall effectiveness of anomaly detection systems. The lack of large-scale, high-quality datasets for defect editing exacerbates this issue, causing variability in defect scale and morphology.

Introducing UDG and UniDG

To tackle these challenges, researchers have developed a new dataset known as UDG, which consists of 300,000 quadruplets of normal-abnormal-mask-caption pairs. This extensive dataset spans a variety of domains, thereby providing a robust foundation for training defect generation models.

Alongside UDG, the team has introduced UniDG, a universal defect generation foundation model that allows for:

Reference-based defect generation
Text instruction-based defect editing

Importantly, UniDG does not require per-category fine-tuning, which is a significant advancement in the field.

Innovative Features of UniDG

UniDG employs several innovative techniques to enhance its performance:

Defect-Context Editing: This feature utilizes adaptive defect cropping and a structured diptych input format to improve the quality of generated defects.
Multimodal Attention: The model integrates reference and target conditions using MM-DiT multimodal attention, allowing for more coherent and contextually relevant defect generation.
Two-Stage Training Strategy: A dual training approach, consisting of Diversity-SFT followed by Consistency-RFT, is implemented to increase diversity while enhancing realism and reference consistency.

Performance Evaluation

Extensive experiments were conducted on datasets such as MVTec-AD and VisA to evaluate the performance of UniDG. The results indicate that UniDG significantly outperforms prior few-shot anomaly generation methods as well as existing image insertion and editing baselines. Key performance metrics include:

Improved synthesis quality
Enhanced single-class and multi-class anomaly detection
Effective localization of defects

Conclusion

In conclusion, the introduction of UDG and UniDG marks a significant advancement in the field of defect generation and anomaly detection. By providing a large-scale dataset and a versatile foundation model, this research paves the way for more robust and effective applications in various domains. Researchers and practitioners can access the code for UniDG at GitHub.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Universal Defect Generation Model with Large-Scale Dataset

Large-Scale Universal Defect Generation: Foundation Models and Datasets

Introduction

Introducing UDG and UniDG

Innovative Features of UniDG

Performance Evaluation

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related