Uni-DAD: Unified Distillation and Adaptation of Diffusion Models for Few-step Few-shot Image Generation
Summary: arXiv:2511.18281v3 Announce Type: replace-cross
Abstract
Diffusion models (DMs) have gained significant attention in the realm of image generation due to their ability to produce high-quality outputs. However, the process of sampling images from these models becomes increasingly costly when attempting to adapt them to new domains. While distilled DMs offer a solution by speeding up generation, they often remain limited to the domain of their teacher model. Consequently, achieving fast and high-quality image generation in novel domains typically involves complex two-stage pipelines that either adapt before distillation or vice versa. Unfortunately, these approaches can lead to design complications and often result in a decrease in image quality or diversity.
Introduction of Uni-DAD
To address these issues, we introduce Uni-DAD, a groundbreaking single-stage pipeline that seamlessly integrates DM distillation and adaptation. This innovative approach combines two essential training signals:
- Dual-domain distribution-matching distillation (DMD) objective: This component guides the student model to learn distributions from both the source teacher and a target teacher, ensuring that the generated images maintain fidelity to both domains.
- Multi-head generative adversarial network (GAN) loss: This loss function promotes realism in the generated images across various feature scales, enhancing the overall quality of the output.
Advantages of Uni-DAD
The design of Uni-DAD brings forth several advantages:
- Preservation of diverse source knowledge: By incorporating source domain distillation, Uni-DAD retains a rich variety of knowledge from the source teacher.
- Stabilized training and reduced overfitting: The multi-head GAN architecture aids in stabilizing the training process, particularly in few-shot scenarios where data is limited.
- Adaptation to structurally distant domains: The inclusion of a target teacher allows for effective adaptation to domains that differ significantly in structure from the source.
Evaluation and Results
We conducted extensive evaluations of Uni-DAD on two comprehensive benchmarks focusing on few-shot image generation (FSIG) and subject-driven personalization (SDP) using diffusion backbones. Our findings reveal that Uni-DAD not only meets but often exceeds the quality of existing state-of-the-art (SoTA) adaptation methods, even with as few as four sampling steps. In many instances, Uni-DAD also outperformed traditional two-stage pipelines in terms of both quality and diversity of the generated images.
Conclusion
In conclusion, Uni-DAD represents a significant advancement in the field of image generation, effectively merging the processes of distillation and adaptation into a single, efficient pipeline. This innovation holds promise for future applications in various domains, enhancing the capabilities of diffusion models in generating high-quality images rapidly and accurately.
Code Availability
For further exploration and implementation, the code for Uni-DAD is available at GitHub.
