DeCo: Efficient Frequency-Decoupled Pixel Diffusion for AI Images

Date:

DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

Summary: arXiv:2511.19365v2 Announce Type: replace-cross

Abstract: Pixel diffusion aims to generate images directly in pixel space in an end-to-end fashion. This approach avoids the limitations of Variational Autoencoders (VAE) in the two-stage latent diffusion, offering higher model capacity. Existing pixel diffusion models suffer from slow training and inference times, as they typically model both high-frequency signals and low-frequency semantics within a single diffusion transformer (DiT). To pursue a more efficient pixel diffusion paradigm, we propose the frequency-DeCoupled pixel diffusion framework.

Introduction

The realm of image generation has witnessed significant advancements, especially with the introduction of pixel diffusion techniques. However, the existing methods are often bogged down by inefficiencies arising from the simultaneous modeling of varying frequency components. The DeCo framework introduces a novel approach that segregates the generation of high and low frequency components, enhancing both training and inference speeds.

Key Features of DeCo

  • Decoupled Generation: By leveraging a lightweight pixel decoder, DeCo generates high-frequency details while relying on semantic guidance from the DiT, allowing the latter to focus on low-frequency semantics.
  • Frequency-aware Flow-matching Loss: This innovative loss function emphasizes visually salient frequencies, effectively suppressing insignificant ones, which results in higher quality image generation.
  • Performance Metrics: Extensive experiments indicate that DeCo achieves an impressive Fréchet Inception Distance (FID) score of 1.62 for 256×256 images and 2.22 for 512×512 images on the ImageNet dataset, significantly narrowing the performance gap with traditional latent diffusion methods.
  • Leading Text-to-Image Model: In a system-level comparison, DeCo’s pretrained text-to-image model scored a remarkable 0.86 on GenEval, establishing its dominance in the field.

Conclusion

The introduction of the DeCo framework presents a promising advancement in the pixel diffusion landscape. By effectively decoupling high and low frequency component generation, it not only improves efficiency but also enhances the quality of generated images. The public availability of the code at https://github.com/Zehong-Ma/DeCo encourages further exploration and development in this exciting area of AI-driven image generation.

Future Directions

As research progresses, it will be intriguing to see how the principles of frequency decoupling can be applied to other domains within generative modeling. The potential for improvements in speed and accuracy may lead to groundbreaking applications in various fields, including virtual reality, gaming, and artistic content creation.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.