HP-Edit: A Human-Preference Post-Training Framework for Image Editing
Summary: arXiv:2604.19406v1 Announce Type: cross
In the rapidly evolving field of image editing, the integration of artificial intelligence (AI) is revolutionizing traditional practices. Leading the charge are advanced generative diffusion models, which have become the standard for real-world content editing. Despite notable improvements from reinforcement learning (RL) techniques like Diffusion-DPO and Flow-GRPO in enhancing generation quality, the application of Reinforcement Learning from Human Feedback (RLHF) to diffusion-based editing has largely remained uncharted territory. This is primarily due to the absence of scalable human-preference datasets and frameworks that cater to a variety of editing requirements.
To address these challenges, researchers have introduced HP-Edit, a groundbreaking post-training framework designed specifically for Human Preference-aligned Editing. This innovative approach is accompanied by the launch of RealPref-50K, a comprehensive real-world dataset encompassing eight common editing tasks while maintaining a balance in common object editing.
Key Features of HP-Edit
- Human-Preference Scoring: HP-Edit leverages a minimal amount of human-preference scoring data in conjunction with a pretrained visual large language model (VLM) to create HP-Scorer. This automatic evaluator is aligned with human preferences, ensuring that the editing outputs cater closely to user expectations.
- Scalable Preference Dataset: By utilizing HP-Scorer, the framework efficiently constructs a scalable preference dataset that significantly enhances the training process.
- Reward Function for Post-Training: HP-Scorer also serves as a reward function for the post-training phase of the editing model, contributing to improved output quality.
- RealPref-Bench Benchmark: In addition to defining a new dataset, HP-Edit introduces RealPref-Bench, a benchmark that allows for the evaluation of real-world editing performance across various models.
Experimental Validation
Extensive experiments conducted on models such as Qwen-Image-Edit-2509 demonstrate the efficacy of HP-Edit. The results indicate a significant enhancement in model performance, with outputs aligning more closely with human preferences. This is a crucial development in the field, as it not only addresses the limitations of existing models but also sets a new standard for future advancements in image editing technology.
By bridging the gap between traditional editing methods and modern AI techniques, HP-Edit stands as a testament to the potential of human-centered design in artificial intelligence. This framework exemplifies how incorporating human feedback into machine learning can lead to more intuitive and user-friendly editing experiences.
Conclusion
The introduction of HP-Edit and RealPref-50K marks a significant milestone in the field of image editing. As AI continues to evolve, frameworks like HP-Edit will play an essential role in ensuring that technology remains aligned with human preferences, ultimately enhancing the quality of digital content creation. The advancements brought forth by this framework are anticipated to inspire further research and development in human-preference aligned AI applications.
