Generating Satellite Imagery Data for Wildfire Detection through Mask-Conditioned Generative AI
Summary: arXiv:2604.02479v1 Announce Type: cross
Abstract
The scarcity of labeled satellite imagery remains a fundamental bottleneck for deep-learning (DL)-based wildfire monitoring systems. This paper investigates whether a diffusion-based foundation model for Earth Observation (EO), EarthSynth, can synthesize realistic post-wildfire Sentinel-2 RGB imagery conditioned on existing burn masks, without task-specific retraining.
Research Overview
Utilizing burn masks derived from the CalFireSeg-50 dataset (Martin et al., 2025), the study designs and evaluates six controlled experimental configurations. These configurations systematically vary key aspects of the pipeline:
- Pipeline Architecture: Mask-only full generation versus inpainting with pre-fire context.
- Prompt Engineering Strategy: Three hand-crafted prompts and a VLM-generated prompt via Qwen2-VL.
- Region-wise Color-Matching Post-Processing: A method aimed at enhancing the visual accuracy of generated imagery.
Methodology
The quantitative assessment was conducted on 10 stratified test samples using four complementary metrics:
- Burn IoU: A measure of the overlap between the predicted and actual burned areas.
- Burn-Region Color Distance ({\Delta}C_burn): An evaluation of color accuracy in burned regions.
- Darkness Contrast: A metric assessing the visibility of burned areas.
- Spectral Plausibility: An evaluation of the realism of the generated imagery.
Results
The results indicate that inpainting-based pipelines consistently outperform full-tile generation across all metrics. Notably:
- The structured inpainting prompt achieved the best spatial alignment with a Burn IoU of 0.456.
- Burn saliency was maximized with a Darkness Contrast of 20.44.
- Color matching produced the lowest color distance ({\Delta}C_burn = 63.22), though it came at the cost of reduced burn saliency.
Conclusion
The findings suggest that VLM-assisted inpainting is competitive with hand-crafted prompts, providing a robust foundation for incorporating generative data augmentation into wildfire detection pipelines. This advancement not only addresses the scarcity of labeled data but also enhances the efficacy of satellite imagery in monitoring wildfire events.
Access to Resources
For further exploration, the code and experimental details are available on Kaggle: Kaggle Experiment Link.
