A Data-Centric Vision Transformer Baseline for SAR Sea Ice Classification
Summary: arXiv:2604.03094v1 Announce Type: cross
Introduction
Accurate and automated sea ice classification is vital for climate monitoring and maritime safety in the Arctic. The operational standard for such classifications is Synthetic Aperture Radar (SAR), thanks to its all-weather capabilities. However, distinguishing morphologically similar ice classes poses significant challenges, particularly under conditions of severe class imbalance.
Research Objectives
This paper does not claim to present a fully validated multimodal system. Instead, it aims to establish a trustworthy SAR-only baseline that future fusion work can build upon. This foundational work is crucial for advancing the field of sea ice classification.
Data and Methodology
Utilizing the AI4Arctic/ASIP Sea Ice Dataset (v2), which comprises 461 Sentinel-1 scenes matched with expert ice charts, this study employs several innovative techniques:
- Full-resolution Sentinel-1 Extra Wide inputs
- Leakage-aware stratified patch splitting
- SIGRID-3 stage-of-development labels
- Training-set normalization
These methodologies are essential for evaluating Vision Transformer (ViT) baselines effectively.
Model Comparisons
The study compares ViT-Base models trained with two different loss functions: cross-entropy and weighted cross-entropy, against a ViT-Large model trained using focal loss. Focal loss is designed to address class imbalance by focusing more on hard-to-classify examples, which is particularly useful in the context of rare ice classes.
Results
Among the configurations tested, the ViT-Large model with focal loss achieved notable performance metrics:
- Held-out accuracy: 69.6%
- Weighted F1 score: 68.8%
- Precision on the minority Multi-Year Ice class: 83.9%
Conclusion
The results demonstrate that focal-loss training provides a more advantageous precision-recall trade-off compared to weighted cross-entropy, particularly for rare ice classes. This finding establishes a cleaner baseline for future multimodal fusion efforts, which may incorporate optical, thermal, or meteorological data.
Implications for Future Research
This research sets the stage for subsequent studies aiming to enhance sea ice classification through multimodal data integration. By establishing a reliable baseline, it encourages further exploration into how different data types can be fused to improve accuracy and reliability in sea ice monitoring and climate studies.
