Inference-Path Optimization via Circuit Duplication in Frozen Visual Transformers for Marine Species Classification
Summary: arXiv:2604.03428v1 Announce Type: cross
Abstract: Automated underwater species classification is constrained by annotation cost and environmental variation that limits the transferability of fully supervised models. Recent work has shown that frozen embeddings from self-supervised vision foundation models already provide a strong label-efficient baseline for marine image classification. Here we investigate whether this frozen-embedding regime can be improved at inference time, without fine-tuning or changing model weights.
In the pursuit of enhancing automated underwater species classification, researchers have faced significant challenges stemming from high annotation costs and environmental variations. These factors often hinder the effectiveness of fully supervised models, making it difficult to achieve reliable results in diverse marine environments. Recent advancements have indicated that frozen embeddings derived from self-supervised vision foundation models offer a promising and efficient baseline for marine image classification tasks.
This article explores the potential of improving the frozen-embedding regime specifically during inference time, without the need for fine-tuning or altering model weights. The approach centers around a technique known as Circuit Duplication, initially proposed for Large Language Models (LLMs). This method involves traversing a selected range of transformer layers twice during the forward pass, which has shown to enhance performance in natural language processing tasks.
Methodology
In this study, we evaluated the effectiveness of Circuit Duplication on the class-imbalanced AQUA20 benchmark, utilizing frozen DINOv3 embeddings under two distinct settings:
- Global Circuit Selection: A single duplicated circuit is chosen for the entire dataset.
- Class-Specific Circuit Selection: Each species may receive a different optimal circuit based on its unique characteristics.
Both methodologies employed simple semi-supervised downstream classifiers to analyze performance and validate the benefits of the Circuit Duplication technique.
Results
The findings of this research indicate that Circuit Duplication consistently outperforms the standard frozen forward pass. Under the maximum label budget, the class-specific selection achieved a remarkable macro F1 score of 0.875. This score narrows the gap to the fully supervised ConvNeXt benchmark, which holds an F1 score of 0.889, by 1.4 points without any gradient-based training. Notably, four species even surpassed their fully supervised references, with octopus classification improving by an impressive 12.1 F1 points.
Conclusion
Across all budget scenarios, approximately 75% of classes exhibited a preference for the class-specific circuit, demonstrating a genuine benefit that is dependent on class characteristics. This research represents a pioneering application of Circuit Duplication within the domain of computer vision, particularly in marine species classification. The implications of these findings could pave the way for more efficient and cost-effective methods in the field of automated species recognition, enhancing our ability to monitor and protect marine biodiversity.
