Optimizing Tile Selection in Frozen WSI-MIL with FOCI

Are Compact Rationales Free? Measuring Tile Selection Headroom in Frozen WSI-MIL

Recent advancements in whole-slide image (WSI) analysis have leveraged multiple instance learning (MIL) classifiers, which exhibit promising slide-level area under the curve (AUC) scores. However, the process of obtaining interpretable outputs remains a challenge. Traditional methods often rely on attention scores as post-hoc explanations, yet these scores can inadvertently reflect the model’s aggregation preferences rather than provide a compact rationale for the classification results.

This article explores the potential of post-hoc rationale highlighting in frozen WSI-MIL classifiers, specifically investigating whether a slide-level prediction can be accurately derived from a compact subset of tiles without the need for retraining the model. This concept is operationalized through a novel approach known as Finding Optimal Contextual Instances (FOCI), which serves as a lightweight rationale-readout layer applied to a frozen MIL backbone.

Key Concepts and Methodology

FOCI operates by training on objectives that ensure model-output sufficiency and tile exclusion, focusing on subsets of tiles that are kept or dropped. The evaluation of this method uses an insertion-style Sequential Reveal Protocol (SRP) tailored for WSI-MIL, culminating in the introduction of the Selection Headroom Index (SHI) as a means to quantify the efficacy of the selected rationales.

Model-Output Sufficiency: Ensures that the selected tiles are sufficient to reproduce the model’s output.
Exclusion Objectives: Focuses on the impact of excluding certain tiles on the model’s predictions.
Sequential Reveal Protocol (SRP): A structured method to evaluate the contribution of tiles incrementally.
Selection Headroom Index (SHI): A metric to assess the quality of selected rationales in WSI-MIL.

Findings and Implications

Across three distinct WSI benchmarks and seven MIL backbones, FOCI’s findings demonstrate that the ability to formulate compact rationales is significantly dependent on the selection headroom. Specifically, the study reveals notable differences among various model architectures:

Transformer and Multi-Branch Attention Aggregators: These models are capable of accommodating compact rationales effectively.
Near-Minimal Attention-Pooling Baselines: These models tend to reach a saturation point, limiting the effectiveness of compact rationales.
Hard-Selection Backbones: These architectures often present conflicts when paired with external readout mechanisms.

For instance, the TransMIL model, when evaluated against its current CLS-proxy ranking, demonstrates a significant reduction in the Minimum Sufficient K (MSK) tile count by 32-56% across various benchmarks. Furthermore, the combination of ACMIL with FOCI achieves the highest mean SHI score, quantified at +0.465.

Conclusion

The results from this investigation position FOCI as a critical interpretability and auditing tool at the model level. It is crucial to note that selected tiles do not equate to claims of diagnostic sufficiency at the clinical or pathologist level; rather, they serve as candidate rationales that provide a compact and reviewable perspective on the localization of predictions made by frozen MIL classifiers. This work not only enhances the interpretability of WSI-MIL models but also opens avenues for further research into the optimization of tile selection in complex image analysis tasks.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Optimizing Tile Selection in Frozen WSI-MIL with FOCI

Are Compact Rationales Free? Measuring Tile Selection Headroom in Frozen WSI-MIL

Key Concepts and Methodology

Findings and Implications

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related