Learning Stable Predictors from Weak Supervision under Distribution Shift
Summary: arXiv:2604.05002v1 Announce Type: cross
Abstract
Learning from weak or proxy supervision is common when ground-truth labels are unavailable, yet robustness under distribution shift remains poorly understood, especially when the supervision mechanism itself changes. We formalize this as supervision drift, defined as changes in P(y | x, c) across contexts, and study it in CRISPR-Cas13d experiments where guide efficacy is inferred indirectly from RNA-seq responses.
Research Overview
In our research, we utilize data from two human cell lines and multiple time points to build a controlled non-IID benchmark with explicit domain and temporal shifts while keeping the weak-label construction fixed. This approach allows us to investigate the effects of supervision drift on model performance.
Key Findings
- Strong In-Domain Performance: The models achieved a strong in-domain performance with a ridge R2 of 0.356 and a Spearman correlation coefficient (rho) of 0.442.
- Partial Cross-Cell-Line Transfer: The models demonstrated partial cross-cell-line transfer, achieving a correlation coefficient of approximately 0.40.
- Challenges in Temporal Transfer: However, temporal transfer failed across all models, resulting in negative R2 values and near-zero correlations. For instance, the XGBoost model yielded an R2 of -0.155 and a rho of 0.056.
Additional Analyses
Further analyses confirmed the observed patterns of performance deterioration. While the feature-label relationships remained stable across different cell lines, they exhibited significant changes over time. This indicates that the failures observed in model performance were primarily attributed to supervision drift rather than inherent limitations of the models themselves.
Implications of the Research
The findings of this study underscore the importance of feature stability as a diagnostic tool for detecting non-transferability issues prior to deploying models in real-world applications. By identifying shifts in feature-label relationships, practitioners can better understand the robustness of their models under varying conditions.
Conclusion
Our research contributes valuable insights into the challenges associated with learning from weak supervision under distribution shift. By formalizing the concept of supervision drift and demonstrating its impact on model performance, we pave the way for further investigations and advancements in the field of machine learning.
