Mamba-SSM with LLM Reasoning for Feature Selection: Faithfulness-Aware Biomarker Discovery
In a groundbreaking study published on arXiv, researchers have introduced a novel approach, termed Mamba-SSM, that utilizes Large Language Model (LLM) reasoning to enhance feature selection in biomarker discovery. The focus of this research is the identification of candidate biomarkers while addressing challenges posed by tissue-composition confounders that can adversely affect the performance of downstream classifiers.
The study highlights the inefficiencies of using gradient saliency derived from deep sequence models, which often produce gene lists that are contaminated by confounding factors. These confounders can mislead classification efforts, thereby diminishing the reliability of the identified biomarkers. The researchers aimed to determine whether LLM chain-of-thought (CoT) reasoning could effectively filter out these confounders and assess the correlation between the quality of reasoning and downstream performance.
Methodology
The researchers trained a Mamba State Space Model (SSM) on RNA sequencing data from The Cancer Genome Atlas (TCGA) focusing on breast cancer (BRCA). They extracted the top 50 genes based on gradient saliency and subsequently utilized DeepSeek-R1 to evaluate each candidate gene using a structured CoT approach. This rigorous evaluation process led to the final selection of 17 genes.
Results
The findings from the held-out test split were illuminating. The initial set of 50 genes, derived solely from raw gradient saliency without the LLM intervention, performed worse than a baseline of 5,000 genes, achieving an Area Under the Curve (AUC) score of 0.832 compared to 0.903 for the baseline. Remarkably, the LLM-filtered gene set outperformed both, achieving an AUC score of 0.927 while using 294 times fewer features. This significant improvement underscores the efficacy of LLM reasoning in biomarker selection.
Faithfulness Audit
To further validate the results, a faithfulness audit was conducted using established databases including COSMIC CGC, OncoKB, and PAM50. The audit revealed that 6 out of the 17 selected genes, representing 35.3%, were validated BRCA biomarkers. However, it also highlighted that 10 out of the 16 known BRCA genes present in the input data were overlooked during the selection process, including the significant gene FOXA1.
Conclusion
The results of this study indicate a divergence between downstream performance and reasoning faithfulness, suggesting a phenomenon of selective faithfulness in this context. The targeted removal of confounders through LLM reasoning appears to enhance predictive performance, even if it compromises comprehensive recall of known biomarkers. This research paves the way for future advancements in biomarker discovery, emphasizing the role of AI and LLMs in overcoming traditional challenges in the field of genomics.
