When Do We Need LLMs? A Diagnostic for Language-Driven Bandits
Summary: arXiv:2604.05859v1 Announce Type: new
Abstract: We study Contextual Multi-Armed Bandits (CMABs) for non-episodic sequential decision making problems where the context includes both textual and numerical information (e.g., recommendation systems, dynamic portfolio adjustments, offer selection; all frequent problems in finance). While Large Language Models (LLMs) are increasingly applied to these settings, utilizing LLMs for reasoning at every decision step is computationally expensive and uncertainty estimates are difficult to obtain.
Introduction
The rise of Large Language Models (LLMs) has transformed numerous fields, particularly in decision-making processes that involve complex contextual information. In finance, for instance, the need for effective decision-support systems is paramount, as professionals often rely on both textual and numerical data to guide their choices.
Challenges with LLMs
Despite their impressive capabilities, LLMs come with significant challenges when deployed for real-time decision making:
- Computational Expense: Utilizing LLMs for reasoning at every decision point can be resource-intensive, making them impractical for many applications.
- Uncertainty Estimates: Obtaining reliable uncertainty estimates from LLMs poses a challenge, complicating the decision-making process.
Introducing LLMP-UCB
To address these challenges, we introduce LLMP-UCB, a novel bandit algorithm that leverages uncertainty estimates derived from LLMs through repeated inference. This approach aims to balance the need for thorough reasoning with the constraints of computational resources.
Findings
Our experiments revealed intriguing insights regarding the performance of LLMP-UCB compared to traditional methods:
- Accuracy: Lightweight numerical bandits that operate on text embeddings, whether dense or Matryoshka, demonstrated the capability to match or even surpass the accuracy of LLM-based solutions.
- Cost Efficiency: The lightweight approaches operate at a fraction of the computational cost associated with LLMs.
- Dimensionality as a Lever: The dimensionality of embeddings serves as a practical lever to adjust the exploration-exploitation balance, allowing for cost-performance trade-offs without increasing prompt complexity.
Guidance for Practitioners
Based on our findings, we propose a geometric diagnostic tool that enables practitioners to determine when to employ LLM-driven reasoning versus a lightweight numerical bandit approach. This tool serves as a framework for deploying cost-effective and uncertainty-aware decision systems.
Conclusion
Our research provides a principled deployment framework applicable across various AI use cases in financial services. As the landscape of decision-making continues to evolve, the insights gained from this study will empower organizations to make informed choices about the integration of LLMs and other decision-support technologies.
