LatentRefusal: Latent-Signal Refusal for Unanswerable Text-to-SQL Queries
Summary: arXiv:2601.10398v3 Announce Type: replace
Abstract
In LLM-based text-to-SQL systems, unanswerable and underspecified user queries may generate not only incorrect text but also executable programs that yield misleading results or violate safety constraints, posing a major barrier to safe deployment. Existing refusal strategies for such queries either rely on output-level instruction following, which is brittle due to model hallucinations, or estimate output uncertainty, which adds complexity and overhead.
Introduction
As the use of large language models (LLMs) in text-to-SQL systems continues to grow, the challenge of handling unanswerable queries becomes increasingly critical. When users submit vague or incomplete queries, the potential for generating erroneous SQL statements can lead to significant operational risks. To mitigate these risks, a robust refusal mechanism is essential.
LatentRefusal Mechanism
To address this challenge, we formalize safe refusal in text-to-SQL systems as an answerability-gating problem. Our proposed solution, LatentRefusal, introduces a latent-signal refusal mechanism that predicts the answerability of queries based on intermediate hidden activations from a large language model.
Tri-Residual Gated Encoder
At the core of LatentRefusal lies the Tri-Residual Gated Encoder, a lightweight probing architecture designed to enhance the model’s ability to distinguish between answerable and unanswerable queries. This architecture effectively suppresses schema noise while amplifying sparse, localized cues that indicate a mismatch between the user’s question and the database schema.
Methodology
The methodology behind LatentRefusal involves several key steps:
- Intermediate Activation Analysis: By analyzing hidden activations, the model can identify patterns that signify unanswerability.
- Noise Suppression: The Tri-Residual Gated Encoder minimizes irrelevant schema noise, improving the accuracy of predictions.
- Localized Cue Amplification: The mechanism focuses on specific cues that are indicative of a mismatch, thereby enhancing the model’s precision.
Empirical Evaluations
We conducted extensive empirical evaluations across a variety of ambiguous and unanswerable settings. Our results, supported by ablation studies and interpretability analyses, demonstrate the effectiveness of LatentRefusal. Key findings include:
- LatentRefusal significantly enhances the model’s refusal capabilities, reducing the likelihood of generating erroneous SQL queries.
- Across four benchmarks, LatentRefusal improved average F1 scores to 88.5 percent on both backbone models.
- The implementation adds approximately 2 milliseconds of probe overhead, making it a practical addition to existing systems.
Conclusion
LatentRefusal represents a significant advancement in the safe deployment of text-to-SQL systems. By formalizing refusal as an answerability-gating problem and introducing a lightweight probing architecture, we provide an efficient safety layer that enhances the reliability of LLM-based query systems. This approach not only mitigates risks associated with unanswerable queries but also paves the way for more robust interactions between users and databases.
