Reasoning about Intent for Ambiguous Requests
Summary: arXiv:2511.10453v3 Announce Type: replace-cross
Abstract
Large language models often respond to ambiguous requests by implicitly committing to one interpretation, frustrating users and creating safety risks when that interpretation is wrong. In this article, we propose a method for generating a structured response that enumerates the different ways an ambiguous request can be interpreted, each paired with a corresponding answer. Our models are trained using reinforcement learning with a dual reward objective: to maximize recall on ambiguous inputs for valid interpretations and to enhance precision on unambiguous ones to minimize spurious alternatives.
Introduction
Ambiguity in user requests is a common challenge in natural language processing, particularly for large language models (LLMs). When faced with an ambiguous query, LLMs frequently default to a single interpretation, which can lead to user frustration and potential safety risks. This article discusses an innovative approach aimed at addressing these concerns by offering a structured response that not only acknowledges the ambiguity but also provides multiple interpretations along with their corresponding answers.
Methodology
Our proposed method focuses on training models with a reinforcement learning framework that employs a dual reward objective. The key components of our methodology include:
- Coverage Maximization: The model aims to recall as many valid interpretations as possible when faced with ambiguous inputs.
- Precision Enhancement: On unambiguous inputs, the model focuses on suppressing any spurious alternatives, ensuring that the responses are accurate and relevant.
This approach requires only multiple valid answers per input for supervision, eliminating the need for clarification questions or explicit interpretations. This not only reduces the complexity of training but also aligns with the natural flow of conversation that users expect.
Experiments and Results
We conducted experiments on conversational question answering and semantic parsing tasks to evaluate the effectiveness of our method. The results indicated that our approach significantly outperformed baseline methods in terms of coverage of valid answers. The structured output format not only facilitated better understanding but also allowed users to see the various interpretations of their requests.
Human Evaluation
To further validate our approach, we performed a human evaluation where participants assessed the meaningfulness of the predicted interpretations and their corresponding answers. The feedback was overwhelmingly positive, with users appreciating the transparency and clarity offered by our structured responses.
Conclusion
Our method represents a significant advancement in how large language models can handle ambiguous requests. By generating structured responses that enumerate various interpretations, we enhance user experience and mitigate potential safety risks associated with incorrect interpretations. This approach not only promotes transparency but also supports downstream applications through its structured output format, paving the way for more robust and user-friendly conversational AI systems.
