Consistency-Guided Decoding with Proof-Driven Disambiguation for Three-Way Logical Question Answering
Summary: arXiv:2604.06196v1 Announce Type: cross
Abstract
Three-way logical question answering (QA) assigns True/False/Unknown to a hypothesis H given a premise set S. While modern large language models (LLMs) can be accurate on isolated examples, we identify two recurring failure modes in 3-way logic QA: (i) negation inconsistency, where answers to H and ¬H violate the deterministic label mapping, and (ii) epistemic Unknown, where the model predicts Unknown due to uncertainty or instability even when S entails one side. We present CGD-PD, a lightweight test-time layer that (a) queries a single 3-way classifier on both H and a mechanically negated form of H, (b) projects the pair onto a negation-consistent decision when possible, and (c) invokes a proof-driven disambiguation step that uses targeted binary entailment probes to selectively resolve Unknown outcomes, requiring only an average of 4-5 model calls. On the FOLIO benchmark’s first-order-logic fields, CGD-PD yields consistent gains across frontier LLMs, with relative improvements in accuracy of up to 16% over the base model, while also reducing Unknown predictions.
Introduction
The advent of large language models (LLMs) has revolutionized natural language processing, particularly in the realm of question answering. However, the complexity of logical reasoning presents significant challenges, especially in three-way logical question answering. This article explores the shortcomings of current models and introduces a novel approach to enhance their reliability.
Identifying the Challenges
Three-way logical question answering involves a nuanced understanding of premises and hypotheses. Two critical issues have been identified:
- Negation Inconsistency: This occurs when a model’s responses to a hypothesis H and its negation ¬H contradict the expected deterministic label mapping.
- Epistemic Unknown: Models may classify answers as Unknown due to uncertainty, even in cases where the premise set S should logically entail a definitive answer.
Introducing CGD-PD
To address these challenges, we propose the Consistency-Guided Decoding with Proof-Driven Disambiguation (CGD-PD) framework. This innovative layer operates at test time, providing an efficient mechanism for improving logical question answering:
- Querying Mechanism: CGD-PD queries a single 3-way classifier for both the hypothesis H and its mechanically negated form.
- Negation-Consistent Decision Making: The framework projects the outputs onto a consistent decision whenever feasible.
- Proof-Driven Disambiguation: In cases of Unknown outputs, targeted binary entailment probes are employed to resolve uncertainty, minimizing the number of model calls needed.
Results and Impact
When tested on the FOLIO benchmark, CGD-PD demonstrated impressive results:
- Achieved relative accuracy improvements of up to 16% over base models.
- Significantly reduced the number of Unknown predictions, enhancing overall model reliability.
Conclusion
CGD-PD represents a significant advancement in the field of logical question answering, addressing critical shortcomings in existing models. By introducing a structured approach to negation and uncertainty, this framework promises to enhance the consistency and reliability of LLMs in complex reasoning tasks.
