Consistency-Guided Decoding for Logical QA Accuracy

Date:

Consistency-Guided Decoding with Proof-Driven Disambiguation for Three-Way Logical Question Answering

Summary: arXiv:2604.06196v1 Announce Type: cross

Abstract

Three-way logical question answering (QA) assigns True/False/Unknown to a hypothesis H given a premise set S. While modern large language models (LLMs) can be accurate on isolated examples, we identify two recurring failure modes in 3-way logic QA: (i) negation inconsistency, where answers to H and ¬H violate the deterministic label mapping, and (ii) epistemic Unknown, where the model predicts Unknown due to uncertainty or instability even when S entails one side. We present CGD-PD, a lightweight test-time layer that (a) queries a single 3-way classifier on both H and a mechanically negated form of H, (b) projects the pair onto a negation-consistent decision when possible, and (c) invokes a proof-driven disambiguation step that uses targeted binary entailment probes to selectively resolve Unknown outcomes, requiring only an average of 4-5 model calls. On the FOLIO benchmark’s first-order-logic fields, CGD-PD yields consistent gains across frontier LLMs, with relative improvements in accuracy of up to 16% over the base model, while also reducing Unknown predictions.

Introduction

The advent of large language models (LLMs) has revolutionized natural language processing, particularly in the realm of question answering. However, the complexity of logical reasoning presents significant challenges, especially in three-way logical question answering. This article explores the shortcomings of current models and introduces a novel approach to enhance their reliability.

Identifying the Challenges

Three-way logical question answering involves a nuanced understanding of premises and hypotheses. Two critical issues have been identified:

  • Negation Inconsistency: This occurs when a model’s responses to a hypothesis H and its negation ¬H contradict the expected deterministic label mapping.
  • Epistemic Unknown: Models may classify answers as Unknown due to uncertainty, even in cases where the premise set S should logically entail a definitive answer.

Introducing CGD-PD

To address these challenges, we propose the Consistency-Guided Decoding with Proof-Driven Disambiguation (CGD-PD) framework. This innovative layer operates at test time, providing an efficient mechanism for improving logical question answering:

  • Querying Mechanism: CGD-PD queries a single 3-way classifier for both the hypothesis H and its mechanically negated form.
  • Negation-Consistent Decision Making: The framework projects the outputs onto a consistent decision whenever feasible.
  • Proof-Driven Disambiguation: In cases of Unknown outputs, targeted binary entailment probes are employed to resolve uncertainty, minimizing the number of model calls needed.

Results and Impact

When tested on the FOLIO benchmark, CGD-PD demonstrated impressive results:

  • Achieved relative accuracy improvements of up to 16% over base models.
  • Significantly reduced the number of Unknown predictions, enhancing overall model reliability.

Conclusion

CGD-PD represents a significant advancement in the field of logical question answering, addressing critical shortcomings in existing models. By introducing a structured approach to negation and uncertainty, this framework promises to enhance the consistency and reliability of LLMs in complex reasoning tasks.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.