OSCAR: Reducing Hallucinations in Diffusion Language Models

OSCAR: Orchestrated Self-verification and Cross-path Refinement

Summary: arXiv:2604.01624v2 Announce Type: replace

Abstract

Diffusion language models (DLMs) expose their denoising trajectories, offering a natural handle for inference-time control; accordingly, an ideal hallucination mitigation framework should intervene during generation using this model-native signal rather than relying on an externally trained hallucination classifier.

Toward this, we formulate commitment uncertainty localization: given a denoising trajectory, identify token positions whose cross-chain entropy exceeds an unsupervised threshold before factually unreliable commitments propagate into self-consistent but incorrect outputs.

Introduction to OSCAR

We introduce OSCAR, a training-free inference-time framework operationalizing the commitment uncertainty localization. OSCAR runs N parallel denoising chains with randomized reveal orders, computes cross-chain Shannon entropy to detect high-uncertainty positions, and then performs targeted remasking conditioned on retrieved evidence.

Methodology

The framework employs a series of trajectory-level assessments, including a cross-chain divergence-at-hallucination (CDH) metric, for principled comparison of localization methods. The approach involves the following key steps:

Parallel Denoising Chains: OSCAR operates multiple chains simultaneously to enhance the robustness of the inference process.
Randomized Reveal Orders: The randomization of token reveal orders helps to mitigate biases and ensures a more reliable assessment of uncertainty.
Cross-chain Shannon Entropy: This metric is utilized to identify token positions with high uncertainty, allowing for targeted interventions.
Targeted Remasking: Once high-uncertainty positions are detected, OSCAR applies remasking strategies to correct potential hallucinations.

Results

Ablation studies confirm that both localization and correction strategies contribute complementary gains, showing robustness across various configurations of N in {4, 8, 16}. The application of OSCAR on multiple datasets, including:

TriviaQA
HotpotQA
RAGTruth
CommonsenseQA

using models such as LLaDA-8B and Dream-7B, demonstrates significant enhancements in generation quality. OSCAR effectively reduces hallucinated content and improves factual accuracy through its uncertainty-guided remasking approach, facilitating a more effective integration of retrieved evidence.

Conclusion

The native entropy-based uncertainty signal of OSCAR surpasses that of specialized trained detectors, emphasizing the inherent capacity of diffusion language models to identify factual uncertainty. This is a significant advancement over the sequential token commitment structure typically found in autoregressive models, suggesting that DLMs possess unique advantages in managing factual accuracy during text generation.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

OSCAR: Reducing Hallucinations in Diffusion Language Models

OSCAR: Orchestrated Self-verification and Cross-path Refinement

Abstract

Introduction to OSCAR

Methodology

Results

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related