A Two-Stage LLM Framework for Accessible and Verified XAI Explanations
Summary: arXiv:2604.12543v1 Announce Type: new
Abstract: Large Language Models (LLMs) are increasingly used to translate the technical outputs of eXplainable Artificial Intelligence (XAI) methods into accessible natural-language explanations. However, existing approaches often lack guarantees of accuracy, faithfulness, and completeness. At the same time, current efforts to evaluate such narratives remain largely subjective or confined to post-hoc scoring, offering no safeguards to prevent flawed explanations from reaching end-users. To address these limitations, this paper proposes a Two-Stage LLM Meta-Verification Framework that consists of:
- An Explainer LLM that converts raw XAI outputs into natural-language narratives.
- A Verifier LLM that assesses them in terms of faithfulness, coherence, completeness, and hallucination risk.
- An iterative refeed mechanism that uses the Verifier’s feedback to refine and improve the explanations.
Experiments across five XAI techniques and datasets, using three families of open-weight LLMs, show that verification is crucial for filtering unreliable explanations while improving linguistic accessibility compared with raw XAI outputs. The analysis of the Entropy Production Rate (EPR) during the refinement process indicates that the Verifier’s feedback progressively guides the Explainer toward more stable and coherent reasoning.
Overall, the proposed framework provides an efficient pathway toward more trustworthy and democratized XAI systems.
Introduction
As the integration of AI in various sectors continues to expand, the demand for transparency and explainability in AI systems has never been more critical. eXplainable Artificial Intelligence (XAI) aims to provide insights into the decision-making processes of AI systems, yet translating these complex outputs into understandable narratives poses significant challenges.
Current Challenges in XAI
Despite advancements in XAI methodologies, several issues persist:
- Accuracy: Many existing explanations may misrepresent the underlying model behavior.
- Faithfulness: Explanations often fail to accurately reflect the logic of the AI systems they represent.
- Completeness: Essential details that contribute to a full understanding of the AI’s decision-making may be omitted.
- Subjectivity: Current evaluation methods for XAI explanations are often subjective, lacking standardized metrics.
The Proposed Framework
The new Two-Stage LLM Meta-Verification Framework effectively addresses these challenges by introducing a systematic approach to generate and validate explanations:
- Stage 1 – Explanation Generation: The Explainer LLM takes raw outputs from various XAI techniques and translates them into human-readable narratives.
- Stage 2 – Verification: The Verifier LLM evaluates these narratives, ensuring they meet criteria for faithfulness, coherence, completeness, and minimal hallucination risks.
Conclusion
The implementation of this framework not only enhances the quality of XAI explanations but also democratizes access to AI insights, fostering greater trust among users. By emphasizing the importance of verification in the explanation generation process, this research paves the way for future developments in the field of explainable AI.
