The Persuasion Paradox: When LLM Explanations Fail to Improve Human-AI Team Performance
Summary: arXiv:2604.03237v1 Announce Type: cross
Large language models (LLMs) have become increasingly integrated into various sectors, providing natural-language explanations to enhance transparency and foster trust among users. However, recent research reveals a concerning trend: while these explanations boost user confidence in AI outputs, they do not necessarily translate to improved performance when humans collaborate with AI. This phenomenon is termed the “Persuasion Paradox.”
The study, conducted across three controlled human-subject experiments, examined the impact of LLM explanations on human-AI team performance in tasks involving abstract visual reasoning and deductive logical reasoning. The findings indicate a complex relationship between AI predictions, user confidence, and task accuracy.
Key Findings
- Visual Reasoning Tasks: In the context of RAVEN matrices, explanations provided by LLMs increased user confidence without enhancing accuracy. In fact, users exhibited a reduced capacity to recover from AI model errors when relying on these explanations.
- Deductive Logical Reasoning: For LSAT problems, LLM explanations demonstrated a different outcome, yielding the highest accuracy and recovery rates compared to traditional expert-written explanations and probability-based aids.
- Model Uncertainty Exposure: Interfaces that displayed model uncertainty through predicted probabilities, along with a selective automation policy deferring uncertain cases to human intervention, significantly outperformed explanation-based interfaces in terms of accuracy and error recovery.
The Task Dependency of Explanations
The divergence in performance outcomes across different tasks underscores the notion that the effectiveness of narrative explanations is not uniform. Instead, it is strongly mediated by cognitive modalities, suggesting that users respond differently based on the nature of the task at hand.
Implications for Human-AI Interaction Design
These findings raise critical questions about the validity of conventional metrics used to assess human-AI interactions. Common subjective evaluations such as trust, confidence, and perceived clarity do not serve as reliable indicators of team performance.
In light of this research, the authors advocate for a paradigm shift in interaction design. Instead of viewing explanations as a one-size-fits-all solution, the emphasis should be placed on:
- Prioritizing calibrated reliance on AI systems.
- Enhancing effective error recovery strategies.
- Designing interfaces that acknowledge and address model uncertainty.
As AI continues to evolve, understanding the nuanced dynamics between human users and AI systems will be crucial for ensuring optimal collaboration and performance in diverse applications.
