Human-Centered Evaluation of Shapley XAI in High-Stakes AI

Rethinking XAI Evaluation: A Human-Centered Audit of Shapley Benchmarks in High-Stakes Settings

Recent advancements in artificial intelligence (AI) have brought about a growing emphasis on explainable AI (XAI), with Shapley values emerging as a significant tool for providing insights into model decision-making processes. However, the increasing number of Shapley variants has led to fragmentation in XAI methodologies, making it challenging to achieve consensus on their practical deployment in critical applications.

A new study published on arXiv (arXiv:2604.22662v1) highlights the urgent need for a comprehensive evaluation framework that aligns with human decision-making needs in high-stakes environments. The research focuses on the evaluation of eight different Shapley value formulations within the context of operational risk workflows, particularly emphasizing the implications for fraud detection scenarios.

Key Findings and Methodology

The authors utilized a unified amortized framework to assess the semantic differences between the various Shapley variants. This approach allowed for a more nuanced understanding of how these differences manifest under the constraints of low-latency environments typical in risk management. The study involved a large-scale empirical evaluation that included:

Four distinct risk datasets
A realistic fraud detection environment
Engagement with professional analysts in 3,735 case reviews

The findings from this extensive analysis revealed a fundamental misalignment between standard quantitative metrics and human-perceived clarity in decision-making. Metrics such as sparsity and faithfulness, while important in the theoretical realm, did not correlate effectively with how analysts perceived the explanations provided by the AI systems.

Implications for Future Research and Practice

One of the most striking outcomes of the study was the observation that, despite the lack of improvement in objective analyst performance across all formulations, the explanations generated by the systems consistently increased the decision confidence of the analysts. This phenomenon raises concerns about potential automation bias in high-stakes settings, where overreliance on AI-generated explanations could lead to critical errors.

The authors argue that the current evaluation proxies, which rely heavily on quantitative assessments, are inadequate for predicting the real-world impact of AI explanations on human decision-making. They emphasize the need for a shift toward more human-centered evaluation metrics that consider how explanations influence analyst behavior and decision outcomes.

Recommendations for Operational Decision Systems

Based on their findings, the researchers offer several recommendations for organizations looking to implement XAI in operational decision systems:

Prioritize human-centric evaluation metrics that assess clarity, relevance, and decision utility.
Conduct user studies that involve professionals in relevant fields to gather qualitative feedback on AI explanations.
Continuously iterate and refine Shapley formulations based on empirical findings to enhance alignment with human cognitive processes.
Foster interdisciplinary collaboration to bridge the gap between theoretical AI research and practical application in high-stakes environments.

In conclusion, the study underscores the critical need for rethinking how we evaluate XAI systems, particularly in high-stakes settings. By placing human decision-making at the forefront of evaluation frameworks, we can enhance the effectiveness of AI systems and mitigate the risks associated with automation bias.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Human-Centered Evaluation of Shapley XAI in High-Stakes AI

Rethinking XAI Evaluation: A Human-Centered Audit of Shapley Benchmarks in High-Stakes Settings

Key Findings and Methodology

Implications for Future Research and Practice

Recommendations for Operational Decision Systems

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related