Measuring Rationale Stability for Explainable Pattern Recognition

Empirical Characterization of Rationale Stability Under Controlled Perturbations for Explainable Pattern Recognition

Summary: arXiv:2604.04456v1 Announce Type: new

Abstract

Reliable pattern recognition systems should exhibit consistent behavior across similar inputs, and their explanations should remain stable. However, most Explainable AI evaluations remain instance-centric and do not explicitly quantify whether attribution patterns are consistent across samples that share the same class or represent small variations of the same input.

Introduction

In the realm of artificial intelligence, particularly in Explainable AI (XAI), the need for reliability and consistency in pattern recognition systems is paramount. This article discusses a novel metric designed to assess the consistency of model explanations, focusing specifically on label-preserving perturbations.

Proposed Metric and Methodology

We introduce a metric aimed at quantifying the stability of model explanations. This is accomplished by implementing the metric using a pre-trained BERT model on the SST-2 sentiment analysis dataset. Additional robustness tests are conducted using RoBERTa, DistilBERT, and IMDB. SHAP (SHapley Additive exPlanations) is employed to compute feature importance across various test samples.

Key Components of the Methodology:

Cosine Similarity: The proposed metric quantifies the cosine similarity of SHAP values for inputs sharing the same label.
Detection of Inconsistencies: The metric aims to identify inconsistent behaviors, such as biased reliance on specific features or failure to maintain consistent reasoning for similar predictions.
Experimental Evaluation: A series of experiments are conducted to evaluate the effectiveness of this metric in identifying misaligned predictions and inconsistencies in model explanations.

Results and Comparison

The experiments reveal that the proposed metric effectively identifies when a model’s behavior deviates from its intended objectives. Comparisons against standard fidelity metrics demonstrate that this new metric provides a more nuanced perspective on model behavior, offering insights that traditional methods may overlook.

Significance of the Findings

This framework enhances the understanding of model behavior by enabling a more robust verification of rationale stability. The ability to quantify whether models rely on consistent attribution patterns for similar inputs is crucial for building trustworthy AI systems.

Conclusion

The findings underscore the importance of consistency in explainable AI, particularly in applications involving pattern recognition. By offering a metric that assesses consistency under controlled perturbations, we pave the way for more effective evaluations of model behavior, ultimately contributing to the development of reliable AI systems.

Availability

The code for the proposed metric and methodology is publicly available at the following repository: https://github.com/anmspro/ESS-XAI-Stability.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Measuring Rationale Stability for Explainable Pattern Recognition

Empirical Characterization of Rationale Stability Under Controlled Perturbations for Explainable Pattern Recognition

Abstract

Introduction

Proposed Metric and Methodology

Key Components of the Methodology:

Results and Comparison

Significance of the Findings

Conclusion

Availability

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related