Order in the Court: Explainable AI Methods Prone to Disagreement.

Michael Neely,Stefan F. Schouten,Maurits Bleeker,Ana Lucic

Order in the Court: Explainable AI Methods Prone to Disagreement.

2021

In Natural Language Processing, feature-additive explanation methods quantify the independent contribution of each input token towards a model's decision. By computing the rank correlation between attention weights and the scores produced by a small sample of these methods, previous analyses have sought to either invalidate or support the role of attention-based explanations as a faithful and plausible measure of salience. To investigate what measures of rank correlation can reliably conclude, we comprehensively compare feature-additive methods, including attention-based explanations, across several neural architectures and tasks. In most cases, we find that none of our chosen methods agree. Therefore, we argue that rank correlation is largely uninformative and does not measure the quality of feature-additive methods. Additionally, the range of conclusions a practitioner may draw from a single explainability algorithm are limited.

Keywords:

Correction
Source
Cite
Save

References

Citations