How robust are ensemble machine learning explanations?

Calzarossa, Maria Carla; Giudici, Paolo; Zieni, Rasha

doi:10.1016/j.neucom.2025.129686

To date, several explainable AI methods are available. The variability of the resulting explanations can be high, especially when many input features are considered. This lack of robustness may limit their usability. In this paper we try to fill this gap, by contributing a methodology that: i) is able to measure the robustness of a given set of explanations; ii) suggests how to improve robustness, by tuning the model parameters. Without loss of generality, we exemplify our proposal for ensemble tree models, which typically reach a high predictive performance in classification problems. We consider a toy case study with artificially generated data as well as two real case studies whose application domain is cybersecurity and more precisely the models used for detecting phishing websites.