We present an evaluation study of the usage of two different post-hoc model agnostic XAI methods, namely SHAP and AraucanaXAI, to provide insights about the most predictive factors of worsening in MS patients, based on clinical observations carried out during a period of 2.5 years. We pre-processed the temporal features considering a Latent Class Mixed Modelling (LCMM) approach in order to discover and extract temporal trajectories as an additional informative feature. The different XAI approaches are compared according to four quantitative evaluation metrics consisting in identity, fidelity, separability and time to compute an explanation. Furthermore, a qualitative comparison of post-hoc generated explanations is carried out on specific scenarios where the ML model predicted the outcome incorrectly, in the effort to debug potentially problematic model behaviour. The combination of the qualitative and quantitative results forms the basis for a critical discussion of XAI methods properties and desiderata for healthcare applications at large, advocating for more meaningful and extensive XAI evaluation studies involving human experts.

Predicting and Explaining Risk of Disease Worsening Using Temporal Features in Multiple Sclerosis

Buonocore T. M.;Bosoni P.;Nicora G.;Vazifehdan M.;Bellazzi R.;Parimbelli E.;Dagliati A.
2023-01-01

Abstract

We present an evaluation study of the usage of two different post-hoc model agnostic XAI methods, namely SHAP and AraucanaXAI, to provide insights about the most predictive factors of worsening in MS patients, based on clinical observations carried out during a period of 2.5 years. We pre-processed the temporal features considering a Latent Class Mixed Modelling (LCMM) approach in order to discover and extract temporal trajectories as an additional informative feature. The different XAI approaches are compared according to four quantitative evaluation metrics consisting in identity, fidelity, separability and time to compute an explanation. Furthermore, a qualitative comparison of post-hoc generated explanations is carried out on specific scenarios where the ML model predicted the outcome incorrectly, in the effort to debug potentially problematic model behaviour. The combination of the qualitative and quantitative results forms the basis for a critical discussion of XAI methods properties and desiderata for healthcare applications at large, advocating for more meaningful and extensive XAI evaluation studies involving human experts.
2023
CEUR Workshop Proceedings
Inglese
24th Working Notes of the Conference and Labs of the Evaluation Forum, CLEF-WN 2023
2023
grc
3497
1210
1218
9
CEUR-WS
black-box; degenerative disease; disease worsening; evaluation; explainability; interpretable machine learning; local explanation; Multiple sclerosis; neurological disease; predictive modelling; surrogate model; temporal data mining; temporal features; XAI
no
none
Buonocore, T. M.; Bosoni, P.; Nicora, G.; Vazifehdan, M.; Bellazzi, R.; Parimbelli, E.; Dagliati, A.
273
info:eu-repo/semantics/conferenceObject
7
4 Contributo in Atti di Convegno (Proceeding)::4.1 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/1487683
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact