Background Balancing artificial intelligence (AI) support with appropriate human oversight is challenging, with associated risks such as algorithm aversion and technology dominance. Research areas like eXplainable AI (XAI) and Frictional AI aim to address these challenges. Studies have shown that presenting XAI explanations as juxtaposed evidence supporting contrasting classifications, rather than just providing predictions, can be beneficial. Objectives This study aimed to design and compare multiple pipelines for generating juxtaposed evidence in the form of class activation maps (CAMs) that highlight areas of interest in a fracture detection task with X-ray images. Materials and Methods We designed three pipelines to generate such evidence. The pipelines are based on a fracture detection task from 630 thoraco-lumbar X-ray images (48% of which contained fractures). The first, a single-model approach, uses an algorithm of the Grad-CAM family applied to a ResNeXt-50 network trained through transfer learning. The second, a dual-model approach, employs two networks-one optimized for sensitivity and the other for specificity-providing targeted explanations for positive and negative cases. The third, a generative approach, leverages autoencoders to create activation maps from feature tensors, extracted from the raw images. Each approach produced two versions of activation maps: AM3-As we termed it-which captures fine-grained, low-level features, and AM4, highlighting high-level, aggregated features. We conducted a validation study by comparing the generated maps with binary ground-Truth masks derived from a consensus of four clinician annotators, identifying the actual locations of fractures in a subset of positive cases. Results HiResCAM proved to be the best performing Grad-CAM variant and was used in both the single-and dual-model strategies. The generative approach demonstrated the greatest overlap with the clinicians' assessments, indicating its ability to align with human expertise. Conclusion The results highlight the potential of Judicial AI to enhance diagnostic decision-making and foster a synergistic collaboration between humans and AI.

Alternative Strategies to Generate Class Activation Maps Supporting AI-based Advice in Vertebral Fracture Detection in X-ray Images

Pe S.;Bortolotto C.;Carone L.;Cisarri A.;Salina A.;Preda L.;Bellazzi R.;Parimbelli E.
2025-01-01

Abstract

Background Balancing artificial intelligence (AI) support with appropriate human oversight is challenging, with associated risks such as algorithm aversion and technology dominance. Research areas like eXplainable AI (XAI) and Frictional AI aim to address these challenges. Studies have shown that presenting XAI explanations as juxtaposed evidence supporting contrasting classifications, rather than just providing predictions, can be beneficial. Objectives This study aimed to design and compare multiple pipelines for generating juxtaposed evidence in the form of class activation maps (CAMs) that highlight areas of interest in a fracture detection task with X-ray images. Materials and Methods We designed three pipelines to generate such evidence. The pipelines are based on a fracture detection task from 630 thoraco-lumbar X-ray images (48% of which contained fractures). The first, a single-model approach, uses an algorithm of the Grad-CAM family applied to a ResNeXt-50 network trained through transfer learning. The second, a dual-model approach, employs two networks-one optimized for sensitivity and the other for specificity-providing targeted explanations for positive and negative cases. The third, a generative approach, leverages autoencoders to create activation maps from feature tensors, extracted from the raw images. Each approach produced two versions of activation maps: AM3-As we termed it-which captures fine-grained, low-level features, and AM4, highlighting high-level, aggregated features. We conducted a validation study by comparing the generated maps with binary ground-Truth masks derived from a consensus of four clinician annotators, identifying the actual locations of fractures in a subset of positive cases. Results HiResCAM proved to be the best performing Grad-CAM variant and was used in both the single-and dual-model strategies. The generative approach demonstrated the greatest overlap with the clinicians' assessments, indicating its ability to align with human expertise. Conclusion The results highlight the potential of Judicial AI to enhance diagnostic decision-making and foster a synergistic collaboration between humans and AI.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/1549982
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 3
social impact