Objectives: AI/ML advancements have been significant, yet their deployment in clinical practice faces logistical, regulatory, and trust-related challenges. To promote trust and informed use of ML predictions in real-world scenarios, reliable assessment of individual predictions is essential. We propose RelAI, a tool for pointwise reliability assessment of ML predictions that can support the identification of prediction errors during deployment. Materials and Methods: RelAI utilizes Autoencoders (AEs) to detect distributional shifts (Density principle) and a proxy model to encode local performance (Local Fit principle). We validated RelAI on a synthetic dataset and a real-world scenario involving Multiple Sclerosis (MS) patient outcomes. Results: On a synthetic dataset, RelAI effectively identified unreliable predictions, outperforming alternative approaches. In the MS case study, reliable predictions exhibited higher accuracy and were associated with specific demographic features, such as sex, residence, and eye symptoms. Discussion and Conclusion: RelAI can support ML deployment in clinical settings by providing pointwise reliability assessments, ensuring regulatory compliance, and fostering user trust. Its model-agnostic nature and its compatibility with Python-based ML pipelines enhance its potential for widespread adoption.
RelAI: an automated approach to judge pointwise ML prediction reliability
Peracchio, Lorenzo;Nicora, Giovanna;Parimbelli, Enea;Buonocore, Tommaso Mario;Tavazzi, Eleonora;Bergamaschi, Roberto;Dagliati, Arianna;Bellazzi, Riccardo
2025-01-01
Abstract
Objectives: AI/ML advancements have been significant, yet their deployment in clinical practice faces logistical, regulatory, and trust-related challenges. To promote trust and informed use of ML predictions in real-world scenarios, reliable assessment of individual predictions is essential. We propose RelAI, a tool for pointwise reliability assessment of ML predictions that can support the identification of prediction errors during deployment. Materials and Methods: RelAI utilizes Autoencoders (AEs) to detect distributional shifts (Density principle) and a proxy model to encode local performance (Local Fit principle). We validated RelAI on a synthetic dataset and a real-world scenario involving Multiple Sclerosis (MS) patient outcomes. Results: On a synthetic dataset, RelAI effectively identified unreliable predictions, outperforming alternative approaches. In the MS case study, reliable predictions exhibited higher accuracy and were associated with specific demographic features, such as sex, residence, and eye symptoms. Discussion and Conclusion: RelAI can support ML deployment in clinical settings by providing pointwise reliability assessments, ensuring regulatory compliance, and fostering user trust. Its model-agnostic nature and its compatibility with Python-based ML pipelines enhance its potential for widespread adoption.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


