ViTMARE - A Vision Transformer Pipeline for Anomaly Detection in 3D Brain MRI

Peracchio, L.; Corso, L.; Santangelo, G.; Seetha, S. T.; Bortolotto, C.; Dagliati, A.; Bellazzi, R.; Nicora, G.

doi:10.3233/SHTI260128

: AI models for medical imaging often fail under dataset shifts and on underrepresented patient subgroups. Detecting out-of-distribution scans-arising from rare pathologies, atypical anatomy, or acquisition artifacts-is therefore essential for robust deployment. We introduce ViTMARE (Vision Transformer Masked Autoencoder Reconstruction Error), a volumetric anomaly-detection pipeline for 3D brain MRI that leverages Vision Transformer Masked AutoEncoders (ViTMAEs) adapted to volumetric data by treating axial slices as input channels. The model is fine-tuned on normal brain volumes and evaluated using a synthetic-lesion generator that produces anatomically plausible abnormalities. During inference, ViTMARE performs multiple reconstructions (N=100) and aggregates binary anomaly masks via majority voting, followed by morphological closing and opening to suppress spurious noise. On a test set of real images with added synthetic anomalies, ViTMARE achieves a median Dice score of 0.793, a median precision of 0.912, and a median recall of 0.748. We present a reproducible pipeline and demonstrate that combining voting-based fusion with morphological postprocessing yields robust voxel-level anomaly detection.