Hyperspectral imaging (HSI) captures detailed spectral information across numerous wavelengths, providing superior object characterization to conventional RGB imaging. Despite these advantages, training deep learning models on HSI data is challenging due to the limited availability of extensive datasets, unlike the more familiar RGB images. To address this issue, we propose an encoder model that transforms hyperspectral images into enriched RGB images. These new enriched images represent a graphical depiction of HSI and become a new dataset to use as input for well-known models pre-trained on RGB images. In this work, we introduce HS2RGB, an encoder model based on the Vision Transformer (ViT) architecture, which condenses hyperspectral data into a three-element vector interpreted as RGB channels. The results demonstrate the effectiveness of the new images generated by the encoder, showing better visual differentiation of features compared to traditional RGB images. Morover, results highlighted greater consistency in latent vectors of the same type of tissue across different samples compared to images generated with feature selection and transformation techniques like PCA and t-SNE. Finally, we tested the enriched RGB images using Meta's SAM model for instance segmentation, revealing that our model's images provided more precise identification of regions of interest, such as tumours in medical images.
HS2RGB: an Encoder Approach to Transform Hyper-Spectral Images to Enriched RGB Images
Gazzoni M.;Torti E.;Marenzi E.;Danese G.;Leporati F.
2024-01-01
Abstract
Hyperspectral imaging (HSI) captures detailed spectral information across numerous wavelengths, providing superior object characterization to conventional RGB imaging. Despite these advantages, training deep learning models on HSI data is challenging due to the limited availability of extensive datasets, unlike the more familiar RGB images. To address this issue, we propose an encoder model that transforms hyperspectral images into enriched RGB images. These new enriched images represent a graphical depiction of HSI and become a new dataset to use as input for well-known models pre-trained on RGB images. In this work, we introduce HS2RGB, an encoder model based on the Vision Transformer (ViT) architecture, which condenses hyperspectral data into a three-element vector interpreted as RGB channels. The results demonstrate the effectiveness of the new images generated by the encoder, showing better visual differentiation of features compared to traditional RGB images. Morover, results highlighted greater consistency in latent vectors of the same type of tissue across different samples compared to images generated with feature selection and transformation techniques like PCA and t-SNE. Finally, we tested the enriched RGB images using Meta's SAM model for instance segmentation, revealing that our model's images provided more precise identification of regions of interest, such as tumours in medical images.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.