Accurate building height estimation from very high-resolution (VHR) Synthetic Aperture Radar (SAR) imagery plays a pivotal role in urban analysis tasks. This paper presents a pixel-based deep learning (DL) framework for estimating building height maps from single COSMO-SkyMed (CSK) SAR images. Supervised training is provided through a refined normalized Digital Surface Model (nDSM), constructed by fusing public building height data with a globally available DSM baseline using a distance-weighted blending scheme. The proposed architecture features a modified Attention U-Net with dual decoders, specialized for built-up and background areas, and is trained using a Mean Absolute Error (MAE) loss for increased robustness to SAR-specific distortions. The model is evaluated across a multi-continental dataset covering eight cities, and tested under both in-distribution and cross-city out-of-distribution (OOD) conditions. The results show that the approach outperforms recent object-based and multimodal benchmarks, especially in European and American cities, although challenges remain in high-rise Asian metropolises.

A Pixel-Based Deep Learning Approach for Building Height Estimation From Single SAR Images

Russo L.;Memar B.;Gamba P.
2026-01-01

Abstract

Accurate building height estimation from very high-resolution (VHR) Synthetic Aperture Radar (SAR) imagery plays a pivotal role in urban analysis tasks. This paper presents a pixel-based deep learning (DL) framework for estimating building height maps from single COSMO-SkyMed (CSK) SAR images. Supervised training is provided through a refined normalized Digital Surface Model (nDSM), constructed by fusing public building height data with a globally available DSM baseline using a distance-weighted blending scheme. The proposed architecture features a modified Attention U-Net with dual decoders, specialized for built-up and background areas, and is trained using a Mean Absolute Error (MAE) loss for increased robustness to SAR-specific distortions. The model is evaluated across a multi-continental dataset covering eight cities, and tested under both in-distribution and cross-city out-of-distribution (OOD) conditions. The results show that the approach outperforms recent object-based and multimodal benchmarks, especially in European and American cities, although challenges remain in high-rise Asian metropolises.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/1550698
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact