This paper develops a regression framework for histogram-valued data using unbalanced optimal transport. By generalizing classical optimal transport theory to account for mass imbalances, the proposed methodology operates within the space of non-negative measures, offering a more flexible and robust framework for regression analysis in distributional settings. The framework aims to determine the optimal barycentric coordinates to construct unbalanced Wasserstein barycenters that establish an optimal mapping between input and target histograms while preserving the underlying distributional structures. The effectiveness of the approach is demonstrated through simulation studies and an empirical application to football analytics, utilizing performance metrics from the 2023–2024 Italian Serie A season. By regressing player-level statistics onto team-level histograms, we quantify the extent to which individual player contributions align with team-level collective dynamics. By analyzing the deviation of individual contributions across different match outcomes, wins, losses, and draws, we uncover patterns that distinguish successful team strategies from less effective ones.
An unbalanced optimal transport framework for histogram-valued regression with applications to sports analytics
Spelta, Alessandro
2026-01-01
Abstract
This paper develops a regression framework for histogram-valued data using unbalanced optimal transport. By generalizing classical optimal transport theory to account for mass imbalances, the proposed methodology operates within the space of non-negative measures, offering a more flexible and robust framework for regression analysis in distributional settings. The framework aims to determine the optimal barycentric coordinates to construct unbalanced Wasserstein barycenters that establish an optimal mapping between input and target histograms while preserving the underlying distributional structures. The effectiveness of the approach is demonstrated through simulation studies and an empirical application to football analytics, utilizing performance metrics from the 2023–2024 Italian Serie A season. By regressing player-level statistics onto team-level histograms, we quantify the extent to which individual player contributions align with team-level collective dynamics. By analyzing the deviation of individual contributions across different match outcomes, wins, losses, and draws, we uncover patterns that distinguish successful team strategies from less effective ones.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


