This paper develops a regression framework for histogram-valued data using unbalanced optimal transport. By generalizing classical optimal transport theory to account for mass imbalances, the proposed methodology operates within the space of non-negative measures, offering a more flexible and robust framework for regression analysis in distributional settings. The framework aims to determine the optimal barycentric coordinates to construct unbalanced Wasserstein barycenters that establish an optimal mapping between input and target histograms while preserving the underlying distributional structures. The effectiveness of the approach is demonstrated through simulation studies and an empirical application to football analytics, utilizing performance metrics from the 2023–2024 Italian Serie A season. By regressing player-level statistics onto team-level histograms, we quantify the extent to which individual player contributions align with team-level collective dynamics. By analyzing the deviation of individual contributions across different match outcomes, wins, losses, and draws, we uncover patterns that distinguish successful team strategies from less effective ones.

An unbalanced optimal transport framework for histogram-valued regression with applications to sports analytics

Spelta, Alessandro
2026-01-01

Abstract

This paper develops a regression framework for histogram-valued data using unbalanced optimal transport. By generalizing classical optimal transport theory to account for mass imbalances, the proposed methodology operates within the space of non-negative measures, offering a more flexible and robust framework for regression analysis in distributional settings. The framework aims to determine the optimal barycentric coordinates to construct unbalanced Wasserstein barycenters that establish an optimal mapping between input and target histograms while preserving the underlying distributional structures. The effectiveness of the approach is demonstrated through simulation studies and an empirical application to football analytics, utilizing performance metrics from the 2023–2024 Italian Serie A season. By regressing player-level statistics onto team-level histograms, we quantify the extent to which individual player contributions align with team-level collective dynamics. By analyzing the deviation of individual contributions across different match outcomes, wins, losses, and draws, we uncover patterns that distinguish successful team strategies from less effective ones.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/1551419
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact