Past research has demonstrated that continuous exposure to pollutants, such as PM2.5 and PM10, is associated with an increased risk of developing and worsening respiratory and neurodegenerative diseases. Calculating and reducing exposure to these pollutants is crucial to assess these risks and perform proper prevention. In this study, we estimate personal exposure to PM2.5 based on the integration of sensors measurements, meteorological data and land use parameters, which could impact on actual pollution levels, especially in areas located far from the sensors. Pollution data have been collected from a dense network of sensors located in Pavia, Italy, meteorological and geographical data have been collected from public sources. We used geographical data to create graphs that model the city road structure, and applied Land Use Regression methods to estimate air pollution on its nodes, adjusting the measurements interpolated from the sensors with the effects of weather data, land use parameters such as the distance from the closest high-traffic road, and additional temporal information such as weekends/holidays and working days. We tested several regression methods: linear regression, both simple and with regularization (Ridge, LASSO and ElasticNet), Random Forest regression, Gradient Boosting and Support Vector Regression (SVR). Results show that meteorological variables, namely temperature and humidity, and temporal factors do contribute significantly in obtaining pollution values in the graph nodes that differ from values obtained exclusively through sensors interpolation.

Land Use Regression on Interpolated Urban Graphs to Assess Personal Exposure to Air Pollution

Pala, Daniele;Bosoni, Pietro;Vazifehdan, Mahin;Bellazzi, Riccardo;Dagliati, Arianna
2024-01-01

Abstract

Past research has demonstrated that continuous exposure to pollutants, such as PM2.5 and PM10, is associated with an increased risk of developing and worsening respiratory and neurodegenerative diseases. Calculating and reducing exposure to these pollutants is crucial to assess these risks and perform proper prevention. In this study, we estimate personal exposure to PM2.5 based on the integration of sensors measurements, meteorological data and land use parameters, which could impact on actual pollution levels, especially in areas located far from the sensors. Pollution data have been collected from a dense network of sensors located in Pavia, Italy, meteorological and geographical data have been collected from public sources. We used geographical data to create graphs that model the city road structure, and applied Land Use Regression methods to estimate air pollution on its nodes, adjusting the measurements interpolated from the sensors with the effects of weather data, land use parameters such as the distance from the closest high-traffic road, and additional temporal information such as weekends/holidays and working days. We tested several regression methods: linear regression, both simple and with regularization (Ridge, LASSO and ElasticNet), Random Forest regression, Gradient Boosting and Support Vector Regression (SVR). Results show that meteorological variables, namely temperature and humidity, and temporal factors do contribute significantly in obtaining pollution values in the graph nodes that differ from values obtained exclusively through sensors interpolation.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/1516555
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact