Considering the significant impact of potentially toxic elements (PTEs) on the ecosystem and human health, this paper, investigated the contamination level of four PTEs (Zn, Cu, Mo and Pb) and their mobility in sediments of Mahabad dam and river. Choosing the most effective machine learning algorithms is very important in accurately predicting bioavailability of PTEs. Therefore, four machine learning (ML) models including decision tree regression (DTR), random forest regression (RFR), multi-layer perceptron regression (MLPR) and support vector regression (SVR), were used and compared for estimating the selected PTEs bioavailability. For these models, 9 variables (total concentration, pH, EC, OM and five chemical forms F1 to F5 obtained by sequential extraction) in 100 sediment samples were considered. The results showed that contamination level decreases from Zn and Cu to Pb and Mo, but the order of the mobility coefficient of the elements in the sediment follows the trend of zinc > copper > molybdenum > lead, and variation coefficient indicated more variability of spatial distribution for Zn and Cu. Among the four tested models, DTR and RFR performed the best for predicting PTEs bioavailability variations (with roc_auc>0.9, R2 > 0.8 and MSE>0.5), followed by MLPR and SVR. Furthermore, the relevance of the factors controlling the metals availability, evaluated using the RFR-based feature importance method and Pearson correlation, revealed that the most important physicochemical property for Zn, Cu and Mo bioavailability was pH, whereas for Pb, EC was the determinant factor. In the case of chemical speciation, F5 had an inverse correlation with the target, while F1 and F2 had a direct correlation. These fractions contributed significantly to the prediction results. This study represents the potential successful application of ML to PTEs risk control in sediments and early warning for the surrounding water PTEs contamination.
Predicting bioavailability of potentially toxic elements (PTEs) in sediment using various machine learning (ML) models: A case study in Mahabad Dam and River-Iran
Sacchi, ElisaWriting – Review & Editing
;
2024-01-01
Abstract
Considering the significant impact of potentially toxic elements (PTEs) on the ecosystem and human health, this paper, investigated the contamination level of four PTEs (Zn, Cu, Mo and Pb) and their mobility in sediments of Mahabad dam and river. Choosing the most effective machine learning algorithms is very important in accurately predicting bioavailability of PTEs. Therefore, four machine learning (ML) models including decision tree regression (DTR), random forest regression (RFR), multi-layer perceptron regression (MLPR) and support vector regression (SVR), were used and compared for estimating the selected PTEs bioavailability. For these models, 9 variables (total concentration, pH, EC, OM and five chemical forms F1 to F5 obtained by sequential extraction) in 100 sediment samples were considered. The results showed that contamination level decreases from Zn and Cu to Pb and Mo, but the order of the mobility coefficient of the elements in the sediment follows the trend of zinc > copper > molybdenum > lead, and variation coefficient indicated more variability of spatial distribution for Zn and Cu. Among the four tested models, DTR and RFR performed the best for predicting PTEs bioavailability variations (with roc_auc>0.9, R2 > 0.8 and MSE>0.5), followed by MLPR and SVR. Furthermore, the relevance of the factors controlling the metals availability, evaluated using the RFR-based feature importance method and Pearson correlation, revealed that the most important physicochemical property for Zn, Cu and Mo bioavailability was pH, whereas for Pb, EC was the determinant factor. In the case of chemical speciation, F5 had an inverse correlation with the target, while F1 and F2 had a direct correlation. These fractions contributed significantly to the prediction results. This study represents the potential successful application of ML to PTEs risk control in sediments and early warning for the surrounding water PTEs contamination.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.