This paper presents a novel four-layer machine learning approach for classification and reconstruction of user and prosumer profiles in electrical distribution networks. Unlike conventional black-box methodologies, the proposed framework introduces a transparent, modular architecture that addresses the challenge of managing vast data amounts while ensuring efficient network operations through intelligent preprocessing that retains over 80% of original data. The approach develops separate models for domestic and non-domestic users and prosumers, recognizing their fundamentally different consumption patterns. The four-layer architecture integrates clustering-based pattern recognition, Random Forest classification and regression, and temperature dependency modeling to enhance electricity demand estimation and forecasting. Key innovations include bidirectional z-score normalization for energy profile reconstruction, temperature-aware load reconstruction that significantly improves accuracy during seasonal variations (up to 7.76% during summer periods), and systematic integration of quantitative energy data with qualitative contextual information such as Italian economic activity classification codes and tariff-based consumption patterns. Comprehensive benchmarking against four baseline methods (Cluster-based Profile Assignment, Linear/Ridge Regression, Random Forest) demonstrates the proposed framework achieves substantial improvements, with 59.8–70.8% reduction in relative mean absolute error compared to the best conventional baseline for each feeder. Validation on real utility data from three medium-voltage feeders demonstrates practical applicability and robustness across different network configurations. The approach successfully scales from individual user modeling to feeder-level reconstruction, achieving relative mean absolute error values below 4.2% and mean absolute percentage error values below 9%. The methodology demonstrates practical scalability from training on thousands of users to modeling utility-scale deployments, validated through real utility infrastructure rather than simulated environments.
Multi-layer machine learning approach for bottom-up classification and reconstruction of electrical load profiles in smart distribution networks
Bosisio, Alessandro;Cirocco, Alessandro;
2026-01-01
Abstract
This paper presents a novel four-layer machine learning approach for classification and reconstruction of user and prosumer profiles in electrical distribution networks. Unlike conventional black-box methodologies, the proposed framework introduces a transparent, modular architecture that addresses the challenge of managing vast data amounts while ensuring efficient network operations through intelligent preprocessing that retains over 80% of original data. The approach develops separate models for domestic and non-domestic users and prosumers, recognizing their fundamentally different consumption patterns. The four-layer architecture integrates clustering-based pattern recognition, Random Forest classification and regression, and temperature dependency modeling to enhance electricity demand estimation and forecasting. Key innovations include bidirectional z-score normalization for energy profile reconstruction, temperature-aware load reconstruction that significantly improves accuracy during seasonal variations (up to 7.76% during summer periods), and systematic integration of quantitative energy data with qualitative contextual information such as Italian economic activity classification codes and tariff-based consumption patterns. Comprehensive benchmarking against four baseline methods (Cluster-based Profile Assignment, Linear/Ridge Regression, Random Forest) demonstrates the proposed framework achieves substantial improvements, with 59.8–70.8% reduction in relative mean absolute error compared to the best conventional baseline for each feeder. Validation on real utility data from three medium-voltage feeders demonstrates practical applicability and robustness across different network configurations. The approach successfully scales from individual user modeling to feeder-level reconstruction, achieving relative mean absolute error values below 4.2% and mean absolute percentage error values below 9%. The methodology demonstrates practical scalability from training on thousands of users to modeling utility-scale deployments, validated through real utility infrastructure rather than simulated environments.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


