The FedRepo algorithm, designed to mitigate concept drift in federated learning environments, is structured around several key principles. These principles ensure that the algorithm dynamically adapts to changes in data distributions across different consumers (households), maintaining the efficacy of the deployed models. In our example on electricity consumption forecasting, we use Random Forest (RF) regressor models for the regression task. Note though, that the approach can be adapted to classification tasks by using RF classifiers and an appropriate performance evaluation metric instead.

Here, we give an overview of the single steps of the FedRepo approach but will elaborate on each step in the remainder of the notebook.

  • Local model training: each consumer trains its own RF regressor model locally. This ensures that sensitive usage data is never shared across consumers or with the central node. Only model parameters like the number of trees or the minimum number of samples required to split an internal node are shared with the central node.
  • Federated model construction: at the central node, federated cluster models are constructed aggregating the insights from local models trained by a group of consumers.
  • Concept drift detection: the performance of deployed federated models is regularly evaluated at the local level. This ensures that the framework is able to detect concept drift when needed.
  • Mitigation: if concept drift is detected, several maintenance steps are taken to mitigate its effects. This could include the retraining of local models with recent data.

These principles are reflected in the main phases of FedRepo, which are: Initialization, Model training, Context-aware inference and Dynamic model maintenance. These are shown in the image below which gives an overview of the methodology. Throughout the methodology, three repositories (hence the name, FedRepo) kept at the central node are continuously maintained and updated to adapt for concept drift:

  • \(Θ\): a repository of workers, which contains at any moment the workers for which new federated models need to be constructed.
  • \(Φ\): a repository of global federated random forest models, which contains at any moment the active (deployed) federated models.
  • \(Γ\): a repository of tree models, which contains at any moment subsets of trees from local RF models of each worker.

Note that a worker refers to a consumer in this use case, however in other applications it could be any type of clients/devices. In the following, each of the main phases will be discussed one by one and executed on the UK Power Networks dataset.

Authors: EluciDATA Lab

Permanent URL