Fighting Back Against Attacks in Federated Learning

Federated Learning (FL) is we train AI models. As an alternative of sending all of your sensitive data to a central location, FL keeps the information where it’s, and only shares model updates. This preserves privacy and enables AI to run closer to where the information is generated.

Attackers can join the training process and subtly influence it, resulting in degraded accuracy, biased outputs or hidden backdoors within the model.

On this project, we set out to analyze how we are able to detect and mitigate such attacks in FL. To do that, we built a multi node simulator that allows researchers and industry professionals to breed attacks and test defences more efficiently.

Why This Matters

A non-technical Example: Consider a shared recipe book that chefs from many restaurants contribute to. Each chef updates just a few recipes with their very own improvements. A dishonest chef could deliberately add the flawed ingredients to sabotage the dish, or quietly insert a special flavour that only they know methods to fix. If nobody checks the recipes rigorously, all future diners across all restaurants could find yourself with ruined or manipulated meals.
A Technical Example: The identical concept appears in FL as data poisoning (manipulating training examples) and model poisoning (altering weight updates). These attacks are especially damaging when the federation has non IID data distributions, imbalanced data partitions or late joining clients. Contemporary defences akin to Multi KRUM, Trimmed Mean and Divide and Conquer can still fail in certain scenarios.

Constructing the Multi Node FL Attack Simulator

To guage the resilience of federated learning against real-world threats, we built a multi-node attack simulator on top of the Scaleout Systems FEDn framework. This simulator makes it possible to breed attacks, test defences, and scale experiments with tons of and even 1000’s of clients in a controlled environment.

Key capabilities:

Flexible deployment: runs distributed FL jobs using Kubernetes, Helm and Docker.
Realistic data settings: Supports IID/non-IID label distributions, imbalanced data partitions and late joining clients.
Attack injection: Includes implementation of common poisoning attacks (Label Flipping, Little is Enough) and allows latest attacks to be defined with ease.
Defense benchmarking: Integrates existing aggregation strategies (FedAvg, Trimmed Mean, Multi-KRUM, Divide and Conquer) and allows for experimentation and testing of a spread of defensive strategies and aggregation rules.
Scalable experimentation: Simulation parameters akin to variety of clients, malicious share and participation patterns may be tuned from one single configuration file.

Using FEDn’s architecture implies that the simulations profit from the robust training orchestration, client management and enables visual monitoring through the Studio web interface.

To start out with the primary example project using FEDn, here is the quickstart guide.

The FEDn framework is free for all academic and research projects, in addition to for industrial testing and trials.

The attack simulator is obtainable and able to use as an open source software.

The Attacks We Studied

Label Flipping (Data Poisoning) – Malicious clients flip labels of their local datasets, akin to changing “cat” to “dog” to cut back accuracy.
Little is Enough (Model Poisoning) – Attackers make small but targeted adjustments to their model updates to shift the worldwide model output toward their very own goals. On this thesis we applied the Little is Enough attack every third round.

Beyond Attacks — Understanding Unintentional Impact

While this study focuses on deliberate attacks, it’s equally useful for understanding the consequences of marginal contributions attributable to misconfigurations or device malfunctions in large-scale federations.

In our recipe example, even an honest chef might by chance use the flawed ingredient because their oven is broken or their scale is inaccurate. The error is unintentional, but it surely still changes the shared recipe in ways in which could possibly be harmful if repeated by many contributors.

In cross-device or fleet learning setups, where 1000’s or hundreds of thousands of heterogeneous devices contribute to a shared model, faulty sensors, outdated configurations or unstable connections can degrade model performance in similar ways to malicious attacks. Studying attack resilience also reveals methods to make aggregation rules robust to such unintentional noise.

Mitigation Strategies Explained

In FL, aggregation rules resolve methods to mix model updates from clients. Robust aggregation rules aim to cut back the influence of outliers, whether attributable to malicious attacks or faulty devices. Listed below are the strategies we tested:

FedAvg (baseline) – Simply averages all updates without filtering. Very vulnerable to attacks.

Trimmed Mean (TrMean) – Sorts each parameter across clients, then discards the very best and lowest values before averaging. Reduces extreme outliers but can miss subtle attacks.
Multi KRUM – Scores each update by how close it’s to its nearest neighbours in parameter space, keeping only those with the smallest total distance. Very sensitive to the variety of updates chosen (k).
EE Trimmed Mean (Newly developed) – An adaptive version of TrMean that uses epsilon–greedy scheduling to make your mind up when to check different client subsets. More resilient to changing client behaviour, late arrivals and non IID distributions.

tables and plots presented on this post were originally designed by the Scaleout team.

Experiments

Across 180 experiments we evaluated different aggregation strategies under various attack types, malicious client ratios and data distributions. For further details, please read the full thesis here .

The table above shows certainly one of the series of experiments using label-flipping attack with non-IID label distributed and partially imbalanced data partitions. The table shows Test Accuracy and Test Loss AUC, computed over all participating clients. Each aggregation strategy’s results are shown in two rows, corresponding to the 2 late-policies (benign clients participating from the fifth round or malicious clients participating from the fifth round). Columns separate the outcomes on the three malicious proportions, yielding six experiment configurations per aggregation strategy. One of the best end in each configuration is shown in daring.

While the table shows a comparatively homogeneous response across all defense strategies, the person plots present a totally different view. In FL, although a federation may reach a certain level of accuracy, it’s equally vital to look at client participation—specifically, which clients successfully contributed to the training and which were rejected as malicious. The next plots illustrate client participation under different defense strategies.

With 20% malicious clients under a label-flipping attack on non-IID, partially imbalanced data, Trimmed Mean () maintained overall accuracy but never fully blocked any client from contributing. While coordinate trimming reduced the impact of malicious updates, it filtered parameters individually moderately than excluding entire clients, allowing each benign and malicious participants to stay within the aggregation throughout training.

In a scenario with 30% late-joining malicious clients and non-IID , imbalanced data, Multi-KRUM () mistakenly chosen a malicious update from round 5 onward. High data heterogeneity made benign updates appear less similar, allowing the malicious update to rank as some of the central and persist in one-third of the aggregated model for the remaining of coaching.

Why we’d like adaptive aggregation strategies

Existing robust aggregation rules, generally depend on static thresholds to make your mind up which client update to incorporate in aggregating the brand new global model. This highlights a shortcoming of current aggregation strategies, which may make them vulnerable to late participating clients, non-IID data distributions or data volume imbalances between clients. These insights led us to develop EE-Trimmed Mean (EE-TrMean).

EE-TrMean: An epsilon greedy aggregation strategy

EE-TrMean construct on the classical Trimmed Mean, but adds an exploration vs. exploitation, epsilon greedy layer for client selection.

Exploration phase: All clients are allowed to contribute and a standard Trimmed Mean aggregation round is executed.
Exploitation phase: The clients which were trimmed the least will probably be included into the exploitation phase, through a mean rating system based on previous rounds it participated.
The switch between the 2 phases is controlled by the epsilon-greedy policy with a decaying epsilon and an alpha ramp.

Each client earns a rating based on whether its parameters survive trimming in each round. Over time the algorithm will increasingly favor the very best scoring clients, while occasionally exploring others to detect changes in behaviour. This adaptive approach allows EE-TrMean to extend resilience in cases where the information heterogeneity and malicious activity is high.

In a label-flipping scenario with 20% malicious clients and late benign joiners on non-IID, partially imbalanced data, EE-TrMean alternated between exploration and exploitation phases—initially allowing all clients, then selectively blocking low-scoring ones. While it occasionally excluded a benign client as a result of data heterogeneity (still a lot better than the known strategies), it successfully identified and minimized the contributions of malicious clients during training. This straightforward yet powerful modification improves the client’s contributions. The literature reports that so long as the vast majority of clients are honest, the model’s accuracy stays reliable.

Fighting Back Against Attacks in Federated Learning

Why This Matters

Constructing the Multi Node FL Attack Simulator

The Attacks We Studied

Beyond Attacks — Understanding Unintentional Impact

Mitigation Strategies Explained

Experiments

Why we’d like adaptive aggregation strategies

EE-TrMean: An epsilon greedy aggregation strategy

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Construct a Real-Time Visual Inspection Pipeline with NVIDIA TAO 6 and NVIDIA DeepStream 8

Custom Policy Enforcement with Reasoning: Faster, Safer AI Applications

Gemini 3 Pro Image model from Google DeepMind

OpenAI’s ‘Code Red’ scramble

With Nova Forge, AWS gives firms a path to construct foundation-class models without GPUs

Fighting Back Against Attacks in Federated Learning

Why This Matters

Constructing the Multi Node FL Attack Simulator

The Attacks We Studied

Beyond Attacks — Understanding Unintentional Impact

Mitigation Strategies Explained

Experiments

Why we’d like adaptive aggregation strategies

EE-TrMean: An epsilon greedy aggregation strategy

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.