FedAvg (Federated Averaging) excels at efficient convergence in trusted environments because it aggregates client model updates via a simple weighted average. This minimizes communication overhead and computational cost, leading to faster training rounds. For example, in benchmark studies with IID (Independent and Identically Distributed) data and honest participants, FedAvg achieves target accuracy with up to 30-50% fewer communication rounds compared to more complex robust aggregators, making it the default choice for collaborative research or internal corporate training where client integrity is assumed.
Comparison
Byzantine-Robust Federated Learning (e.g., Krum) vs FedAvg

Introduction
A foundational comparison of the standard FedAvg aggregation algorithm and Byzantine-robust alternatives like Krum, focusing on the critical trade-off between resilience and convergence.
Byzantine-Robust algorithms like Krum take a different approach by explicitly defending against malicious clients that may submit poisoned updates. Krum's strategy involves selecting the single client update that is most similar to its neighbors, effectively filtering out statistical outliers. This results in a trade-off of higher computational cost per round and potentially slower convergence in exchange for proven resilience; Krum can tolerate up to a known fraction of Byzantine clients (e.g., f out of n) without compromising the global model's integrity, a critical requirement for open or adversarial cross-silo settings.
The key trade-off: If your priority is maximizing training speed and efficiency in a controlled, vetted network (e.g., internal departmental collaboration), choose FedAvg. If you prioritize security and model integrity in potentially untrusted or regulated multi-party environments (e.g., cross-company healthcare research under HIPAA where data provenance is uncertain), choose a Byzantine-robust aggregator like Krum. For a deeper understanding of aggregation strategies, explore our guide on Secure Aggregation (SecAgg) vs Differential Privacy (DP) for Federated Learning and the analysis of FedProx vs FedAvg for Heterogeneous Clients.
FedAvg vs Krum: Byzantine-Robust Federated Learning
Direct comparison of standard federated averaging against the Krum algorithm for security and performance.
| Metric | FedAvg (Standard) | Krum (Byzantine-Robust) |
|---|---|---|
Byzantine Client Resilience | ||
Convergence Rate (Typical) | 1.0x (Baseline) | 0.6x - 0.8x |
Communication Cost per Round | O(n) | O(n²) |
Primary Use Case | Trusted, Homogeneous Clients | Untrusted, Adversarial Environments |
Model Utility (IID Data) | High | Moderate to High |
Model Utility (Non-IID Data) | Moderate | Low to Moderate |
Algorithm Complexity | Low | High |
TL;DR Summary
A security-focused comparison of standard aggregation versus robust algorithms, evaluating resilience against malicious clients and the associated cost in convergence and performance.
Choose FedAvg
For cooperative, homogeneous environments prioritizing speed. FedAvg's simple weighted averaging of client updates offers fast convergence and high final accuracy when all participants are honest and data is IID or mildly non-IID. It's the baseline for most production FL systems like TensorFlow Federated or Flower due to its simplicity and low computational overhead. It provides zero defense against malicious or faulty clients.
Key Trade-off: Security vs. Utility
Robust Aggregation sacrifices model utility for security. Techniques like Krum, Median, or Trimmed Mean introduce bias by discarding potentially valid but outlier updates. This can reduce final accuracy by 2-10% absolute on non-IID data compared to FedAvg. The choice hinges on whether the threat model (malicious clients) poses a greater risk than a slight performance drop.
Key Trade-off: Computational & Communication Cost
Robust algorithms increase overhead. Krum requires O(n²) pairwise distance calculations per round, where n is the number of clients. For 1000 clients, this is ~1M comparisons, adding significant server-side compute vs. FedAvg's O(n) averaging. Secure Aggregation (SecAgg) can be combined with these methods, further increasing communication rounds. FedAvg remains the most lightweight option.
When to Choose: Decision Guide by Role
Krum for Security Architects
Verdict: Mandatory for high-risk environments. Strengths: Krum and other Byzantine-robust algorithms (e.g., Median, Trimmed Mean) are designed to detect and filter out malicious client updates. They provide provable resilience against data poisoning and model manipulation attacks, which is critical for cross-silo collaborations with low trust, such as in competitive finance or healthcare consortia. The core trade-off is a higher communication cost per round and potentially slower convergence, but this is justified when the threat model includes adversarial participants.
FedAvg for Security Architects
Verdict: Acceptable only in fully trusted or low-risk settings. Strengths: FedAvg offers simplicity and faster convergence under ideal, non-adversarial conditions. However, it is highly vulnerable; a single malicious client can significantly skew the global model. Its use should be restricted to environments with verified, vetted participants (e.g., internal departmental training) or where other layers like Secure Aggregation (SecAgg) vs Differential Privacy (DP) for Federated Learning provide complementary protection. For architects, the choice hinges on threat modeling: if you cannot guarantee client integrity, robust aggregation is non-negotiable.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Final Verdict and Recommendation
A decisive comparison of standard and robust aggregation algorithms for federated learning, guiding the choice between performance and security.
FedAvg excels at efficient convergence and high model utility in trusted, homogeneous environments because it simply averages client updates. For example, in benchmark studies with IID data and honest clients, FedAvg achieves ~95% of centralized training accuracy with significantly lower computational overhead per round compared to robust methods. Its simplicity makes it the default choice for cross-device FL on millions of benign mobile devices or within a single organization's secure silos.
Byzantine-Robust algorithms like Krum take a different approach by statistically filtering or aggregating client updates to tolerate malicious actors. This strategy results in a critical trade-off: enhanced security at the cost of slower convergence and potential utility loss, especially under high attack rates. For instance, Krum may discard up to 50% of client updates per round in a severe attack scenario, which protects the global model but can increase the rounds-to-convergence by 20-30% compared to FedAvg in a clean setting.
The key trade-off is between trust assumptions and resilience. If your priority is maximizing model accuracy and training speed in a controlled, low-risk environment (e.g., internal data collaboration), choose FedAvg. It is the foundation of most production FL systems. If you prioritize security and must operate in an adversarial, cross-silo setting with untrusted participants (e.g., multi-competitor consortia), choose a Byzantine-Robust algorithm like Krum or Median. For a deeper understanding of the underlying privacy mechanisms that complement these approaches, see our analysis of Secure Aggregation (SecAgg) vs Differential Privacy (DP) for Federated Learning. Furthermore, the choice of framework significantly impacts your ability to implement these algorithms; compare production-ready options in FedML vs Flower (Flwr).

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us