Glossary

Communication Rounds

Communication Rounds are the fundamental iterative cycles in federated learning where a central server coordinates model training across distributed clients without sharing raw data.

Get in touch Learn more

Data scientist building training data pipeline on laptop, data preprocessing visible, technical workspace.

FEDERATED LEARNING

What is Communication Rounds?

A core iterative cycle in federated learning where a global model is collaboratively improved across distributed devices without centralizing raw data.

In federated learning, a communication round is the fundamental iterative cycle where a central server coordinates the collaborative training of a global model across a distributed network of clients. Each round consists of three phases: the server broadcasts the current global model to a selected subset of clients; each client performs local training (e.g., via Stochastic Gradient Descent) on its private data; and the server aggregates the resulting model updates (e.g., using Federated Averaging) to produce an improved global model. This process repeats until convergence, minimizing the need to exchange raw, sensitive data.

The efficiency of federated learning is critically measured by the number of communication rounds required for the model to converge, as this directly impacts training time, bandwidth cost, and device energy consumption. Key challenges include managing statistical heterogeneity (non-IID data) across clients, which can cause client drift and slow convergence, and optimizing for partial client participation where only a fraction of devices are available each round. Advanced algorithms like FedProx and SCAFFOLD are designed to reduce the required rounds and improve stability under these conditions.

FEDERATED LEARNING

Key Components of a Communication Round

A communication round is the fundamental iterative cycle in federated learning where the global model is updated through decentralized collaboration. Each round consists of distinct, orchestrated phases between a central server and participating edge clients.

Server Model Broadcast

The round begins with the central server selecting a cohort of available clients and transmitting the current global model parameters to them. This broadcast is a one-to-many distribution, often optimized for bandwidth. The selection strategy can be random, based on device capability, or designed to maximize statistical diversity. In cross-device FL, this phase must handle intermittent connectivity and device churn.

Local Model Training (Epochs)

Each selected client performs local stochastic gradient descent (Local SGD) on its private, on-device dataset. This involves multiple training epochs over the local data. The key hyperparameter is the number of local steps or epochs before communication. Performing more local computation reduces communication frequency but can lead to client drift, where local models diverge due to non-IID data distributions. Algorithms like FedProx add a proximal term to the local loss to constrain this drift.

Client Update Computation & Privacy

After local training, the client computes an update. This is typically the difference between the initialized and final model weights (delta) or the computed gradients. To preserve privacy, this update may be protected before transmission:

Differential Privacy: Adding calibrated noise to the update.
Secure Aggregation Preparation: Using cryptographic masks so the server can only decrypt the sum of all updates, not individual contributions.
Compression: Applying techniques like quantization or sparsification to reduce upload payload size.

Secure Update Aggregation

The server collects updates from participating clients. The core algorithmic step is model aggregation, most commonly Federated Averaging (FedAvg), which computes a weighted average of client updates. For security:

Secure Aggregation protocols ensure the server learns only the aggregated sum, not individual updates.
Byzantine Robust aggregation rules (e.g., median, trimmed mean) are used to filter out malicious updates from model poisoning attacks. This phase transforms many local insights into a single, improved global model.

Global Model Update & Evaluation

The aggregated update is applied to the global model, creating a new model version for the next round. The server then evaluates the new global model's performance, typically on a held-out validation set. Key metrics tracked across rounds include global accuracy, loss convergence, and fairness across client distributions. This evaluation informs decisions like adjusting client selection or learning rates for subsequent rounds.

Round Completion & Synchronization

The round concludes with the server preparing for the next iteration. In synchronous FL, the server waits for all selected clients to respond or times out, creating a consistent update cycle. In asynchronous FL, clients update the global model as soon as they finish, improving efficiency at the cost of potential staleness. The total number of communication rounds is a primary determinant of the training time and the final model quality, directly impacting the communication-computation trade-off.

KEY DETERMINANTS

Factors Influencing Communication Efficiency

This table compares the primary technical and system factors that determine the efficiency of communication rounds in federated learning, focusing on their impact on latency, bandwidth, and overall training convergence.

Factor	High Efficiency (Favorable)	Low Efficiency (Unfavorable)	Primary Impact
Client Compute Speed	Fast, dedicated hardware (e.g., NPU)	Slow, shared CPU on constrained device	Local Training Time
Network Bandwidth	High (>100 Mbps), stable connection	Low (<1 Mbps), intermittent connection	Update Upload/Download Time
Model Size	Small, compressed (<1 MB)	Large, uncompressed (>100 MB)	Data Transferred per Round
Client Participation Rate	High, consistent participation	Low, sporadic participation	Statistical Utility per Round
Data Heterogeneity (Non-IID)	Low, similar distributions across clients	High, divergent distributions	Rounds to Convergence
Aggregation Algorithm	Robust to stragglers & heterogeneity (e.g., FedProx)	Simple averaging (FedAvg) on highly heterogeneous data	Convergence Efficiency
Update Compression	Applied (e.g., quantization, sparsification)	Not applied (full precision updates sent)	Bandwidth Consumption
Security/Privacy Overhead	Lightweight or selectively applied	Heavy (e.g., full Homomorphic Encryption)	Round Completion Latency

COMMUNICATION ROUNDS

Frequently Asked Questions

Communication rounds are the fundamental iterative cycles in federated learning. This FAQ addresses the core mechanics, optimization strategies, and trade-offs involved in coordinating model training across decentralized devices.

A communication round is the core iterative cycle in federated learning where a central server coordinates the collaborative training of a global model across multiple decentralized clients without exchanging raw data. Each round consists of four key phases: 1) Server Selection & Broadcast, where the server selects a subset of available clients and sends them the current global model; 2) Local Training, where each selected client computes an update by training the model on its private, on-device data; 3) Update Transmission, where clients send their local model updates (e.g., gradients or weights) back to the server; and 4) Secure Aggregation, where the server combines these updates—often using an algorithm like Federated Averaging (FedAvg)—to produce a new, improved global model. The process then repeats, with the refined global model being broadcast in the next round.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

FEDERATED LEARNING

Related Terms

Communication rounds are the fundamental iterative unit of federated learning. Understanding related concepts is crucial for designing efficient and robust decentralized training systems.

Federated Averaging (FedAvg)

The foundational aggregation algorithm that defines a standard communication round. The server broadcasts the global model, clients perform local SGD, and the server computes a weighted average of the returned model updates. Its simplicity makes it the baseline, but its performance degrades significantly under non-IID data and partial client participation.

Client Drift

A critical phenomenon where local client models, optimized on their heterogeneous data, diverge from the global objective during multiple local update steps. This divergence accumulates over communication rounds, slowing convergence and reducing final accuracy. Algorithms like FedProx and SCAFFOLD are designed explicitly to mitigate client drift by constraining or correcting local updates.

Statistical Heterogeneity

The defining characteristic of federated data where the data distribution varies significantly across clients (non-IID). This is the root cause of client drift and makes aggregation challenging. Heterogeneity necessitates:

Robust aggregation rules (e.g., median-based, trimmed mean).
Personalization techniques to adapt the global model locally.
Careful client selection strategies per round.

Secure Aggregation

A cryptographic protocol that ensures privacy within a communication round. It allows the server to compute the sum of client updates without being able to inspect any individual client's contribution. This protects against a curious server reconstructing raw data from gradients. It adds computational overhead but is essential for high-assurance privacy in cross-device settings.

Partial Participation

The practical reality in cross-device FL where only a subset of available clients participates in each communication round, due to device availability, connectivity, or sampling strategies. This requires:

Unbiased client sampling to maintain statistical representativeness.
Aggregation algorithms robust to missing updates.
Asynchronous or semi-asynchronous protocols to avoid stragglers.

Communication Efficiency

A primary optimization goal for federated learning, as communication is often the bottleneck compared to local computation. Techniques to reduce rounds or data per round include:

Local steps: Performing multiple SGD steps per round.
Compression: Sending sparsified, quantized, or sketched updates.
Adaptive client selection: Choosing clients with the most informative updates. The trade-off is between round efficiency and final model accuracy.

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.

Talk to Us

Communication Rounds

What is Communication Rounds?

Key Components of a Communication Round

Server Model Broadcast

Local Model Training (Epochs)

Client Update Computation & Privacy

Secure Update Aggregation

Global Model Update & Evaluation

Round Completion & Synchronization

Factors Influencing Communication Efficiency

Frequently Asked Questions

Intelligent Analysis, Decision & Execution

Search across company data

Automate internal workflows

Add AI to products and internal tools

Prasad Kumkar

Partnered with leading AI, data, and software stack.

Custom AI workflows for your Business

Review the use case

Pick the right approach

Build the first useful version

Improve from there