Vertical Federated Learning (VFL) is a decentralized training paradigm for scenarios where data is partitioned by features (columns) across parties, not by samples (rows). In VFL, different entities—such as a bank and an e-commerce platform—hold distinct feature sets about the same customers. They collaborate to train a model, such as a joint credit risk predictor, by aligning their data via encrypted entity matching and then computing over encrypted or masked intermediate results, ensuring raw feature data never leaves its owner's silo.
Glossary
Vertical Federated Learning

What is Vertical Federated Learning?
Vertical Federated Learning (VFL) is a collaborative machine learning paradigm where different organizations, each holding different feature sets about the same set of entities, jointly train a model without directly sharing their raw data.
The core technical challenge in VFL is performing secure, aligned computation without data centralization. Common architectures use a split neural network, where each party computes the initial layers on its local features. Intermediate outputs (often called smashed data or embeddings) are securely aggregated at a coordinator or another party to compute the final layers and loss. Training relies on cryptographic techniques like homomorphic encryption or secure multi-party computation to compute gradients, preserving privacy. This makes VFL distinct from horizontal federated learning, where parties share the same feature space but different samples.
Key Characteristics of VFL
Vertical Federated Learning (VFL) is a collaborative machine learning paradigm where different organizations hold different feature sets about the same set of entities and train a model without directly sharing raw data. Its architecture is defined by several core technical characteristics.
Feature Partitioning by Entity
In VFL, data is partitioned vertically or by feature. Different parties (e.g., a bank and an e-commerce platform) hold different attributes (features) for the same set of users (entities or sample IDs).
- Bank: Holds credit score, transaction history.
- E-commerce: Holds purchase history, browsing behavior.
The goal is to train a model that utilizes this combined feature space without any party exposing its raw feature columns. This contrasts with Horizontal Federated Learning (HFL), where parties have the same feature set but different samples.
Sample Alignment & Cryptography
A prerequisite for VFL is Private Set Intersection (PSI) to securely identify the common set of entities across parties without revealing non-overlapping samples.
- Process: Parties use cryptographic protocols (e.g., based on Diffie-Hellman, oblivious transfer) to compute the intersection of their ID lists.
- Output: Only the aligned, overlapping samples are used for training. The protocol ensures no party learns the full ID list of another.
This step is computationally intensive but critical for privacy and model validity, preventing training on misaligned data.
Split Neural Network Architecture
The model architecture is physically split across participants. A typical setup involves:
- Bottom Models: Each party holds a local model (e.g., a few neural network layers) that processes its private features.
- Interactive Layer: The outputs (embeddings or smashed data) from all bottom models are sent to a guest party or a neutral coordinator.
- Top Model: The guest/coordinator aggregates these intermediate outputs and runs the remaining layers of the network to produce the final prediction.
During backward propagation, gradients flow back through the top model to each party's bottom model for local updates, without exposing raw features.
Asymmetric Roles: Host & Guest
VFL typically involves asymmetric participant roles, unlike the symmetric client-server model of HFL.
- Guest Party: The party that holds the labels (Y) for the aligned samples and usually hosts the top model. It initiates the training task and computes the final loss.
- Host Party(ies): Parties that hold only features (X) and host bottom models. They contribute feature representations but do not possess labels.
This role distinction fundamentally shapes the protocol's communication pattern, incentive structure, and privacy considerations.
Privacy-Preserving Forward Pass
The forward pass is designed to prevent leakage of private features. The key mechanism is the transmission of encrypted or homomorphically encrypted intermediate results.
- Plaintext Embeddings: In basic setups, hosts send plaintext embeddings (smashed data) to the guest. This reveals some information but not raw features.
- Enhanced Privacy: For stronger guarantees, hosts encrypt their embeddings using Homomorphic Encryption (HE) or Secure Multi-Party Computation (SMPC). The guest can perform computations on these encrypted values to continue the forward pass without decryption.
This ensures the guest cannot directly invert the embeddings to reconstruct host features.
Secure Gradient Exchange
The backward pass must also protect sensitive information. Gradients can leak information about the underlying training data.
- Gradient Protection: Hosts receive gradients for their bottom models from the guest. To prevent label leakage from the guest to the hosts, these gradients may be obfuscated or computed using cryptographic techniques.
- Aggregation without Disclosure: Protocols like Secure Aggregation can be extended to VFL, allowing the coordinator to compute the necessary aggregated gradient information without learning any party's individual contribution.
This secure exchange is crucial for maintaining the privacy guarantee for all parties throughout the training lifecycle.
Vertical vs. Horizontal Federated Learning
A comparison of the two primary data partitioning schemes in federated learning, highlighting their architectural differences, use cases, and technical challenges.
| Feature | Vertical Federated Learning (VFL) | Horizontal Federated Learning (HFL) |
|---|---|---|
Data Partitioning Scheme | Features are partitioned across clients (same sample IDs, different features). | Samples are partitioned across clients (different sample IDs, same feature set). |
Typical Use Case | Collaboration between organizations with complementary data on the same entities (e.g., a bank and an e-commerce site analyzing shared customers). | Training across many devices/users with similar data schemas (e.g., next-word prediction across millions of smartphones). |
Sample Alignment Requirement | ||
Privacy-Preserving Entity Resolution | Required (e.g., via Private Set Intersection) to find common samples without exposing IDs. | |
Model Architecture | Typically a split neural network. Clients hold bottom models for their features; a server holds the top model. | All clients train an identical, full model architecture locally. |
Communication Overhead per Round | High (requires exchanging intermediate activations/gradients for each aligned sample). | Lower (exchanges only model parameters or gradients). |
Scalability to Massive Client Numbers | ||
Primary Privacy Risk | Potential leakage from intermediate activations (smashed data). | Potential leakage from shared model gradients/updates. |
Common Aggregation Method | Gradient/activation aggregation from split layers. | Parameter averaging (e.g., Federated Averaging). |
Handling of Non-IID Data | Inherently addresses feature-wise heterogeneity. | Challenged by sample-wise heterogeneity; requires algorithms like FedProx. |
Common Use Cases for VFL
Vertical Federated Learning (VFL) enables collaborative model training across organizations that hold different attributes (features) about the same entities. Its primary applications are in industries where data is highly sensitive, siloed, and complementary.
Frequently Asked Questions
Vertical Federated Learning (VFL) enables collaborative model training between organizations that hold different data features about the same entities, such as customers or patients, without sharing the raw underlying data. This FAQ addresses its core mechanisms, differences from horizontal FL, and its critical role in privacy-preserving, cross-organizational AI.
Vertical Federated Learning (VFL) is a collaborative machine learning paradigm where two or more parties, each holding a different set of features for the same set of entities (e.g., customers, patients), jointly train a model without directly exchanging their raw feature data. It works by aligning entities via privacy-preserving entity resolution (e.g., using cryptographic hashes) and then training a model where the computation is split: each party computes on its local features, and only necessary intermediate results, such as embeddings or gradients, are exchanged under encryption to complete forward and backward passes. A common architecture uses a split neural network, where the bottom layers reside with each data party and the top layers are on a coordinating server or a designated party.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Related Terms
Vertical Federated Learning (VFL) operates within a broader ecosystem of privacy-preserving and distributed machine learning techniques. These related concepts define the protocols, security measures, and architectural patterns that make VFL feasible and robust.
Horizontal Federated Learning (HFL)
The more common counterpart to VFL, where participating parties hold different data samples (e.g., different users) but with the same feature space. For example, two hospitals train a model on their separate patient records, which all contain the same clinical features. The core challenge is data distribution heterogeneity across clients.
- Key Difference: HFL partitions data by rows (samples); VFL partitions by columns (features).
- Typical Use: Cross-device learning on smartphones or IoT sensors.
Split Learning
A distributed technique where a neural network is vertically split between parties. The client holding the raw data computes the initial layers and sends the intermediate outputs (smashed data or activations) to another party (e.g., a server) which completes the forward and backward pass. This allows collaborative training without sharing raw input data.
- Relation to VFL: Often used as the underlying training protocol for VFL, especially when one party holds the labels. The model architecture itself is federated.
- Privacy Risk: Smashed data can potentially leak information, requiring privacy safeguards.
Secure Multi-Party Computation (SMPC)
A cryptographic framework that enables multiple parties to jointly compute a function (like model training or inference) over their private inputs while revealing only the final output. No party learns anything about the others' raw data beyond what is implied by the result.
- Role in VFL: Used to securely compute aggregated statistics, gradients, or loss calculations during the collaborative training process. For example, securely computing the sum of gradients from different feature holders.
- Common Protocols: Garbled Circuits, Secret Sharing.
Entity Alignment
The prerequisite step in VFL where parties must securely identify overlapping samples (entities) across their disjoint feature sets without exposing their entire datasets. This is necessary because VFL assumes collaboration on the same set of entities (e.g., the same customers).
- Core Challenge: Performing alignment with privacy guarantees, often using techniques like Private Set Intersection (PSI).
- Without Alignment: Parties would be training on misaligned data, rendering the model useless. This step is unique to the vertical data partition setting.
Homomorphic Encryption (HE)
A form of encryption that allows computation on ciphertexts. An operation performed on encrypted data produces an encrypted result which, when decrypted, matches the result of the operation as if it had been performed on the plaintext.
- Role in VFL: Enables a party (e.g., the label holder) to perform computations on encrypted intermediate results from other feature holders. This allows forward/backward propagation without decrypting sensitive features.
- Trade-off: Provides strong cryptographic security but introduces significant computational overhead.
Cross-Silo Federated Learning
A federated learning setting characterized by a small number of reliable, organizational participants (e.g., 2-100 banks or hospitals). Participants have significant computational resources and stable connectivity. This is the primary deployment scenario for VFL.
- Contrast with Cross-Device FL: Cross-device involves millions of unstable edge devices (phones). VFL's need for entity alignment and complex cryptographic protocols makes it practically suited only for the cross-silo context.
- Trust Model: Parties are often known but mutually distrustful regarding their proprietary data assets.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us