Glossary

Vertical Federated Learning

Vertical Federated Learning (VFL) is a privacy-preserving machine learning paradigm where multiple parties, each holding different feature sets about the same set of entities (e.g., customers), collaboratively train a model without directly sharing their raw data.

Get in touch Learn more

Data engineer managing feature store on laptop, feature definitions visible, casual data engineering session.

FEDERATED LEARNING

What is Vertical Federated Learning?

Vertical Federated Learning (VFL) is a collaborative machine learning paradigm where different organizations, each holding different feature sets about the same set of entities, jointly train a model without directly sharing their raw data.

Vertical Federated Learning (VFL) is a decentralized training paradigm for scenarios where data is partitioned by features (columns) across parties, not by samples (rows). In VFL, different entities—such as a bank and an e-commerce platform—hold distinct feature sets about the same customers. They collaborate to train a model, such as a joint credit risk predictor, by aligning their data via encrypted entity matching and then computing over encrypted or masked intermediate results, ensuring raw feature data never leaves its owner's silo.

The core technical challenge in VFL is performing secure, aligned computation without data centralization. Common architectures use a split neural network, where each party computes the initial layers on its local features. Intermediate outputs (often called smashed data or embeddings) are securely aggregated at a coordinator or another party to compute the final layers and loss. Training relies on cryptographic techniques like homomorphic encryption or secure multi-party computation to compute gradients, preserving privacy. This makes VFL distinct from horizontal federated learning, where parties share the same feature space but different samples.

VERTICAL FEDERATED LEARNING

Key Characteristics of VFL

Vertical Federated Learning (VFL) is a collaborative machine learning paradigm where different organizations hold different feature sets about the same set of entities and train a model without directly sharing raw data. Its architecture is defined by several core technical characteristics.

Feature Partitioning by Entity

In VFL, data is partitioned vertically or by feature. Different parties (e.g., a bank and an e-commerce platform) hold different attributes (features) for the same set of users (entities or sample IDs).

Bank: Holds credit score, transaction history.
E-commerce: Holds purchase history, browsing behavior.

The goal is to train a model that utilizes this combined feature space without any party exposing its raw feature columns. This contrasts with Horizontal Federated Learning (HFL), where parties have the same feature set but different samples.

Sample Alignment & Cryptography

A prerequisite for VFL is Private Set Intersection (PSI) to securely identify the common set of entities across parties without revealing non-overlapping samples.

Process: Parties use cryptographic protocols (e.g., based on Diffie-Hellman, oblivious transfer) to compute the intersection of their ID lists.
Output: Only the aligned, overlapping samples are used for training. The protocol ensures no party learns the full ID list of another.

This step is computationally intensive but critical for privacy and model validity, preventing training on misaligned data.

Split Neural Network Architecture

The model architecture is physically split across participants. A typical setup involves:

Bottom Models: Each party holds a local model (e.g., a few neural network layers) that processes its private features.
Interactive Layer: The outputs (embeddings or smashed data) from all bottom models are sent to a guest party or a neutral coordinator.
Top Model: The guest/coordinator aggregates these intermediate outputs and runs the remaining layers of the network to produce the final prediction.

During backward propagation, gradients flow back through the top model to each party's bottom model for local updates, without exposing raw features.

Asymmetric Roles: Host & Guest

VFL typically involves asymmetric participant roles, unlike the symmetric client-server model of HFL.

Guest Party: The party that holds the labels (Y) for the aligned samples and usually hosts the top model. It initiates the training task and computes the final loss.
Host Party(ies): Parties that hold only features (X) and host bottom models. They contribute feature representations but do not possess labels.

This role distinction fundamentally shapes the protocol's communication pattern, incentive structure, and privacy considerations.

Privacy-Preserving Forward Pass

The forward pass is designed to prevent leakage of private features. The key mechanism is the transmission of encrypted or homomorphically encrypted intermediate results.

Plaintext Embeddings: In basic setups, hosts send plaintext embeddings (smashed data) to the guest. This reveals some information but not raw features.
Enhanced Privacy: For stronger guarantees, hosts encrypt their embeddings using Homomorphic Encryption (HE) or Secure Multi-Party Computation (SMPC). The guest can perform computations on these encrypted values to continue the forward pass without decryption.

This ensures the guest cannot directly invert the embeddings to reconstruct host features.

Secure Gradient Exchange

The backward pass must also protect sensitive information. Gradients can leak information about the underlying training data.

Gradient Protection: Hosts receive gradients for their bottom models from the guest. To prevent label leakage from the guest to the hosts, these gradients may be obfuscated or computed using cryptographic techniques.
Aggregation without Disclosure: Protocols like Secure Aggregation can be extended to VFL, allowing the coordinator to compute the necessary aggregated gradient information without learning any party's individual contribution.

This secure exchange is crucial for maintaining the privacy guarantee for all parties throughout the training lifecycle.

FEDERATED LEARNING PARADIGMS

Vertical vs. Horizontal Federated Learning

A comparison of the two primary data partitioning schemes in federated learning, highlighting their architectural differences, use cases, and technical challenges.

Feature	Vertical Federated Learning (VFL)	Horizontal Federated Learning (HFL)
Data Partitioning Scheme	Features are partitioned across clients (same sample IDs, different features).	Samples are partitioned across clients (different sample IDs, same feature set).
Typical Use Case	Collaboration between organizations with complementary data on the same entities (e.g., a bank and an e-commerce site analyzing shared customers).	Training across many devices/users with similar data schemas (e.g., next-word prediction across millions of smartphones).
Sample Alignment Requirement
Privacy-Preserving Entity Resolution	Required (e.g., via Private Set Intersection) to find common samples without exposing IDs.
Model Architecture	Typically a split neural network. Clients hold bottom models for their features; a server holds the top model.	All clients train an identical, full model architecture locally.
Communication Overhead per Round	High (requires exchanging intermediate activations/gradients for each aligned sample).	Lower (exchanges only model parameters or gradients).
Scalability to Massive Client Numbers
Primary Privacy Risk	Potential leakage from intermediate activations (smashed data).	Potential leakage from shared model gradients/updates.
Common Aggregation Method	Gradient/activation aggregation from split layers.	Parameter averaging (e.g., Federated Averaging).
Handling of Non-IID Data	Inherently addresses feature-wise heterogeneity.	Challenged by sample-wise heterogeneity; requires algorithms like FedProx.

VERTICAL DATA PARTITIONING

Common Use Cases for VFL

Vertical Federated Learning (VFL) enables collaborative model training across organizations that hold different attributes (features) about the same entities. Its primary applications are in industries where data is highly sensitive, siloed, and complementary.

Financial Credit Scoring

A bank and an e-commerce platform collaborate to build a more accurate credit risk model. The bank holds core financial features (account history, loan repayments), while the e-commerce platform holds complementary behavioral data (purchase history, browsing patterns). VFL allows the joint model to leverage both feature sets to predict default risk without either party exposing their raw customer data.

Key Features: Combines financial stability signals with real-time spending behavior.
Privacy Benefit: Sensitive financial records never leave the bank's silo.
Regulatory Alignment: Facilitates compliance with data localization laws (e.g., GDPR, CCPA).

EXPLORE

Healthcare & Precision Medicine

A hospital's electronic health records (EHRs) contain clinical data (lab results, diagnoses), while a genomics research institute holds genetic data for the same patient cohort. VFL enables the training of a predictive model for disease progression or drug response that integrates clinical and genomic features, which is impossible with data isolated in separate institutions.

Key Features: Unifies phenotypic (clinical) and genotypic data.
Privacy Imperative: Protects Protected Health Information (PHI) under HIPAA and similar regulations.
Research Impact: Accelerates biomedical discovery without centralizing sensitive patient data.

EXPLORE

Cross-Platform Recommendation Systems

A video streaming service and a music streaming service want to improve content recommendations. They share a common user base but possess different feature sets: one has viewing history, the other has listening history. Using VFL, they can train a model that learns unified user preferences from multimodal engagement signals, leading to better cross-service recommendations without sharing watch or listen logs.

Key Features: Learns from multimodal user behavior (video, audio).
Business Benefit: Enhances user engagement and retention for both platforms.
Data Sovereignty: Each company retains full control over its proprietary interaction data.

EXPLORE

Smart City & IoT Analytics

Different municipal departments or utility companies hold vertical slices of city data. A transportation agency has traffic flow data, while the energy utility has smart meter readings from the same city blocks. VFL can train a model to optimize traffic light timing for reduced congestion and lower localized energy consumption, using features from both domains without creating a centralized data lake.

Key Features: Correlates disparate urban sensor data (traffic, energy, environment).
Operational Efficiency: Enables holistic urban planning without bureaucratic data sharing agreements.
Scalability: Model can incorporate new data sources (e.g., air quality sensors) as additional vertical parties.

EXPLORE

Manufacturing Supply Chain Optimization

An original equipment manufacturer (OEM), a parts supplier, and a logistics provider each hold different features about the same production process. The OEM has assembly line quality data, the supplier has component material specs, and the logistics firm has shipping condition data. VFL enables predictive maintenance or yield optimization models that account for the entire supply chain's variables, improving resilience and reducing downtime.

Key Features: Integrates design, manufacturing, and logistics features.
Competitive Advantage: Partners collaborate on a shared objective while protecting proprietary process data.
Outcome: Reduces defects and predicts bottlenecks using a complete, but privacy-preserving, feature set.

EXPLORE

Fraud Detection in Banking Consortia

Multiple banks within a consortium wish to build a more robust fraud detection model. While they cannot share transaction details or customer identifiers, they may have overlapping customers. Using VFL with entity alignment techniques, they can train a model where each bank contributes its unique feature perspective (e.g., different spending channels, geographic patterns) on shared entities, creating a defense system that understands fraud patterns across the entire banking ecosystem.

Key Features: Aggregates threat intelligence across institutions.
Security Critical: Models evolve faster than fraudsters can adapt to a single bank's patterns.
Regulatory Model: Operates within strict financial data privacy regulations (e.g., GLBA).

EXPLORE

VERTICAL FEDERATED LEARNING

Frequently Asked Questions

Vertical Federated Learning (VFL) enables collaborative model training between organizations that hold different data features about the same entities, such as customers or patients, without sharing the raw underlying data. This FAQ addresses its core mechanisms, differences from horizontal FL, and its critical role in privacy-preserving, cross-organizational AI.

Vertical Federated Learning (VFL) is a collaborative machine learning paradigm where two or more parties, each holding a different set of features for the same set of entities (e.g., customers, patients), jointly train a model without directly exchanging their raw feature data. It works by aligning entities via privacy-preserving entity resolution (e.g., using cryptographic hashes) and then training a model where the computation is split: each party computes on its local features, and only necessary intermediate results, such as embeddings or gradients, are exchanged under encryption to complete forward and backward passes. A common architecture uses a split neural network, where the bottom layers reside with each data party and the top layers are on a coordinating server or a designated party.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

VFL ECOSYSTEM

Related Terms

Vertical Federated Learning (VFL) operates within a broader ecosystem of privacy-preserving and distributed machine learning techniques. These related concepts define the protocols, security measures, and architectural patterns that make VFL feasible and robust.

Horizontal Federated Learning (HFL)

The more common counterpart to VFL, where participating parties hold different data samples (e.g., different users) but with the same feature space. For example, two hospitals train a model on their separate patient records, which all contain the same clinical features. The core challenge is data distribution heterogeneity across clients.

Key Difference: HFL partitions data by rows (samples); VFL partitions by columns (features).
Typical Use: Cross-device learning on smartphones or IoT sensors.

Split Learning

A distributed technique where a neural network is vertically split between parties. The client holding the raw data computes the initial layers and sends the intermediate outputs (smashed data or activations) to another party (e.g., a server) which completes the forward and backward pass. This allows collaborative training without sharing raw input data.

Relation to VFL: Often used as the underlying training protocol for VFL, especially when one party holds the labels. The model architecture itself is federated.
Privacy Risk: Smashed data can potentially leak information, requiring privacy safeguards.

Secure Multi-Party Computation (SMPC)

A cryptographic framework that enables multiple parties to jointly compute a function (like model training or inference) over their private inputs while revealing only the final output. No party learns anything about the others' raw data beyond what is implied by the result.

Role in VFL: Used to securely compute aggregated statistics, gradients, or loss calculations during the collaborative training process. For example, securely computing the sum of gradients from different feature holders.
Common Protocols: Garbled Circuits, Secret Sharing.

Entity Alignment

The prerequisite step in VFL where parties must securely identify overlapping samples (entities) across their disjoint feature sets without exposing their entire datasets. This is necessary because VFL assumes collaboration on the same set of entities (e.g., the same customers).

Core Challenge: Performing alignment with privacy guarantees, often using techniques like Private Set Intersection (PSI).
Without Alignment: Parties would be training on misaligned data, rendering the model useless. This step is unique to the vertical data partition setting.

Homomorphic Encryption (HE)

A form of encryption that allows computation on ciphertexts. An operation performed on encrypted data produces an encrypted result which, when decrypted, matches the result of the operation as if it had been performed on the plaintext.

Role in VFL: Enables a party (e.g., the label holder) to perform computations on encrypted intermediate results from other feature holders. This allows forward/backward propagation without decrypting sensitive features.
Trade-off: Provides strong cryptographic security but introduces significant computational overhead.

Cross-Silo Federated Learning

A federated learning setting characterized by a small number of reliable, organizational participants (e.g., 2-100 banks or hospitals). Participants have significant computational resources and stable connectivity. This is the primary deployment scenario for VFL.

Contrast with Cross-Device FL: Cross-device involves millions of unstable edge devices (phones). VFL's need for entity alignment and complex cryptographic protocols makes it practically suited only for the cross-silo context.
Trust Model: Parties are often known but mutually distrustful regarding their proprietary data assets.

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.

Talk to Us

Vertical Federated Learning

What is Vertical Federated Learning?

Key Characteristics of VFL

Feature Partitioning by Entity

Sample Alignment & Cryptography

Split Neural Network Architecture

Asymmetric Roles: Host & Guest

Privacy-Preserving Forward Pass

Secure Gradient Exchange

Vertical vs. Horizontal Federated Learning

Common Use Cases for VFL

Financial Credit Scoring

Healthcare & Precision Medicine

Cross-Platform Recommendation Systems

Smart City & IoT Analytics

Manufacturing Supply Chain Optimization

Fraud Detection in Banking Consortia

Frequently Asked Questions

Intelligent Analysis, Decision & Execution

Search across company data

Automate internal workflows

Add AI to products and internal tools

Prasad Kumkar

Partnered with leading AI, data, and software stack.

Custom AI workflows for your Business

Review the use case

Pick the right approach

Build the first useful version

Improve from there