Glossary

Pod Disruption Budget (PDB)

A Pod Disruption Budget (PDB) is a Kubernetes API object that limits the number of concurrent voluntary disruptions to pods in an application, ensuring high availability during cluster maintenance operations like node drains or updates.

Get in touch Learn more

Operations room with a large monitor wall for system visibility and control.

KUBERNETES POLICY

What is Pod Disruption Budget (PDB)?

A Pod Disruption Budget (PDB) is a Kubernetes API object that specifies the minimum number or percentage of pods for an application that must remain available during voluntary disruptions, ensuring high availability for stateful and critical workloads.

A Pod Disruption Budget (PDB) is a declarative Kubernetes policy that constrains the number of concurrent voluntary disruptions to pods belonging to a single application. It defines thresholds using minAvailable or maxUnavailable fields, which the cluster respects during operations like node draining for maintenance, cluster autoscaling downscaling, or manual pod eviction. This mechanism is distinct from handling involuntary disruptions caused by hardware failure.

By enforcing availability guarantees, PDBs are critical for agent deployment observability and maintaining Service Level Objectives (SLOs) for stateful services like databases or agent backends. When a disruptive operation is requested, the Kubernetes API server checks all relevant PDBs; if the operation would violate a budget, it is temporarily blocked. This allows SREs and DevOps Engineers to schedule maintenance safely, preventing cascading failures and ensuring deterministic application uptime.

KUBERNETES AVAILABILITY

Key Features of Pod Disruption Budgets

A Pod Disruption Budget (PDB) is a Kubernetes API object that specifies the minimum number or percentage of pods that must remain available during voluntary disruptions, acting as a safeguard for application availability.

Voluntary vs. Involuntary Disruptions

A PDB only governs voluntary disruptions, which are initiated by cluster administrators or automated processes. These include:

Draining a node for maintenance or upgrade.
Deleting a pod managed by a Deployment.
Updating a pod template that triggers a rolling update.

Involuntary disruptions, like a node hardware failure or a pod eviction due to resource exhaustion, are not blocked by a PDB. The budget's role is to ensure planned maintenance does not violate your application's availability requirements.

minAvailable and maxUnavailable

You define a PDB using one of two mutually exclusive fields:

minAvailable: Specifies the absolute number or percentage of pods that must always be ready. For a 10-pod deployment, minAvailable: 80% (or 8) means the disruption process must never leave fewer than 8 pods running.
maxUnavailable: Specifies the absolute number or percentage of pods that can be unavailable. For the same deployment, maxUnavailable: 20% (or 2) means the disruption process can take down up to 2 pods at a time.

These parameters create a budget of allowed concurrent disruptions, pacing the eviction process.

Selector-Based Scope

A PDB does not reference a Deployment or StatefulSet directly. Instead, it uses a label selector (spec.selector) to identify the pods it protects. This creates a loose coupling.

Example: A PDB with selector.matchLabels: app=api-server will apply to all pods with that label, whether they are managed by a Deployment, ReplicaSet, or StatefulSet. This is powerful for protecting logical application groups but requires careful label management to ensure the PDB matches the intended pod set.

Interaction with Disruption Processes

When a cluster administrator runs kubectl drain <node>, the process interacts with PDBs:

The drain command identifies all pods on the node.
For each pod, it checks if eviction would violate any matching PDB.
Pods whose eviction is allowed are terminated. Pods whose eviction would violate the PDB are skipped, leaving the node in a draining but not fully cleared state.

The Kubernetes Eviction API respects PDBs, making this behavior consistent across all voluntary disruption tools.

Health & Readiness Probe Dependency

A PDB's calculations are based on pods in the Ready state (as determined by their readiness probe). A pod that is running but failing its readiness probe is considered unavailable for the budget's purposes.

Critical Implication: If many pods are in a crash-loop or failing probes, they may already be counted as "unavailable." A voluntary disruption could then be blocked because the minAvailable threshold is already breached, even though no new pods are being deleted. Effective PDBs require stable pod health.

Use with Stateful Workloads

PDBs are crucial for stateful applications like databases (managed by StatefulSets). Here, maxUnavailable: 1 is a common pattern to ensure only one replica is down at a time, preserving quorum and preventing data loss.

Example: A 3-node etcd cluster requires a quorum of 2 nodes. A PDB with maxUnavailable: 1 ensures the disruption process never takes down two pods simultaneously, maintaining cluster availability and consensus. This provides a safety rail for automated operations on critical stateful services.

EXPLORE

KUBERNETES AVAILABILITY MECHANISMS

PDB vs. Other Availability Controls

A comparison of the Pod Disruption Budget (PDB) with other Kubernetes controllers and patterns that influence application availability, highlighting their distinct purposes and operational scopes.

Feature / Mechanism	Pod Disruption Budget (PDB)	Horizontal Pod Autoscaler (HPA)	Liveness/Readiness Probes	Service Mesh (e.g., Istio)
Primary Purpose	Limit voluntary disruptions during maintenance	Scale pod count based on resource demand	Determine container health & readiness for traffic	Manage & secure service-to-service communication
Triggering Event	User-initiated eviction (e.g., node drain, kubectl drain)	Metric threshold (e.g., CPU > 70%)	Container process state or HTTP response	Network traffic patterns & policy configuration
Operational Scope	Pod availability within an application (Deployment/StatefulSet)	Resource utilization of a Deployment/ReplicaSet	Individual container health within a pod	Network layer between services across the cluster
Key Configuration Parameter	minAvailable or maxUnavailable (count or percentage)	Target metric value and min/max replica bounds	Probe type (HTTP, TCP, Exec), delay, timeout, period	Traffic routing rules, retry policies, fault injection
Automatic Remediation
Protects Against	Voluntary disruptions from cluster operations	Resource starvation due to increased load	Application hangs or deadlocks	Network failures, latency spikes, partial outages
Granularity of Control	Application-level (selector-based)	Deployment-level	Container-level	Service-level & network path-level
Typical Use Case	Ensuring >= 2 pods of a payment service remain during a node upgrade	Adding replicas when request latency increases	Restarting a pod if its health endpoint fails 3 times	Shifting 10% of traffic to a canary version and retrying failed requests

POD DISRUPTION BUDGET

Frequently Asked Questions

A Pod Disruption Budget (PDB) is a critical Kubernetes policy for ensuring application availability during voluntary cluster maintenance. These questions address its core mechanics, use cases, and operational best practices.

A Pod Disruption Budget (PDB) is a Kubernetes API object that specifies the minimum number or percentage of pods for a given application that must remain available during voluntary disruptions. It works by acting as a constraint on the cluster's disruption controllers (like the node drain process), preventing them from evicting pods if doing so would violate the defined availability threshold.

When a user initiates a voluntary action—such as draining a node for maintenance, updating a node's kernel, or scaling down a node pool—the Kubernetes control plane checks all relevant PDBs. If evicting pods from the target node would reduce the available replicas below the minAvailable count or above the maxUnavailable percentage, the eviction API will temporarily block the request. The drain command will pause until pods can be safely rescheduled elsewhere without breaching the PDB, ensuring high availability is maintained.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

KUBERNETES DEPLOYMENT

Related Terms

A Pod Disruption Budget (PDB) is a critical component of a robust deployment strategy. It works in concert with other Kubernetes controllers and policies to ensure application availability during voluntary cluster operations.

ReplicaSet

A core Kubernetes controller that ensures a specified number of identical pod replicas are running at any given time. It is the fundamental mechanism for maintaining desired scale and self-healing. A PDB works by placing constraints on the voluntary disruption of pods managed by a ReplicaSet or Deployment.

Key Function: Continuously monitors pod count and creates/deletes pods to match the replicas field.
Relation to PDB: The PDB's minAvailable or maxUnavailable settings are enforced against the pods this controller manages.

Horizontal Pod Autoscaler (HPA)

A Kubernetes controller that automatically scales the number of pods in a deployment or replica set based on observed metrics like CPU utilization or custom application metrics. It dynamically adjusts the replicas count on the ReplicaSet.

Key Function: Enables applications to handle variable load by adding or removing pod instances.
Interaction with PDB: During a scale-down event triggered by the HPA, the PDB policy is still enforced. The HPA cannot delete pods if doing so would violate the PDB's availability guarantees, ensuring scale-down operations do not compromise stability.

Node Drain

A kubectl command (kubectl drain <node-name>) that safely evicts all pods from a node in preparation for maintenance (e.g., kernel upgrade, node termination). It is the primary voluntary disruption that a PDB is designed to govern.

Process: The drain command cordons the node (marks it unschedulable) and begins evicting pods. It respects Pod Disruption Budgets, waiting if eviction would violate the policy.
PDB Enforcement: If draining a node would cause the number of available pods to fall below minAvailable or exceed maxUnavailable, the eviction API will return an error, and the drain command will wait, periodically retrying.

Voluntary vs. Involuntary Disruptions

A critical distinction in Kubernetes availability management that defines the scope of a PDB's protection.

Voluntary Disruptions: Initiated by the cluster operator or automated controllers. Examples include:
- Draining a node for maintenance.
- Deleting a pod managed by a Deployment.
- A rolling update.
- PDBs apply here.
Involuntary Disruptions: Caused by unavoidable failures outside operator control. Examples include:
- Hardware failure of the underlying physical machine.
- Kernel panic.
- Cloud provider preempting a virtual machine.
- PDBs do NOT protect against these. For this, you rely on ReplicaSets and high replicas counts.

Deployment

A higher-level Kubernetes abstraction that manages ReplicaSets and provides declarative updates for pods. It is the most common controller used with stateless applications.

Key Function: Describes the desired state for pods and manages rolling updates and rollbacks.
PDB Association: A PDB is typically applied to the pods created by a Deployment. During a rolling update, the Deployment controller coordinates pod creation and deletion, and these deletion actions are subject to the PDB's constraints. This ensures a controlled update pace that maintains application availability.

Readiness Probe

A periodic test performed by the kubelet on a container to determine if it is ready to accept network traffic. A pod is considered "ready" when all its containers' readiness probes pass.

Key Function: Signals when a pod is fully initialized and operational, allowing it to be added to Service load balancers.
Critical for PDBs: The PDB's calculation of "available" pods is based on pods with a status of Ready. If a pod's readiness probe fails, it is subtracted from the available count. Therefore, correctly configured readiness probes are essential for the PDB to accurately assess the health of the application and make correct eviction decisions.

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.

Talk to Us

Pod Disruption Budget (PDB)

What is Pod Disruption Budget (PDB)?

Key Features of Pod Disruption Budgets

Voluntary vs. Involuntary Disruptions

minAvailable and maxUnavailable

Selector-Based Scope

Interaction with Disruption Processes

Health & Readiness Probe Dependency

Use with Stateful Workloads

PDB vs. Other Availability Controls

Frequently Asked Questions

Intelligent Analysis, Decision & Execution

Search across company data

Automate internal workflows

Add AI to products and internal tools

Prasad Kumkar

Partnered with leading AI, data, and software stack.

Custom AI workflows for your Business

Review the use case

Pick the right approach

Build the first useful version

Improve from there