Glossary

Rolling Update

A rolling update is a deployment strategy that incrementally replaces instances of an old application version with new ones, ensuring zero downtime during the update process.

Get in touch Learn more

Strategy workshop with sticky notes and AI roadmap diagrams on glass wall, collaborative planning session.

DEPLOYMENT STRATEGY

What is Rolling Update?

A rolling update is a zero-downtime deployment strategy that incrementally replaces instances of an old application version with new ones.

A rolling update is a deployment strategy that incrementally replaces instances of an old application version with new ones, ensuring zero downtime and continuous service availability. It is a core feature of modern orchestrators like Kubernetes, which manages the process by starting new pods with the updated container image while terminating old ones, maintaining the desired replica count. This method provides a controlled, automated rollout and is a fundamental practice within Agent Deployment Observability for safely updating autonomous agents in production.

The strategy's key mechanism is its controlled, stepwise progression. The orchestrator updates pods in subsets, waiting for new instances to pass readiness probes before proceeding. This allows for continuous health monitoring and enables an instant rollback to the previous version if failures are detected. For agentic systems, rolling updates are essential for deploying new reasoning loops or tool integrations without disrupting the deterministic execution required by enterprise clients, making them a critical component of resilient DevOps and SRE workflows.

DEPLOYMENT STRATEGY

Key Features of Rolling Updates

A rolling update is a zero-downtime deployment strategy that incrementally replaces old application instances with new ones. Its core features ensure controlled, safe rollouts in production environments.

Zero-Downtime Deployment

The primary objective of a rolling update is to maintain service availability throughout the deployment process. This is achieved by:

Sequential Pod Replacement: New pods are created and pass their readiness probes before old pods are terminated.
Traffic Shifting: The orchestrator (e.g., Kubernetes) automatically shifts traffic from old to new, healthy pods.
Continuous Service: End-users experience no interruption, as at least one replica is always available to serve requests.

Controlled Rollout Pace

Rolling updates provide fine-grained control over the speed and risk of the deployment through configurable parameters:

maxUnavailable: Defines the maximum number or percentage of pods that can be unavailable during the update (e.g., 25%). This controls the impact on capacity.
maxSurge: Defines the maximum number or percentage of pods that can be created over the desired number (e.g., 25%). This allows for faster rollouts by temporarily over-provisioning.
MinReadySeconds: Forces new pods to be 'ready' for a minimum period before being considered available, catching late-start failures.

Automated Health Validation

The strategy relies on probes to autonomously validate the health of new instances, preventing defective versions from receiving traffic.

Readiness Probes: Determine if a new pod is fully initialized and ready. Traffic is only directed to pods that pass this check.
Liveness Probes: Continuously check if a running pod is healthy. A failing pod is restarted.
Rollback Trigger: If the number of unavailable pods exceeds maxUnavailable due to probe failures, the update can be automatically paused or rolled back, acting as a built-in circuit breaker.

Version Coexistence & Rollback

During the update, multiple application versions run simultaneously, enabling instant recovery.

Gradual Traffic Migration: Traffic is split between old and new versions, allowing for performance comparison and A/B testing.
Atomic Rollback Capability: If the new version fails, the orchestrator can immediately reverse the process by terminating the new pods and scaling up the old ones. This leverages the immutable nature of container images and declarative state.
State Management: For stateless services, this is straightforward. Stateful services require careful design using PersistentVolumes to ensure data consistency across versions.

Declarative & Orchestrated Execution

Rolling updates are managed declaratively by orchestration platforms like Kubernetes Deployments or Amazon ECS Services.

Declarative Spec: The engineer defines the desired end state (new image, replica count). The orchestrator's controller loop executes the precise sequence of pod lifecycle events to achieve it.
Event-Driven Coordination: The system handles pod scheduling, image pulling, network attachment, and health checking without manual intervention.
Integration with Ecosystem: Works seamlessly with Horizontal Pod Autoscaling (HPA), Pod Disruption Budgets (PDB), and Service Meshes for advanced traffic shaping and resilience.

Use Cases & Strategic Fit

Rolling updates are the default strategy for most stateless microservices but must be evaluated against alternatives.

Ideal For: Frequent, automated deployments of backend APIs, web services, and worker processes where brief version coexistence is acceptable.
Comparison to Blue-Green: Less resource-intensive (no full parallel environment) but slower to rollback completely. Blue-green offers instantaneous switchover.
Comparison to Canary: A rolling update is a broad technique; a canary deployment is a specific application of it where the new version is released to a small, controlled subset of traffic first, often using a service mesh for sophisticated traffic splitting.

DEPLOYMENT STRATEGY COMPARISON

Rolling Update vs. Other Deployment Strategies

A comparison of key operational characteristics for common deployment strategies used in modern, containerized environments, focusing on availability, risk, and operational overhead.

Feature / Metric	Rolling Update	Blue-Green Deployment	Canary Deployment
Primary Goal	Zero-downtime incremental replacement	Instant rollback capability	Risk-mitigated validation with real users
Infrastructure Overhead	Minimal (single environment)	High (duplicate full environment)	Moderate (partial duplicate capacity)
Rollback Speed	Slow (incremental reverse update)	< 1 sec (traffic switch)	Fast (traffic re-routing)
Release Risk Profile	Moderate (all traffic shifts gradually)	Low (validated before full switch)	Very Low (validated on small subset)
Traffic Control Granularity	Pod-level	Environment-level (100% traffic)	Precise percentage or user segment
Resource Cost During Update	~100-110% of baseline	200% of baseline	~100-120% of baseline
Complexity of Stateful Data Migration	High (requires backward/forward compatibility)	Moderate (can run dual-write patterns)	High (requires backward/forward compatibility)
Ideal Use Case	Stateless microservices, frequent patches	Major version upgrades, stateful applications	New features with unknown performance impact

DEPLOYMENT STRATEGIES

Where Rolling Updates Are Used

Rolling updates are a fundamental deployment pattern for achieving zero-downtime releases. They are a core feature of modern orchestration platforms and are applied across diverse operational scenarios.

Kubernetes Deployments

The Deployment controller in Kubernetes is the canonical implementation of a rolling update. It manages the lifecycle of ReplicaSets to incrementally update pods. Key parameters control the update:

maxUnavailable: The maximum number of pods that can be unavailable during the update (e.g., 25%).
maxSurge: The maximum number of pods that can be created over the desired number (e.g., 25%). The controller uses readiness probes to determine when a new pod is ready to receive traffic before terminating an old one.

EXPLORE

Cloud-Native Service Updates

Managed services on AWS, Google Cloud, and Azure use rolling updates for their Platform-as-a-Service offerings.

AWS Elastic Beanstalk: Uses rolling updates for environment updates, with configurable batch size and health check grace periods.
Google Cloud Run: Revises services by gradually replacing container instances, managed automatically by the platform.
Azure App Service: Deploys new code via slot swapping, which is conceptually a rapid rolling update between two complete environments. These platforms abstract the underlying orchestration but enforce the same core principle of incremental, health-verified replacement.

EXPLORE

Continuous Delivery Pipelines

CI/CD tools like GitLab CI/CD, Jenkins, and Argo CD integrate rolling updates as a deployment step. They typically execute a sequence like:

Trigger on a merge to the main branch.
Build and push a new container image.
Update the Kubernetes Deployment manifest (e.g., image tag).
Apply the manifest, initiating the cluster's native rolling update. Tools like Argo Rollouts provide advanced features on top of basic rolling updates, such as automated analysis and promotion based on metrics.

EXPLORE

Microservices & API Versioning

In a microservices architecture, rolling updates allow teams to deploy new versions of a single service independently. This is critical for:

Backward-Compatible API Changes: Deploying a new service version that understands both old and new request formats, allowing client services to update at their own pace.
Database Schema Migrations: Applying non-breaking schema changes (e.g., adding a nullable column) before deploying code that uses the new schema. The rolling update ensures some pods use the old schema and some the new during the transition, maintaining overall system function.

Stateful Application Updates

Rolling updates for stateful applications like databases (e.g., Cassandra, ZooKeeper) require special coordination to maintain quorum and data consistency. Operators often use:

StatefulSets in Kubernetes, which update pods in reverse ordinal order.
PodDisruptionBudgets (PDBs) to ensure a minimum number of pods remain available.
Application-specific readiness probes that check for cluster membership and data replication status before considering a pod ready. The update proceeds one pod at a time to preserve the cluster's operational state.

EXPLORE

Edge & IoT Fleet Management

Rolling updates are a key strategy for updating software on large fleets of edge devices or IoT gateways. Platforms like AWS IoT Greengrass or Azure IoT Edge manage updates by:

Deploying to a small percentage of devices first (a canary set).
Monitoring device health and success metrics.
Gradually increasing the deployment percentage in a rolling fashion. This approach minimizes the blast radius of a faulty update and ensures the majority of the fleet remains operational, which is critical for geographically distributed systems where physical access is difficult.

EXPLORE

ROLLING UPDATE

Frequently Asked Questions

A rolling update is a deployment strategy that incrementally replaces instances of an old application version with new ones, ensuring zero downtime. This glossary section answers common technical questions about its implementation, mechanics, and role in modern DevOps.

A rolling update is a deployment strategy that incrementally replaces instances of an old application version with new ones, ensuring zero downtime and continuous service availability. It works by an orchestrator (like Kubernetes) managing the lifecycle of application pods or containers according to a defined update strategy. The orchestrator terminates pods running the old version one-by-one or in small batches, scheduling new pods with the updated version in their place. It uses readiness probes to verify each new pod is fully operational before proceeding to update the next batch and liveness probes to ensure the new pods remain healthy. This creates a gradual, controlled transition where the total number of available replicas never drops below a specified minimum, guaranteeing the service can handle production traffic throughout the update process.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

DEPLOYMENT STRATEGIES & OBSERVABILITY

Related Terms

A rolling update is a core deployment pattern that interacts with several other critical concepts for managing application lifecycles in production, especially within orchestrated environments like Kubernetes.

Canary Deployment

A risk-mitigation strategy where a new version is released to a small, controlled subset of users or infrastructure (the 'canary') before a full rollout. It is often used in conjunction with a rolling update to first validate stability and performance metrics.

Key Difference: A rolling update gradually replaces all instances, while a canary deployment initially targets a subset for validation.
Use Case: Testing a new model version with 5% of production traffic to monitor for latency spikes or errors before proceeding.

Blue-Green Deployment

A strategy that maintains two identical, full-scale production environments (Blue and Green). Traffic is switched entirely from the old environment (Blue) to the new one (Green) in a single cutover.

Key Difference: Provides instant rollback by switching traffic back, whereas a rolling update rollback requires re-deploying the previous version across instances.
Trade-off: Requires double the infrastructure resources during the transition but minimizes deployment risk and complexity.

Traffic Splitting

The underlying mechanism for directing a controlled percentage of user requests to different service versions. It is the enabling technology for canary deployments and A/B tests.

Implementation: Often managed by a service mesh (like Istio or Linkerd) or an API gateway using rules based on HTTP headers or weights.
Observability Link: Critical for monitoring key performance indicators (KPIs) like error rates and latency per version during a progressive rollout.

Readiness & Liveness Probes

Health checks defined in a pod specification that the orchestrator uses to manage a rolling update's lifecycle.

Readiness Probe: Determines if a pod is ready to serve traffic. A new pod must pass this check before being added to the load balancer, preventing user errors.
Liveness Probe: Determines if a pod is still running. If it fails, the pod is restarted, ensuring faulty instances are replaced.
Rollout Control: These probes govern the pace and success of the incremental pod replacement in a rolling update.

Pod Disruption Budget (PDB)

A Kubernetes policy that limits the number of concurrent voluntary disruptions to pods in an application. It is a safety mechanism for rolling updates and other maintenance operations.

Function: Ensures a minimum number of pods (e.g., minAvailable: 90%) or a maximum number of unavailable pods (e.g., maxUnavailable: 1) are always running.
Impact on Updates: The rolling update controller respects the PDB, pacing the pod replacements to maintain the defined availability guarantee.

Rollback

The process of reverting a deployment to a previous, stable version. In the context of a rolling update, this is typically triggered automatically if the new version fails health checks or manually due to observed issues.

Kubernetes Mechanism: Executed by updating the deployment manifest to point to the previous container image, triggering a new rolling update in reverse.
Observability Dependency: Effective rollbacks depend on rapid detection of anomalies through metrics, logs, and traces collected during the update.

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.

Talk to Us

Rolling Update

What is Rolling Update?

Key Features of Rolling Updates

Zero-Downtime Deployment

Controlled Rollout Pace

Automated Health Validation

Version Coexistence & Rollback

Declarative & Orchestrated Execution

Use Cases & Strategic Fit

Rolling Update vs. Other Deployment Strategies

Where Rolling Updates Are Used

Kubernetes Deployments

Cloud-Native Service Updates

Continuous Delivery Pipelines

Microservices & API Versioning

Stateful Application Updates

Edge & IoT Fleet Management

Frequently Asked Questions

Intelligent Analysis, Decision & Execution

Search across company data

Automate internal workflows

Add AI to products and internal tools

Prasad Kumkar

Partnered with leading AI, data, and software stack.

Custom AI workflows for your Business

Review the use case

Pick the right approach

Build the first useful version

Improve from there