Inferensys

Glossary

Rolling Update

A rolling update is a deployment strategy that incrementally replaces instances of an old application version with new ones, ensuring zero downtime during the update process.
Strategy workshop with sticky notes and AI roadmap diagrams on glass wall, collaborative planning session.
DEPLOYMENT STRATEGY

What is Rolling Update?

A rolling update is a zero-downtime deployment strategy that incrementally replaces instances of an old application version with new ones.

A rolling update is a deployment strategy that incrementally replaces instances of an old application version with new ones, ensuring zero downtime and continuous service availability. It is a core feature of modern orchestrators like Kubernetes, which manages the process by starting new pods with the updated container image while terminating old ones, maintaining the desired replica count. This method provides a controlled, automated rollout and is a fundamental practice within Agent Deployment Observability for safely updating autonomous agents in production.

The strategy's key mechanism is its controlled, stepwise progression. The orchestrator updates pods in subsets, waiting for new instances to pass readiness probes before proceeding. This allows for continuous health monitoring and enables an instant rollback to the previous version if failures are detected. For agentic systems, rolling updates are essential for deploying new reasoning loops or tool integrations without disrupting the deterministic execution required by enterprise clients, making them a critical component of resilient DevOps and SRE workflows.

DEPLOYMENT STRATEGY

Key Features of Rolling Updates

A rolling update is a zero-downtime deployment strategy that incrementally replaces old application instances with new ones. Its core features ensure controlled, safe rollouts in production environments.

01

Zero-Downtime Deployment

The primary objective of a rolling update is to maintain service availability throughout the deployment process. This is achieved by:

  • Sequential Pod Replacement: New pods are created and pass their readiness probes before old pods are terminated.
  • Traffic Shifting: The orchestrator (e.g., Kubernetes) automatically shifts traffic from old to new, healthy pods.
  • Continuous Service: End-users experience no interruption, as at least one replica is always available to serve requests.
02

Controlled Rollout Pace

Rolling updates provide fine-grained control over the speed and risk of the deployment through configurable parameters:

  • maxUnavailable: Defines the maximum number or percentage of pods that can be unavailable during the update (e.g., 25%). This controls the impact on capacity.
  • maxSurge: Defines the maximum number or percentage of pods that can be created over the desired number (e.g., 25%). This allows for faster rollouts by temporarily over-provisioning.
  • MinReadySeconds: Forces new pods to be 'ready' for a minimum period before being considered available, catching late-start failures.
03

Automated Health Validation

The strategy relies on probes to autonomously validate the health of new instances, preventing defective versions from receiving traffic.

  • Readiness Probes: Determine if a new pod is fully initialized and ready. Traffic is only directed to pods that pass this check.
  • Liveness Probes: Continuously check if a running pod is healthy. A failing pod is restarted.
  • Rollback Trigger: If the number of unavailable pods exceeds maxUnavailable due to probe failures, the update can be automatically paused or rolled back, acting as a built-in circuit breaker.
04

Version Coexistence & Rollback

During the update, multiple application versions run simultaneously, enabling instant recovery.

  • Gradual Traffic Migration: Traffic is split between old and new versions, allowing for performance comparison and A/B testing.
  • Atomic Rollback Capability: If the new version fails, the orchestrator can immediately reverse the process by terminating the new pods and scaling up the old ones. This leverages the immutable nature of container images and declarative state.
  • State Management: For stateless services, this is straightforward. Stateful services require careful design using PersistentVolumes to ensure data consistency across versions.
05

Declarative & Orchestrated Execution

Rolling updates are managed declaratively by orchestration platforms like Kubernetes Deployments or Amazon ECS Services.

  • Declarative Spec: The engineer defines the desired end state (new image, replica count). The orchestrator's controller loop executes the precise sequence of pod lifecycle events to achieve it.
  • Event-Driven Coordination: The system handles pod scheduling, image pulling, network attachment, and health checking without manual intervention.
  • Integration with Ecosystem: Works seamlessly with Horizontal Pod Autoscaling (HPA), Pod Disruption Budgets (PDB), and Service Meshes for advanced traffic shaping and resilience.
06

Use Cases & Strategic Fit

Rolling updates are the default strategy for most stateless microservices but must be evaluated against alternatives.

  • Ideal For: Frequent, automated deployments of backend APIs, web services, and worker processes where brief version coexistence is acceptable.
  • Comparison to Blue-Green: Less resource-intensive (no full parallel environment) but slower to rollback completely. Blue-green offers instantaneous switchover.
  • Comparison to Canary: A rolling update is a broad technique; a canary deployment is a specific application of it where the new version is released to a small, controlled subset of traffic first, often using a service mesh for sophisticated traffic splitting.
DEPLOYMENT STRATEGY COMPARISON

Rolling Update vs. Other Deployment Strategies

A comparison of key operational characteristics for common deployment strategies used in modern, containerized environments, focusing on availability, risk, and operational overhead.

Feature / MetricRolling UpdateBlue-Green DeploymentCanary Deployment

Primary Goal

Zero-downtime incremental replacement

Instant rollback capability

Risk-mitigated validation with real users

Infrastructure Overhead

Minimal (single environment)

High (duplicate full environment)

Moderate (partial duplicate capacity)

Rollback Speed

Slow (incremental reverse update)

< 1 sec (traffic switch)

Fast (traffic re-routing)

Release Risk Profile

Moderate (all traffic shifts gradually)

Low (validated before full switch)

Very Low (validated on small subset)

Traffic Control Granularity

Pod-level

Environment-level (100% traffic)

Precise percentage or user segment

Resource Cost During Update

~100-110% of baseline

200% of baseline

~100-120% of baseline

Complexity of Stateful Data Migration

High (requires backward/forward compatibility)

Moderate (can run dual-write patterns)

High (requires backward/forward compatibility)

Ideal Use Case

Stateless microservices, frequent patches

Major version upgrades, stateful applications

New features with unknown performance impact

DEPLOYMENT STRATEGIES

Where Rolling Updates Are Used

Rolling updates are a fundamental deployment pattern for achieving zero-downtime releases. They are a core feature of modern orchestration platforms and are applied across diverse operational scenarios.

04

Microservices & API Versioning

In a microservices architecture, rolling updates allow teams to deploy new versions of a single service independently. This is critical for:

  • Backward-Compatible API Changes: Deploying a new service version that understands both old and new request formats, allowing client services to update at their own pace.
  • Database Schema Migrations: Applying non-breaking schema changes (e.g., adding a nullable column) before deploying code that uses the new schema. The rolling update ensures some pods use the old schema and some the new during the transition, maintaining overall system function.
ROLLING UPDATE

Frequently Asked Questions

A rolling update is a deployment strategy that incrementally replaces instances of an old application version with new ones, ensuring zero downtime. This glossary section answers common technical questions about its implementation, mechanics, and role in modern DevOps.

A rolling update is a deployment strategy that incrementally replaces instances of an old application version with new ones, ensuring zero downtime and continuous service availability. It works by an orchestrator (like Kubernetes) managing the lifecycle of application pods or containers according to a defined update strategy. The orchestrator terminates pods running the old version one-by-one or in small batches, scheduling new pods with the updated version in their place. It uses readiness probes to verify each new pod is fully operational before proceeding to update the next batch and liveness probes to ensure the new pods remain healthy. This creates a gradual, controlled transition where the total number of available replicas never drops below a specified minimum, guaranteeing the service can handle production traffic throughout the update process.

Prasad Kumkar

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.