Inferensys

Glossary

StatefulSet

A StatefulSet is a Kubernetes workload controller designed to manage stateful applications by providing stable, unique network identifiers and persistent storage that persists across pod rescheduling.
Developer working on RAG retrieval system, document chunks visible on screen, technical workspace with code editor.
KUBERNETES CONTROLLER

What is StatefulSet?

A StatefulSet is a Kubernetes workload controller designed to manage stateful applications by providing stable, unique network identities and persistent storage that persists across pod rescheduling.

A StatefulSet is a Kubernetes workload API object used to manage stateful applications. Unlike a Deployment, which manages stateless pods, a StatefulSet maintains a sticky identity for each of its pods. These pods are created from the same specification but are not interchangeable; each has a persistent identifier that it maintains across any rescheduling. This is essential for applications like databases (e.g., Cassandra, MongoDB), message queues, and other systems where stable network identifiers and persistent storage are required for proper operation and data integrity.

The controller provides ordered, graceful deployment and scaling. Pods are created sequentially, following a strict order (0, 1, 2,...), and scaling down follows the reverse order. Each pod in a StatefulSet derives its hostname from the pattern <statefulset-name>-<ordinal-index>. Paired with a headless Service, this provides stable DNS records for direct pod discovery. Persistent storage is managed via PersistentVolumeClaims (PVCs), where each pod gets its own, uniquely bound PVC that persists even if the pod is rescheduled to another node, ensuring data durability.

KUBERNETES CONTROLLER

Key Features of StatefulSet

StatefulSet is a Kubernetes workload controller designed for stateful applications. It provides guarantees about the ordering and uniqueness of pods, stable network identities, and persistent storage that survives pod rescheduling.

KUBERNETES CONTROLLER

How StatefulSet Works

A StatefulSet is a Kubernetes workload controller designed to manage stateful applications by providing stable, unique network identifiers and persistent storage that persists across pod rescheduling.

A StatefulSet manages the deployment and scaling of a set of Pods while providing guarantees about their ordering and uniqueness. Unlike a Deployment, each Pod in a StatefulSet derives a persistent, predictable hostname from the StatefulSet's name and a sequential index (e.g., web-0, web-1). This stable network identity is crucial for applications like databases, which often rely on fixed hostnames for cluster membership and service discovery.

The controller provides stable persistent storage by creating a unique PersistentVolumeClaim (PVC) for each Pod, bound to its ordinal index. When a Pod is rescheduled, it reattaches to its specific PVC, preserving its data. Operations like scaling and updates follow a strict, sequential order (e.g., web-1 is not created until web-0 is Running and Ready), ensuring deterministic behavior for clustered applications that require ordered deployment and termination.

KUBERNETES WORKLOAD CONTROLLERS

StatefulSet vs. Deployment

A comparison of two core Kubernetes controllers, highlighting their distinct design purposes and operational characteristics for managing pods.

FeatureStatefulSetDeployment

Primary Use Case

Stateful applications requiring stable identity and persistent storage (e.g., databases, message queues).

Stateless applications where pods are fungible and interchangeable (e.g., web servers, APIs).

Pod Identity & Naming

Stable, predictable hostnames (e.g., app-0, app-1) and ordinal index. Pods are not fungible.

Non-deterministic, hash-based pod names (e.g., app-7cbb8796fd-2xzqk). Pods are fungible.

Pod Creation & Scaling Order

Ordered, sequential creation (0, 1, 2...) and termination (reverse order). Supports ordered rolling updates.

Parallel, non-ordered creation and termination. All pods are treated identically.

Persistent Storage

Uses PersistentVolumeClaims (PVCs) that are created per pod and follow the pod during rescheduling. Each pod gets its own unique storage.

Typically uses shared or ephemeral storage. Any PersistentVolumeClaim is shared identically across all pod replicas.

Network Identity (DNS)

Stable DNS hostname per pod: <pod-name>.<service-name>.<namespace>.svc.cluster.local. Headless Service required for direct pod access.

Service load-balances traffic to any backend pod. Individual pod DNS is not stable or typically used.

Update Strategy

RollingUpdate (ordered) or OnDelete. Cannot use the Recreate strategy.

RollingUpdate (parallel) or Recreate.

Pod Management Policy

Can be OrderedReady (default, enforces order) or Parallel (relaxes ordering during scaling).

Always parallel; no ordering concept.

Use in Agentic Observability

Suitable for stateful agent backends requiring durable memory, audit logs, or ordered execution of agent replicas.

Suitable for stateless agent inference endpoints, telemetry collectors, or load-balanced API fronts for agent interactions.

KUBERNETES WORKLOADS

StatefulSet Use Cases

StatefulSets manage stateful applications in Kubernetes by providing stable network identity and persistent storage. Here are the primary scenarios where they are essential.

03

Leader-Follower & Consensus Systems (e.g., etcd, Zookeeper)

Systems that use leader election or consensus protocols (like Raft) have strict requirements for member identity and startup sequence, which StatefulSets enforce.

  • Predictable Network Identifiers: The stable DNS names (etcd-0.etcd-headless.default.svc.cluster.local) allow cluster members to discover each other reliably, forming a stable cluster membership list.
  • Ordered, Graceful Startup: The orderedReady pod management policy ensures Pod n-1 is fully ready before Pod n is created. This allows for safe cluster bootstrap where the first pod (etcd-0) often initializes the cluster.
  • Safe Updates & Failures: During a rolling update, pods are terminated and recreated in reverse order, helping to preserve quorum and leadership stability.
04

Custom Sharded Data Stores

Applications that implement custom data sharding, where each pod is responsible for a specific subset of data (a shard), depend on StatefulSets for pod-specific storage and identity.

  • Shard-Pod Affinity: Each shard (e.g., shard-0-data) is permanently associated with a specific pod (e.g., data-pod-0) via its persistent volume. The shard's data moves with its pod.
  • Direct Addressing: Clients can address requests directly to a specific shard by connecting to the stable DNS name of its corresponding pod (data-pod-2.data-service).
  • Scaling Complexity: Adding a new shard requires adding a new pod (scale up), which will create a new PVC for the new shard's data. Removing a shard (scale down) requires safely migrating its data first, as the associated PVC is not automatically deleted.
05

Monitoring & Logging (Stateful Agents)

While DaemonSets are typical for node-level agents, StatefulSets are used for monitoring components that require their own persistent state, such as time-series databases or log aggregators.

  • Prometheus with Long-Term Storage: A Prometheus server deployed as a StatefulSet can use a large persistent volume to store its time-series database (TSDB) blocks, surviving pod restarts.
  • Dedicated Storage per Instance: In multi-tenant or high-availability setups, each instance of a monitoring service can have its own dedicated, persistent configuration and cache.
  • Predictable Service Discovery: Other services can discover a specific monitoring pod instance via its stable hostname for direct debugging or querying.
06

Key Contrast: When NOT to Use a StatefulSet

Understanding the alternatives clarifies the StatefulSet's role. Use a Deployment with a ReplicaSet when:

  • Statelessness: The application is truly stateless or stores state externally (e.g., in a cloud database).
  • Fungible Pods: Any pod replica is identical and can handle any request. Pods do not need unique, stable identities.
  • Simple Scaling & Updates: You require fast, parallel scaling and rolling updates without ordered constraints.
  • Shared Storage: All pods can read from and write to the same ReadWriteMany (RWX) persistent volume.

Use a DaemonSet when you need one pod per node for cluster-wide services like logging, monitoring, or storage drivers.

KUBERNETES STATEFUL APPLICATIONS

Frequently Asked Questions

A StatefulSet is a core Kubernetes workload controller designed for deploying and managing stateful applications. It provides guarantees about the ordering and uniqueness of pods, stable network identities, and persistent storage that survives pod rescheduling. This FAQ addresses common questions about its operation, use cases, and role in agent deployment observability.

A StatefulSet is a Kubernetes controller that manages the deployment and scaling of a set of Pods while providing guarantees about ordering and stable, unique network identifiers. It works by assigning each pod a persistent, ordinal identity (e.g., app-0, app-1) that is maintained across pod reschedules. This identity is used to create stable PersistentVolumeClaims (PVCs) and a stable headless Service DNS name for each pod (e.g., app-0.app-svc.namespace.svc.cluster.local). Updates and scaling operations follow a strict, predictable order (e.g., sequential creation/termination), which is essential for stateful applications like databases, message queues, and agent memory stores that require stable storage and network endpoints.

Key mechanisms:

  • Stable Pod Identity: Pods are created from an identical spec but are not interchangeable; each has a persistent ordinal index.
  • Stable Storage: Each pod's PVC template is bound to a unique PersistentVolume (PV), ensuring data persists even if the pod is rescheduled to a different node.
  • Ordered Operations: Pods are created, scaled up, updated, and terminated in a strict sequential order (by default).
  • Network Identity: A headless Service controls the pod's network domain, allowing other pods to discover and connect to specific instances directly.
Prasad Kumkar

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.