Inferensys

Glossary

Readiness Probe

A readiness probe is a type of health check in Kubernetes that determines if a container is fully initialized and ready to accept network requests.
Stylish WeWork-like workspace with hot desks and document wall, professional searching through enterprise knowledge base on a mounted ultrawide display, warm industrial pendants overhead.
KUBERNETES HEALTH CHECK

What is a Readiness Probe?

A readiness probe is a Kubernetes health check mechanism that determines if a containerized application is fully initialized and ready to accept network traffic.

A readiness probe is a periodic diagnostic executed by the kubelet on a worker node to verify that a container within a pod is ready to serve requests. Unlike a liveness probe, which determines if a container should be restarted, a readiness probe controls whether the pod's IP address is added to the endpoints of a matching Service. If the probe fails, the pod is temporarily removed from service load balancers, preventing traffic from being sent to an unprepared or busy instance. This mechanism is crucial for supporting graceful startup, rolling updates, and canary deployments by ensuring traffic only routes to healthy, initialized pods.

Probes are defined in a pod's specification and can be configured as an HTTP GET request, a TCP socket check, or an exec command. The probe's success threshold, failure threshold, period, and timeout are tunable parameters. This fine-grained control allows operators to accurately model an application's true readiness state, accounting for dependencies like database connections or cache warming. Properly configured readiness probes are a foundational element of service reliability, preventing cascading failures and enabling zero-downtime deployments by ensuring the orchestrator only directs traffic to pods that are operationally prepared.

KUBERNETES HEALTH CHECKS

Key Characteristics of a Readiness Probe

A readiness probe is a Kubernetes health check mechanism that determines if a container is fully initialized and ready to accept network traffic. Unlike a liveness probe, it does not restart the container on failure; instead, it temporarily removes the pod from service endpoints.

01

Purpose: Traffic Gating

The primary function is to gate traffic. A pod passes its readiness probe only when all its containers are fully operational, including completed dependency initialization (e.g., database connections, cache warming, large model loading). Until then, the pod's IP address is removed from the Service's Endpoints list, preventing the kube-proxy or Ingress controller from routing requests to it. This prevents user requests from hitting a partially started application, which is critical for stateful services and agents loading large context.

02

Probe Types & Mechanisms

Kubernetes supports three mechanisms for executing the check:

  • HTTP GET Probe: Sends an HTTP request to a specified path and port. Success is defined by a response status code between 200 and 399. This is the most common type for web services and APIs.
  • TCP Socket Probe: Attempts to open a TCP connection to a specified port. Success is defined by the connection being established. Used for non-HTTP services like databases or custom gRPC servers.
  • Exec Probe: Executes a specified command inside the container. Success is defined by a zero exit code. Used for complex, application-specific startup logic that cannot be expressed as a simple HTTP or TCP check.
03

Configuration Parameters

Probe behavior is finely tuned via parameters in the container spec:

  • initialDelaySeconds: Wait time after container start before initiating probes. Crucial for applications with long boot sequences.
  • periodSeconds: How often (in seconds) to perform the probe.
  • timeoutSeconds: Number of seconds after which the probe times out.
  • successThreshold: Minimum consecutive successes for the probe to be considered passed after failing.
  • failureThreshold: Number of consecutive failures required for the probe to be considered failed. After this, the pod is marked 'Not Ready'.
04

Interaction with Service Mesh

In a service mesh like Istio or Linkerd, the readiness probe interacts with the sidecar proxy. The probe must succeed for both the application container and the sidecar proxy container for the pod to receive traffic. This ensures the proxy's routing tables and mTLS connections are fully initialized. Misconfiguration here is a common cause of traffic drops during deployments, where the app starts but the proxy is not ready to route.

05

Distinction from Liveness & Startup Probes

Readiness vs. Liveness: A failed liveness probe causes the container to be restarted. A failed readiness probe only removes the pod from service endpoints. Use liveness to catch deadlocks; use readiness for temporary unavailability. Readiness vs. Startup: A startup probe is used for legacy applications with extremely slow startup times. It disables liveness and readiness checks until it succeeds once, after which the readiness probe takes over for the remainder of the container's lifecycle.

06

Critical for Deployment Strategies

Readiness probes are foundational for safe rolling updates, canary deployments, and blue-green deployments. During a rolling update, Kubernetes waits for new pods to pass their readiness probes before terminating old ones, ensuring continuous service availability. For a canary, a new version pod is created; only after its readiness probe passes is it added to the pool to receive a percentage of live traffic, enabling performance validation without impacting all users.

KUBERNETES HEALTH CHECKS

Readiness Probe vs. Liveness Probe vs. Startup Probe

A comparison of the three primary health check mechanisms used by Kubernetes to manage container lifecycle and traffic routing.

FeatureReadiness ProbeLiveness ProbeStartup Probe

Primary Purpose

Determines if a container is ready to accept network traffic.

Determines if a container is still running and responsive.

Determines if a container application has successfully started.

Probe Failure Action

Removes the pod's IP from all Service endpoints. Traffic is not routed to the pod.

Kills the container and restarts it according to the pod's restartPolicy.

Kills the container and restarts it according to the pod's restartPolicy.

Typical Use Case

Waiting for dependencies (DB, cache, API) to be ready. Warming up large caches.

Detecting deadlocks or application hangs where the process is running but unresponsive.

Legacy applications with long, unpredictable startup times (e.g., Java apps initializing).

Probe Timing

Runs continuously throughout the container's lifecycle after startup.

Runs continuously throughout the container's lifecycle after startup.

Runs only during the container's initial startup phase. Disabled after first success.

Impact on Deployment

Prevents new pods from receiving traffic until fully initialized, enabling smooth rollouts.

Forces restart of unhealthy pods, aiding in recovery but may cause temporary disruption.

Prevents liveness/readiness probes from starting prematurely, avoiding unnecessary restarts during slow startup.

Common Configuration (initialDelaySeconds)

5-30 seconds

5-30 seconds

0-60+ seconds (often longer to accommodate slow starts)

Configuration Methods

HTTP GET request, TCP socket connection, or Exec command.

HTTP GET request, TCP socket connection, or Exec command.

HTTP GET request, TCP socket connection, or Exec command.

Default State if Not Configured

Container is assumed ready immediately upon starting.

Container is assumed live. No automatic restarts on failure.

Not configured by default. Liveness/readiness probes begin immediately upon container start.

READINESS PROBE

Frequently Asked Questions

A readiness probe is a critical health check mechanism in container orchestration that determines if an application instance is fully initialized and ready to accept network traffic. These questions address its core function, configuration, and role in reliable deployments.

A readiness probe is a periodic diagnostic check performed by a container orchestrator (like Kubernetes) to determine if a containerized application is ready to serve requests. It works by executing a configured test—such as an HTTP GET request, a TCP socket check, or a command execution inside the container—at a defined interval. If the probe succeeds, the container's endpoint is added to the service's load balancing pool. If it fails, the endpoint is removed, preventing traffic from being sent to an unprepared instance.

Key Mechanism:

  • Orchestrator Action: The kubelet on the node executes the probe.
  • Probe Types: HTTP (checks status code), TCP (checks socket connection), Exec (runs a command).
  • Traffic Control: Directly controls service mesh or load balancer routing.
  • Failure Handling: The pod's Ready condition becomes False, and it is excluded from service discovery.
Prasad Kumkar

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.