Glossary

Kayenta

Kayenta is an open-source, automated canary analysis service that performs statistical comparisons of metrics between control and canary deployments to provide a deployment verdict.

Get in touch Learn more

Analytics team reviewing AI metrics dashboard on large monitor, KPIs visible, modern data-driven office setup.

PRODUCTION CANARY ANALYSIS

What is Kayenta?

Kayenta is an open-source, automated canary analysis service developed by Netflix that performs statistical comparisons of metrics between control and canary deployments to provide a deployment verdict.

Kayenta is an Automated Canary Analysis (ACA) service that statistically compares a canary deployment (a new version) against a control deployment (the stable baseline) using a configured set of metrics. It automates the evaluation of a release's health by analyzing Service Level Indicators (SLIs) like error rates, latency, and throughput, providing an objective deployment verdict—promote or rollback—based on predefined success criteria. This is a core component of progressive delivery and continuous deployment pipelines.

The service integrates with monitoring backends like Prometheus, Datadog, and Stackdriver to fetch real-time metrics. It performs time-series analysis, often using techniques like dynamic baseline adjustment, to account for normal traffic patterns and identify statistically significant regressions. As a platform-agnostic tool, Kayenta is commonly used with Spinnaker for orchestration and works alongside traffic routing systems like Istio or Flagger to manage the canary's traffic splitting and execute the final verdict automatically.

AUTOMATED CANARY ANALYSIS SERVICE

Key Features of Kayenta

Kayenta is an open-source, automated canary analysis service developed by Netflix. It performs statistical comparisons of metrics between control and canary deployments to provide a deployment verdict, enabling safe, data-driven releases.

Automated Statistical Analysis

Kayenta's core function is to automatically compare a comprehensive set of metrics from the canary (new version) against the control (baseline version). It uses statistical tests to determine if observed differences are significant, moving the decision from subjective judgment to objective, data-driven analysis.

Key Tests: Uses methods like the two-sample t-test and Kolmogorov-Smirnov test to compare distributions of metrics like latency, error rates, and throughput.
Automated Verdict: Produces a final pass/fail score based on aggregated metric comparisons, removing human guesswork from the deployment decision.

EXPLORE

Multi-Metric, Multi-Dimensional Evaluation

The system evaluates deployments across a broad spectrum of metrics—not just technical system health but also business KPIs. It aggregates data from multiple sources to provide a holistic view of the canary's impact.

Metric Types: Analyzes infrastructure metrics (CPU, memory), application metrics (error rates, latency percentiles), and business metrics (conversion rates, revenue per session).
Multi-Dimensional Analysis: Can segment analysis by dimensions like geographic region, user cohort, or device type to detect issues that only affect specific subsets of traffic.

Integration with Existing Observability Stacks

Kayenta is designed as a pluggable service that fetches time-series data from industry-standard monitoring backends. It does not replace your observability tools but acts as an analysis layer on top of them.

Supported Providers: Native integrations include Atlas, Datadog, Graphite, New Relic, Prometheus, and Stackdriver.
Unified Analysis: This allows teams to use a single canary analysis service regardless of their underlying monitoring choices, standardizing the release process across an organization.

Declarative Configuration & Score Thresholds

Success criteria are defined declaratively through configuration files. Users specify which metrics to analyze, their relative importance, and the pass/fail thresholds for each.

Weighted Scoring: Each metric is assigned a weight. The system calculates a composite score, and the canary must meet a minimum overall threshold to pass.
Flexible Policies: Configurations can define different marginal and critical failure thresholds, allowing for nuanced policies where some metric regressions are tolerated more than others.

Tight Integration with Spinnaker

Kayenta was built as a core component of the Spinnaker continuous delivery platform. This integration enables automated canary analysis as a native stage within a Spinnaker deployment pipeline.

Pipeline Automation: A Spinnaker pipeline can automatically deploy a canary, trigger Kayenta analysis, and then promote or roll back the release based on the verdict, creating a fully automated, safe deployment workflow.
Centralized Management: Provides a unified interface within Spinnaker for configuring canary analysis and reviewing results.

EXPLORE

Objective, Repeatable Release Decisions

By codifying release criteria into configuration, Kayenta ensures that every deployment is evaluated against the same objective standards. This eliminates variance between teams or engineers and builds institutional knowledge around what constitutes a 'safe' release.

Audit Trail: Provides a clear record of which metrics were analyzed, their results, and the final verdict, creating an auditable trail for compliance and post-mortem analysis.
Continuous Improvement: Teams can iteratively refine their metric thresholds and weights based on historical analysis data, continuously improving the precision and safety of their deployment process.

AUTOMATED CANARY ANALYSIS

How Kayenta Works

Kayenta is an open-source, automated canary analysis service that performs statistical comparisons of metrics between control and canary deployments to provide a deployment verdict.

Kayenta operates by automating the statistical analysis of a canary deployment. It continuously collects a predefined set of canary metrics—such as error rates, latency percentiles, and business KPIs—from both the stable baseline (control) and the new candidate version (canary). The service then executes a series of statistical tests and comparisons against configured thresholds and Service Level Objectives (SLOs). This process evaluates whether the canary's performance is statistically equivalent or superior to the control, or if it shows regressions that warrant a rollback.

The analysis culminates in an automated deployment verdict—promote or rollback—based on the aggregated metric scores. Kayenta integrates with continuous delivery platforms like Spinnaker and monitoring backends such as Prometheus, Datadog, and Stackdriver. Its architecture is metric-source agnostic, allowing teams to define custom judgment criteria and weight different metrics. This provides a deterministic, quantitative gate for progressive rollouts, replacing manual checks with verifiable engineering standards for release safety.

AUTOMATED CANARY ANALYSIS

Kayenta vs. Other Deployment Analysis Tools

A feature comparison of Kayenta against other common tools and platforms used for evaluating canary deployments and progressive rollouts.

Feature / Capability	Kayenta	Generic A/B Testing Framework	Basic Health Check / SLO Monitoring
Primary Purpose	Automated statistical canary analysis for deployment verdicts	Statistical comparison of user-facing variants for product decisions	Threshold-based alerting on service health metrics
Analysis Methodology	Statistical hypothesis testing (e.g., t-tests, Mann-Whitney U) on metric distributions	Frequentist or Bayesian inference on aggregate success rates (e.g., conversion)	Simple rule-based checks (e.g., error rate > X%)
Integration with Deployment Orchestration
Automated Promotion/Rollback Trigger
Real-time Metric Comparison (Control vs. Canary)
Support for Custom Business Metrics
Built-in Metric Aggregation (Min, Max, P95, etc.)
Judgment Configuration (Pass/Fail Criteria)	Flexible, metric-specific thresholds and weights	Typically single primary metric with significance threshold	Static, binary thresholds per metric
Native Cloud Provider Integration (AWS, GCP, etc.)
Requires Service Mesh (e.g., Istio, Linkerd) for Traffic Routing

KAYENTA

Frequently Asked Questions

Kayenta is an open-source, automated canary analysis service developed by Netflix. It provides a statistical framework for comparing a new software version (the canary) against a stable baseline (the control) to determine if the new version is safe to release. These FAQs address its core functionality, integration, and role in modern deployment pipelines.

Kayenta is an open-source, automated canary analysis (ACA) service that performs statistical comparisons of metrics between a stable control deployment and a new canary deployment to generate a deployment verdict. It works by ingesting time-series metrics (e.g., error rates, latency, throughput) from monitoring systems like Prometheus, Datadog, or Stackdriver. Kayenta then executes a configured analysis, which typically involves:

Metric Fetching: Retrieving identical metrics for both the control and canary groups over the same time window.
Statistical Comparison: Applying algorithms to compare the two data series. A common method is the Mann-Whitney U Test, a non-parametric test that assesses if the distributions of the two samples differ significantly.
Score Aggregation: Each metric is assigned a pass/fail status based on configurable thresholds (e.g., a 95% confidence interval). These results are aggregated into an overall score.
Verdict Generation: Based on the aggregated score and a minimum pass threshold, Kayenta outputs a final verdict: PASS (promote the canary), FAIL (roll it back), or MARGINAL (requires manual review).

This automated, data-driven process replaces error-prone manual checks, enabling safe, high-velocity deployments.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.

Talk to Us

Kayenta

What is Kayenta?