OpenShift AI provides a curated stack of open-source tools—like Jupyter, Kubeflow, and KServe—for building, training, and serving models. AI integration focuses on augmenting these core surfaces with intelligent automation. Key integration points include the Model Serving layer (monitoring inference logs for drift and performance), the Pipeline Orchestration engine (optimizing Tekton or Argo workflows for resource efficiency), and the JupyterHub interface (providing in-notebook code and data assistance). AI agents can plug into these layers via Kubernetes Operators, webhooks, and the platform's REST APIs to act as a co-pilot for the entire AI lifecycle.
Integration
AI Integration for OpenShift AI

Where AI Fits into the OpenShift AI Stack
A practical guide to augmenting Red Hat OpenShift AI's native MLOps tooling with custom AI agents for production-grade model operations and data scientist support.
For implementation, an AI agent typically runs as a separate service or Job within the OpenShift AI project, with RBAC scoped to specific namespaces. It consumes events from the OpenShift AI dashboard API, Prometheus metrics for model endpoints, and pipeline run logs. Use cases are operational: an agent can analyze failed pipeline steps, suggest code fixes or parameter adjustments, and automatically retry with new resource requests. For model serving, it can trigger canary deployments or scale replicas based on predicted latency, moving beyond simple threshold-based alerts. This turns the platform from a static orchestration tool into a self-optimizing system.
Rollout requires careful governance, as these agents interact with production models and data. Start by deploying an AI agent in audit-only mode alongside a non-critical pipeline or development model serving endpoint. Use OpenShift's built-in audit trails and Red Hat Quay image scanning to ensure the agent's own containers are secure. Gradually introduce approval workflows, where the agent's suggestions—like a pipeline parameter change or a model rollback—require a human sign-off via a Slack webhook or ServiceNow ticket before execution. This controlled integration maximizes platform utility while maintaining the security and compliance posture expected in enterprise Kubernetes environments.
Key Integration Surfaces in OpenShift AI
Model Serving & Inference Layer
Integrate AI agents directly with OpenShift AI's KServe and ModelMesh serving runtimes to automate operational workflows. This surface is ideal for agents that monitor model performance, manage A/B testing, and handle canary rollouts.
Key integration points:
- KServe InferenceService CRDs: Agents can watch for changes to
InferenceServiceresources to trigger scaling, logging, or alerting actions. - ModelMesh Serving: Use the REST/gRPC endpoints of deployed models for programmatic inference, enabling agents to act as orchestrators or quality gates.
- Serving Runtime Metrics: Pull Prometheus metrics from the
odh-model-controllerandodh-modelmeshnamespaces to detect latency spikes, error rates, or drift.
Example Agent Workflow: An AI agent analyzes inference latency percentiles. If P99 exceeds a threshold, it automatically scales the InferenceService replica count and posts an alert to a Slack channel for the data science team.
High-Value AI Use Cases for OpenShift AI
Extend Red Hat OpenShift AI's managed MLOps platform with custom AI agents and copilots to automate operational workflows, enhance data scientist productivity, and optimize the full model lifecycle from development to production.
Model Performance & Drift Monitoring Agent
Deploy an AI agent that continuously analyzes model serving metrics, inference logs, and data drift from the OpenShift AI Model Serving layer. The agent correlates performance dips with cluster events (node pressure, network latency) and suggests retraining triggers or scaling adjustments, moving monitoring from periodic review to proactive alerting.
Jupyter Notebook Data Science Copilot
Embed a context-aware assistant within OpenShift AI's JupyterHub environment. It analyzes notebook code, suggests relevant datasets from connected data sources, explains pipeline errors, and generates documentation snippets. This reduces time spent on environment setup and debugging, letting data scientists focus on experimentation.
Pipeline Optimization & Failure Triage
Integrate an AI agent with OpenShift AI Pipelines (Kubeflow/Tekton) to analyze execution logs and resource metrics. The agent identifies recurring failure patterns (e.g., OOM errors in specific steps), suggests parameter tuning or resource limit increases, and can auto-generate RCA summaries for failed runs, accelerating CI/CD for ML.
Intelligent Resource Scheduling & GPU Management
Augment OpenShift AI's scheduler with an AI layer that analyzes job queues, GPU utilization metrics, and user priorities. It predicts resource needs for pending experiments, suggests optimal nvidia.com/gpu allocations, and can preempt low-priority workloads to maximize hardware ROI for high-value training jobs.
Compliance & Audit Workflow Automation
Automate governance tasks by connecting AI to the OpenShift AI dashboard and underlying Kubernetes APIs. An agent reviews model registry entries for missing metadata, checks pipeline runs for PII data handling, and generates audit-ready reports on model lineage and data usage for regulated industries.
Self-Service Onboarding & Environment Provisioning
Implement a natural-language interface for data scientists to request and configure OpenShift AI projects. An AI agent interprets requests (e.g., "set up a PyTorch env with 2 GPUs and access to S3 bucket X"), validates quotas, generates the necessary ResourceClaim and Secret manifests, and guides users through the approval workflow.
Example AI Agent Workflows for OpenShift AI
These workflows illustrate how custom AI agents can augment Red Hat OpenShift AI's native MLOps tooling, automating operational tasks and providing intelligent support to data science teams. Each pattern connects to specific APIs, data sources, and user surfaces within the platform.
Trigger: Scheduled cron job (e.g., daily) or event from the OpenShift AI model monitoring dashboard.
Context Pulled:
- Model inference logs and performance metrics from the OpenShift AI model server (Seldon or KServe).
- Historical baseline metrics stored in a platform-integrated data store (e.g., Prometheus, S3 via Data Science Pipelines).
- Model metadata (version, training data signature) from the OpenShift AI model registry.
Agent Action:
- The agent retrieves current inference metrics (latency, throughput, error rate, custom business metrics) and compares them against the defined baseline using statistical tests (PSI, CSI).
- It analyzes the feature distribution of recent inference requests versus the training dataset.
- Using an LLM, the agent generates a plain-English summary of the drift: type (data, concept), severity, and likely impacted segments.
System Update:
- The agent creates a Jira ticket via webhook or updates an existing OpenShift AI dashboard alert with the analysis summary.
- It can optionally trigger a retraining pipeline in OpenShift AI Pipelines (Tekton) by submitting a new
PipelineRunwith parameters derived from the analysis. - Sends a formatted Slack/MS Teams notification to the responsible data science team, including links to the relevant dashboard and pipeline run.
Human Review Point: The notification and proposed retraining pipeline require team approval before execution, managed through the OpenShift AI UI or linked ITSM ticket.
Typical Implementation Architecture
A production AI integration for OpenShift AI extends the platform's native MLOps tooling with custom agents, orchestration, and governance layers.
The integration typically layers AI agents and workflows onto three key surfaces of the OpenShift AI stack: the JupyterHub environment for data scientist support, the model serving layer (KServe, Seldon) for runtime monitoring and optimization, and the pipeline orchestration engine (Kubeflow Pipelines, Tekton) for intelligent workflow automation. Agents connect via the platform's REST APIs and Kubernetes Custom Resource Definitions (CRDs), ingesting logs, metrics, and metadata from these components to provide contextual assistance, predictive analysis, and automated actions.
A common pattern involves deploying a central orchestrator agent as a service within the OpenShift AI project namespace. This agent subscribes to events (e.g., pipeline failures, model drift alerts, resource quota breaches) via webhooks and Kubernetes Event Exposers. It then routes tasks to specialized worker agents—such as a notebook copilot that helps debug code in JupyterLab, a serving optimizer that suggests canary weights or instance scaling, or a data curator that validates pipeline input datasets. These agents act on the user's behalf, executing approved actions through the OpenShift AI API or generating pull requests for GitOps-managed configurations.
Governance and rollout are managed through OpenShift's native RBAC and Projects. AI agents are granted service accounts with scoped permissions, and all agent decisions and actions are logged to the cluster's audit trails and optionally to a dedicated vector store for explainability. The rollout is often phased, starting with read-only monitoring and alert summarization for platform admins, then progressing to assisted actions within data science projects, and finally enabling automated remediation for predefined, low-risk operational scenarios like pipeline retries or log cleanup.
Code and Payload Examples
Analyzing Model Serving Endpoints
Integrate AI agents with OpenShift AI's model serving layer (KServe, Seldon) to monitor performance, detect drift, and trigger retraining. Agents can poll serving metrics, analyze prediction logs, and compare live data against training distributions.
Example Python agent checking a KServe endpoint:
pythonimport requests import pandas as pd from inference_systems.agent import OpenShiftAIAgent agent = OpenShiftAIAgent() # Get model serving endpoint details from OpenShift AI endpoint = agent.get_serving_endpoint( project_name="fraud-detection", model_name="v2" ) # Fetch recent inference logs via OpenShift AI's monitoring API logs = agent.query_inference_logs( endpoint_url=endpoint["metrics_url"], timeframe="24h", limit=1000 ) # Calculate prediction drift using your chosen statistical test drift_score = calculate_psi( reference_distribution=training_stats, current_distribution=logs["predictions"] ) if drift_score > 0.25: # Trigger automated retraining workflow agent.create_retraining_job( model_id=endpoint["model_id"], trigger="drift_detected", priority="high" )
Realistic Operational Impact and Time Savings
This table shows the measurable impact of integrating custom AI agents and copilots into Red Hat OpenShift AI workflows, focusing on reducing manual toil for data scientists, MLOps engineers, and platform teams.
| Metric | Before AI | After AI | Notes |
|---|---|---|---|
Pipeline failure root cause analysis | Manual log review (30-90 mins) | Automated correlation & suggestion (2-5 mins) | AI analyzes Tekton logs, model dependencies, and resource events to pinpoint likely cause |
Model performance drift detection review | Weekly manual dashboard checks | Proactive anomaly alerts with context | AI monitors Seldon/Prometheus metrics, correlates with data pipeline changes |
Jupyter environment provisioning | Ticket-based, 1-2 business days | Self-service via AI agent, <15 mins | AI validates resource requests against quotas and team policies before automated provisioning |
Model serving resource right-sizing | Static allocation, periodic manual review | Continuous recommendation engine | AI analyzes inference latency, concurrency, and cost to suggest CPU/GPU adjustments |
Experiment tracking and artifact lineage | Manual notebook logging, prone to gaps | Automated metadata capture and query | AI agent parses notebook cells and pipeline runs to populate MLflow/Weights & Biases |
Data scientist support (common queries) | Search documentation, ask peers (20+ mins) | Contextual AI copilot in JupyterLab | Agent grounded in internal docs, cluster policies, and past resolved issues |
Compliance scan for model registry images | Scheduled weekly scans, manual triage | Pre-promotion scan integrated in CI/CD | AI evaluates CVEs against runtime context to prioritize critical fixes |
Multi-model endpoint canary analysis | Manual A/B test setup and metric review | Automated canary rollout with guardrails | AI monitors business and performance metrics, suggests rollback or full promotion |
Governance, Security, and Phased Rollout
A practical framework for deploying AI agents on OpenShift AI with security, compliance, and controlled adoption in mind.
Integrating AI agents into OpenShift AI requires a security-first approach that aligns with the platform's native governance model. This means embedding AI workflows within OpenShift's existing Role-Based Access Control (RBAC), Project and Namespace isolation, and audit logging framework. For instance, an AI agent that analyzes Jupyter notebook logs for cost optimization should only have service account permissions scoped to the specific project, with all its API calls and tool usage logged to the cluster's audit trail. Sensitive operations, like modifying a model server's resource limits, should be gated behind OpenShift's approval workflows or integrated with external policy engines like OPA Gatekeeper to ensure compliance before execution.
A phased rollout is critical for managing risk and proving value. Start with a read-only observation phase, where AI agents monitor model serving metrics, pipeline runtimes, and resource utilization within the OpenShift AI dashboard and Prometheus metrics, generating reports and alerts without taking action. The next phase introduces assistive automation in low-risk areas, such as suggesting optimized ResourceQuotas for data science projects or drafting Git commit messages for pipeline code changes in the connected Git repository. The final phase enables closed-loop automation for pre-approved workflows, like auto-scaling model inference deployments based on predicted traffic or triggering re-training pipelines when data drift is detected in the model registry.
Governance extends to the AI models and prompts themselves. Treat prompts as configuration managed in GitOps repositories alongside your application code, with changes peer-reviewed and deployed through OpenShift's CI/CD pipelines. Use OpenShift AI's native experiment tracking and model lineage features to maintain an audit trail of which model version an agent consulted for a recommendation. For production-critical agents, implement a human-in-the-loop step using OpenShift's notification system to Slack or Teams, requiring manual approval before executing significant changes to cluster resources or model deployments. This layered approach ensures AI augments your platform team's capabilities without introducing unmanaged risk or operational chaos.
For related architectural patterns on securing and orchestrating AI agents within enterprise platforms, see our guides on AI Governance and LLMOps Platforms and leveraging API Management and Gateway Platforms for secure, policy-enforced tool calling.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Frequently Asked Questions
Common questions about extending Red Hat OpenShift AI with custom AI agents, model monitoring, and data scientist support workflows.
AI agents connect to OpenShift AI via its APIs and Kubernetes-native architecture, typically operating as sidecar containers or separate services within the same project namespace.
Typical Integration Points:
- JupyterHub API: Agents can be injected into notebook sessions to provide real-time code suggestions, dependency troubleshooting, or data exploration guidance based on the notebook's kernel and imported libraries.
- ModelMesh/KServe Inference Endpoints: Agents monitor inference logs, latency, and error rates. They can trigger automated scaling events or route traffic away from underperforming model servers.
- OpenShift AI Dashboard: Custom plugins or API calls can surface agent-generated insights—like recommended pipeline parameters or data drift alerts—directly in the data scientist's console.
- S3-Compatible Object Storage (for pipelines): Agents analyze pipeline artifacts and metrics stored in MinIO or cloud storage to suggest optimizations for subsequent runs.
Security & Permissions: Agents use ServiceAccounts with RBAC roles scoped to the project, ensuring they only access permitted resources like specific notebook pods or inference services.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us