Autonomous diagnostics transforms reactive maintenance into a proactive, self-healing system. By integrating AI agents with physical equipment, you create a system that can autonomously detect, diagnose, and even guide the remediation of faults. This process hinges on building a knowledge graph of machine failure modes and using a small language model (SLM) like Phi-3 to reason over technical manuals and sensor data in natural language, generating actionable root-cause analysis.
Guide
Setting Up Autonomous Diagnostics for Manufacturing Equipment

Introduction
This guide provides a blueprint for creating an autonomous diagnostic agent that interprets error codes, sensor readings, and maintenance logs for manufacturing equipment.
The outcome is a significant reduction in mean-time-to-repair (MTTR). The diagnostic agent integrates directly with collaborative robotics (cobots) and human technicians, providing step-by-step repair guidance. This approach is a core component of modern self-healing physical infrastructure, moving beyond simple alerts to closed-loop correction. For foundational concepts, see our guide on How to Architect a Self-Healing Power Grid Controller.
Key Concepts
Building a self-healing manufacturing system requires integrating several core technologies. These concepts form the blueprint for an agent that interprets data, reasons about failures, and guides repairs.
Knowledge Graph of Failure Modes
A knowledge graph structures your diagnostic data, linking error codes, sensor readings, maintenance logs, and repair procedures. This creates a machine-readable map of cause-and-effect relationships.
- Nodes represent entities: specific machine components, error IDs, or sensor types.
- Edges define relationships:
error-1234TRIGGERED_BYoverheating_bearing. - This graph enables the AI to perform multi-hop reasoning, tracing a symptom back to a root cause across interconnected data points, far beyond simple rule-based systems.
Small Language Model (SLM) for Manuals
A Small Language Model (SLM) like Microsoft's Phi-3 provides the natural language reasoning layer. It is fine-tuned to interpret unstructured text—equipment manuals, technician notes, and forum discussions—and answer diagnostic questions.
- Key Advantage: SLMs offer high accuracy on specialized tasks with lower latency and cost than massive LLMs, making them ideal for real-time, on-premise deployment.
- Use Case: The agent queries the SLM with a sensor anomaly; the model cross-references the manual to suggest probable faulty components and required tools.
Root-Cause Analysis (RCA) Agent
The RCA Agent is the core orchestrator. It ingests real-time telemetry, queries the knowledge graph and SLM, and synthesizes findings into a diagnostic report.
- Workflow: 1. Ingest sensor alerts and logs. 2. Retrieve related historical failures from the knowledge base. 3. Reason using the SLM to interpret context. 4. Generate a confidence-scored report listing probable root causes.
- This moves diagnostics from reactive alarm monitoring to proactive, evidence-based analysis.
Cobot-Guided Repair Procedures
Collaborative Robots (Cobots) act as the physical interface. Once a fault is diagnosed, the system generates step-by-step repair instructions displayed on a cobot's interface or an AR headset worn by a technician.
- The cobot can physically guide the human, highlighting components with a laser pointer or presenting tools.
- This human-in-the-loop approach ensures safety and leverages human dexterity while the AI handles complex planning and information retrieval, drastically reducing mean-time-to-repair (MTTR).
Sensor Fusion & Data Ingestion Pipeline
Reliable diagnostics depend on a robust pipeline that unifies data from disparate sources.
- Sources: Vibration sensors, thermal cameras, PLC error codes, and power quality monitors.
- Technology Stack: Use Apache Kafka or MQTT for real-time streaming. Normalize data into a unified time-series database like InfluxDB or TimescaleDB.
- Fusion: Apply algorithms to correlate events across sensor modalities, turning raw signals into contextualized 'health indicators' for the diagnostic agent.
Human-in-the-Loop (HITL) Governance
Autonomy requires oversight. A HITL governance layer defines when the system can act alone and when it must seek human approval.
- Confidence Thresholds: Only execute automated procedures (e.g., a cobot-guided step) if the RCA agent's confidence score exceeds 95%.
- Approval Loops: For critical actions or novel fault scenarios, the system pauses and presents its reasoning to a technician for verification.
- This framework builds trust, ensures safety, and is essential for compliance in high-stakes industrial environments. Learn more about designing these systems in our guide on Human-in-the-Loop (HITL) Governance Systems.
Step 1: Design the System Architecture
The architecture defines how data flows, where intelligence resides, and how the system scales. A robust design is the prerequisite for effective autonomous diagnostics.
An autonomous diagnostic system is a multi-agent system comprising three core layers: the sensing layer (IoT sensors, PLCs, error logs), the reasoning layer (a small language model (SLM) for interpreting manuals and logs, plus a knowledge graph of failure modes), and the action layer (integration with collaborative robotics (cobots) and CMMS for guided repair). Data flows from edge sensors to a central data lake, where it is processed for real-time anomaly detection and historical analysis. This separation of concerns ensures modularity and scalability.
Begin by mapping your physical equipment to a digital twin. Define the communication protocols (e.g., OPC UA, MQTT) for secure data ingestion. Architect the reasoning layer to use a fine-tuned SLM, like Phi-3, for natural language querying of maintenance manuals. The output is a root-cause analysis report and a procedural guide, which is routed to a human technician's interface or directly to a cobot for execution. This design directly reduces mean-time-to-repair (MTTR) by automating the diagnostic bottleneck. For related foundational concepts, see our guide on Human-in-the-Loop (HITL) Governance Systems.
Tool Comparison: SLMs and Knowledge Graph Databases
This table compares the two primary reasoning engines for an autonomous diagnostic system: a Small Language Model (SLM) for natural language understanding and a Knowledge Graph Database for structured relationship mapping.
| Feature / Metric | Small Language Model (SLM) | Knowledge Graph Database | Integrated System (Recommended) |
|---|---|---|---|
Primary Function | Natural language reasoning on unstructured text (manuals, logs) | Storing and querying structured relationships between entities (parts, failures) | SLM queries the knowledge graph to ground its reasoning in factual relationships |
Data Input | Unstructured text documents, error logs, technician notes | Structured data (CSV, SQL), ontology schemas, entity-relationship models | Both unstructured text and structured data, fused into a unified context |
Output for Diagnosis | Natural language hypothesis, potential root cause description | Graph path traversal showing connected failure modes, parts, and symptoms | A root-cause analysis report citing specific graph relationships and supporting log excerpts |
Reasoning Explainability | Medium (can generate step-by-step chain-of-thought) | High (explicit, auditable relationship paths) | High (combines logical graph paths with natural language explanation) |
Integration with Cobots | Generates natural language repair instructions | Provides structured procedure steps and part location data | Guides the cobot through a verified sequence of repair actions |
Update Mechanism | Fine-tuning on new data, prompt engineering | CRUD operations, schema evolution, batch ingestion | Continuous learning loop: SLM findings can propose new graph relationships |
Latency for Query | < 100 ms (on-device inference) | < 10 ms (for local graph traversal) | < 200 ms (combined query and reasoning cycle) |
Common Tools | Phi-3, Llama 3.1, Gemma (fine-tuned) | Neo4j, Amazon Neptune, TerminusDB | Custom agent orchestrating both (e.g., using LangChain or LlamaIndex) |
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Common Mistakes
Implementing autonomous diagnostics for manufacturing equipment is a high-stakes engineering challenge. These are the most frequent technical pitfalls developers encounter and how to fix them.
This occurs when the Small Language Model (SLM) lacks grounding in specific equipment data. You are likely using a general-purpose model without proper fine-tuning or Retrieval-Augmented Generation (RAG).
Fix:
- Fine-tune your SLM (e.g., Phi-3) on your equipment manuals, historical work orders, and failure logs.
- Implement a multi-hop RAG system where the agent must retrieve relevant schematics, error code definitions, and past case resolutions before generating an answer.
- Use a verification agent to cross-check proposed steps against a knowledge graph of valid procedures before presenting them to a technician.
Without these steps, the agent operates on generic knowledge, leading to dangerous recommendations.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us