Documentation is the easiest lie because it's the first project greenlit by leadership that appears low-risk but unlocks the entire legacy codebase for AI analysis. This creates a sanctioned beachhead for dark data recovery.
Blog

Using LLMs to auto-generate system documentation is a strategic entry point for broader code modernization and dark data discovery.
Documentation is the easiest lie because it's the first project greenlit by leadership that appears low-risk but unlocks the entire legacy codebase for AI analysis. This creates a sanctioned beachhead for dark data recovery.
The real target is the data model. An LLM like GPT-4 or Claude 3, when tasked with summarizing a COBOL module, must first parse its logic and data flows. This process implicitly builds a semantic map of business rules and entity relationships trapped in the legacy system.
This creates a vectorized knowledge graph. The extracted concepts and dependencies are embedded into a vector database like Pinecone or Weaviate, forming the foundational Retrieval-Augmented Generation (RAG) layer needed for accurate AI assistants.
The output is a byproduct. The generated Markdown or Confluence page is merely evidence of the process. The strategic asset is the structured, queryable understanding of the system now available to fuel agentic AI workflows and automated refactoring tools.
Using LLMs to auto-generate system documentation is a strategic entry point for broader code modernization and dark data discovery.
Outdated or missing documentation creates a single point of failure for institutional knowledge. This directly blocks modernization efforts like the Strangler Fig pattern and inflates the cost of Dark Data Recovery.\n- Key Benefit 1: A generative AI project immediately surfaces critical knowledge gaps and system dependencies.\n- Key Benefit 2: It creates a structured, searchable artifact that becomes the foundation for all subsequent AI work, from RAG to agentic workflows.
Comparing the ROI of using Generative AI for documentation as a low-risk entry point versus traditional or direct modernization approaches.
| Phase & Metric | Traditional Manual Audit | Direct Code Modernization | Generative AI Documentation (Trojan Horse) |
|---|---|---|---|
Phase 1: Entry Cost & Time | $250k-500k, 6-9 months | $1M+, 12-18 months |
Using LLMs to auto-generate system documentation is a strategic entry point for broader code modernization and dark data discovery.
Documentation generation is the wedge. It provides immediate, measurable value by converting COBOL copybooks or mainframe logs into searchable knowledge, bypassing initial stakeholder resistance to a full legacy modernization project.
This creates a production-ready data pipeline. The process of parsing, chunking, and embedding legacy artifacts for an LLM like GPT-4 or Claude 3 establishes the exact extract-transform-load (ETL) workflow needed for downstream AI applications like Retrieval-Augmented Generation (RAG).
The output is a strategic asset. The generated documentation, stored in a vector database like Pinecone or Weaviate, becomes the first mapped index of your Dark Data, revealing system dependencies and data flows previously invisible to modern tools.
This blueprint enables agentic systems. With a documented and indexed system, you can deploy AI agents using frameworks like LangChain to autonomously query this knowledge base, creating a low-risk testbed for autonomous workflow orchestration.
Using LLMs to auto-generate documentation is a tactical wedge for the larger, more critical project of legacy system modernization and dark data recovery.
LLMs like GPT-4 and Claude 3, trained on generic public code, invent plausible-but-false system dependencies and data flows. Treating this output as authoritative documentation creates a dangerous false map of your enterprise.\n- Introduces systemic risk for future development and security audits.\n- Obscures the true 'Dark Data' flows that need recovery for accurate AI training.
Direct LLM-driven refactoring fails because models lack the business logic and architectural context embedded in legacy systems.
Direct refactoring with an LLM is impossible without first extracting and structuring the implicit business rules and data dependencies trapped in the code. Models like GPT-4 or Claude 3 operate on syntax, not decades of accreted operational logic.
The core failure is missing context. An LLM sees a COBOL copybook or a Java class file, but not the end-to-end transaction flow, the stateful session management, or the side effects on downstream mainframe systems. This leads to functionally broken outputs that compile but corrupt data.
Refactoring is an architectural decision, not a syntactic translation. An LLM cannot decide to decompose a monolith into microservices, select between event-driven or RESTful patterns, or implement the Strangler Fig Pattern for safe incremental replacement.
Evidence from production systems shows LLM-generated refactors for legacy banking logic introduce an average of 15-20 critical logic errors per 1,000 lines of code. The remediation cost exceeds the value of the automated translation, creating negative ROI.
Using generative AI to auto-document legacy systems is not an IT project; it's a strategic wedge for unlocking dark data and enabling broader modernization.
Legacy systems have no living documentation, creating a single point of failure for institutional knowledge. This bottleneck stalls every downstream AI initiative, from RAG to agentic workflows.
Using an LLM to auto-generate documentation is a low-risk, high-reward entry point for discovering and mobilizing dark data.
Your first mission is documentation. Proposing a full legacy system overhaul triggers budget and risk aversion. Instead, pitch a project to auto-generate missing or outdated system documentation using a Large Language Model (LLM). This creates a strategic beachhead for data discovery without declaring war on the existing architecture.
The real objective is data mapping. Frameworks like LangChain or LlamaIndex can ingest COBOL copybooks, JCL scripts, and data dictionaries. As the LLM processes these artifacts to write summaries, it simultaneously builds a semantic map of your data landscape, identifying entities, relationships, and critical business logic trapped in the mainframe.
This process reveals the infrastructure gap. The generated documentation will expose dependencies and data flows that your current API wrapping strategy misses. This evidence justifies the next phase: building a federated RAG system using Pinecone or Weaviate to index this newly discovered context, directly feeding our Dark Data Recovery initiatives.
Evidence: A 2023 study by Gartner found that organizations using AI for initial system discovery reduced the time to identify critical data assets for modernization by 70%. This reconnaissance mission provides the actionable intelligence required to plan a true Strangler Fig migration, not another doomed big-bang project.

About the author
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Documentation generation has a tangible, immediate ROI and requires minimal initial system disruption. It bypasses the Infrastructure Gap by working with existing data exports.\n- Key Benefit 1: Delivers a working AI product in weeks, not quarters, building stakeholder confidence for larger initiatives like automated code modernization.\n- Key Benefit 2: The process inherently performs a lightweight Legacy System Audit, identifying data quality issues and security models that would poison downstream Machine Learning models.
The documentation corpus becomes the first vectorized knowledge base. This is the prerequisite for enterprise-grade Retrieval-Augmented Generation (RAG) and feeding Agentic AI workflows.\n- Key Benefit 1: Transforms inert legacy data into an active asset for real-time AI decisioning and predictive maintenance systems.\n- Key Benefit 2: Establishes the data governance and explainable AI audit trails required for AI TRiSM compliance and Sovereign AI deployments.
Creating a simple API facade over a legacy database does not solve the underlying data quality issues or create semantic understanding. It's a brittle bridge.\n- Key Benefit 1: Generative documentation forces a semantic data strategy, mapping business logic that pure API wrapping misses.\n- Key Benefit 2: It provides the context engineering needed for AI agents to correctly interpret legacy system outputs, preventing costly hallucinations and errors in autonomous workflows.
Many AI projects stall in pilot purgatory because they lack a clear path to production integration. A documentation project has a built-in deployment path: the IT and engineering teams themselves.\n- Key Benefit 1: Creates immediate utility for human-in-the-loop validation, ensuring the AI output is accurate and building trust.\n- Key Benefit 2: The validated documentation becomes the single source of truth for MLOps pipelines and digital twin creation, directly enabling the next phase of Legacy System Modernization.
Accurate, AI-generated system documentation is the control plane for the next step: automated code modernization. It provides the business logic map that LLMs need for reliable refactoring.\n- Key Benefit 1: Dramatically reduces the technical debt and risk associated with big bang migrations by enabling incremental, understood changes.\n- Key Benefit 2: Empowers AI-native SDLC tools and coding agents to safely interact with and modernize legacy codebases, turning a cost center into a strategic asset.
$50k-150k, 2-4 months
Phase 1: Primary Output | Static PDF/Word docs | Partially refactored code modules | Interactive, queryable knowledge graph |
Phase 1: Dark Data Discovery | Manual, < 5% of total data | Incidental, focused on code paths | Automated, surfaces 30-50% of trapped data |
Phase 2: Foundation for RAG | Limited to modernized modules |
Phase 2: Data for Model Fine-Tuning | None | Synthetic or limited real data | Validated, historical datasets from docs |
Phase 3: Unblocks Strangler Fig Pattern |
Phase 3: Reduces Tech Debt for AI Agents | High risk of new debt |
Total 18-Month ROI (Risk-Adjusted) | 0-5% | High variance, -20% to 30% | 200-400% |
Evidence: Projects that start with documentation see a 70% higher success rate for subsequent code refactoring phases because the data foundation and stakeholder buy-in are already secured.
Frame the AI not as a writer, but as an interrogator. Use its output to identify gaps, contradictions, and undocumented business logic buried in COBOL or RPG code. This process surfaces the actionable inventory of dark data required for modernization.\n- Prioritizes the Strangler Fig Pattern by identifying low-risk, high-value modules for incremental migration.\n- Creates the data map needed for effective Retrieval-Augmented Generation (RAG) and agentic workflow design.
The real value isn't the PDF manual; it's the structured data catalog and dependency graph generated as a byproduct. This becomes the blueprint for building robust APIs that expose legacy functions to modern AI stacks.\n- Enables real-time data feeds for MLOps pipelines and autonomous agents.\n- Prevents the technical debt of brittle, point-to-point API wrapping by identifying canonical data sources first.
AI-generated docs inherit the biases, obsolete logic, and security flaws of the source material. Deploying this context into Agentic AI systems or RAG assistants propagates legacy risks at machine speed.\n- Violates core AI TRiSM principles for explainability and data protection.\n- Amplifies legacy data quality issues, corrupting downstream machine learning models with historical inaccuracies.
The end goal is a living, queryable model—a digital twin of your legacy environment. This allows safe simulation for automated code modernization projects and provides the context layer for hyper-personalized AI-powered consumer experiences.\n- Enables shadow mode deployment of new AI agents against the emulated system.\n- Solves the data foundation problem for constructing accurate industrial metaverse simulations.
Treating this as an IT project guarantees failure. Success requires an executive role—the Chief Dark Data Officer—to own the recovered data as a strategic AI asset and govern its use across sovereign AI infrastructure and confidential computing environments.\n- Orchestrates the shift from legacy mainframes to hybrid cloud AI architecture.\n- Ensures recovered data fuels competitive advantage in precision medicine and predictive sales orchestration, not just compliance.
Deploy a lightweight LLM agent to analyze codebases (COBOL, RPG) and generate structured documentation. This creates a low-risk, high-value artifact that serves as your Trojan Horse.
The generated documentation is not the end goal. It's the map for your next move: dark data recovery. Use the discovered schemas and logic to build targeted extraction pipelines.
An AI-generated documentation system must be governed to avoid creating another unmaintainable artifact. This requires treating the outputs as version-controlled, living assets.
With mobilized dark data and understood system logic, you can now fuel the modern AI ecosystem. This is the exit strategy for your Trojan Horse operation.
This approach is strategic, not magical. It fails without acknowledging core constraints. Generative AI cannot understand complex, undocumented business nuance.
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
5+ years building production-grade systems
Explore ServicesWe look at the workflow, the data, and the tools involved. Then we tell you what is worth building first.
01
We understand the task, the users, and where AI can actually help.
Read more02
We define what needs search, automation, or product integration.
Read more03
We implement the part that proves the value first.
Read more04
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us