Glossary

Plan Repair

Plan repair is the process of modifying an existing plan that has failed during execution due to unexpected state changes, often using local modifications instead of full re-planning.

Get in touch Learn more

Product manager reviewing autonomous task execution dashboard on laptop, completed tasks visible, casual work session.

AUTOMATED PLANNING SYSTEMS

What is Plan Repair?

Plan repair, also known as replanning, is a core capability in automated planning systems where an existing plan is modified after it fails during execution, often due to unexpected changes in the environment.

Plan repair is the process of modifying an existing, failing plan during execution, typically using local modifications instead of generating a new plan from scratch. This approach, also called replanning, is crucial for autonomous agents operating in dynamic, uncertain environments where the initial world state assumptions can become invalid. The goal is to efficiently produce a corrected plan that achieves the original objectives from the new, unexpected state, minimizing disruption and computational cost compared to full re-planning.

Effective plan repair strategies leverage the structure of the original plan and the nature of the failure. Techniques range from simple patches, like reordering or substituting actions, to more sophisticated methods that reason about causal links and landmarks. This process is a key component of robust execution monitoring within agentic cognitive architectures, enabling systems to recover from setbacks and continue pursuing complex, multi-step goals autonomously.

AUTOMATED PLANNING SYSTEMS

Core Characteristics of Plan Repair

Plan repair, or replanning, is the process of modifying an existing plan that has failed during execution due to unexpected state changes, often using local modifications instead of full re-planning.

Localized Modification

Plan repair focuses on making minimal changes to an existing, failed plan rather than discarding it and starting from scratch. This involves identifying the specific point of failure and adjusting subsequent actions to accommodate the new world state. The core principle is incrementalism: preserving as much of the original, valid plan structure as possible to conserve computational effort and maintain plan stability. For example, if a delivery robot finds a door locked, a repair algorithm might insert a 'request key' action rather than re-planning the entire route from the warehouse.

Execution Monitoring & Failure Detection

Repair is triggered by a discrepancy between the expected state (as predicted by the plan's effects) and the observed state during execution. This requires continuous execution monitoring to detect failures such as:

Precondition violations: An action's required conditions are not met.
Unexpected state changes: External events alter the world independently of the agent's actions.
Action execution failures: An action is attempted but does not produce its intended effects. The monitoring system compares sensor readings or state assertions against the plan's timeline to identify the precise moment and nature of the failure.

Replanning vs. Plan Repair

While often used interchangeably, replanning and plan repair represent different strategies on a spectrum. Classical replanning treats the current, unexpected state as a new initial state and invokes the full planner from scratch, guaranteeing a correct solution but at high computational cost. Plan repair is a more efficient, anytime algorithm that attempts to patch the existing plan. The choice depends on the severity of the failure, time constraints, and domain dynamics. In highly dynamic environments, the speed of repair is often critical, favoring local modifications.

Plan-Space Repair

This is a primary algorithmic approach where repair operations are performed directly on the plan structure itself, not by searching the state space. Common repair operators include:

Action insertion: Adding a new action to achieve a missing precondition.
Action reordering: Changing the sequence of actions to resolve a causal link threat or resource conflict.
Action substitution: Replacing a failed action with a different one that achieves the same subgoal.
Goal re-establishment: Adding actions to re-achieve a goal fact that was made true earlier but has since become false. These operators are applied iteratively until the plan is consistent and reaches the goal from the current state.

Dependency-Directed Repair

This sophisticated technique analyzes the causal structure of the plan to understand why the failure occurred. It builds a dependency graph linking actions through their preconditions and effects. When a failure is detected (e.g., a required precondition is false), the algorithm backtracks through this graph to find the culprit action whose effect was expected but did not materialize, or whose effect was unexpectedly deleted. Repair then focuses on this subgraph, minimizing changes to unrelated parts of the plan. This method is more informed than blind plan-space operators.

Integration with Contingency Planning

Robust autonomous systems often combine plan repair with contingency planning. Before execution, the planner may generate a primary plan alongside a set of expected failure modes and pre-computed repair strategies or branching points. During execution, if a monitored failure matches a predicted contingency, the corresponding repair patch can be applied instantly. This hybrid approach blends the efficiency of pre-computation with the flexibility of runtime repair, creating systems that are both robust and responsive. It is essential for domains with known, high-probability uncertainties.

AUTOMATED PLANNING SYSTEMS

How Plan Repair Works

Plan repair, also known as replanning, is the process of modifying an existing action sequence that has failed during execution, typically due to unexpected changes in the environment or action failures.

Plan repair is the dynamic process of modifying a previously generated plan when its execution fails due to an unexpected state deviation or action failure. Instead of discarding the entire plan and initiating a costly full re-planning cycle, repair algorithms attempt to make local modifications—such as inserting, deleting, or reordering actions—to restore the plan's feasibility from the current, altered world state. This approach is more computationally efficient than complete re-planning and is essential for autonomous agents operating in non-deterministic, real-world environments where perfect execution cannot be guaranteed.

Effective plan repair relies on maintaining a causal link structure from the original plan, which records the dependencies between actions and the subgoals they achieve. When a failure is detected, the system identifies the broken causal links—goals that are no longer supported—and searches for minimal patches to re-establish them. Common techniques include least-commitment planning to insert new actions and partial-order causal link (POCL) planning to resolve threats. The goal is to produce a valid plan that achieves the original objectives from the new current state with minimal disruption to the remaining, still-valid plan steps.

PLAN REPAIR

Frequently Asked Questions

Plan repair, also known as replanning, is a critical capability for autonomous systems operating in dynamic environments. This FAQ addresses common technical questions about the mechanisms, trade-offs, and applications of modifying plans during execution.

Plan repair, or replanning, is the process of modifying an existing action sequence that has failed or become suboptimal during execution due to unexpected changes in the environment. It works by detecting a discrepancy between the expected and observed world state, then applying algorithms to locally adjust the plan—such as removing invalid actions, reordering steps, or inserting new corrective actions—instead of initiating a computationally expensive full re-planning cycle from scratch. Common techniques include plan-space planning and the use of execution monitors to trigger repair when preconditions are violated.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

AUTOMATED PLANNING SYSTEMS

Related Terms

Plan repair operates within a broader ecosystem of automated planning concepts. These related terms define the formalisms, algorithms, and processes that enable the creation, validation, and dynamic adjustment of action sequences.

Automated Planning

The computational process of generating a sequence of actions, known as a plan, that transforms an initial state of the world into a desired goal state. It is the foundational discipline from which plan repair emerges.

Core Problem: Given a formal description of the initial state, goal state, and possible actions, find a valid sequence.
Contrast with Plan Repair: Automated planning typically generates a plan from scratch, while plan repair modifies an existing, failing plan.

Plan Execution

The phase where a generated plan's sequence of actions is dispatched to actuators or simulators to physically or virtually change the state of the world. Plan repair is triggered by failures detected during execution monitoring.

Execution Monitoring: Continuously compares the expected state (from the plan's predictions) with the observed state (from sensors).
Failure Detection: A discrepancy between expected and observed state signals the need for repair. Common causes include action failure, exogenous events, or modeling errors.

Contingent Planning

A planning paradigm that generates conditional plans (e.g., trees or policies) specifying different future actions based on the outcomes of sensory observations made during execution. It is a proactive alternative to reactive plan repair.

Key Structure: Plans are often conditional branches (if-else) or full policies mapping belief states to actions.
Use Case vs. Repair: Ideal for domains with predictable uncertainty (e.g., sensor noise). Plan repair is used when uncertainty is unpredictable or the contingent plan's branches are insufficient.

Replanning

Often used synonymously with plan repair, but can imply a more comprehensive approach. Strictly, replanning involves discarding the failed plan and initiating a new planning cycle from the current (unexpected) state.

Full Replanning: Treats the current state as a new initial state and calls the planner again. Can be computationally expensive but guarantees a fresh solution.
Contrast with Local Repair: Plan repair seeks minimal modifications (e.g., splicing in a new action sequence) to preserve the valid parts of the original plan, which is often more efficient.

Execution Monitoring

The continuous process of tracking plan progress and comparing the predicted state effects of actions against the real-world observed state. It is the critical subsystem that identifies the need for plan repair.

Components: Includes state estimation (determining the current world state) and discrepancy detection.
Triggers: Monitors for action failures (an action's preconditions were met but its effects did not occur), unexpected state changes (exogenous events), or violated state invariants.

Temporal Planning

A class of automated planning that deals with actions having explicit durations, concurrent execution, and temporal constraints between plan events. Plan repair in temporal domains must respect these complex constraints.

Added Complexity: Repair must adjust action start/end times and manage resource profiles, not just action ordering.
Example: In a manufacturing schedule, if a machine breaks down, repairing the plan may involve rescheduling subsequent jobs on other machines while respecting delivery deadlines.

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.

Talk to Us

Plan Repair

What is Plan Repair?

Core Characteristics of Plan Repair

Localized Modification

Execution Monitoring & Failure Detection

Replanning vs. Plan Repair

Plan-Space Repair

Dependency-Directed Repair

Integration with Contingency Planning

How Plan Repair Works

Frequently Asked Questions

Intelligent Analysis, Decision & Execution

Search across company data

Automate internal workflows

Add AI to products and internal tools

Prasad Kumkar

Partnered with leading AI, data, and software stack.

Custom AI workflows for your Business

Review the use case

Pick the right approach

Build the first useful version

Improve from there