Simulated Annealing (SA) is a metaheuristic optimization algorithm that finds an approximate global optimum in a large, complex search space. It is a probabilistic technique inspired by the physical process of annealing in metallurgy, where a material is heated and slowly cooled to reduce defects and minimize its internal energy. The algorithm's core mechanism is its ability to accept moves to worse states with a probability that decreases over time, controlled by a temperature parameter and a defined cooling schedule. This allows it to escape local optima that trap simpler algorithms like Hill Climbing.
Glossary
Simulated Annealing

What is Simulated Annealing?
A probabilistic optimization algorithm inspired by metallurgical annealing, designed to escape local optima by occasionally accepting worse solutions.
The algorithm begins at a high temperature, where it frequently accepts worse solutions to explore the state space broadly. As the temperature cools according to the schedule, it becomes increasingly greedy, converging toward a local—and ideally global—optimum. Key parameters include the initial temperature, cooling rate, and stopping criterion. It is widely applied to NP-hard problems like the traveling salesman, VLSI design, and scheduling, where exhaustive search is infeasible. Within Agentic Cognitive Architectures, SA provides a robust planning and decision-making component for agents navigating environments with deceptive reward landscapes.
Key Features of Simulated Annealing
Simulated Annealing is a probabilistic metaheuristic for global optimization, inspired by the physical annealing process in metallurgy. Its defining features enable it to escape local optima and navigate complex, high-dimensional search spaces.
Probabilistic Acceptance of Worse Moves
The core mechanism that differentiates Simulated Annealing from greedy algorithms like Hill Climbing. At each iteration, the algorithm may accept a new candidate solution that is worse than the current one, with a probability given by the Metropolis criterion: P(accept) = exp(-ΔE / T), where ΔE is the change in solution cost (or energy) and T is the current temperature. This stochastic acceptance allows the search to escape local optima by traversing through valleys in the energy landscape.
Temperature Parameter & Cooling Schedule
The temperature (T) is a control parameter that dictates the exploration-exploitation balance. A high initial temperature encourages widespread exploration (high probability of accepting worse moves). The cooling schedule is a deterministic function that gradually reduces the temperature over time, slowly shifting the algorithm's behavior from exploration to exploitation. Common schedules include:
- Geometric Cooling:
T_{k+1} = α * T_k, whereαis a constant (e.g., 0.95). - Linear Cooling:
T_{k+1} = T_k - β. - Logarithmic Cooling:
T_{k+1} = T_0 / log(k), which guarantees convergence to the global optimum but is impractically slow.
Asymptotic Convergence Guarantee
Under specific theoretical conditions, Simulated Annealing can be proven to converge to the global optimum with probability 1. This guarantee requires an infinitely slow cooling schedule, such as the logarithmic schedule T(k) ≥ C / log(1 + k), where C is a problem-dependent constant. While impractical for finite-time computation, this property establishes it as a global optimization method, unlike purely local search algorithms. In practice, faster cooling schedules are used to find high-quality solutions efficiently.
Neighborhood Structure
The algorithm's performance is heavily dependent on how a neighbor is defined for a given solution. The neighborhood structure is the set of all states reachable from the current state by applying a small, predefined perturbation (e.g., swapping two cities in a TSP tour, flipping a bit in a binary string). A well-designed neighborhood should:
- Be efficiently computable.
- Allow the search to traverse the entire state space.
- Provide a smooth gradient for the algorithm to follow. Poor neighborhood design can render the temperature schedule ineffective.
Application to Discrete & Continuous Spaces
While classically applied to combinatorial optimization problems (e.g., Traveling Salesperson Problem, job scheduling, circuit placement), variants exist for continuous domains. For continuous optimization, the neighborhood move is typically a random step in parameter space, often drawn from a Gaussian distribution. The step size can be annealed alongside the temperature. This makes it a versatile tool for optimizing agent policies, neural network weights, or hyperparameters where gradient information is unavailable or unreliable.
Connection to Markov Chain Monte Carlo (MCMC)
Simulated Annealing is fundamentally linked to MCMC sampling methods. At a fixed temperature T, the sequence of states visited by the algorithm forms a Markov chain whose stationary distribution is the Boltzmann distribution π(s) ∝ exp(-E(s) / T). This means the algorithm spends more time in low-energy (high-quality) states. Annealing can be viewed as running a series of MCMC chains at progressively lower temperatures, gradually focusing the sampling on the deepest minima. This statistical mechanics perspective provides a robust theoretical foundation.
Frequently Asked Questions
A probabilistic optimization technique inspired by metallurgy, used to find global optima in complex search spaces by allowing controlled exploration of worse states.
Simulated Annealing is a probabilistic optimization algorithm designed to approximate the global optimum of a given function in a large search space. It works by iteratively proposing random moves to neighboring states. A move to a better state is always accepted, while a move to a worse state is accepted with a probability determined by the Metropolis criterion: P(accept) = exp(-ΔE / T), where ΔE is the change in the objective function (energy) and T is a global temperature parameter. The algorithm starts with a high temperature, allowing frequent exploration of worse states to escape local optima. Over time, the temperature is gradually reduced according to a cooling schedule, making the algorithm increasingly greedy and converging towards a (hopefully global) optimum.
Key components include:
- State: A candidate solution to the optimization problem.
- Energy (Cost) Function: The objective function to be minimized.
- Neighbor Generation Function: Defines how to perturb the current state to create a new candidate.
- Temperature (T): Controls the probability of accepting worse moves.
- Cooling Schedule: The rule for decreasing
Tover time (e.g.,T_{k+1} = α * T_k, whereαis a decay rate like 0.95).
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Related Terms
Simulated Annealing is a foundational metaheuristic within the broader family of optimization and search algorithms. These related techniques share the goal of efficiently navigating complex state spaces, but employ different strategies for exploration, exploitation, and escaping local optima.
Hill Climbing
Hill Climbing is a simple local search algorithm and a direct conceptual precursor to Simulated Annealing. It iteratively moves to a neighboring state that improves the objective function value, akin to always climbing uphill.
- Key Difference: Unlike Simulated Annealing, it never accepts worse moves. This makes it highly efficient but极易陷入 local optima from which it cannot escape.
- Use Case: Ideal for convex or simple optimization landscapes where the global optimum is easily reachable from many starting points.
- Variants: Include Stochastic Hill Climbing (randomly selects a better neighbor) and Random-Restart Hill Climbing (runs the algorithm multiple times from different start points).
Tabu Search
Tabu Search is another metaheuristic designed to escape local optima by using memory. It maintains a tabu list—a short-term memory of recently visited states or moves—that are forbidden (tabu) for a number of iterations.
- Mechanism: This enforced "forgetting" prevents cycling and encourages exploration of new regions of the search space. It can accept worsening moves if they are not tabu.
- Contrast with SA: While SA uses probabilistic acceptance and a cooling schedule, Tabu Search uses deterministic rules guided by memory. Tabu Search often requires careful tuning of the tabu list size and aspiration criteria (rules that override the tabu status).
- Application: Extensively used in scheduling, routing, and other combinatorial optimization problems.
Genetic Algorithms
Genetic Algorithms (GAs) are population-based metaheuristics inspired by natural selection. Instead of iteratively improving a single solution (like SA), GAs maintain a population of candidate solutions.
- Core Operators: Solutions are combined and mutated using crossover and mutation operators to create a new generation. Selection pressure favors fitter solutions.
- Exploration vs. Exploitation: The population allows for parallel exploration of the search space. Crossover exploits building blocks of good solutions, while mutation introduces exploration.
- Comparison: GAs are inherently parallel and can handle complex, multi-modal landscapes well. They are less directly analogous to physical processes than SA but share the goal of global optimization. They often require defining a genetic representation (encoding) of the problem.
Monte Carlo Tree Search (MCTS)
Monte Carlo Tree Search (MCTS) is a heuristic search algorithm for optimal decision-making, famous for its success in Go. It combines tree search with random sampling.
- Four-Step Process: 1) Selection: Traverse the tree using a policy (e.g., UCB1) to select a node. 2) Expansion: Add a child node. 3) Simulation: Run a random playout (rollout) to a terminal state. 4) Backpropagation: Update the node statistics with the result.
- Stochastic Element: Like SA's random acceptance of worse states, MCTS uses random simulations to estimate the value of states, which provides a broad, exploratory view of the search space.
- Domain: Primarily used in sequential decision-making problems like games and planning, whereas SA is typically applied to static optimization problems.
Local Beam Search
Local Beam Search is a variant of hill climbing that maintains not one, but k current states (the beam width) simultaneously. At each iteration, it generates all successors of all k states, then selects the k best from the combined set.
- Parallel Hill Climbing: It can be seen as running
khill climbs in parallel, with limited information sharing (only the best successors are kept). - Difference from SA: It is a deterministic, greedy algorithm within the beam. It lacks SA's explicit mechanism (probabilistic acceptance) to escape local optima, though a stochastic beam search variant selects successors probabilistically.
- Trade-off: It uses more memory than hill climbing but can explore a slightly broader area. It can still converge all beams to the same local optimum.
Metropolis-Hastings Algorithm
The Metropolis-Hastings Algorithm is a Markov Chain Monte Carlo (MCMC) method for sampling from a probability distribution. Simulated Annealing is directly derived from it.
- Foundational Logic: Given a current state
x, propose a new statex'. Accept the move with probabilitymin(1, P(x')/P(x)). This generates a Markov chain whose stationary distribution isP(x). - Connection to SA: Simulated Annealing adapts this for optimization by treating
P(x)asexp(-f(x)/T), wheref(x)is the cost function andTis the temperature. The cooling schedule gradually reducesT, making the distribution increasingly concentrated on low-cost states. - Significance: This formal probabilistic foundation is what allows SA to asymptotically converge to a global optimum with a sufficiently slow cooling schedule, providing theoretical guarantees not present in simpler heuristics.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us