In the Monte Carlo Tree Search (MCTS) algorithm, the playout policy (or rollout policy) is the decision rule applied during the simulation phase to quickly play from a newly expanded node to a terminal state. Its primary function is to generate a sample outcome—a win, loss, or reward—used to estimate the value of the node where the simulation began. Because speed is critical for running many simulations, this policy is typically a lightweight heuristic, a uniform random selection of actions, or a simplified version of the game's true strategy. The quality and speed of this policy directly influence the statistical efficiency of the overall search, balancing the need for informative outcomes against computational cost.
