The UCT algorithm is a heuristic that guides the tree search phase of Monte Carlo Tree Search (MCTS) by applying the UCB1 (Upper Confidence Bound) formula from multi-armed bandit problems to each node in the search tree. For a given node, it calculates a score that balances exploitation (selecting the child node with the highest average simulated reward) against exploration (selecting less-visited child nodes that may have untapped potential). This balance is critical for efficiently navigating vast state spaces in problems like game playing (e.g., Go, chess) and complex automated planning.
