Bayesian Optimization (BO) is a sequential, sample-efficient strategy for finding the global optimum of an expensive-to-evaluate black-box function. It works by constructing a probabilistic surrogate model (typically a Gaussian Process) to approximate the unknown function and an acquisition function to decide where to sample next, optimally balancing exploration of uncertain regions with exploitation of known promising areas.
The process is iterative:
- Build a Surrogate Model: Fit a probabilistic model (e.g., a Gaussian Process) to all previously observed (input, output) pairs.
- Define an Acquisition Function: Use the surrogate's predictive distribution (mean and uncertainty) to compute a utility score for sampling any new point. Common functions include Expected Improvement (EI), Upper Confidence Bound (UCB), and Probability of Improvement (PI).
- Optimize the Acquisition Function: Find the point that maximizes the acquisition function. This is a cheaper optimization problem, as the acquisition function is analytical.
- Evaluate the True Function: Sample the expensive black-box function at the chosen point.
- Update the Surrogate Model: Incorporate the new observation and repeat from step 1 until a budget is exhausted.
This framework is particularly powerful in Recursive Self-Improvement contexts, where an AI system uses BO to optimize its own internal hyperparameters or learning curricula, treating its own performance metric as the expensive black-box function to be maximized.