A deep ensemble is a machine learning method that trains multiple deep neural networks independently, typically from different random initializations, and combines their predictions through averaging or voting to produce a final, more robust output. This technique directly reduces predictive variance and improves generalization by leveraging model diversity, functioning as a form of approximate Bayesian inference without modifying the underlying training procedure. It is a cornerstone of self-consistency mechanisms for building reliable agentic systems.
