Meta-Reinforcement Learning (Meta-RL) is a machine learning paradigm where an agent learns a learning algorithm itself, acquiring prior knowledge from a distribution of related tasks that enables rapid adaptation—or few-shot learning—to novel tasks with minimal additional experience. The core objective is to develop a meta-policy or a set of adaptable parameters that can be quickly fine-tuned, moving beyond single-task optimization to a system that learns how to learn efficiently across a task family.




