Experience replay is a machine learning technique, primarily used in deep reinforcement learning (DRL), where an agent stores its past experiences—each a tuple of (state, action, reward, next state)—in a fixed-size memory buffer called a replay buffer. During training, the agent samples mini-batches of these past experiences uniformly at random to update its Q-network or policy network. This process breaks the strong temporal correlations present in sequential, on-policy data, which stabilizes training by decorrelating updates and prevents catastrophic forgetting of rare but important events.
