Reservoir Sampling is a probabilistic algorithm for buffer management in continual learning. It maintains a uniformly random subset, or 'reservoir,' of fixed size k from a potentially infinite stream of data. The core algorithm, Algorithm R, processes items sequentially: the first k items fill the reservoir, and each subsequent i-th item replaces a randomly chosen reservoir entry with probability k/i. This guarantees every item in the stream has an equal probability of being in the final reservoir, enabling unbiased sampling from non-stationary distributions without prior knowledge of the stream's length.
Glossary
Reservoir Sampling

What is Reservoir Sampling?
Reservoir Sampling is a family of randomized algorithms designed to select a simple random sample of a fixed size from a data stream of unknown or very large length, with each element having an equal probability of being included in the final sample.
In Edge-CL, reservoir sampling is critical for implementing experience replay within strict memory constraints. It provides a statistically sound method for maintaining a replay buffer that represents the overall data distribution seen so far. This allows models to rehearse past experiences efficiently, directly combating catastrophic forgetting. Variations like weighted reservoir sampling adapt the algorithm for non-uniform sampling, which is useful for prioritizing rare or informative examples in the buffer to improve learning efficiency and model robustness in sequential tasks.
Key Properties of Reservoir Sampling
Reservoir Sampling is a family of randomized algorithms designed to select a uniform random sample of k items from a data stream of unknown length n, where n can be very large or potentially infinite, using only O(k) memory.
Single-Pass Guarantee
The core property of reservoir sampling is its single-pass operation. The algorithm processes each element of the stream exactly once, without the need to store the entire stream or revisit previous elements. This makes it memory-efficient and suitable for online learning and streaming data scenarios where data arrives continuously and cannot be stored in full.
- Mechanism: It maintains a 'reservoir' of size
k. For the firstkelements, it fills the reservoir. For each subsequent elementi(wherei > k), it includes the element with probabilityk/i, replacing a randomly chosen existing reservoir element.
Uniform Random Sampling
The defining statistical guarantee: after processing n elements, every subset of size k from the n elements has an equal probability of being the final reservoir. Each individual element has a probability of exactly k/n of being in the sample. This property holds regardless of the order of the stream.
- Proof Sketch: For element
i, the probability it enters the reservoir isk/i. The probability it survives all subsequent replacement attempts (for elementsjfromi+1ton) isΠ (1 - k/(j * k)) = i/n. Multiplying these givesk/n. - Implication for Continual Learning: This ensures the replay buffer is an unbiased statistical sample of the entire data stream, preventing distributional skew in rehearsal.
Algorithm R (The Classic Version)
Algorithm R is the foundational, simple reservoir sampling algorithm. It is straightforward to implement and provides the uniform sampling guarantee.
Pseudocode:
codeInitialize reservoir R with first k elements. for i = k+1 to n: j = random integer in [1, i] if j <= k: R[j] = element i
- Complexity:
O(n)time,O(k)space. - Drawback: It requires generating a random number for every element in the stream after the first
k, which can be a computational bottleneck for extremely high-speed streams.
Algorithm L (Optimized for Speed)
Algorithm L is an optimized version that reduces the number of random number generations. Instead of checking every element, it uses skip intervals calculated from a geometric distribution.
- Mechanism: It calculates how many elements to skip before the next candidate for replacement, based on the current reservoir size and a random variable. This dramatically reduces the number of iterations in the main loop.
- Advantage: While maintaining the uniform sampling property, it runs in
O(k(1 + log(n/k)))expected time, which is significantly faster than Algorithm R for largen. - Use Case: Preferred in high-throughput streaming applications or when the sampling operation itself is a performance-critical path.
Application in Continual Learning Buffers
In Continual Learning, reservoir sampling is the standard algorithm for buffer management in rehearsal-based methods like Experience Replay.
- Process: As new task data streams in, the algorithm decides which new samples to insert into the fixed-size replay buffer and which old ones to evict.
- Why it works: It ensures the buffer always contains a uniform random sample of all data seen so far. This provides a statistically sound basis for interleaving old and new data during training, which is crucial for mitigating catastrophic forgetting.
- Contrast with Heuristics: Unlike FIFO (First-In-First-Out) or other deterministic strategies, reservoir sampling prevents bias towards recent data, promoting better long-term knowledge retention.
Weighted Reservoir Sampling
A powerful generalization where each stream element i has an associated weight w_i. The algorithm selects a sample where the probability of an element being chosen is proportional to its weight.
- Use Case in ML: Extremely relevant for importance sampling and non-uniform buffer management. For example, samples with higher loss or gradient magnitude can be given higher weight to be retained in the buffer, as they may be more informative for rehearsal.
- Algorithm: Variants like Algorithm A-Chao or Algorithm A-Res extend the core idea to handle weights, maintaining
O(k)memory but with more complex per-element operations. - Connection: This bridges simple random sampling to more sophisticated, curriculum-based or hard example mining strategies within continual learning frameworks.
Frequently Asked Questions
Reservoir Sampling is a foundational probabilistic algorithm for managing data streams in continual learning. These questions address its core mechanics, applications, and trade-offs for edge AI systems.
Reservoir Sampling is a family of randomized algorithms designed to select a simple random sample of k items from a data stream of unknown, potentially infinite length n, with each item having an equal probability of being included in the final sample. The classic Algorithm R works by initially filling a reservoir (a buffer of size k) with the first k stream elements. For each subsequent element i (where i > k), the algorithm generates a random integer j uniformly from [1, i]. If j ≤ k, the element at reservoir index j is replaced with the new stream element i. This process guarantees that after processing n elements, every possible subset of size k is equally likely to be the reservoir's contents, providing a uniform random sample without requiring prior knowledge of n.
Key Property: The probability any given element is in the final reservoir is exactly k / n.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Related Terms
Reservoir Sampling is a core component of buffer management for rehearsal-based continual learning. The following terms define the broader ecosystem of algorithms, challenges, and deployment scenarios where it is applied.
Experience Replay
A foundational continual learning technique that mitigates catastrophic forgetting by storing a subset of past training data (or their feature representations) and interleaving them with new data during training. This rehearsal of old tasks helps maintain performance. Reservoir Sampling is the standard probabilistic algorithm for managing the fixed-size memory buffer in online scenarios where the data stream is potentially infinite.
Replay Buffer
The fixed-capacity memory storage used in rehearsal-based continual learning. Its management is critical for performance. Key strategies include:
- Reservoir Sampling: Maintains a uniform random sample.
- Core-set Selection: Chooses representative prototypes to maximize coverage.
- Ring Buffer: FIFO replacement, simple but non-random.
- Gradient-based Selection: Prioritizes samples that maximize loss if forgotten. Effective Buffer Management directly influences the stability-plasticity trade-off.
Online Continual Learning
The strictest and most realistic variant of continual learning, where the model receives a single, non-repeating pass through a stream of data, often one sample or a small mini-batch at a time. This scenario prohibits multiple epochs over past data, making efficient buffer algorithms like Reservoir Sampling essential for feasible rehearsal. It closely mimics real-world edge deployment where data arrives sequentially and cannot be stored indefinitely.
Gradient Episodic Memory (GEM)
A rehearsal-based algorithm that stores past data in an episodic memory. Its core innovation is to constrain the gradient update for a new task so that it does not increase the loss on the memory examples. This is formulated as a quadratic programming problem. While GEM defines how to use the buffer, Reservoir Sampling or other strategies define what goes into the buffer, making them complementary components of the system.
Stability-Plasticity Dilemma
The fundamental trade-off in all continual learning systems. Stability refers to a model's ability to retain knowledge from previous tasks. Plasticity is its capacity to learn new information efficiently. Rehearsal-based methods using a buffer directly address this: the buffer size and management policy (e.g., Reservoir Sampling) are hyperparameters that control the balance. A larger buffer favors stability but may reduce plasticity for new patterns.
Edge-CL
The practical domain of deploying continual learning on resource-constrained edge devices (e.g., smartphones, IoT sensors). Key constraints are memory, compute, and energy. Reservoir Sampling is highly suitable for Edge-CL because it is computationally lightweight (O(1) per sample decision), requires minimal memory overhead (only the buffer itself), and operates in a single pass, making it ideal for on-device, online data streams.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us