A sliding window chunk is a text segmentation technique where a fixed-size window moves sequentially across a document, creating overlapping segments to preserve contextual information that would otherwise be lost at arbitrary split points. This method is critical in Retrieval-Augmented Generation (RAG) and semantic search pipelines, as it mitigates the context fragmentation problem by ensuring key concepts and entities near a chunk boundary remain accessible in adjacent chunks. The technique is defined by two primary parameters: the chunk size (window length) and the chunk overlap, which determines how many tokens or characters are shared between consecutive segments.
