Top-K retrieval is the process of returning only the 'K' highest-scoring documents or data points from a search operation, where 'K' is a user-defined integer. This parameter is fundamental to vector search, approximate nearest neighbor (ANN) algorithms, and hybrid search systems, acting as a critical lever for controlling latency, computational cost, and the precision-recall trade-off. By limiting results to a manageable set, it enables efficient downstream processing like reranking with a cross-encoder.
