Sparse retrieval is a traditional information retrieval method where documents and queries are represented as high-dimensional, sparse vectors based on term occurrence, enabling efficient keyword-based search. In this model, each dimension corresponds to a unique term in the vocabulary, and values are typically weighted using functions like TF-IDF or BM25. The sparsity arises because any single document contains only a small subset of all possible terms, making the vector mostly zeros. Relevance is scored by computing the similarity, often a dot product, between the query vector and pre-computed document vectors.
