BM25 is a probabilistic retrieval model that scores documents based on the frequency of query terms they contain, penalizing term saturation and normalizing for document length. It refines earlier models like TF-IDF by introducing non-linear term frequency saturation and length normalization parameters, making it robust for general text search. As a lexical or sparse retrieval method, it operates on exact keyword matches, forming a core component of hybrid search systems when combined with semantic vector search.
