ColBERT (Contextualized Late Interaction over BERT) is a neural information retrieval model that provides efficient and effective search by computing contextualized embeddings for every token in a query and document and scoring relevance via a late interaction mechanism. Unlike models that produce a single vector per passage, ColBERT encodes queries and documents independently into fine-grained token-level embeddings, enabling pre-computation and indexing of document representations for scalable retrieval. The model's core scoring function, MaxSim, calculates relevance by summing the maximum cosine similarity between each query token embedding and all document token embeddings.
