ColBERT is a dense retrieval model that uses a late interaction mechanism to compute similarity between queries and documents. Unlike a standard bi-encoder that produces a single vector per document, ColBERT encodes text into fine-grained, contextualized token-level embeddings. This allows for a more expressive, token-wise similarity computation, capturing nuanced semantic relationships that a single vector might miss, while still enabling efficient approximate nearest neighbor (ANN) search via pre-computed document token embeddings.
