A cross-encoder is a neural architecture for reranking that processes a query and a candidate document simultaneously through a single transformer model to output a direct relevance or similarity score. Unlike a bi-encoder, which encodes inputs separately for fast retrieval, a cross-encoder performs deep, joint attention across the entire input pair. This allows it to capture complex semantic interactions and subtle linguistic nuances, making its relevance judgments highly precise but computationally expensive, as scores cannot be pre-computed.
