Dense Passage Retrieval (DPR) is a neural information retrieval architecture that replaces traditional sparse, term-based systems like BM25 with dense vector representations. It uses two separate BERT-based encoders: a question encoder and a passage encoder. These are trained end-to-end to maximize the cosine similarity between a question and its corresponding relevant passage while minimizing similarity to irrelevant ones. At inference, passages are pre-encoded into a vector store, and retrieval involves a fast approximate nearest neighbor (ANN) search to find the closest matching passage embeddings for a query.
