Approximate Nearest Neighbor (ANN) search is a computational method for finding vectors in a database that are most similar to a query vector, accepting a small margin of error in exchange for orders-of-magnitude improvements in speed and memory usage compared to exact search. It is foundational to real-time semantic retrieval in systems using vector embeddings, such as RAG architectures and agentic memory. Instead of exhaustively comparing the query to every vector, ANN algorithms use probabilistic data structures like graphs, trees, or hash-based indexes to rapidly navigate the embedding space.
