Embedding drift is the phenomenon where the statistical distribution and semantic relationships of generated vector embeddings change over time, degrading the performance of downstream systems like semantic search and retrieval-augmented generation (RAG). This occurs due to shifts in the input data distribution, updates to the embedding model itself, or fine-tuning on new domains, causing previously aligned items to become misaligned in the embedding space.
