A Sentence Transformer is a transformer-based model, often derived from architectures like BERT or RoBERTa, that is specifically fine-tuned using contrastive learning objectives such as triplet loss. Unlike its base models which output token-level embeddings, a Sentence Transformer uses embedding pooling techniques to produce a single, fixed-dimensional vector per input text. This enables efficient semantic similarity comparisons via metrics like cosine similarity, forming the core of modern semantic search and retrieval systems.
