Contrastive learning is a self-supervised machine learning technique that trains a model to learn useful data representations by distinguishing between similar (positive) and dissimilar (negative) examples. The core mechanism involves a contrastive loss function, such as InfoNCE, which maximizes agreement between positive pairs—like different augmentations of the same image or a matched image-text pair—and minimizes it for negative pairs. This process creates a structured embedding space where semantic similarity is encoded as geometric proximity, without requiring manually labeled data.
