Rotary Positional Embedding (RoPE) is a positional encoding method that injects absolute positional information into a transformer model by applying a rotation matrix to the query and key vectors based on their token positions. This rotation creates a multiplicative interaction between the token's embedding and its position, allowing the model's attention scores to inherently capture relative positional relationships through the dot product. Unlike additive embeddings, RoPE's rotational approach preserves the norm of vectors and offers better length extrapolation capabilities.
