Position Interpolation (PI) is a method for extending a transformer model's effective context window by linearly down-scaling the position indices of a longer input sequence to fit within the model's originally trained positional range. Instead of extrapolating to unseen, larger positions, PI compresses the position space, allowing the model to attend to sequences up to 32 times longer with minimal fine-tuning. This approach mitigates the high perplexity and instability typically associated with naive context length extrapolation.
