Transformers are overkill for static route planning because the problem is deterministic, not a sequence modeling task. Using a BERT or GPT model to solve the Traveling Salesman Problem is architecturally wrong; you are applying a sequence-to-sequence transformer to a combinatorial optimization problem it was not designed for.














