Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation Ta-Chung Chi author Ting-Han Fan author Alexander Rudnicky author 2024-06 text Findings of the Association for Computational Linguistics: NAACL 2024 Kevin Duh editor Helena Gomez editor Steven Bethard editor Association for Computational Linguistics Mexico City, Mexico conference publication chi-etal-2024-attention 10.18653/v1/2024.findings-naacl.10 https://aclanthology.org/2024.findings-naacl.10/ 2024-06 132 148