Copy Suppression: Comprehensively Understanding a Motif in Language Model Attention Heads

Copy Suppression: Comprehensively Understanding a Motif in Language Model Attention Heads Callum Stuart McDougall author Arthur Conmy author Cody Rushing author Thomas McGrath author Neel Nanda author 2024-11 text Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP Yonatan Belinkov editor Najoung Kim editor Jaap Jumelet editor Hosein Mohebbi editor Aaron Mueller editor Hanjie Chen editor Association for Computational Linguistics Miami, Florida, US conference publication mcdougall-etal-2024-copy 10.18653/v1/2024.blackboxnlp-1.22 https://aclanthology.org/2024.blackboxnlp-1.22/ 2024-11 337 363