Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models

Paweł Mąka; Yusuf Can Semerci; Jan Scholtes; Gerasimos Spanakis

Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models

Paweł Mąka, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis

Abstract

In this paper, we investigate the role of attention heads in Context-aware Machine Translation models for pronoun disambiguation in the English-to-German and English-to-French language directions. We analyze their influence by both observing and modifying the attention scores corresponding to the plausible relations that could impact a pronoun prediction. Our findings reveal that while some heads do attend the relations of interest, not all of them influence the models’ ability to disambiguate pronouns. We show that certain heads are underutilized by the models, suggesting that model performance could be improved if only the heads would attend one of the relations more strongly. Furthermore, we fine-tune the most promising heads and observe the increase in pronoun disambiguation accuracy of up to 5 percentage points which demonstrates that the improvements in performance can be solidified into the models’ parameters.

Anthology ID:: 2025.coling-main.424
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6348–6377
Language:
URL:: https://aclanthology.org/2025.coling-main.424/
DOI:
Bibkey:
Cite (ACL):: Paweł Mąka, Yusuf Can Semerci, Jan Scholtes, and Gerasimos Spanakis. 2025. Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models. In Proceedings of the 31st International Conference on Computational Linguistics, pages 6348–6377, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models (Mąka et al., COLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.coling-main.424.pdf

PDF Cite Search Fix data