Controlling Formality in Low-Resource NMT with Domain Adaptation and Re-Ranking: SLT-CDT-UoS at IWSLT2022

Sebastian Vincent, Loïc Barrault, Carolina Scarton


Abstract
This paper describes the SLT-CDT-UoS group’s submission to the first Special Task on Formality Control for Spoken Language Translation, part of the IWSLT 2022 Evaluation Campaign. Our efforts were split between two fronts: data engineering and altering the objective function for best hypothesis selection. We used language-independent methods to extract formal and informal sentence pairs from the provided corpora; using English as a pivot language, we propagated formality annotations to languages treated as zero-shot in the task; we also further improved formality controlling with a hypothesis re-ranking approach. On the test sets for English-to-German and English-to-Spanish, we achieved an average accuracy of .935 within the constrained setting and .995 within unconstrained setting. In a zero-shot setting for English-to-Russian and English-to-Italian, we scored average accuracy of .590 for constrained setting and .659 for unconstrained.
Anthology ID:
2022.iwslt-1.31
Volume:
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)
Month:
May
Year:
2022
Address:
Dublin, Ireland (in-person and online)
Venues:
ACL | IWSLT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
341–350
Language:
URL:
https://aclanthology.org/2022.iwslt-1.31
DOI:
10.18653/v1/2022.iwslt-1.31
Bibkey:
Cite (ACL):
Sebastian Vincent, Loïc Barrault, and Carolina Scarton. 2022. Controlling Formality in Low-Resource NMT with Domain Adaptation and Re-Ranking: SLT-CDT-UoS at IWSLT2022. In Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022), pages 341–350, Dublin, Ireland (in-person and online). Association for Computational Linguistics.
Cite (Informal):
Controlling Formality in Low-Resource NMT with Domain Adaptation and Re-Ranking: SLT-CDT-UoS at IWSLT2022 (Vincent et al., IWSLT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.iwslt-1.31.pdf
Data
MuST-CParaCrawlWikiMatrix