%0 Conference Proceedings %T Machine Translation Between High-resource Languages in a Language Documentation Setting %A Kann, Katharina %A Ebrahimi, Abteen %A Stenzel, Kristine %A Palmer, Alexis %Y Serikov, Oleg %Y Voloshina, Ekaterina %Y Postnikova, Anna %Y Klyachko, Elena %Y Neminova, Ekaterina %Y Vylomova, Ekaterina %Y Shavrina, Tatiana %Y Ferrand, Eric Le %Y Malykh, Valentin %Y Tyers, Francis %Y Arkhangelskiy, Timofey %Y Mikhailov, Vladislav %Y Fenogenova, Alena %S Proceedings of the first workshop on NLP applications to field linguistics %D 2022 %8 October %I International Conference on Computational Linguistics %C Gyeongju, Republic of Korea %F kann-etal-2022-machine %X Language documentation encompasses translation, typically into the dominant high-resource language in the region where the target language is spoken. To make data accessible to a broader audience, additional translation into other high-resource languages might be needed. Working within a project documenting Kotiria, we explore the extent to which state-of-the-art machine translation (MT) systems can support this second translation – in our case from Portuguese to English. This translation task is challenging for multiple reasons: (1) the data is out-of-domain with respect to the MT system’s training data, (2) much of the data is conversational, (3) existing translations include non-standard and uncommon expressions, often reflecting properties of the documented language, and (4) the data includes borrowings from other regional languages. Despite these challenges, existing MT systems perform at a usable level, though there is still room for improvement. We then conduct a qualitative analysis and suggest ways to improve MT between high-resource languages in a language documentation setting. %U https://aclanthology.org/2022.fieldmatters-1.3 %P 26-33