Unsupervised Compositional Translation of Multiword Expressions

Pablo Gamallo, Marcos Garcia


Abstract
This article describes a dependency-based strategy that uses compositional distributional semantics and cross-lingual word embeddings to translate multiword expressions (MWEs). Our unsupervised approach performs translation as a process of word contextualization by taking into account lexico-syntactic contexts and selectional preferences. This strategy is suited to translate phraseological combinations and phrases whose constituent words are lexically restricted by each other. Several experiments in adjective-noun and verb-object compounds show that mutual contextualization (co-compositionality) clearly outperforms other compositional methods. The paper also contributes with a new freely available dataset of English-Spanish MWEs used to validate the proposed compositional strategy.
Anthology ID:
W19-5106
Volume:
Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019)
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Agata Savary, Carla Parra Escartín, Francis Bond, Jelena Mitrović, Verginica Barbu Mititelu
Venue:
MWE
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
40–48
Language:
URL:
https://aclanthology.org/W19-5106
DOI:
10.18653/v1/W19-5106
Bibkey:
Cite (ACL):
Pablo Gamallo and Marcos Garcia. 2019. Unsupervised Compositional Translation of Multiword Expressions. In Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019), pages 40–48, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Unsupervised Compositional Translation of Multiword Expressions (Gamallo & Garcia, MWE 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-5106.pdf