On the Relation between Syntactic Divergence and Zero-Shot Performance

Ofir Arviv, Dmitry Nikolaev, Taelin Karidi, Omri Abend


Abstract
We explore the link between the extent to which syntactic relations are preserved in translation and the ease of correctly constructing a parse tree in a zero-shot setting. While previous work suggests such a relation, it tends to focus on the macro level and not on the level of individual edges—a gap we aim to address. As a test case, we take the transfer of Universal Dependencies (UD) parsing from English to a diverse set of languages and conduct two sets of experiments. In one, we analyze zero-shot performance based on the extent to which English source edges are preserved in translation. In another, we apply three linguistically motivated transformations to UD, creating more cross-lingually stable versions of it, and assess their zero-shot parsability. In order to compare parsing performance across different schemes, we perform extrinsic evaluation on the downstream task of cross-lingual relation extraction (RE) using a subset of a standard English RE benchmark translated to Russian and Korean. In both sets of experiments, our results suggest a strong relation between cross-lingual stability and zero-shot parsing performance.
Anthology ID:
2021.emnlp-main.394
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4803–4817
Language:
URL:
https://aclanthology.org/2021.emnlp-main.394
DOI:
10.18653/v1/2021.emnlp-main.394
Bibkey:
Cite (ACL):
Ofir Arviv, Dmitry Nikolaev, Taelin Karidi, and Omri Abend. 2021. On the Relation between Syntactic Divergence and Zero-Shot Performance. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4803–4817, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
On the Relation between Syntactic Divergence and Zero-Shot Performance (Arviv et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.394.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.394.mp4
Code
 ofirarviv/improving-ud
Data
Translated TACREDUniversal Dependencies