A little perturbation makes a difference: Treebank augmentation by perturbation improves transfer parsing

Ayan Das, Sudeshna Sarkar


Abstract
We present an approach for cross-lingual transfer of dependency parser so that the parser trained on a single source language can more effectively cater to diverse target languages. In this work, we show that the cross-lingual performance of the parsers can be enhanced by over-generating the source language treebank. For this, the source language treebank is augmented with its perturbed version in which controlled perturbation is introduced in the parse trees by stochastically reordering the positions of the dependents with respect to their heads while keeping the structure of the parse trees unchanged. This enables the parser to capture diverse syntactic patterns in addition to those that are found in the source language. The resulting parser is found to more effectively parse target languages with different syntactic structures. With English as the source language, our system shows an average improvement of 6.7% and 7.7% in terms of UAS and LAS over 29 target languages compared to the baseline single source parser trained using unperturbed source language treebank. This also results in significant improvement over the transfer parser proposed by (CITATION) that involves an “order-free” parser algorithm.
Anthology ID:
2019.icon-1.9
Volume:
Proceedings of the 16th International Conference on Natural Language Processing
Month:
December
Year:
2019
Address:
International Institute of Information Technology, Hyderabad, India
Editors:
Dipti Misra Sharma, Pushpak Bhattacharya
Venue:
ICON
SIG:
Publisher:
NLP Association of India
Note:
Pages:
75–84
Language:
URL:
https://aclanthology.org/2019.icon-1.9
DOI:
Bibkey:
Cite (ACL):
Ayan Das and Sudeshna Sarkar. 2019. A little perturbation makes a difference: Treebank augmentation by perturbation improves transfer parsing. In Proceedings of the 16th International Conference on Natural Language Processing, pages 75–84, International Institute of Information Technology, Hyderabad, India. NLP Association of India.
Cite (Informal):
A little perturbation makes a difference: Treebank augmentation by perturbation improves transfer parsing (Das & Sarkar, ICON 2019)
Copy Citation:
PDF:
https://aclanthology.org/2019.icon-1.9.pdf