Making a Point: Pointer-Generator Transformers for Disjoint Vocabularies

Nikhil Prabhu, Katharina Kann


Abstract
Explicit mechanisms for copying have improved the performance of neural models for sequence-to-sequence tasks in the low-resource setting. However, they rely on an overlap between source and target vocabularies. Here, we propose a model that does not: a pointer-generator transformer for disjoint vocabularies. We apply our model to a low-resource version of the grapheme-to-phoneme conversion (G2P) task, and show that it outperforms a standard transformer by an average of 5.1 WER over 15 languages. While our model does not beat the the best performing baseline, we demonstrate that it provides complementary information to it: an oracle that combines the best outputs of the two models improves over the strongest baseline by 7.7 WER on average in the low-resource setting. In the high-resource setting, our model performs comparably to a standard transformer.
Anthology ID:
2020.aacl-srw.13
Volume:
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop
Month:
December
Year:
2020
Address:
Suzhou, China
Editors:
Boaz Shmueli, Yin Jou Huang
Venue:
AACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
85–92
Language:
URL:
https://aclanthology.org/2020.aacl-srw.13
DOI:
Bibkey:
Cite (ACL):
Nikhil Prabhu and Katharina Kann. 2020. Making a Point: Pointer-Generator Transformers for Disjoint Vocabularies. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop, pages 85–92, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Making a Point: Pointer-Generator Transformers for Disjoint Vocabularies (Prabhu & Kann, AACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.aacl-srw.13.pdf
Code
 nala-cub/g2p-pgt