Badr Jaidi
2022
Impact of Sequence Length and Copying on Clause-Level Inflection
Badr Jaidi
|
Utkarsh Saboo
|
Xihan Wu
|
Garrett Nicolai
|
Miikka Silfverberg
Proceedings of the 2nd Workshop on Multi-lingual Representation Learning (MRL)
We present the University of British Columbia’s submission to the MRL shared task on multilingual clause-level morphology. Our submission extends word-level inflectional models to the clause-level in two ways: first, by evaluating the role that BPE has on the learning of inflectional morphology, and second, by evaluating the importance of a copy bias obtained through data hallucination. Experiments demonstrate a strong preference for language-tuned BPE and a copy bias over a vanilla transformer. The methods are complementary for inflection and analysis tasks – combined models see error reductions of 38% for inflection and 15.6% for analysis; However, this synergy does not hold for reinflection, which performs best under a BPE-only setting. A deeper analysis of the errors generated by our models illustrates that the copy bias may be too strong - the combined model produces predictions more similar to the copy-influenced system, despite the success of the BPE-model.