Generating from AMRs into High and Low-Resource Languages using Phylogenetic Knowledge and Hierarchical QLoRA Training (HQL)

William Soto Martinez, Yannick Parmentier, Claire Gardent


Abstract
Multilingual generation from Abstract Meaning Representations (AMRs) verbalises AMRs into multiple languages. Previous work has focused on high- and medium-resource languages relying on large amounts of training data. In this work, we consider both high- and low-resource languages capping training data size at the lower bound set by our low-resource languages i.e. 31K. We propose a straightforward technique to enhance results on low-resource while preserving performance on high-resource languages. We iteratively refine a multilingua model to a set of monolingual models using Low-Rank Adaptation with a training curriculum based on a tree structure; this permits investigating how the languages used at each iteration impact generation performance on high and low-resource languages. We show an improvement over both mono and multilingual approaches. Comparing different ways of grouping languages at each iteration step we find two working configurations: grouping related languages which promotes transfer, or grouping distant languages which facilitates regularisation
Anthology ID:
2024.inlg-main.7
Volume:
Proceedings of the 17th International Natural Language Generation Conference
Month:
September
Year:
2024
Address:
Tokyo, Japan
Editors:
Saad Mahamood, Nguyen Le Minh, Daphne Ippolito
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
70–81
Language:
URL:
https://aclanthology.org/2024.inlg-main.7
DOI:
Bibkey:
Cite (ACL):
William Soto Martinez, Yannick Parmentier, and Claire Gardent. 2024. Generating from AMRs into High and Low-Resource Languages using Phylogenetic Knowledge and Hierarchical QLoRA Training (HQL). In Proceedings of the 17th International Natural Language Generation Conference, pages 70–81, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):
Generating from AMRs into High and Low-Resource Languages using Phylogenetic Knowledge and Hierarchical QLoRA Training (HQL) (Soto Martinez et al., INLG 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.inlg-main.7.pdf
Supplementary attachment:
 2024.inlg-main.7.Supplementary_Attachment.pdf