Rewarding Semantic Similarity under Optimized Alignments for AMR-to-Text Generation

Lisa Jin, Daniel Gildea


Abstract
A common way to combat exposure bias is by applying scores from evaluation metrics as rewards in reinforcement learning (RL). Metrics leveraging contextualized embeddings appear more flexible than their n-gram matching counterparts and thus ideal as training rewards. However, metrics such as BERTScore greedily align candidate and reference tokens, which can allow system outputs to receive excess credit relative to a reference. Furthermore, past approaches featuring semantic similarity rewards suffer from repetitive outputs and overfitting. We address these issues by proposing metrics that replace the greedy alignments in BERTScore with optimized ones. We compute them on a model’s trained token embeddings to prevent domain mismatch. Our model optimizing discrete alignment metrics consistently outperforms cross-entropy and BLEU reward baselines on AMR-to-text generation. In addition, we find that this approach enjoys stable training compared to a non-RL setting.
Anthology ID:
2022.acl-short.80
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
710–715
Language:
URL:
https://aclanthology.org/2022.acl-short.80
DOI:
10.18653/v1/2022.acl-short.80
Bibkey:
Cite (ACL):
Lisa Jin and Daniel Gildea. 2022. Rewarding Semantic Similarity under Optimized Alignments for AMR-to-Text Generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 710–715, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Rewarding Semantic Similarity under Optimized Alignments for AMR-to-Text Generation (Jin & Gildea, ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-short.80.pdf