Diversifying Neural Text Generation with Part-of-Speech Guided Softmax and Sampling

Zhixian Yang, Pengxuan Xu, Xiaojun Wan


Abstract
Neural text generation models are likely to suffer from the low-diversity problem. Various decoding strategies and training-based methods have been proposed to promote diversity only by exploiting contextual features, but rarely do they consider incorporating syntactic structure clues. In this work, we propose using linguistic annotation, i.e., part-of-speech (POS), to guide the text generation. In detail, we introduce POS Guided Softmax to explicitly model two posterior probabilities: (i) next-POS, and (ii) next-token from the vocabulary of the target POS. A POS Guided Sampling strategy is further proposed to address the low-diversity problem by enriching the diversity of POS. Extensive experiments and human evaluations show that, compared with existing state-of-the-art methods, our POS Guided Softmax and Sampling (POSG) can generate more diverse text while maintaining comparable quality.
Anthology ID:
2022.coling-1.570
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
6547–6563
Language:
URL:
https://aclanthology.org/2022.coling-1.570
DOI:
Bibkey:
Cite (ACL):
Zhixian Yang, Pengxuan Xu, and Xiaojun Wan. 2022. Diversifying Neural Text Generation with Part-of-Speech Guided Softmax and Sampling. In Proceedings of the 29th International Conference on Computational Linguistics, pages 6547–6563, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Diversifying Neural Text Generation with Part-of-Speech Guided Softmax and Sampling (Yang et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.570.pdf
Code
 fadedcosine/pos-guided-neural-text-generation
Data
PARANMT-50MWikiText-103WikiText-2