The Effect of Generalisation on the Inadequacy of the Mode

Bryan Eikema


Abstract
The highest probability sequences of most neural language generation models tend to be degenerate in some way, a problem known as the inadequacy of the mode. While many approaches to tackling particular aspects of the problem exist, such as dealing with too short sequences or excessive repetitions, explanations of why it occurs in the first place are rarer and do not agree with each other. We believe none of the existing explanations paint a complete picture. In this position paper, we want to bring light to the incredible complexity of the modelling task and the problems that generalising to previously unseen contexts bring. We argue that our desire for models to generalise to contexts it has never observed before is exactly what leads to spread of probability mass and inadequate modes. While we do not claim that adequate modes are impossible, we argue that they are not to be expected either.
Anthology ID:
2024.uncertainlp-1.9
Volume:
Proceedings of the 1st Workshop on Uncertainty-Aware NLP (UncertaiNLP 2024)
Month:
March
Year:
2024
Address:
St Julians, Malta
Editors:
Raúl Vázquez, Hande Celikkanat, Dennis Ulmer, Jörg Tiedemann, Swabha Swayamdipta, Wilker Aziz, Barbara Plank, Joris Baan, Marie-Catherine de Marneffe
Venues:
UncertaiNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
87–92
Language:
URL:
https://aclanthology.org/2024.uncertainlp-1.9
DOI:
Bibkey:
Cite (ACL):
Bryan Eikema. 2024. The Effect of Generalisation on the Inadequacy of the Mode. In Proceedings of the 1st Workshop on Uncertainty-Aware NLP (UncertaiNLP 2024), pages 87–92, St Julians, Malta. Association for Computational Linguistics.
Cite (Informal):
The Effect of Generalisation on the Inadequacy of the Mode (Eikema, UncertaiNLP-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.uncertainlp-1.9.pdf