disco: a toolkit for Distributional Control of Generative Models

Germán Kruszewski, Jos Rozen, Marc Dymetman


Abstract
Pre-trained language models and other generative models have revolutionized NLP and beyond. However, these models tend to reproduce undesirable biases present in their training data. Also, they may overlook patterns that are important but challenging to capture. To address these limitations, researchers have introduced distributional control techniques. These techniques, not limited to language, allow controlling the prevalence (i.e. expectations) of any features of interest in the model’s outputs. Despite their potential, the widespread adoption of these techniques has been hindered by the difficulty in adapting the complex, disconnected code. Here, we present disco, an open-source Python library that brings these techniques to the broader public
Anthology ID:
2023.acl-demo.14
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Danushka Bollegala, Ruihong Huang, Alan Ritter
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
144–160
Language:
URL:
https://aclanthology.org/2023.acl-demo.14
DOI:
10.18653/v1/2023.acl-demo.14
Bibkey:
Cite (ACL):
Germán Kruszewski, Jos Rozen, and Marc Dymetman. 2023. disco: a toolkit for Distributional Control of Generative Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 144–160, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
disco: a toolkit for Distributional Control of Generative Models (Kruszewski et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-demo.14.pdf
Video:
 https://aclanthology.org/2023.acl-demo.14.mp4