Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks

Tristan Thrush, Kushal Tirumala, Anmol Gupta, Max Bartolo, Pedro Rodriguez, Tariq Kane, William Gaviria Rojas, Peter Mattson, Adina Williams, Douwe Kiela


Abstract
We introduce Dynatask: an open source system for setting up custom NLP tasks that aims to greatly lower the technical knowledge and effort required for hosting and evaluating state-of-the-art NLP models, as well as for conducting model in the loop data collection with crowdworkers. Dynatask is integrated with Dynabench, a research platform for rethinking benchmarking in AI that facilitates human and model in the loop data collection and evaluation. To create a task, users only need to write a short task configuration file from which the relevant web interfaces and model hosting infrastructure are automatically generated. The system is available at https://dynabench.org/ and the full library can be found at https://github.com/facebookresearch/dynabench.
Anthology ID:
2022.acl-demo.17
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Valerio Basile, Zornitsa Kozareva, Sanja Stajner
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
174–181
Language:
URL:
https://aclanthology.org/2022.acl-demo.17
DOI:
10.18653/v1/2022.acl-demo.17
Bibkey:
Cite (ACL):
Tristan Thrush, Kushal Tirumala, Anmol Gupta, Max Bartolo, Pedro Rodriguez, Tariq Kane, William Gaviria Rojas, Peter Mattson, Adina Williams, and Douwe Kiela. 2022. Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 174–181, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks (Thrush et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-demo.17.pdf
Video:
 https://aclanthology.org/2022.acl-demo.17.mp4
Code
 facebookresearch/dynabench
Data
ANLIAdversarialQAGLUE