Academics Can Contribute to Domain-Specialized Language Models

Mark Dredze, Genta Winata, Prabhanjan Kambadur, Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, David Rosenberg, Sebastian Gehrmann


Abstract
Commercially available models dominate academic leaderboards. While impressive, this has concentrated research on creating and adapting general-purpose models to improve NLP leaderboard standings for large language models. However, leaderboards collect many individual tasks and general-purpose models often underperform in specialized domains; domain-specific or adapted models yield superior results. This focus on large general-purpose models excludes many academics and draws attention away from areas where they can make important contributions. We advocate for a renewed focus on developing and evaluating domain- and task-specific models, and highlight the unique role of academics in this endeavor.
Anthology ID:
2024.emnlp-main.293
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5100–5110
Language:
URL:
https://aclanthology.org/2024.emnlp-main.293
DOI:
Bibkey:
Cite (ACL):
Mark Dredze, Genta Winata, Prabhanjan Kambadur, Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, David Rosenberg, and Sebastian Gehrmann. 2024. Academics Can Contribute to Domain-Specialized Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 5100–5110, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Academics Can Contribute to Domain-Specialized Language Models (Dredze et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.293.pdf