Corpus Complexity Matters in Pretraining Language Models

Ameeta Agrawal, Suresh Singh


Anthology ID:
2023.sustainlp-1.20
Volume:
Proceedings of The Fourth Workshop on Simple and Efficient Natural Language Processing (SustaiNLP)
Month:
July
Year:
2023
Address:
Toronto, Canada (Hybrid)
Editors:
Nafise Sadat Moosavi, Iryna Gurevych, Yufang Hou, Gyuwan Kim, Young Jin Kim, Tal Schuster, Ameeta Agrawal
Venue:
sustainlp
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
257–263
Language:
URL:
https://aclanthology.org/2023.sustainlp-1.20
DOI:
10.18653/v1/2023.sustainlp-1.20
Bibkey:
Cite (ACL):
Ameeta Agrawal and Suresh Singh. 2023. Corpus Complexity Matters in Pretraining Language Models. In Proceedings of The Fourth Workshop on Simple and Efficient Natural Language Processing (SustaiNLP), pages 257–263, Toronto, Canada (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Corpus Complexity Matters in Pretraining Language Models (Agrawal & Singh, sustainlp 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.sustainlp-1.20.pdf