ELC-ParserBERT: Low-Resource Language Modeling Utilizing a Parser Network With ELC-BERT

Rufus Behr


Abstract
This paper investigates the effect of including a parser network, which produces syntactic heights and distances to perform unsupervised parsing, in the Every Layer Counts BERT (ELC-BERT) architecture trained on 10M tokens for the 2024 BabyLM challenge. The parser network’s inclusion in this setup shows little or no improvement over the ELC-BERT baseline for the BLiMP and GLUE evaluation, but, in particular domains of the EWoK evaluation framework, its inclusion shows promise for improvement and raises interesting questions about its effect on learning different concepts.
Anthology ID:
2024.conll-babylm.11
Volume:
The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning
Month:
November
Year:
2024
Address:
Miami, FL, USA
Editors:
Michael Y. Hu, Aaron Mueller, Candace Ross, Adina Williams, Tal Linzen, Chengxu Zhuang, Leshem Choshen, Ryan Cotterell, Alex Warstadt, Ethan Gotlieb Wilcox
Venues:
CoNLL | BabyLM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
140–146
Language:
URL:
https://aclanthology.org/2024.conll-babylm.11/
DOI:
Bibkey:
Cite (ACL):
Rufus Behr. 2024. ELC-ParserBERT: Low-Resource Language Modeling Utilizing a Parser Network With ELC-BERT. In The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning, pages 140–146, Miami, FL, USA. Association for Computational Linguistics.
Cite (Informal):
ELC-ParserBERT: Low-Resource Language Modeling Utilizing a Parser Network With ELC-BERT (Behr, CoNLL-BabyLM 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.conll-babylm.11.pdf