Is German secretly a Slavic language? What BERT probing can tell us about language groups

Aleksandra Mysiak, Jacek Cyranka


Abstract
In the light of recent developments in NLP, the problem of understanding and interpreting large language models has gained a lot of urgency. Methods developed to study this area are subject to considerable scrutiny. In this work, we take a closer look at one such method, the structural probe introduced by Hewitt and Manning (2019). We run a series of experiments involving multiple languages, focusing principally on the group of Slavic languages. We show that probing results can be seen as a reflection of linguistic classification, and conclude that multilingual BERT learns facts about languages and their groups.
Anthology ID:
2023.bsnlp-1.11
Volume:
Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023)
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Jakub Piskorski, Michał Marcińczuk, Preslav Nakov, Maciej Ogrodniczuk, Senja Pollak, Pavel Přibáň, Piotr Rybak, Josef Steinberger, Roman Yangarber
Venue:
BSNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
86–93
Language:
URL:
https://aclanthology.org/2023.bsnlp-1.11
DOI:
10.18653/v1/2023.bsnlp-1.11
Bibkey:
Cite (ACL):
Aleksandra Mysiak and Jacek Cyranka. 2023. Is German secretly a Slavic language? What BERT probing can tell us about language groups. In Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023), pages 86–93, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Is German secretly a Slavic language? What BERT probing can tell us about language groups (Mysiak & Cyranka, BSNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.bsnlp-1.11.pdf
Video:
 https://aclanthology.org/2023.bsnlp-1.11.mp4