An evaluation of Czech word embeddings

Karolína Hořeňovská


Abstract
We present an evaluation of Czech low-dimensional distributed word representations, also known as word embeddings. We describe five different approaches to training the models and three different corpora used in training. We evaluate the resulting models on five different datasets, report the results and provide their further analysis.
Anthology ID:
W19-6107
Volume:
Proceedings of the 22nd Nordic Conference on Computational Linguistics
Month:
September–October
Year:
2019
Address:
Turku, Finland
Editors:
Mareike Hartmann, Barbara Plank
Venue:
NoDaLiDa
SIG:
Publisher:
Linköping University Electronic Press
Note:
Pages:
65–75
Language:
URL:
https://aclanthology.org/W19-6107
DOI:
Bibkey:
Cite (ACL):
Karolína Hořeňovská. 2019. An evaluation of Czech word embeddings. In Proceedings of the 22nd Nordic Conference on Computational Linguistics, pages 65–75, Turku, Finland. Linköping University Electronic Press.
Cite (Informal):
An evaluation of Czech word embeddings (Hořeňovská, NoDaLiDa 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-6107.pdf