Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection

Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Dusko Vitas, Mihailo Skoric, Milica Ikonić Nešić


Abstract
In this paper we present the Serbian part of the ELTeC multilingual corpus of novels written in the time period 1840-1920. The corpus is being built in order to test various distant reading methods and tools with the aim of re-thinking the European literary history. We present the various steps that led to the production of the Serbian sub-collection: the novel selection and retrieval, text preparation, structural annotation, POS-tagging, lemmatization and named entity recognition. The Serbian sub-collection was published on different platforms in order to make it freely available to various users. Several use examples show that this sub-collection is usefull for both close and distant reading approaches.
Anthology ID:
2022.lrec-1.356
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3337–3345
Language:
URL:
https://aclanthology.org/2022.lrec-1.356
DOI:
Bibkey:
Cite (ACL):
Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Dusko Vitas, Mihailo Skoric, and Milica Ikonić Nešić. 2022. Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3337–3345, Marseille, France. European Language Resources Association.
Cite (Informal):
Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection (Stanković et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.356.pdf