German Parliamentary Corpus (GerParCor) Reloaded

Giuseppe Abrami, Mevlüt Bagci, Alexander Mehler


Abstract
In 2022, the largest German-speaking corpus of parliamentary protocols from three different centuries, on a national and federal level from the countries of Germany, Austria, Switzerland and Liechtenstein, was collected and published - GerParCor. Through GerParCor, it became possible to provide for the first time various parliamentary protocols which were not available digitally and, moreover, could not be retrieved and processed in a uniform manner. Furthermore, GerParCor was additionally preprocessed using NLP methods and made available in XMI format. In this paper, GerParCor is significantly updated by including all new parliamentary protocols in the corpus, as well as adding and preprocessing further parliamentary protocols previously not covered, so that a period up to 1797 is now covered. Besides the integration of a new, state-of-the-art and appropriate NLP preprocessing for the handling of large text corpora, this update also provides an overview of the further reuse of GerParCor by presenting various provisioning capabilities such as API’s, among others.
Anthology ID:
2024.lrec-main.681
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
7707–7716
Language:
URL:
https://aclanthology.org/2024.lrec-main.681
DOI:
Bibkey:
Cite (ACL):
Giuseppe Abrami, Mevlüt Bagci, and Alexander Mehler. 2024. German Parliamentary Corpus (GerParCor) Reloaded. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 7707–7716, Torino, Italia. ELRA and ICCL.
Cite (Informal):
German Parliamentary Corpus (GerParCor) Reloaded (Abrami et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.681.pdf