Parsing Icelandic Alþingi Transcripts: Parliamentary Speeches as a Genre

Kristján Rúnarsson, Einar Freyr Sigurðsson


Abstract
We introduce a corpus of transcripts from Alþingi, the Icelandic parliament. The corpus is syntactically parsed for phrase structure according to the annotation scheme of the Icelandic Parsed Historical Corpus (IcePaHC). This addition to IcePaHC makes it more diverse with respect to text types and we argue that having a syntactically parsed corpus facilitates research on differt types of texts. We furthermore argue that the speech corpus can be treated somewhat like spoken language even though the transcripts differ in various ways from daily spoken language. We also compare this text type to other types and argue that this genre can shed light on their properties. Finally, we exhibit how this addition to IcePaHC has helped us in identifying and solving issues with our parsing scheme.
Anthology ID:
2020.parlaclarin-1.9
Volume:
Proceedings of the Second ParlaCLARIN Workshop
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Darja Fišer, Maria Eskevich, Franciska de Jong
Venue:
ParlaCLARIN
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
44–50
Language:
English
URL:
https://aclanthology.org/2020.parlaclarin-1.9
DOI:
Bibkey:
Cite (ACL):
Kristján Rúnarsson and Einar Freyr Sigurðsson. 2020. Parsing Icelandic Alþingi Transcripts: Parliamentary Speeches as a Genre. In Proceedings of the Second ParlaCLARIN Workshop, pages 44–50, Marseille, France. European Language Resources Association.
Cite (Informal):
Parsing Icelandic Alþingi Transcripts: Parliamentary Speeches as a Genre (Rúnarsson & Sigurðsson, ParlaCLARIN 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.parlaclarin-1.9.pdf