Rules and neural nets for morphological tagging of Norwegian - Results and challenges

Dag Haug, Ahmet Yildirim, Kristin Hagen, Anders Nøklestad


Abstract
This paper reports on efforts to improve the Oslo-Bergen Tagger for Norwegian morphological tagging. We train two deep neural network-based taggers using the recently introduced Norwegian pre-trained encoder (a BERT model for Norwegian). The first network is a sequence-to-sequence encoder-decoder and the second is a sequence classifier. We test both these configurations in a hybrid system where they combine with the existing rule-based system, and on their own. The sequence-to-sequence system performs better in the hybrid configuration, but the classifier system performs so well that combining it with the rules is actually slightly detrimental to performance.
Anthology ID:
2023.nodalida-1.43
Volume:
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May
Year:
2023
Address:
Tórshavn, Faroe Islands
Editors:
Tanel Alumäe, Mark Fishel
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
425–435
Language:
URL:
https://aclanthology.org/2023.nodalida-1.43
DOI:
Bibkey:
Cite (ACL):
Dag Haug, Ahmet Yildirim, Kristin Hagen, and Anders Nøklestad. 2023. Rules and neural nets for morphological tagging of Norwegian - Results and challenges. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 425–435, Tórshavn, Faroe Islands. University of Tartu Library.
Cite (Informal):
Rules and neural nets for morphological tagging of Norwegian - Results and challenges (Haug et al., NoDaLiDa 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nodalida-1.43.pdf