Jóhannes B. Sigtryggsson


2025

pdf bib
Playing by the Rules: A Benchmark Set for Standardized Icelandic Orthography
Bjarki Ármannsson | Hinrik Hafsteinsson | Jóhannes B. Sigtryggsson | Atli Jasonarson | Einar Freyr Sigurðsson | Steinþór Steingrímsson
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)

We present the Icelandic Standardization Benchmark Set: Spelling and Punctuation (IceStaBS:SP), a dataset designed to provide standardized text examples for Icelandic orthography. The dataset includes non-standard orthography examples and their standardized counterparts, along with detailed explanations based on official Icelandic spelling rules. IceStaBS:SP aims to support the development and evaluation of automatic spell and grammar checkers, particularly in educational settings. We evaluate various spell and grammar checkers using IceStaBS:SP, demonstrating its utility as a benchmarking tool and highlighting areas for future improvement.