Jóhannes B. Sigtryggsson
2025
Playing by the Rules: A Benchmark Set for Standardized Icelandic Orthography
Bjarki Ármannsson
|
Hinrik Hafsteinsson
|
Jóhannes B. Sigtryggsson
|
Atli Jasonarson
|
Einar Freyr Sigurðsson
|
Steinþór Steingrímsson
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
We present the Icelandic Standardization Benchmark Set: Spelling and Punctuation (IceStaBS:SP), a dataset designed to provide standardized text examples for Icelandic orthography. The dataset includes non-standard orthography examples and their standardized counterparts, along with detailed explanations based on official Icelandic spelling rules. IceStaBS:SP aims to support the development and evaluation of automatic spell and grammar checkers, particularly in educational settings. We evaluate various spell and grammar checkers using IceStaBS:SP, demonstrating its utility as a benchmarking tool and highlighting areas for future improvement.