Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language

Linda Wiechetek, Flammie Pirinen, Mika Hämäläinen, Chiara Argese


Abstract
We investigate both rule-based and machine learning methods for the task of compound error correction and evaluate their efficiency for North Sámi, a low resource language. The lack of error-free data needed for a neural approach is a challenge to the development of these tools, which is not shared by bigger languages. In order to compensate for that, we used a rule-based grammar checker to remove erroneous sentences and insert compound errors by splitting correct compounds. We describe how we set up the error detection rules, and how we train a bi-RNN based neural network. The precision of the rule-based model tested on a corpus with real errors (81.0%) is slightly better than the neural model (79.4%). The rule-based model is also more flexible with regard to fixing specific errors requested by the user community. However, the neural model has a better recall (98%). The results suggest that an approach that combines the advantages of both models would be desirable in the future. Our tools and data sets are open-source and freely available on GitHub and Zenodo.
Anthology ID:
2021.ranlp-1.171
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:
September
Year:
2021
Address:
Held Online
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
1526–1535
Language:
URL:
https://aclanthology.org/2021.ranlp-1.171
DOI:
Bibkey:
Cite (ACL):
Linda Wiechetek, Flammie Pirinen, Mika Hämäläinen, and Chiara Argese. 2021. Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1526–1535, Held Online. INCOMA Ltd..
Cite (Informal):
Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language (Wiechetek et al., RANLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ranlp-1.171.pdf