Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language

Linda Wiechetek; Flammie A. Pirinen; Mika Hämäläinen; Chiara Argese

Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language

Linda Wiechetek, Flammie A Pirinen, Mika Hämäläinen, Chiara Argese

Abstract

We investigate both rule-based and machine learning methods for the task of compound error correction and evaluate their efficiency for North Sámi, a low resource language. The lack of error-free data needed for a neural approach is a challenge to the development of these tools, which is not shared by bigger languages. In order to compensate for that, we used a rule-based grammar checker to remove erroneous sentences and insert compound errors by splitting correct compounds. We describe how we set up the error detection rules, and how we train a bi-RNN based neural network. The precision of the rule-based model tested on a corpus with real errors (81.0%) is slightly better than the neural model (79.4%). The rule-based model is also more flexible with regard to fixing specific errors requested by the user community. However, the neural model has a better recall (98%). The results suggest that an approach that combines the advantages of both models would be desirable in the future. Our tools and data sets are open-source and freely available on GitHub and Zenodo.

Anthology ID:: 2021.ranlp-1.171
Volume:: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:: September
Year:: 2021
Address:: Held Online
Editors:: Ruslan Mitkov, Galia Angelova
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd.
Note:
Pages:: 1526–1535
Language:
URL:: https://aclanthology.org/2021.ranlp-1.171/
DOI:
Bibkey:
Cite (ACL):: Linda Wiechetek, Flammie A Pirinen, Mika Hämäläinen, and Chiara Argese. 2021. Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1526–1535, Held Online. INCOMA Ltd..
Cite (Informal):: Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language (Wiechetek et al., RANLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.ranlp-1.171.pdf

PDF Cite Search Fix data