Chiara Argese
2021
Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language
Linda Wiechetek
|
Flammie Pirinen
|
Mika Hämäläinen
|
Chiara Argese
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
We investigate both rule-based and machine learning methods for the task of compound error correction and evaluate their efficiency for North Sámi, a low resource language. The lack of error-free data needed for a neural approach is a challenge to the development of these tools, which is not shared by bigger languages. In order to compensate for that, we used a rule-based grammar checker to remove erroneous sentences and insert compound errors by splitting correct compounds. We describe how we set up the error detection rules, and how we train a bi-RNN based neural network. The precision of the rule-based model tested on a corpus with real errors (81.0%) is slightly better than the neural model (79.4%). The rule-based model is also more flexible with regard to fixing specific errors requested by the user community. However, the neural model has a better recall (98%). The results suggest that an approach that combines the advantages of both models would be desirable in the future. Our tools and data sets are open-source and freely available on GitHub and Zenodo.
2020
Suoidne-varra-bleahkka-mála-bihkka-senet-dielku ‘hay-blood-ink-paint-tar-mustard-stain’ -Should compounds be lexicalized in NLP?
Linda Wiechetek
|
Chiara Argese
|
Tommi A Pirinen
|
Trond Trosterud
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)
2018
Using authentic texts for grammar exercises for a minority language
Lene Antonsen
|
Chiara Argese
Proceedings of the 7th workshop on NLP for Computer Assisted Language Learning
Search
Fix author
Co-authors
- Linda Wiechetek 2
- Lene Antonsen 1
- Mika Hämäläinen 1
- Tommi A. Pirinen 1
- Flammie Pirinen 1
- show all...