Comparing Neural Network Parsers for a Less-resourced and Morphologically-rich Language: Amharic Dependency Parser

Binyam Ephrem Seyoum, Yusuke Miyao, Baye Yimam Mekonnen


Abstract
In this paper, we compare four state-of-the-art neural network dependency parsers for the Semitic language Amharic. As Amharic is a morphologically-rich and less-resourced language, the out-of-vocabulary (OOV) problem will be higher when we develop data-driven models. This fact limits researchers to develop neural network parsers because the neural network requires large quantities of data to train a model. We empirically evaluate neural network parsers when a small Amharic treebank is used for training. Based on our experiment, we obtain an 83.79 LAS score using the UDPipe system. Better accuracy is achieved when the neural parsing system uses external resources like word embedding. Using such resources, the LAS score for UDPipe improves to 85.26. Our experiment shows that the neural networks can learn dependency relations better from limited data while segmentation and POS tagging require much data.
Anthology ID:
2020.rail-1.5
Volume:
Proceedings of the first workshop on Resources for African Indigenous Languages
Month:
May
Year:
2020
Address:
Marseille, France
Venues:
LREC | RAIL | WS
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
25–30
Language:
English
URL:
https://aclanthology.org/2020.rail-1.5
DOI:
Bibkey:
Cite (ACL):
Binyam Ephrem Seyoum, Yusuke Miyao, and Baye Yimam Mekonnen. 2020. Comparing Neural Network Parsers for a Less-resourced and Morphologically-rich Language: Amharic Dependency Parser. In Proceedings of the first workshop on Resources for African Indigenous Languages, pages 25–30, Marseille, France. European Language Resources Association (ELRA).
Cite (Informal):
Comparing Neural Network Parsers for a Less-resourced and Morphologically-rich Language: Amharic Dependency Parser (Seyoum et al., RAIL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.rail-1.5.pdf