Incorporating Inner-word and Out-word Features for Mongolian Morphological Segmentation

Na Liu, Xiangdong Su, Haoran Zhang, Guanglai Gao, Feilong Bao


Abstract
Mongolian morphological segmentation is regarded as a crucial preprocessing step in many Mongolian related NLP applications and has received extensive attention. Recently, end-to-end segmentation approaches with long short-term memory networks (LSTM) have achieved excellent results. However, the inner-word features among characters in the word and the out-word features from context are not well utilized in the segmentation process. In this paper, we propose a neural network incorporating inner-word and out-word features for Mongolian morphological segmentation. The network consists of two encoders and one decoder. The inner-word encoder uses the self-attention mechanisms to capture the inner-word features of the target word. The out-word encoder employs a two layers BiLSTM network to extract out-word features in the sentence. Then, the decoder adopts a multi-head double attention layer to fuse the inner-word features and out-word features and produces the segmentation result. The evaluation experiment compares the proposed network with the baselines and explores the effectiveness of the sub-modules.
Anthology ID:
2020.coling-main.408
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
4638–4648
Language:
URL:
https://aclanthology.org/2020.coling-main.408
DOI:
10.18653/v1/2020.coling-main.408
Bibkey:
Cite (ACL):
Na Liu, Xiangdong Su, Haoran Zhang, Guanglai Gao, and Feilong Bao. 2020. Incorporating Inner-word and Out-word Features for Mongolian Morphological Segmentation. In Proceedings of the 28th International Conference on Computational Linguistics, pages 4638–4648, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Incorporating Inner-word and Out-word Features for Mongolian Morphological Segmentation (Liu et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.408.pdf